Personal tools

Partners

ETL Environment Standards

From MIKE2 Methodology

Jump to: navigation, search

ETL Environment Standards help define how the ETL environment is to be physically deployed. This is generally covered in the Solution Architecture but it needs to be completely spelled out here. As with other types of IT development work, the ETL development process requires a controlled software development lifecycle and a discipline from individual developers to keep it maintainable.

Although these standards are listed as taking place in ETL Physical Design, it is ideal that they be done before the prototype if possible. Once they are established once, they should be able to be re-used for future increments and only need to be reviewed.

Contents

Key Standards to Establish for the ETL Environment

Environment Configuration

The following environments are recommended for ETL projects as a minimum:

  • Development
  • Testing
  • Production

The restrictions of ETL tool licensing usually means there are one or two licensed servers available. The best case scenario is to have three physical ETL servers, giving one for each environment. However, it is usually possible to define Development and Testing as separate logical environments residing on the same physical server.

Testing can consist of multiple types of testing, from functional testing to user acceptance. Depending upon the complexity of the project and the number of concurrent development streams envisioned, it may be desirable to have multiple logical testing environments. Multiple testing environments may share database instances or they may require separate instances, depending upon the required concurrent levels of testing.

Environment Control

For ETL development the Testing and Production environments are largely read-only. Developers must be strongly discouraged if not prohibited from applying fixes directly into Production; the correct path is to make the change in Development, verify it works and then migrate it through Testing and Production.

An efficient migration procedure is essential by the time Production goes live. Emergency fixes can thereby be fast-tracked from the updatable Development environment through to the Production environment.

If Production is left updatable then it leaves open the short cut for developers or support staff to fix problems directly against Production jobs. This breaks the life cycle of job development. The version of the job in Development is no longer the latest version, Production fixes may be lost when new versions are delivered; pushing Production versions back into Development may overwrite development work.

It is permissible to have updatable Testing environments for verification testing where different job scenarios are being verified. Stress testing, for example, may require many minor changes to jobs to check the effect on performance. In these cases once the optimal job design has been arrived at the final job can be pushed back into Development or the changes applied to the Development job manually. Ideally however, performance controls would be controlled by parameters to avoid having to modify the jobs.

Multiple Projects

Multiple ETL projects can exist in parallel in each of the environments. Development work can be divided up among multiple business units to keep the number of jobs in each ETL project maintainable. This division would generally flow through to Testing and Production.

Release Management Standards

Release Management Standards defines the ETL version control approach that is to be used, including version control within the tool itself and/or in any external configuration management tool.

Relationship to Overall Implementation Guide

The following tasks from the Overall Implementation Guide act as input to the definiton of ETL Environment Standards:

The output of this step is then a set of updated ETL Environment Standards

Powered by omCollab