| Moving from Teradata to BigQuery? Automate the process! |
|
|
|
|
|
Teradata's exit - a technical challenge Teradata is an "Appliance" chosen by countless players. Teradata's specialization in Datawarehousing / Analytics has enabled the implementation of solutions with exceptional computing capacities and strong scalability. But most of Teradata's historical users are now in a process of switching to the Cloud. One of our customers decided to switch its Teradata assets to Google Cloud Platform (BigQuery), as well as to migrate a number of data visualization technologies (SAP BO, Power BI).
We share with you the methodology implemented as part of this migration. |
Some key indicators regarding the platform to be migrated |
|
|
|
|
|
| Asset data: - 400,000 data “containers”;
- 270,000 tables ;
- 11,000 files.
|
|
|
|
|
|
| Scripts assets : - 122,000 views;
- 1,200 macros ;
- 500 processus d’injection BTEQ ;
- 600 universe BO;
- 100,000 webi reports;
- 30,000 data manipulation processes spread across 450 Stambia ETL projects.
|
|
|
|
|
|
| Usage statistics*: - 30% of data used;
- 30% of tables/views/files used;
- 50% of transformations used.
*which lead to a decision-making report or application use |
|
|
|
|
|
Processing : - 1,500,000 requests per day,
- 880,000 insert / update / merge
.... |
|
| |
|
|
|
3 major challenges have been identified to make this migration a success |
- It was necessary to be able to define in continuous time the existing on the source platform, with all its dependencies;
- It was necessary to be able to make a permanent inventory of the progress of the migration: what is migrated, what needs to be migrated;
- It was necessary to share the migration process with everyone, to avoid misunderstandings.
Automating these tasks was a must. |
|
|
|
|
|
Stages #1 Mastering the source platform by mapping it |
|
| |
|
|
|
| {openAudit} made it possible to control internal processes via physical data lineage, in the field,in Bteq, but also in ETL/ELT, Views, Macros, other scripts associated with feeding the flows. |
|
|
|
|
|
| {openAudit} helped identify the uses of information, via an analysis of audit database logs, for the consumption and injection of data. |
|
|
|
|
|
| {openAudit} analyzed the task scheduling and linked it to data lineage, as well as data usage. |
|
|
|
|
|
| {openAudit} highlighted the impacts in the data visualization tools that are associated with Teradata (e.g.: Power BI, SAP BO, etc.), to glimpse the related complexity (management rules) and to be able to do real end-to-end data lineage. |
|
|
|
|
|
| Stages #2 Automate migration |
|
|
|
|
|
Through a series of mechanisms, {openAudit} reproduced the essential processing in BigQuery : parsing, then production of an enriched standard SQL. Note that some encapsulations (Shell, others) are likely to degrade the output. Also note that the existence of an ETL/ELT in the source system requires a transposition of the treatments. For some of them, {openAudit} can speed up the project. |
|
|
|
|
|
Stages #3 Mastering deployment in GCP with mapping |
|
| |
|
|
|
| {openAudit} performed dynamic analysis of BigQuery, scheduled queries, view scripts and Json, CSV type loading files, to enable intelligent construction of flows. |
|
|
|
|
|
| {openAudit} analyzed the logs in Google Cloud's Operations (Stackdriver) , to immediately understand the uses of the information. |
|
|
|
|
|
| {openAudit} defined the ordering of tasks, to link it to data lineage and data usage. |
|
|
|
|
|
| {openAudit} has introspected some "target" data visualization technologies that rely on GCP (Looker, Data Studio, BO Cloud, Power BI ...), to be able to reconstruct the intelligence by comparing the responses. |
|
|
|
|
|
Furthermore, the connectors could be migrated to BigQuery (case of connectors with a deterioration in performance via datometry's hyper-Q middleware). |
Conclusion We do not believe that a migration of such ambition can be organized with "kick offs" and "deadlines", but in an intelligent process based on a real mastery of the source platform / and the target platform, via a continuous technical introspection of the processes and uses, as well as a graphic representation of "the" Information Systems, which everyone can understand and exploit. Automating migration will have undeniable added value in this context. |
|
|
|
|
|
|
Commentaires
Enregistrer un commentaire