Moving from Teradata to BigQuery? Automate the process!

  

Moving from Teradata to BigQuery? 

 

 

Automate the process!

Teradata's exit - a technical challenge 

 

Teradata is an "Appliance" chosen by countless players. Teradata's specialization in Datawarehousing / Analytics has enabled the implementation of solutions with exceptional computing capacities and strong scalability. 

But most of Teradata's historical users are now in a process of switching to the Cloud.


One of our customers decided to switch its Teradata assets to Google Cloud Platform (BigQuery), as well as to migrate a number of data visualization technologies (SAP BO, Power BI). 

 

We share with you the methodology implemented as part of this migration. 

 

Some key indicators regarding the platform to be migrated 

Asset data:

  • 400,000 data “containers”;
  • 270,000 tables ;
  • 11,000 files.

Scripts assets :

  • 122,000 views;
  • 1,200 macros ;
  • 500 processus d’injection BTEQ ;
  • 600 universe BO;
  • 100,000 webi reports;
  • 30,000 data manipulation processes spread across 450 Stambia ETL projects.

Usage statistics*:

  • 30% of data used; 
  • 30% of tables/views/files used;  
  • 50% of transformations used.
  •  

*which lead to a decision-making report or application use  

Processing :  

  • 1,500,000 requests per day,
  • 880,000 insert / update / merge 

.... 

3 major challenges have been identified to make this migration a success 

 

  • It was necessary to be able  to define in continuous time the existing on the source platform, with all its dependencies; 
  • It was necessary to be able to make a permanent inventory of the progress of the migration: what is migrated, what needs to be migrated; 
  • It was necessary  to share the migration process with everyone,  to avoid misunderstandings. 

 

Automating these tasks was a must. 

 

Stages #1

Mastering the source platform by mapping it

{openAudit}  made it possible  to control internal processes via physical data lineage, in the field,in Bteq, but also in ETL/ELT, Views, Macros, other scripts associated with feeding the flows.

 

{openAudit} helped identify  the uses of information,  via an analysis of audit database logs, for the consumption and injection of data.

 

{openAudit} analyzed  the task scheduling and linked it to data lineage, as well as data usage.

 

{openAudit}  highlighted the  impacts in the data visualization tools that are associated with Teradata (e.g.: Power BI, SAP BO, etc.), to glimpse the related complexity (management rules) and to be able  to do real end-to-end data lineage.

 

Stages #2

Automate migration 

Through a series of mechanisms, {openAudit} reproduced the essential processing in BigQuery  : parsing, then production of an enriched standard SQL.

Note that some encapsulations (Shell, others) are likely to degrade the output.

Also note that the existence of an ETL/ELT in the source system requires a transposition of the treatments. For some of them, {openAudit} can speed up the project.

Stages #3

Mastering deployment in GCP with mapping

{openAudit}  performed  dynamic analysis of BigQuery, scheduled queries, view scripts and Json, CSV type loading files, to enable intelligent construction of flows.

{openAudit}  analyzed the logs in Google Cloud's Operations (Stackdriver) , to immediately understand the uses of the information.

{openAudit} defined  the ordering of tasks,  to link it to data lineage and data usage.

{openAudit}  has introspected some "target" data visualization technologies  that rely on GCP (Looker, Data Studio, BO Cloud, Power BI ...), to be able to reconstruct the intelligence by comparing the responses.  

Furthermore, the connectors could be migrated to BigQuery (case of connectors with a deterioration in performance via datometry's hyper-Q middleware). 

 

 

See also: 

 
Migrating from Oracle to Postgre

 

 

Conclusion

We do not believe that a migration of such ambition can be organized with "kick offs" and "deadlines", but in an intelligent process based on a real mastery of the source platform / and the target platform, via a continuous technical introspection of the processes and uses, as well as a graphic representation of "the" Information Systems, which everyone can understand and exploit.

 

Automating migration will have undeniable added value in this context.

 

 

Commentaires

Posts les plus consultés de ce blog

Migrer de SAP BO vers Power BI, Automatiquement, Au forfait !

La Data Observabilité, Buzzword ou nécessité ?

BCBS 239 : L'enjeu de la fréquence et de l'exactitude du reporting de risque