Moving from Teradata to BigQuery? Automate the process!

février 07, 2024

Moving from Teradata to BigQuery?

Automate the process!

Teradata's exit - a technical challenge

Teradata is an "Appliance" chosen by countless players. Teradata's specialization in Datawarehousing / Analytics has enabled the implementation of solutions with exceptional computing capacities and strong scalability.

But most of Teradata's historical users are now in a process of switching to the Cloud.

One of our customers decided to switch its Teradata assets to Google Cloud Platform (BigQuery), as well as to migrate a number of data visualization technologies (SAP BO, Power BI).

We share with you the methodology implemented as part of this migration.

Some key indicators regarding the platform to be migrated

Asset data:

400,000 data “containers”;
270,000 tables ;
11,000 files.

Scripts assets :

122,000 views;
1,200 macros ;
500 processus d’injection BTEQ ;
600 universe BO;
100,000 webi reports;
30,000 data manipulation processes spread across 450 Stambia ETL projects.

Usage statistics*:

30% of data used;
30% of tables/views/files used;
50% of transformations used.

*which lead to a decision-making report or application use

Processing :

1,500,000 requests per day,
880,000 insert / update / merge

....

3 major challenges have been identified to make this migration a success

It was necessary to be able to define in continuous time the existing on the source platform, with all its dependencies;
It was necessary to be able to make a permanent inventory of the progress of the migration: what is migrated, what needs to be migrated;
It was necessary to share the migration process with everyone, to avoid misunderstandings.

Automating these tasks was a must.

Stages #1

Mastering the source platform by mapping it

{openAudit} made it possible to control internal processes via physical data lineage, in the field,in Bteq, but also in ETL/ELT, Views, Macros, other scripts associated with feeding the flows.

{openAudit} helped identify the uses of information, via an analysis of audit database logs, for the consumption and injection of data.

{openAudit} analyzed the task scheduling and linked it to data lineage, as well as data usage.

{openAudit} highlighted the impacts in the data visualization tools that are associated with Teradata (e.g.: Power BI, SAP BO, etc.), to glimpse the related complexity (management rules) and to be able to do real end-to-end data lineage.

Stages #2

Automate migration

Through a series of mechanisms, {openAudit} reproduced the essential processing in BigQuery : parsing, then production of an enriched standard SQL.

Note that some encapsulations (Shell, others) are likely to degrade the output.

Also note that the existence of an ETL/ELT in the source system requires a transposition of the treatments. For some of them, {openAudit} can speed up the project.

Stages #3

Mastering deployment in GCP with mapping

{openAudit} performed dynamic analysis of BigQuery, scheduled queries, view scripts and Json, CSV type loading files, to enable intelligent construction of flows.

{openAudit} analyzed the logs in Google Cloud's Operations (Stackdriver) , to immediately understand the uses of the information.

{openAudit} defined the ordering of tasks, to link it to data lineage and data usage.

{openAudit} has introspected some "target" data visualization technologies that rely on GCP (Looker, Data Studio, BO Cloud, Power BI ...), to be able to reconstruct the intelligence by comparing the responses.

Furthermore, the connectors could be migrated to BigQuery (case of connectors with a deterioration in performance via datometry's hyper-Q middleware).

Rechercher dans ce blog

Le data lineage et l’usage des données pour transformer un système : simplifications / migrations

Moving from Teradata to BigQuery? Automate the process!

Some key indicators regarding the platform to be migrated

3 major challenges have been identified to make this migration a success

Commentaires

Enregistrer un commentaire

Posts les plus consultés de ce blog

Migration automatisée de SAP BO vers Power BI, au forfait.

La Data Observabilité, Buzzword ou nécessité ?

La 1ère action de modernisation d’un Système d'Information : Ecarter les pipelines inutiles ?