SQL & SQLOps​ at the service of digital transformation

 

 

SQL  &  SQLOps

at the service of digital  

transformation

SQL procedural code and its derivatives have been marginalized in favor of ETL/ELT in “on premise” infrastructure.

 

ETL/ELT have proliferated in companies over the last 30 years, with the primary idea of ​​making the deployment of “data pipelines” accessible to less technical profiles. It was also about facilitating impact analyses, having versioning, facilitating flow monitoring, being able to replicate Jobs, etc.

With the Cloud, SQL is coming back “in majesty” for many very good reasons. 

And we are also seeing SQLOps-oriented solutions appear that “augment” SQL by encapsulating it in tools that offer  additional guarantees .

 

But how to make a precise transition from DataStage, Talend, BODS, SSIS, etc., to “simple” SQL or to SQLOps-oriented tools?

 
 

 

The resilience of  SQL

Our experience shows us that developers often continued to model flows in SQL before building them with ETL in place. 

For what ? 

Developers often have a strong relationship with the SQL language  : SQL provides a natural way to describe and understand data relationships and patterns.  

Using SQL queries allows you to quickly test data transformations directly in the database.

Using SQL allows applications to manage data without worrying about the underlying storage,  simplifying deployment across different cloud platforms.

In short, SQL is simplicity, efficiency, scalability.

 

“The world of tomorrow” is largely focused on SQL, with proof of a massive return of SQL and its “dialects” to manage data processing among hyperscalers: Redshift SQL at AWS, BigQuery at GCP, or T -SQL in Azure. As is often the case, we tend to reinvent what we had somewhat forgotten 😊.

 

But SQL still has the limits  that it had before  :

  • You can end up with an abundance of code that is difficult to maintain.
  • And the more imposing the system, the more expertise there will be needed, which is not in line with domain-oriented architectures ("data mesh"), aimed at giving the profession intensive and extensive use of data. given.  
  • ... 
 

And the 

solutions oriented

SQLOps  ?

SQLOps-oriented solutions encapsulate SQL.

They allow data  analysts  to write SQL code to define transformations, to test them, to document them and to execute them in a reproducible manner.

In short, they offer functionalities that combine the simplicity of SQL with a real control tool.

The best known of these solutions are dbt, dbForge Studio, or even Dataform from Google. 

The main attributes of these tools:

They promote  a modular approach to data development, allowing the creation of reusable packages.  This simplifies maintenance, updating and managing data transformations.

Documentation is automatically generated  from SQL query metadata.

These tools integrate automated testing functionalities.

 

.....

 
 
 
 
 
 

Migrate  ? 

Whether you want to migrate to simple SQL, to a "dialect" of SQL, or to a dbt type solution, you will need to be able to "deconstruct" everything that your ETL/ELT embeds effectively, by transposing it into the target technology. It can be very complicated, very long, very expensive. 

We offer a 2-step method: 

1. Simplify the source system: 

The granular data lineage of {openAudit}, our software, coupled with the uses of information, will make it possible to identify and rule out "dead branches" , ie data pipelines which are continually replayed but which produce data which are not never consulted. This is often more than 50% of systems.

 

Also see: 

Lower Cloud Costs

2. “Translate” ETL jobs: 

The very strong introspection capacity  of {openAudit},  in particular with regard to ETL/ELT present in legacy systems (DataStage / BODS / SSIS / Talend, etc.), makes it possible to make an exhaustive and dynamic inventory of the set of actions that act on the data.

  • {openAudit} will “translate” these statements into simple SQL;
  • This SQL will be enriched with an executable  to make it “adherence” as precisely as possible to the target technology (BigQuery, Redshift, Azur SQL, etc.); 
  • For a migration to dbt (for example), {openAudit} will encapsulate this SQL in dbt jobs.

All of these migration mechanisms will be automated. 

Furthermore, views, views of views, dynamic procedures, flows managed with procedural code will be included in the target, depending on the articulation of the source.  

 
 

Conclusion 

 

SQL plays a central role in the Cloud environment, both at the database level and in the evolution of ETL methods, towards more flexible approaches adapted to the Cloud.

Judicious use of SQL dialects specific to cloud databases and adoption of SQLOps-oriented tools can help optimize data management in these dynamic environments.

Some would like to move in this direction,  but migration is not simple. Except to automate the processes. 

 

Also see: 

Migrate from SAP BO to Power BI 

Commentaires

Posts les plus consultés de ce blog

La Data Observabilité, Buzzword ou nécessité ?

BCBS 239 : L'enjeu de la fréquence et de l'exactitude du reporting de risque

Le data lineage, l’arme idéale pour la Data Loss Prevention ?