The first action to modernize an Information System: Eliminate unnecessary pipelines?

The first action to modernize an Information System:

 

Eliminate unnecessary pipelines?

A data system could be compared to a vast road network, made up of various paths, each created to meet a specific need at a given time.


As this network expands and ages, some paths (data pipelines) become underutilized or unused (replicated, obsolete).

The financial and organizational impacts are numerous: 

  • 60% of data in the Cloud is not used according to NTT.

Source: IT  Social

  • According to Civo, for almost half of companies with more than 500 employees, the annual cost of the Cloud exceeds a million dollars, with growth rates that are difficult to sustain.

Source: ChannelNews

 

The origin of these useless data pipelines,

these “ghost paths”?  

Over time, Information Systems aggregate pipelines that have become useless:

  • Pipelines created for now abandoned projects.
  • Duplication of pipelines due to lack of coordination between services. The advent of "data mesh" architectures appears to be a major accelerator of this state of affairs.  
  • Obsolete pipelines kept as a precautionary measure ("you never know!"), or to cover some risk.


These “ghost routes” consume a lot of resources in the Cloud (storage, processing, bandwidth), which could be used otherwise!

We have advanced on a software response which makes it possible to remedy this natural drift, as old as physics: entropy, i.e. the "degree of disorder reflecting the natural tendency of things to evolve towards a state of chaos".

 

This drift is not inevitable. On the other hand, it is a race against time, because systems have such an inclination towards entropy that only industrialized mechanisms can cope with it.

 

This answer is one of the features of {openAudit} . With 2 mechanics: 

 

Technically and continuously identify  pipelines to be decommissioned  

It is possible to precisely map these complex tangles and identify unused pipelines. 

This approach requires 2 coordinated technical actions that we offer with our {openAudit}  software :

Analysis of data usage: to identify “informational dead ends”.

 

  • {openAudit} will analyze the main technical stack to know all the data consumed in and outside the batch chains.
  • Data consumed by satellites (non-parsed applications) are also analyzed to identify the completeness of useful information.
  • This dual analysis can be subtle and will be configured to take into consideration the business target : regulatory information can be consumed very periodically for example, while having significant added value. 

 

Through a "mirror analysis", informational deadlocks are factually defined in continuous time. 

Data Lineage: trace flows to isolate unnecessary chains

 

  • Data lineage allows you to trace the pipeline from unused data to the first table that will be the source of information consumed in another branch.
  • From this branch, it is possible to delete the unnecessary chain fraction without impact. 

Clean the Information System

 

The {openAudit}  run is operated continuously, which makes it possible to organize the decommissioning of all unnecessary flows over a long period of time with internal teams.

A classification can also be made by profession, tools, others, to prioritize the process.  

 

 

Modeling a harmonious system

We are currently developing an algorithm, which we have named  Harmony" , which will allow us to automatically model a system so that it is as rational and efficient as possible, even when many proprietary technologies are at work (ETL, dataviz tools). News to come!

And if you would like to discuss automated migration topics around the following themes (or others), we will also be happy to: 

 

 

Commentaires

Posts les plus consultés de ce blog

Migrer de SAP BO vers Power BI, Automatiquement, Au forfait !

La Data Observabilité, Buzzword ou nécessité ?

BCBS 239 : L'enjeu de la fréquence et de l'exactitude du reporting de risque