Reverse engineering, for true data observability

 





 

Reverse engineering is “the process of deconstructing a product or system to understand how it works and improve it”. It is the essential prerequisite for true observability. 

 

Reverse engineering can be applied to various fields. 

 

We are particularly interested in everything relating to Information Systems, and in particular data - and everything that makes it usable: storage, processing, exhibition. 

 

For what ?

 

Because it is the most inflationary and heterogeneous part of information systems, therefore the most complex. 

 

The “moves” towards the Cloud have required rapid modernizations, but they have not slowed down this continuous march towards hyper complexity. 

 

“Modernization” projects, erasure of IT debt, or simple knowledge sharing should rely on solid observability of data enabled by technical reverse engineering of systems.

 
 
 

For  objective  observability   ,

 exhaustive reverse  engineering

 

We cannot consider wanting to master something if we only half-observe it. 

Reverse engineering of the data system must be exhaustive!

 

This approach should make it possible to create overall coherence and make the reading of the system agnostic. 

It seems essential to us, at a minimum, to analyze the 5 technical stacks below:

Data inventory,  physical persisted or in memory, views, reports...

Log analysis, 

to understand data consumption and injection.

Parsing the scheduler, 

to understand the ordering.

Reverse engineer the code,  to generate a data lineage.

Introspection of the data visualization layer, 

to know the connection between technical and business information, to bring together intelligence (business rules).

 

For  objective observability  , reverse engineering in continuous time

 

These analyzes should be carried out in continuous time to guarantee an objective vision of things: what I “observe” must be an exact reflection of reality. 

 

The volumes to be analyzed are often so large that it is necessary to bias them to allow these daily analyses, for example using a delta analysis, or CDC ("Change Data Capture").

 

The virtues of reverse engineering 

 in a data observability framework

 

Observability and  Compliance

With the advent of the Personal Data Protection Framework (GDPR), precise mapping and reinforced controls over processing involving personal data are now essential. From this perspective, observability based on continuous analysis of processes presents itself as a facilitator. It makes it possible to represent the different data production processes and to identify processing errors in chains. Especially since investigations can be triggered very simply by teams without technical knowledge.

 

Observability and  Security

Reverse engineering helps identify and correct weaknesses and vulnerabilities in the design or security of data systems. 

If a sensitive technical table with an abstract name broadcasts its data into a system, and that data ends up being requested by an unauthorized person, no one will know anything about it. 

Having a precise flow map for sourcing or impact analysis will help in this DLP (Data Loss Prevention) context. 

 

Observability and  Governance

Robust reverse engineering provides the entire company with a detailed repository, describing data flows, as well as a shared vision of the system architecture, thus promoting the development of effective data governance and a strategy of data quality. This in-depth knowledge of the system is also very useful for optimally architecting projects: 

  • By highlighting strategic flows and all their dependencies. 
  • By offering a valuable tool for the design of an optimal architecture through the simplification of information systems: in fact, the definition of information uses coupled with the analysis of processing (data lineage) makes it possible to identify flows of useless data in information systems, prior to vast decommissioning operations.
 

Observability and  Technological Obsolescence

Reverse engineering allows lower technical layers to be brought to light by unearthing their specifications and sharing them. 

This can help extend the life cycle of certain tools to reduce costs. For example, if you have an intelligible and widely shared reading of the entire Cobol layer, it is potentially less urgent to exit it.

 

Observability and  Creativity

By drawing inspiration from the best practices and innovations of others, observability promotes the renewal of good practices. Because we only invent what we have forgotten! 

By analyzing historical processes, we can draw on old (good) ideas to generate new methods.

 

Observability and  Maintenance

Reverse engineering makes it possible to analyze failures by revealing the root causes of dysfunctions. This analysis helps prevent future failures and improve system quality. Intervention on the root cause of the failure is greatly facilitated by making a map of the information system available to as many people as possible.

 

Observability and  Migrations

Solid reverse engineering based on real processes allows you to have an understanding of what is managed by the different technologies involved: ETL, procedural code, scheduler, data visualization tools, etc. This opens the door to technological re-addressing towards third-party solutions, typically in the Cloud, by integrating the thorny issue of dependencies and security. These can be databases or data visualization tools.

 

                                              Example :  

Migrate SAP BO to Power BI on a fixed price basis 
 

Conclusion 

 

Reverse engineering turns out to be an essential tool to guarantee optimal observability of data. 

Ellipsys is a "tech" that specializes in the reverse engineering of complex systems to share understanding, to simplify them and to automate technical migrations. 

 

Commentaires

Posts les plus consultés de ce blog

La Data Observabilité, Buzzword ou nécessité ?

BCBS 239 : L'enjeu de la fréquence et de l'exactitude du reporting de risque

Le data lineage, l’arme idéale pour la Data Loss Prevention ?