GDPR: Auditing personal data through traceability ?




On May 25, 2018, the GDPR (General Data Protection Regulation) or GDPR came into force. As everyone knows, the main objectives of the GDPR are to give back control to EU citizens over their personal data.


A pitfall: the profusion of personal data, their heterogeneity


Since 2018, the holding of personal data must have a legal basis. It is necessary either to contract with the owner of the data (!), or to obtain consent. Moreover, when the legal reason for the detention is "consent", the consent must be specified - i.e. the owner must specify what he has consented to regarding the uses of his personal data.


Personal data is defined very broadly. Article 4 of the GDPR defines personal data as any information relating to "an identifiable natural person". That is to say a person who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more specific elements physical, physiological, genetic, psychic, economic, cultural or social identity…”. Wow!


So companies have astronomical volumes of data stored, not knowing very well where they come from, where they go, who consults them…


Personal data audit


Open any book or website on implementing GRDP, it will start by saying that Step #1 to being GDPR Compliant is an audit of personal data.


This audit will inform you about the personal data you hold. This will require you to document what data you hold, where it is held, what it is used for, and who accesses it, etc. The general observation is that they tend to proliferate in internal systems. This data also flows externally: companies constantly exchange data, but rarely keep track of what information has been shared, and with whom. It can also be third parties who access a company's network. Recently it was established that the average corporate network was accessed by 89 external players every week!


The key is traceability!


Data frequently passes through different systems. Typically, it will be necessary to know if you have a longitudinal follow-up of all the personal data in your data visualization tools, and all their technical underlyings. If a person wishes to modify their inaccurate personal data, will this modification be replicated in all systems? Is this principle also true for development and test environments?


An audit can also tell you what you have… at the time of the audit. But an audit is a "one shot". As soon as done, immediately obsolete. As data moves through the systems it will be necessary to know what changes have been made and how they have propagated. Good luck to the listeners!


Do you think you will really be able to answer these 2 questions at all times: where does the data come from, and where is it used? And the answers must relate to the situation at the moment “t”.


The data lineage


There is a well-established technique for tracing information in systems, it's called "data lineage".


  • But if the uses of information are not defined, including “official” uses, but also “ad hoc” uses of information, it does not make sense. Thus, the data lineage will have to be able to identify the uncontrolled "scraping" of personal data in the company, in a perimeter beyond IT.
  • If there is no continuous time update of this mapping, it will not be usable. It will therefore be necessary that the scan of the information system be continuous.
  • And for the business lines in which we generally find the “Compliance”, “Security” and DPO teams to be able to exploit the answers produced, they will have to be able to rely on a mapping that is as less technical and as intelligible as possible. Thus the "business" terms used in the lower layers (ERP, CRM), or the upper layers (data visualization) should be able to be propagated throughout the cartography.


Conclusion

A multi-technology data lineage solution, defining the uses of information, which analyzes delta flows, will be a powerful lever for GDPR compliance. We will not be in the traditional GDPR "check list" (free consent, register of processing, etc.), but in a holistic approach to the subject with answers that follow the GDPR framework, to bring the company into the world of data governance!




#GRDP #GDPR #datalineage


ellipsys@ellipsys-bi

www.ellipsys-bi.com

Commentaires

Posts les plus consultés de ce blog

Migrer de Oracle à Postgre en automatisant le processus !

Sur-administrer une plateforme SAP BO simplement

La Data Observabilité, Buzzword ou nécessité ?