What is the main difference between anonymous and anonymized data?

Anonymized data is, by definition, not personal data and cannot be linked to individuals. Anonymized data, on the other hand, has originated from personal data and, theoretically, can be traced back to its origins through certain methods.

What challenge does complete anonymization pose?

There is no guarantee of perfect anonymization, as mathematical or stochastic methods may exist that could enable the tracing back to the original data. Even if the anonymization is technically flawless, the risk of reconstruction remains.

What is data processing in the context of anonymization?

Data processing, according to the article, means any activity involving personal data, such as collecting, recording, organizing, arranging, storing, adapting, modifying, reading out, querying, using, transmitting, or disseminating. These activities are relevant to the process of anonymization.

When is data anonymization permitted for scientific or statistical purposes?

Data anonymization is permitted when processing is legitimate for scientific, research, or statistical purposes. This is supported by Consideration 50 of the GDPR, which recognizes statistical data collection as lawful processing.

Is data anonymization always permissible?

Anonymization is permitted if the controller has a legitimate interest, for example, for the development of AI systems. However, the anonymization must not undermine the purpose of the original data collection and must respect the rights of the affected person.

Can anonymization be done without consent?

Generally yes, when a legitimate interest exists and the rights of the affected person are protected. However, the possibility of withdrawing consent must be provided to ensure compliance with data protection regulations.

Can anonymization always be legally permissible, regardless of the situation?

Generally yes, anonymization can be legally permissible even if the data was lawfully collected and processed by the controller. However, it is important to note that anonymization does not need to be perfect to be effective.

How can problems arise with anonymization, even if the data has been anonymized?

Even though data has been anonymized, there is a risk that an insider with detailed knowledge of the system can analyze the data and draw conclusions about individual people. Therefore, it is crucial to protect the data carefully and control third-party analysis.

Making Data Anonymous: Data Protection Law & Rights in Detail 🛡️

On what basis is it possible to anonymize personal data, and what does anonymization actually mean? These questions are particularly important in the context of Big Data and Artificial Intelligence.

Introduction

The General Data Protection Regulation (GDPR) only applies to personal or personally identifiable data. It defines in Article 4, Number 5 pseudonymisation as a way to infer a person from existing data. Pseudonymous data is therefore also covered by the GDPR because it is personally identifiable.

The concept of personenbeziehbarkeit must be understood in a very broad sense. This is, inter alia, evident from the EU Court of Justice ruling of 19.10.2016 – C-582/14, according to which even dynamic IP addresses are to be considered as personal (because personenbeziehbar) data. The Federal Supreme Court confirmed this.

My thesis: The anonymization of personal data for legitimate purposes is always permissible.
My thesis, for which I provide arguments in the article.

The European Court of Justice ruled that data can be considered personal even if, in theory, it would be possible to identify a person from a data point by involving multiple third parties, such as intelligence agencies or telecommunications providers.

Therefore, the GDPR applies to all data that relates to an individual or, with reasonable and objectively feasible efforts, could be related to an individual.

Anonymous and anonymized data

Anonymous data are by definition not personally identifiable and cannot be linked to a person.

The GDPR does not apply to anonymized data.
Anonymous data cannot be traced back to a specific person.

Due to Big Data applications, which also include Artificial Intelligence, it would be possible in individual cases to conclude with a certain probability on a single person by evaluating a huge data pool. This can only succeed, however, if a register of persons is available. If there is no register of persons, it logically follows that no personal reference can be derived from (actually or securely) anonymous data.

Data that has been anonymized can be called anonymized data, which originated from personal data. The following table aims to clarify the difference.

Anonymous vs. anonymized	Data source
Anonymous data	Random or similar.
Anonymized data	Personally identifiable data

Distinction between anonymous and anonymized data.

Ultimately, terminology is irrelevant if anonymization has been perfectly carried out. Theoretically, it would still be conceivable that an anonymization procedure is mathematically flawless, because it leaves open numerical or stochastic possibilities to undo or crack the non-existent anonymization.

This text does not delve into the question of when anonymization is perfect. Instead, it focuses on whether the anonymization process is compatible with the General Data Protection Regulation (GDPR). Before doing so, examples will clarify the difference between pseudonymous and anonymous data

Examples of pseudonymous, anonymized, and anonymous data

Pseudonym data are data that are not personally identifiable but can be related to individuals by adding further information. This is my informal explanation of a definition that, in Article 4 No. 5 GDPR, is given somewhat differently, but, in my opinion, is essentially the same.

For simplicity's sake, pseudonyms and pseudonymous data will be treated as equivalent here. There may not even be a difference between these two terms, which I haven't investigated further and, for now, consider irrelevant.

As a starting point for pseudonymous data in this example, a personal datum is used, namely the exemplary IP address 192.168.66.77. The following table shows several possibilities to store an information related to the IP address that can be evaluated later.

Type of Data Value	Example manifestation	Educational rule
Pseudonym	4711	Generate a hash value from the input value and replace the input value with it
Pseudonym	192.168.66.x	Extract the last byte from the input value
Pseudonym	0815	Assign a random value to the input value and store the input value and the random value in separate databases
Anonymized	0815	Assign a random value to the input value. Send the random value to a third party who processes it. Ideally, the data is anonymized at the third party (data importer), while at the data exporter, the data remains personal.
Anonymous	0815	Generate a random value (that is completely independent of the input value). The input value can persist for legitimate purposes

Examples of pseudonymous, anonymized, and anonymous data.

As the table demonstrates, you cannot directly tell from a value whether it has been pseudonymized, anonymized, or de-identified. Rather, it depends on the mapping rule applied to the value. It is also crucial to consider whether a possible link exists between the input value and the target value, and whether this link can be reconstructed, for example, by reading additional data or by recognizing the logic behind the (secretly kept) mapping rule.

Anonymized and anonymous data do not matter whether the collected personal data still exist or not. Anonymized and anonymous data are always without reference to persons and can never be traceable to individuals. Data that is personally identifiable are by definition not anonymous and not anonymized.

Perfectly anonymized data have the same data protection level as anonymous data
Logical deduction from the definition of the term "anonymization".

Anonymized data are anonymous data that were generated on account of the collection of personal or personally identifiable data. See also the definition of anonymization at Wikipedia. There, it is even permitted for anonymized data to exist if their tracing back to individuals would be theoretically or practically possible, although with very great and uneconomical effort.

In my view, anonymized data encompasses two perspectives:

Data Exporter: The owner of personal data who has manipulated it. In the case of a data exporter, there are then personal data present for which an additional manipulated value is also available. This manipulated value is sent to a third party, so that the third party cannot undo the manipulation
Data Importer: Third party (or also second party), which receives processed data from the Data Exporter. The processed data do not allow any conclusion about a person. At the Data Importer, these data are therefore ideally anonymous. At the Data Exporter, the original data were or are personally identifiable

For a more in-depth consideration at the practical example, see my contribution to the analysis tool Jentis.

Legal basis for anonymization of personal data

To be able to anonymize personal data, one must first obtain the personal data. The acquisition of such data is, according to Article 4 No. 2 GDPR, a data processing. That already knowing data constitutes a data processing (provided an offer lies at its base), I have previously extensively investigated on the concept of data collection and introduced the concept of the data container therewith.

An anonymization that means deleting or destroying personal data shall only be examined in terms of the anonymization process itself. The deletion or destruction of the present data is assumed to be permitted. See also Stürmer Beck, ZD 2020, 626 for the deletion process through anonymization

The General Data Protection Regulation (GDPR) Article 6 provides the legal bases on which personal data can be processed. Processing is permitted when data are ([1])

obtained (have become known)
collected
organized
categorized,
stored
adjusted
changed
read out
queried
used
revealed through transmission
widespread
provided
aligned
linked
limited
deleted or
destroyed

were.

As processes for anonymization, ad hoc use, modification, deletion or destruction are considered. The last two options are ruled out because data usage is still present with anonymous data. Modification also doesn't quite fit because the information content of the data about a person is no longer given during anonymization. However, using personal data would already be present and in my opinion would be the only possible processing activity to consider during anonymization.

Special categories of data

If anonymization is a processing activity under GDPR, then Article 9 Section 1 GDPR should be taken into account, provided that special categories of personal data are involved. However, Section 2 thereof opens up the possibility that these special categories of data can be processed if they are obviously publicly disclosed personal data. In any case, public data could be anonymized, even if it involves critical data categories. Letter g thereof again enables anonymization for such data, possibly also when there is a significant public interest involved. An example of this would be the evaluation of health data to mitigate a pandemic. Letter i thereof could also be used in this regard.

Data anonymization for statistical data collection

Article 9(2)j of the GDPR explicitly permits statistical collection for special categories of data, referring to Article 89(1) GDPR. Statistical data collection and the processing of anonymized data are also mentioned in Recital 26, where it is clarified that the GDPR does not apply to such processing.

In Recital 50, it is explicitly described again:

The further processing for […] scientific […] research purposes or for statistical purposes should be considered a permissible and lawful processing operation.
Excerpt from Recital 50 of the GDPR

Recital 50 explicitly states that statistical purposes allow for lawful processing. In my view, anonymizing personal data for scientific or statistical purposes is therefore always permissible, provided that the purposes of processing the subsequently anonymized data are legitimate.

Responsibility for data processing

On the example of the IP-address, which is compulsorily collected every time a website is accessed, I would like to illustrate when liability arises due to data processing.

When a user accesses a website, the operator of the website receives from the user their network address, i.e., their IP address. The network address is a personal data point. The operator of the website has therefore processed personal data of the user because

a) the website offers a service,

b) the user has accepted the offer,

c) the user thereby transmits their network address as personal data to the operator of the website,

d) the website operator received (collected) and thus processed the network address via their web server.

Therefore, every website must have a data protection declaration in accordance with Article 12 GDPR. Exceptions are empty or almost empty websites that pursue no purpose. This legal provision is unfortunately not known to many data protection officers themselves. An example shows this. I had written to the person responsible, whom I met personally once, and explained the situation to him. He did not respond and also left my inquiry unanswered. His email address knows me because he included it in his mailing list, which he supplies with manually created emails with information that also reaches me. Ultimately, I informed him that his website would be mentioned as a negative example in an article (in this one).

Therefore, visiting a webpage means that it is legally permissible to obtain an individual's network address, because this is technically necessary.

Can this network address now be anonymized? The Art. 6 GDPR provides the legal basis for data processing. Here, especially Art. 6 Abs. 1 f GDPR, legitimate interest is relevant. I would like to consider consent here at a later point, but will go into it from another reason anyway.

The relevant excerpt regarding legitimate interest states that data processing is lawful

…, unless the interests or fundamental rights and freedoms of the data subject, which require the protection of personal data, outweigh
Excerpt from Article 6 Paragraph 1 f GDPR

During anonymization, no data worthy of protection is created. Rather, the already collected personal data remains exactly as it is and is neither passed on to third parties nor processed in a way that is worthy of protection.

According to my assessment, Article 6(1)f of the GDPR allows for the anonymization of personal data if there is a legitimate interest of the controller that justifies this. Such an interest could be the development of machine learning procedures or any other legitimate purpose that appears reasonable.

On the other hand, anonymization could be understood as a process that is not covered by data processing under the GDPR and therefore would not require any legal basis, so it would always be possible (at least if the purposes of processing the anonymous data are legitimate). See e.g. the essay Anonymisierung als datenschutzrelevante Verarbeitung? in the journal ZD 2020.

Purpose of Processing

The anonymization of the present data is likely intended for another purpose than the original one. Article 6, paragraph 4 of the GDPR states that this should be checked,

…whether the processing is compatible with the purpose for which the personal data was originally collected.
Excerpt from Article 6 Paragraph 4 GDPR

Also Article 5(1) f GDPR introduces a purpose binding and review. However, I see the purpose of anonymization as being derived from the purpose of using anonymized data. And here I on of the opinion that any legitimate purpose for using anonymized data is sufficient to justify the anonymization, and thus the legitimate interest.

In my opinion, the purpose is irrelevant when it comes to anonymization, as long as the data controller has a legitimate interest according to Article 6(1)(f) GDPR. The remaining letters of Article 6(4) GDPR state:

Letter a: the connection between the purposes, namely the original purpose and the purpose of anonymization
Letter b: Relationship between the data subject and the controller
Letter c: Type of personal data
Letter d: Possible Consequences of the Intended Further Processing
Letter e: Existence of appropriate safeguards

In my view, regarding anonymization, it is as follows:

Letter d: With perfect anonymization, there are per se no significant consequences
The letter "e" is irrelevant in the case of perfect anonymization
Letter c is therefore just as irrelevant in the case of perfect anonymization
Letter b may be relevant to the question of whether a legitimate interest exists. However, in the context of perfect anonymization, this letter may also be entirely irrelevant, which would need to be clarified by case law
In my opinion, point a applies in the same way as point b

Is anonymization possible with consent?

The logical answer is No. Pursuant to Article 7, paragraph 3 of the GDPR, the data subject has the right to withdraw their consent at any time. Anonymization cannot be reversed and therefore also cannot be withdrawn.

If anonymization could be reversed, it wouldn't be anonymization.

So long as a right of revocation is granted, this question can be answered unequivocally. Even a deletion process cannot be undone, which is why granting consent as a legal basis for deleting data also falls short if a right of revocation is to be granted. This applies at least in the case of perfect deletion. If backups still exist, it can at least be doubted whether this constitutes a deletion. In any case, no perfect deletion would have been carried out in fact. It may be considered acceptable for deleted data to remain in backups within the system. However, I suspect that if these backup data were to leak to third parties, they could become a major problem.

An agreement without a right of withdrawal is conceivable in principle. At least the affected person would have to be informed accordingly before giving consent. However, this contradicts either Article 7(3) of the GDPR or the fact that consent is being given to something that reduces or completely eliminates the person's affected status.

Is anonymization a data modification or a data linkage?

The term data processing includes linking or modifying personal data. Linking cannot occur during anonymization, because then the data would not be anonymous.

A change in personal data is therefore not fundamentally present either. Because nothing remains of the personal or person-related data.

In my opinion, there are at least two ways to achieve anonymization.

Anonymisation in the Deletion of Personal Data

If personal data is to be deleted, there is generally the possibility of anonymizing it. This applies regardless of any legal grounds that might contradict it. I only mean this in relation to the possible process. If deletion is deemed appropriate or desired, anonymization of data could be considered directly before deletion.

Anonymization by counting (statistical evaluation)

I mean the anonymization by counting that I on referring to is the determination of certain characteristics for which no hits were initially found. An example of a characteristic is an age range of people, such as 40 to 49 years old. When anonymizing by counting, the stock of personal data can remain unchanged. However, it should be ensured that the considered data pool is large enough not to fall into individual counting. Compare this also with the concept of k-anonymity, which requires at least k individuals to be considered as a common mass in order to avoid stochastic conclusions about individual persons. See also Google FloC, an unsuccessful approach to avoiding cookies.

If there exists a data pool with 100,000 persons and an age specification per person, one can count how many persons lie in the mentioned age group. In the database, it is then noted that, for example, 27,583 persons are between 40 and 49 years old. Further characteristics can also be counted across all persons. However, this would be an example of anonymization for which (at least theoretically) a revocation would be possible. Nevertheless, a revocation could also bring dangers with it, because through revocations from several persons, the data pool could shrink to the point where perfect anonymization is no longer given. To counteract this, one could only assign age groups in the anonymous data mass for every x. person. Or one would only evaluate blocks of data records at once. With that, a revocation would also not be possible anymore, which I generally consider highly problematic and often also theoretically not realizable for anonymization.

It would be conceivable for discrimination through evaluating anonymous data. For example, it could be determined that a certain age group is more susceptible to a particular disease. However, this would then be a factual finding (as long as the study is representative), which has also been made in various ways regarding Covid and actually disadvantages many people in everyday life. That a person of the respective age group might have something against the acquisition of such knowledge that could possibly be detrimental to them cannot be legally claimed (this is independent of the question of whether the process of data anonymization is permissible). I would simply consider it bad luck. After all, it's also personal bad luck for all children that they currently do not receive a Covid-19 vaccine protection.

Can anonymization always be permitted?

The question addresses whether anonymization of personal data is permissible regardless of the specific circumstances of the data controller, the affected person, and the purpose of the original, legitimate processing of the personal data.

I'd like to invite you to a thought experiment and remind you that great physicists have solely used this method to create the foundations of our current telecommunications, computers, smartphones, and satellite navigation capabilities.

Suppose you have lawfully obtained personal data from thousands of individuals. Now you anonymize this data and obtain a data pool consisting solely of anonymized data.

How can someone even gain knowledge that you possess anonymized data? The data is anonymous, after all, and cannot be inferred back to a case, i.e., a person! Nobody can complain to you because the person in question was not knowingly affected. Although it may make no difference from a legal standpoint whether someone gains knowledge of an event or not, it is still the case that the process of anonymization could only be recognized by a so-called White Box Insider. The term White Box comes from computer science and means having intimate knowledge about a system, which can be investigated to understand how the system works. This stands in contrast to a Black Box or Grey Box, which only allows observation from outside. The term Insider was introduced by me in this context. Normally, one speaks of a White Box Test. A test is something other than gaining knowledge, however. Therefore, the term Insider was chosen as a designation for a person who has intimate knowledge about a system.

One could infer from Article 5, Paragraph 1 a GDPR that anonymizing data must be notified to the affected persons. However, Article 5, Paragraph 1 e GDPR even speaks in favor of anonymization, as it enables one to dispense with the original personal data directly or earlier than otherwise possible on a case-by-case basis. Even letter f ibidem speaks in favor of anonymization, because thereby data security is increased by an infinite factor.

Let's assume a "data subject" now knows that their data has been anonymized. I won't delve into whether or not there is any impact on this person, or what kind of impact. Instead, I'm interested in whether and how the data belonging to this specific "data subject" could be removed from the pool of anonymized data. This question can be answered logically.

Even from a pool of anonymized data, individual data points derived from personal information may not be fully removable. If a "affected" person requested it, the entire data pool would have to be deleted as a permissible action.

Let's say the pool of anonymized data has already been shared or combined with other anonymized data. Should all these further anonymized data pools, which originated from a million datasets of individuals, be deleted because one person has objected to the processing of their personal data? I argue that this is not the case.

Anonymization of IP addresses happens billions of times daily in Germany, without any information being provided to the user. I consider this to be legally compliant, for example, when anonymously counting visitors on a website. This applies at least as long as cookies are not used to prevent double-counting of the same person. Whether information obligations are violated if anonymization occurs without notice is not entirely clear to me, but it is secondary. The consequences of anonymization are anyway manageable.

It is worth noting that a visitor counter does not necessarily process personal data, and if it does, it may be for purposes related to those for which the data was originally collected.

Counterarguments

Due to a discussion on XING, I would like to present here some counter-arguments that were put forward to argue that anonymization is covered by data processing already taking place.

Same legal basis?

As a counterargument, it was stated:

"_The legal basis for anonymization is the same as that on which the personal data were collected. For processing is a procedure or in most cases (there may be exceptions) a series of procedures

Whether a series of actions exists or not only matters if the purpose within that series is the same. This is often not the case with anonymization. Just think of Big Data applications and statistical analyses to gain insights.

Are data always stored somewhere?

The claim was that data would always be stored and therefore the process of anonymization was justified.

This is incorrect. Data processing under GDPR begins already with data collection. Data collection, in turn, is already an objective possibility of knowledge about data. The prerequisite for this is that a message was sent to the data recipient on the basis of an offer by the data recipient. See my contribution to data collection.

Are data always being processed, used, or exploited in some way?

One part of the counter-argument was that data would always be processed, used, or exploited in some way. While I don't see this as a valid argument, I would like to respond to it.

Data are not always handled, used or utilized. These terms are also partly irrelevant because they are not part of the definition in Article 4 Paragraph 2 GDPR . As written in the previous section, data may not even become known under certain circumstances. It is enough that they could potentially become known. A "handling" or "use" of the data therefore sometimes does not occur at all. ([1])

Is deleting equivalent to anonymizing?

Clearly, deleting is different from anonymizing.

Anonymizing original data is ultimately possible under retention of original data, which does not seem to be the case when deleting is involved.

When deleting data, storage space is released, whereas anonymizing data consumes additional or other storage space, depending on whether the original data remains intact or not.

Data anonymization means a significantly higher effort than data deletion. To anonymize data, algorithms must be applied and mathematical procedures used to control the loss of personal reference.

From the language usage finally one can deduce alone that deleting something is different from anonymizing it. It is correct that in both processes ideally the personal reference is lost.

Does deleting data serve a different purpose than collecting it, and does that also apply to anonymization?

Deleting data always has a different purpose than the original purpose for which the data was collected. Therefore, one could conclude that anonymizing data for purposes other than the original data collection is also permitted.

I say that's not true, because deleting in the GDPR is explicitly regulated. Deleting is then required when the purpose of the collected personal data is no longer given. See for example Article 17 GDPR here. A person concerned can therefore obtain a deletion if you express this. Anonymization instead of deletion, however, cannot be requested by a person. Art. 17 Sec. 3 e for example limits the right to delete in order to prevent legal disputes, whereas I see no possibility to restrict anonymization of data, as long as anonymization is considered fundamentally permissible. This can be disputed because anonymization will not always succeed perfectly. But I mean the undisputed perfect anonymization.

The purpose of anonymization is the destruction of personal reference

That only holds true if the original data is deleted or destroyed during the anonymization process. Anonymization while retaining the original data does not change the degree of personal reference overall, when considering the total data volume and if the anonymization is perfectly successful.

The purpose of anonymization, in my view, is rather to statistically evaluate data for specific purposes while safeguarding personality rights. What these purposes are depends on the type of evaluation or the goals pursued. I refer again to Big Data applications and Artificial Intelligence applications. For example, one could anonymize data from people infected with flu to find out whether men more frequently contract a relevant illness than women. One could tap into entirely different data pools for this purpose. The example is hypothetical and does not take into account the fact that special categories of personal data are protected under Article 9 GDPR under special protection.

What responsibility arises from anonymization?

Similar to external links on websites, it seems to me that the question of responsibility for the anonymization process is located. Someone who does not mark an external link on a website as such could be fully held responsible for the resulting data processing by a third party. This can be limited if the link provider is not aware that the third party processes personal data illegally, namely by forwarding the visitor's network address and traffic data to another party. Not much seems to happen in most cases, probably. It looks different when the target of the link is being monitored by an intelligence agency.

In perfect data anonymization, nothing happens except for a non-critical (statistical) evaluation that has no personal reference. Being responsible for this mathematically translates to no liability. A person affected could certainly sue the person responsible for the anonymization process. However, I question what the claim would entail. It's unlikely that one could demand damages. The plaintiff could sue for an injunction. Many things are conceivable here, but in my opinion, this would lead to unproductive results and seem more like a desire to sue for the sake of suing.

If anonymization is not perfect, things may look different. In the strict sense of the word, anonymization would not occur, which would mean that data processing could be deemed unlawful because it lacks a legal basis.

Overall, I think responsibility stemming from anonymization will be extremely low or practically nonexistent. As long as there are millions of obvious data privacy violations on websites, it seems absurd to me, although arguably permissible, to focus narrowly on anonymization.

Conclusion

Especially for AI-Applications such as ChatGPT or other language models or image processing procedures, as well as other analyses of larger data sets, the question of anonymizing data is relevant. In AI, it's even about the copyright issue that also raises the question of sufficient modification of data. Anonymization can be seen as a modification of data.

The GDPR does not apply to anonymized data. Once data has been anonymized, the GDPR no longer applies.

The process of anonymizing personal or personally identifiable data may be justified by a legal basis in accordance with Article 6(1) GDPR. In this case, only the legitimate interest is considered (for most cases). Article 6(4) GDPR helps to assess whether a legitimate interest exists. On the other hand, Recital 50 allows for the statistical processing of data without further purpose binding.

I believe it is possible for the legislature to enact a law that permits the anonymization of data. This would provide a legal basis in accordance with Article 6(2) of the GDPR. Whether this would be upheld would have to be examined by the ECJ in case of doubt. However, I believe such a law would be possible to create in order to provide clarity.

In my opinion, any perfect anonymization of legally obtained personal data is permissible, as long as the purposes pursued with the anonymous data are legitimate. For then there is always a legitimate interest in data anonymization, or consideration 50 applies, since a statistical processing takes place. I believe that even the authors of the aforementioned article from ZD 2020 see it this way, because the authors write after intensive consideration: "_So long as – as is usually the case – no indications for In cases where conflicting interests of the affected person exist, anonymizing personal data is therefore a permissible data processing operation under the GDPR

I could even imagine that anonymizing illegally obtained, not specially protected personal data is legally permissible under data protection law, because through anonymization of data nothing else happens than that in the best case new insights (without personal reference) can arise. This is as written, only related to data protection, not to criminal law etc.

The retention of knowledge is generally not prohibited and cannot be prohibited from a data protection perspective, as far as I can judge, if the underlying data basis is not protected, provided that the data protection is only concerned with me, not a criminal classification, such as theft of documents.

If a new insight arises from evaluating classified documents, I consider that morally acceptable, provided there is no personal reference and the classified documents are withheld for purely strategic or political reasons. Legally, this may be assessed differently by intelligence agencies, etc., raising the question of how legitimate the activities of individual intelligence agencies and the laws that shape them truly are. Criminal law may also be structured differently, I'm only speaking about data protection under the GDPR.

I would be interested in your opinion on this, including any legal grounds that might contradict my arguments. I on marking this post as preliminary and do not consider parts of it to be final yet. Update: After publishing the post, no one has raised any serious objections. This at least suggests that there don't appear to be any significant doubts at present.

Key takeaways of this article

Anonymous data may be processed without the consent of the data subject because it is no longer considered personal data.

Pseudonymous data can be traced back to individuals through additional information, while anonymous data can never be linked to specific people.

The perception of anonymized data can differ between the "data exporter" (the party that anonymizes the data) and the "data importer" (the party that receives the anonymized data).

The GDPR allows for the anonymization of data for scientific and statistical purposes, even if it involves sensitive data.

Perfect anonymization means that data is changed in such a way that it can no longer be attributed to a specific person.

While anonymized data can provide useful insights, there is a risk that individuals can still be identified from it.

Although anonymized data is often difficult to completely separate from the original data, it may not always be possible to fully remove information about an individual.

Deleting and anonymizing data are distinct processes with different purposes.

About these key statements

Making Data Anonymous: Data Protection Law & Rights in Detail 🛡️