Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort DSGVO-Probleme finden

EU AI Regulation: Big Bang or False Start?

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI

Artificial intelligence (AI) is based on large data sets. The EU protects personal or author-related data very well, which is good in itself, but it hinders the development of competitive AI systems. Further reasons speak against high-performance language models made in Germany. Can this dilemma be resolved?

Introduction

The most common application cases for AI are probably language models (LLMs) and image models. Video generators or object recognizers may soon be added. This article therefore focuses on LLMs for simplicity's sake. The findings are largely or entirely transferable to many other model types, such as classifiers or medical diagnosis systems.

Currently all competitive language models come from countries outside the EU. Mistral may be a small exception, although their language models are not quite at the forefront.

Aleph Alpha is no exception, as its new model Pharia-1 performs only moderately well in benchmarks, to put it politely.

Some believe the EU could still catch up. That won't happen. Because for powerful language models, there's only one thing that's truly necessary: data. Nothing else. No personnel. No technology. No money. No time. Nothing except a very large amount of, ideally representative, data is missing. Of course, the data must be legally compliant. Thus, even fewer data sets are available.

For very good language models, there is exactly one important ingredient missing in Europe:

Data.

Everything else is always available: One person, one or a few servers, the best program code for AI training.

The reasons for the EU lagging behind in AI are literally enshrined in law.

Data protection laws

Data protection is very important. Numerous scandals prove this, scandals that primarily originate outside of Europe. Here are a few examples:

In the US, a very important presidential election was influenced by the illegal use of user analysis data from Google and Facebook (Meta) ("Cambridge Analytica").

Microsoft is being referred to as a security risk in the US by prominent entities, due to its lack of data security. ([1])

Meta is not better than Microsoft, but rather worse. Because Microsoft at least earns money with products as well as data, while Meta has nothing except user data. These user data are maximally monetized. Data protection laws like GDPR are thereby more of a hindrance. ([1]) ([2])

Similar negative reports can be made about Google. That sometimes criminals are caught because US security authorities evaluate the use of Google products does not really calm down. Who as an innocent citizen is at the wrong place at the wrong time will quickly be labeled a criminal and rot unscholarly in prison or even have to reckon with the death penalty.

The General Data Protection Regulation (GDPR) as Regulation has a very good basic idea. It was issued when AI was not yet an issue. It is sensible in itself. But why is it actually not applied? German data protection authorities effectively only sanction in homoeopathically detectable doses.

The General Data Protection Regulation (GDPR) essentially only allows the use of personal data for AI training on the basis of a legitimate interest (cf. Art. 6 Abst. 1 GDPR). Consent is ruled out in cases of mass data. A contract would be legally difficult to establish for mass data.

Worse still: For authorities, the legitimate interest as a legal basis is NOT available (available in the aforementioned Article 6 Paragraph 1 GDPR according to letter f). Authorities can thus practically not train AI-systems. This is particularly unfortunate, because authorities would have many valuable data that could also benefit citizens again.

The General Data Protection Regulation (GDPR) applies "only" to personal data, which also includes pseudonymised data (Article 4 No. 1 GDPR). The GDPR does not apply to anonymous data.

However, if you put it somewhat hyperbolically, there are practically no truly anonymous data:

  1. Anonymous data is data for which the original data is no longer accessible (very rare case).
  2. Anonymized data are not as representative as original data and therefore less valuable for AI training.
  3. Anonymisation itself is a data processing operation. In practice, authorities are virtually prohibited from carrying it out. Others can practically only carry it out if there is a legitimate interest involved, which is difficult to assess.

We're talking about practice here. What's valid in theory doesn't interest any company in the world that wants to solve concrete problems. Theoretical discussions leave out one thing, namely the practical relevance.

In fact, mass data alone cannot flow into a AI system for reasons of data protection, for example, for training the AI . ([1])

This also applies to public data on the internet. The following cases are problematic: (Note: I kept the "" untouched as per your request):

  1. Someone writes something about another person. This could be a statement of fact, or it could be defamation. The other person does not want this information to be public knowledge, and certainly not stored in a AI language model.
  2. A person publishes information about themselves. An AI stores this information because a crawler reads the person's website. Later, the person decides to withdraw the information and demands it from the AI operator as well. However, data from AI models cannot be deleted. Try erasing an information from your head. You can't. Your brain and the AI brain are both neural networks. There's no difference here. Believe it or not. What's important is that information cannot be removed from AI models.

Repetition: Due to data protection reasons, mass data cannot be used for AI training in the EU. This is at least an unwanted side effect of the otherwise very sensible GDPR.

Copyright

German copyright law allows under § 44b UrhG training AI with works protected by copyright. These works may even be temporarily stored for AI training.

Read full article now via free Dr. GDPR newsletter.
More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks
Already a subscriber? Click on the link in the newsletter & refresh this page.
Subscribe to Newsletter
About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Artificial intelligence: How do AI language models store data? Do the models also contain personal data?