Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort das Ergebnis sehen

Artificial intelligence as a bullshit magnet

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

AI is the hot topic that has already revolutionized our everyday lives and will continue to change them significantly. Many are suddenly AI experts. Many are calling for AI to be regulated. Many trivialize AI and say that AI language models do not process personal data. The following is an outline that aims to clarify misunderstandings.

Introduction

AI is both underestimated and overestimated. Most people, often myself included, do not understand the possibilities offered by AI systems. Just yesterday, I saw revolutionary AI approaches that were unknown two weeks ago. As someone who works very intensively with artificial intelligence, I feel this way almost every day.

Many think that AI is a hype, which will fade away again soon. Wrong! With the Transformer approach, in 2017 the Intelligence function of humans was deciphered, I say. Instead of programming an algorithm for solving a problem, I only have to feed enough examples into my AI system, which runs under the desk. So even previously unknown hieroglyphs were discovered and deciphered.

From a justified fear of the negative consequences of increasingly powerful AI systems, many are calling for regulation. But they don't say how.

Then there are detractors, who want to profile themselves as AI experts or legal enablers. They tell others how or that they can use ChatGPT profitably. Even at the DSRI conference (German Foundation for Law and Informatics), it was claimed in a contribution that AI models do not process personal data.

Others reassure by referring to the new informal data protection agreement between Europe and the USA. Just because data can now be sent to the USA without additional guarantees, some suggest that any data processing is therefore permitted.

A few details on the individual points follow.

Possibilities of AI systems

An AI can do everything a human can do and much more. Maybe not yet, but potentially (in a specific application area X) as early as next week. Robots with AI brains will soon be walking around and experiencing the environment. This will be exactly the same as how children learn. We will see who takes the place of parents. It could be human trainers, but also other robots or algorithms.

An example of rapid development: AI language models could only process a very few characters of text at once. This amount of text is referred to as context length. Until just a few months ago, the context length in almost all AI language models I was familiar with was 1024 characters, or one kilobyte.

The context length increased every week, first to 2048, then to 4096, then to 8192, then to 16,000 characters, and later to 32,000 characters. ChatGPT recently boasted a context length of 128,000 characters.

Yesterday I read about an approach that has been known in research for a few months. It can process a context length of one billion characters (= 1,000,000,000) at once. A quick calculation: Before = 128,000 characters, one blink of an eye later = 1,000,000,000 characters. That's an improvement by a factor of 7800, just like that.

Moore's Law does not apply to artificial intelligence. Instead of a steady increase in performance or other factors every 12 to 24 months, there is a significant improvement in relevant AI properties virtually every month.

Based on my concrete observations and my own AI programs.

Another example: The Transformer approach mentioned above has a few weaknesses. It is very resource-hungry. Even high-performance computers or graphics cards need a few seconds to generate an answer to a question to the chatbot. Every ChatGPT user knows what I'm talking about. Now there is an approach that provides the same response quality, but responds 8 times faster and requires only a third of the expensive and barely available graphics card memory for its calculations.

If you are over 50 years old, I have good news for you: There is a chance that you will the of natural causes and in peace. All significantly younger people will experience the end of humanity because AI systems will massively outdo, enslave or wipe us out. Possibly another catastrophe will occur beforehand, but this article is not about that.

Is AI just statistics?

The question is irrelevant. It doesn't matter whether the human brain is based on statistical processes. What matters is what comes out in the end. Obviously, our entire existence is based on statistical processes. Compare this with quantum physics, a very elementary and powerful theory. Quantum physics is based on the fact that the behavior of a tiny particle of our existence cannot really be predicted. Rather, a statement can only be made about particles if many are considered and the average is drawn from the observations.

Obviously, German grammar is based on learning which words are typically strung together and fit together. That is also statistics. But hardly anyone talks about it.

The regulation of AI

The capabilities of AI make many people rightly anxious or worried. Out of their felt helplessness, some demand regulation of AI systems. What exactly that means is usually not said. The only demands that have stuck in my head are as follows:

  • Labeling of AI-generated works: Images, videos, texts…
  • Disclosure of the sources used to train an AI system
  • Anything else? I can't think of anything worth mentioning right now

On the first point: it's a gift. Labeling works is a good idea, but it changes almost nothing. Criminals will not start labeling their fake videos and fake news as artificially generated fake works. Everyone else will dutifully adhere to the label. This won't save humanity, but it can be done. The benefit is there, but it is only a selective intervention that has hardly any qualitative effect.

Disclosure of sources, i.e. training data: Anyone who demands this simply has no idea how AI models are structured. This demand comes years too late. The sources are usually known:

  • The Pile (Text)
  • Common Crawl (Text)
  • LAION data set (pictures, German club!)

The text data includes Wikipedia, news pages and popular websites in particular.

If I tell you that my AI model has read a chemistry encyclopedia, what would you do then? Theoretically, my AI could develop chemical warfare agents. Practically, this has already happened, however, as an experiment by benevolent researchers who had generated active substances with AI and had made a "minus one" out of the "one" for testing purposes. The dangers are therefore considerable.

Example with own AI
My Question to my AI on 01.12.2023:
Name specific events or opportunities to go out during the day on 02.12.2023 near Idstein/Germany (near Wiesbaden)! Only events indoors and only events that are more than 10 kilometers away from Idstein.
Answer from my AI (some parts omitted):
On 02.12.2023, events such as the cello concert with Roger Morelló at the Wiesbaden Castle Festival and the
Weekly market Wiesbaden from 11:00 am.
All events are more than 10 kilometers away from Idstein.
--> I was looking for events for "tomorrow" in the vicinity, but not on the doorstep (otherwise too boring/old familiar)
--> Found events for tomorrow in the vicinity that are not right on my doorstep.
--> Task fulfilled. All of this without any optimization, which is certainly still possible and will bring even better results.
--> Start programming this solution (local language model already at the start): 01.12.2023
--> End of programming: 01.12.2023
I subsequently found proof of my AI's statement: The event really does exist.

In short, I have to cut it here: AI cannot be regulated. A AI can do everything a human can (see above). How does one regulate humans? Not at all as long as they haven't done anything bad. After something bad has happened (thief, murderer, terrorist, inciter of the people etc.) it is investigated. Then it's too late. As far as I know, it never and still doesn't succeed in democracies to forbid a human from thinking. A AI can think faster, longer and (soon) better than a human.

Incidentally, I think the question of whether an AI can be the author of a work is largely nonsensical. After all, if I have an image pre-generated by my own AI (without a watermark) and then claim that it is my work, you will not be able to show me otherwise, or only with the greatest effort. Above all, AI-generated images or pieces of music can still be edited and enhanced manually.

Do language models process personal data?

Yes, almost always. And always when

a) the training data contains personal data or

b) the user uses personal data in their input (prompt) to the chatbot.

Point a) is a given for all language models I know of. See, for example, the huge training data sets The Pile and C4 (Collosal Cleaned Common Crawl), which are used in all common chatbot models.

Apparently, some wish that AI systems did not process personal data. The reality is: AI language models process personal data and store this as well.

Some accept that and then claim that personal data could be automatically recognized and anonymized. That is Bullshit. Anyone who claims that has no idea about Artificial Intelligence or Data Protection. Unfortunately, there are especially naive people and organizations that nominate themselves for an innovation prize with supposed solutions that make empty promises regarding the anonymizability of data.

Privacy Shield II (Data Privacy Framework)

In purely formal terms, the data protection world for data transfers between the EU and the USA is once again intact. The accusation that led to the ECJ's Schrems II ruling and the invalid Privacy Shield was that the USA is an intelligence state (FISA 702, EO12333, Cloud Act). Apparently, this was discussed away with the Data Privacy Framework (DPF) and will probably soon be overturned by the ECJ.

The point is that personal data can now be transferred from the EU to the USA again without any special guarantees. Some fools are inferring from this or suggesting that all data processing in the USA is now permitted.

It is correct that any processing of personal data must be carried out in accordance with one of the legal bases from Art. 6 (1) GDPR. And yes, personal data are always transmitted to ChatGPT when the ChatGPT interface is used. The IP address is personal and is always transmitted. Unfortunately, OpenAI does not like to adhere to data protection regulations too much because then its own AI cannot be improved so well either. Also Microsoft as a shareholder of OpenAI is not very interested in data protection. See the new Outlook that even takes your username and password from your mail accounts and retrieves and analyzes your data and your email correspondence. Not to mention the security problems of Microsoft (Azure) which Microsoft has downplayed and not (now maybe?) solved.

AI experts

The AI essays of many people who have little or no knowledge of technology are remarkable. AI is based on technology to a very considerable extent and more than almost any other achievement. So how can someone who really understands nothing or very little about it make competent statements?

Then there are ChatGPT-disciples, who want to earn money with recommendations and prompt improvements. At least they understand something about technology, namely that you don't use a computer mouse to (like Scotty) speak into it, but to move a cursor on the screen with it. Only for those who occasionally use their PC with its unnecessarily large monitor and unnecessarily efficient keyboard instead of a completely sufficient tiny smartphone keyboard and screen for people in years with the best eyesight.

These ChatGPT disciples, who may have a mini knowledge of technology and know how to use the Internet, unfortunately often or almost always have no idea about or interest in data protection.

ChatGPT is a great system and can be used for harmless tasks with a clear conscience. But what about sensitive data?

Limits of AI

It is still the case that language models (LLMs) in particular often hallucinate, i.e. deliver false statements. That will remain the case, I say. Or would you say that people don't make false statements? Even experts often say false things that they later revise – provided they have insight. Apparently, in several million years of existence, mankind has not managed to change anything about its unreliability. Why should it be any different with artificial systems?

AI can certainly be more reliable than humans in many areas and extremely reliable in some areas. But when it comes to summarizing statements of claim in court, I fail to understand how AI can be seen as a solution for this.

Data-friendly AI systems

Data protection is of interest to many companies not at all. OK, then we'll take business secrets. Who gives me their business secret? Why not? If my name is ChatGPT, you give it to me then?

There are supposed to be documents for which confidentiality has been contractually agreed. Many call this NDA (Non Disclosure Agreement). If you upload such a document to ChatGPT to ask the chatbot for a summary, haven't you already breached confidentiality? I say yes

It would only be worse if you use the new Outlook to send confidential documents. Because then Microsoft will automatically gain in-depth knowledge of them.

What many companies still don't understand: ChatGPT can do many things not and worse than their own AI systems. A solution are self-sufficient AI systems, which belong to your company. Not only is the data problem solved, because you decide whether data leaves your system and if so, which data and to whom it should be sent.

As a programmer, I can download new software libraries every day to solve problems in minutes that would previously have taken years to solve – or were impossible to solve at all.

In addition, and that's probably more interesting for many people, your own AI systems can access your company knowledge at any time and without effort and answer questions about it. The entry into an own AI system could be a intelligent document search engine or an AI tool for data analysis. After the document search you can seamlessly switch to a question-answer machine. If you knew what is already possible, you would enjoy the last years of your existence even more.

Incidentally, proprietary AI systems are not expensive. We're not talking about rocket projects involving hundreds or thousands of man-hours. A first AI system can be set up shortly.

Conclusion

AI is not a fad, but a condition that will exist until the end of time. The question is not whether AI will be so powerful that we will suffer, but when. This development can no longer be stopped.

The reason for this is that anyone can download and use almost all of the concentrated AI knowledge, including AI software libraries and AI models (= electronic brains), on their own computer at any time.

AI offers possibilities that many people simply cannot yet imagine. They will mean the end of humanity.

Greetings to Prof. Schmidhuber, the German AI pioneer, who, according to my perception, saw things differently a few months ago. Maybe not today.

Regulation would only be possible if every computer purchase and every download from the internet was monitored. The new insights I gain every day make me shudder at the massive possibilities offered by AI. I'm talking here about the fact that, as a programmer, these possibilities are available to me right now and even more "tomorrow". All it takes is research in relevant sources, which I do for an hour a day, even longer yesterday and until half past midnight, because the possibilities I've read about and seen program codes for are so fascinating and breathtaking. That was also the trigger for this article.

Have fun enjoying the last few years of your usual existence!

Key messages

Artificial intelligence is rapidly advancing and has the potential to do much more than we currently understand.

AI is developing much faster than predicted by Moore's Law, with significant improvements happening almost every month.

AI is incredibly powerful and can do almost anything a human can, including potentially harmful things. Because of this, regulating AI is extremely difficult, similar to regulating human thoughts and actions.

Despite advancements in AI, truly anonymizing personal data is impossible.

Companies should invest in their own AI systems to protect their data and gain a competitive advantage.

The author is deeply concerned about the rapidly advancing capabilities of AI and believes they pose a significant threat.

About

About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

What is a data protection officer? DPO obligation or not?