Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.

Jetzt testen

sofort das Ergebnis sehen

DSGVO Website-Check

Artificial Intelligence: Question-and-answer system for the Data Protection Blog Dr. GDPR

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

Sensitive data doesn't belong in foreign or American hands, such as ChatGPT, Microsoft's clouds, Google's or AWS'. How good that own AI systems are possible and affordable. Finally business secrets no longer have to be invited into ChatGPT or any cloud. An experiment for a question-answer assistant for this data protection blog, Dr. GDPR.

Introduction

If we didn't care about data protection so far, maybe we do now that our business secrets shouldn't be scattered all over the world. Perhaps there are legally binding confidentiality agreements for certain documents. Whether confidentiality is still granted when a document is uploaded to ChatGPT's or Google's cloud, I dare to doubt it.

Data-friendly: Secure for all kinds of data, whether personal data (data protection), confidential data or business secrets.

Data-friendly is more than data-protective.

Even the often despised data protection is once again on the minds of many. While search engines were allowed and are still allowed to process data without intervention, the same data from AI systems cannot be processed without a request from data protection authorities. Funny. It's probably also due to the possibilities offered by artificial intelligence, but just as much due to herd mentality (if one authority checks it, then we can do it too, without being seen as spoilsports, think some officials). Only that's why I find it understandable why the most inactive data protection federal state in the world (Hesse) also made a timid approach in the form of an inquiry to ChatGPT announced).

A frequent application case for using Artificial Intelligence is document searching. More demanding are question-answer systems or search engines that directly provide text summaries of hit documents. My plan was to create a find system for the Dr. GDPR Data Protection Blog, and that's data-friendly.

The search assistant for Dr. GDPR should provide an answer to natural language questions. Here is an example:

Does my website need a cookie popup?

The answer of AI is better than that of most people. Answer Dr. GDPR AI: see below.

As one can infer from the question posed, some questions are formulated differently than would be academically correct. Many ask whether something is in compliance with data protection, meaning most often whether a specific data processing is lawful according to the GDPR.

The answer should be given by my AI in its own words, based on the contributions that have appeared so far on Dr. GDPR. Hereby hallucinations should be avoided, as it's all about facts and legally relevant knowledge. Hallucinations are invented statements that do not exist. How hallucinations come into being, I will address specifically in a future contribution. One can explain them thoroughly and need not rely on speculation.

Prototype proves feasibility

That own AI systems can be programmed and run locally on their own servers, I have proven through a prototype. The simple way would have been one of the following possibilities:

  • Use the interface of ChatGPT
  • Throw a lot of money at the problem and bless the Americans (Cloud)
  • Throw no more money at the problem and buy expensive hardware.

Buying expensive hardware is a viable option for larger companies, but not for many SMEs. Therefore, I have chosen another Setup. When choosing the hardware, costs were taken into account. To this end, one must know that AI calculations take place on graphics cards instead. The graphics card is not used here to output images or text. Rather, the thousands of mini-processors of a graphics card are misused to perform computationally intensive work of an AI faster than a single Einstein processor of your still so good personal computer can do it. Unfortunately, graphics cards with a lot of main memory cost a lot of money. A graphics card with 48 GB of main memory cost 15,000 euros just a few months ago. For good AI models, however, rather 96 GB or even up to 400 GB of more expensive main memory of several graphics cards (not hard drive storage and not cheaper RAM of a computer!) are needed.

My AI systems, on the other hand, run on minimum hardware, if one understands the term in the context of Artificial Intelligence. An example: The search for (own) documents from the company's intranet via natural language questions works on a rented server of the mini-class. Of course, an own company server can also be used. This succeeds through exploiting optimization procedures that one buys through additional technical complexity. Once solved, the complexity problem is resolved.

Effective AI applications and language models

For Question-Answer Assistants, however, a bit more is needed than for intelligent document search. Not only should documents be found, but also contents from these documents should be extracted and presented as an answer. A simple way to do this is with an extractive answer. This is a faithful quote from the original text. More difficult and better are abstractive answer systems. They provide an answer in their own words and can even combine knowledge from several documents to deliver an answer in new words. The answer would not have been feasible with just one document. A person would have had to find, read, and intellectually process many documents. The AI takes this unpleasant, time-consuming, and above all, unachievable task for many people away and solves the problem.

My AI systems claim to be data-friendly. Furthermore, they should run on hardware as inexpensive as possible. Both are possible, as practice shows.

Deeper tested application cases so far: Document search, text understanding, image generation, image analysis, audio applications.

When we talk about searching and summarizing documents, we usually mean documents and answers in German language. To put it very briefly: German is unfortunately not a world language. That's why it's much harder to process German texts with an AI application than English or Chinese texts (where the latter would be extremely difficult for me too).

For my AI system, therefore, an electronic brain („model“) is needed that understands German and „speaks“. This increases the requirements for a AI architecture significantly. But this problem is also solvable, as I have found out. The size of the required AI model due to the German requirement would not be usable on affordable hardware.

For using powerful AI systems on servers that are both affordable and available in Germany (data protection! business secrets! confidentiality!), some tricks are needed. When creating the AI solution, I felt like I was at "Jugend forscht!

Read full article now via free Dr. GDPR newsletter.
More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks
Already a subscriber? Click on the link in the newsletter & refresh this page.
Subscribe to Newsletter
About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

AI and intelligence: aren't humans also token parrots?