Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort DSGVO-Probleme finden

Artificial Intelligence: The Benefits of In-House AI Systems, with a Practical Example

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

Data is a valuable resource, especially when it comes to business secrets. But confidential and personal data should not be given to third parties (like ChatGPT) for legal reasons. Own AI systems offer besides confidentiality the advantage of great flexibility and precise alignment with concrete requirements. A practice report.

Introduction

Since it's just simple, was a slogan of a mobile phone provider. Simple is what the new false often says for data-intensive applications. Data protection does not really interest many people. When it comes to employee data, vertraglich as confidentially secured data, patent foundations or other business secrets, then companies are more sensitized. Finally, no one wants legal trouble. The desire to bring the internal company knowledge out into the world is probably not so widespread.

Artificial Intelligence: The legal approach examines what may be permitted and clarifies risks. The technical approach provides data-friendly systems and resolves many legal issues on its own.

Constructively acting rather than arguing is a good strategy, I think. Lawyers still have enough to do even then.

It's easy to use ChatGPT, but some people make it too simple to their own detriment. This already shows that thinking is harder than doing something wrong or suboptimal. Even greater efforts are accepted if they are only slight, but often repeated. Rather 100 times a small effort with a high overall expenditure than 1 time a medium-sized effort with a significantly lower overall expenditure.

Recently, Meetings as a provider of video conferencing software formulated new terms of use. With this, Zoom grants itself the right to use all data received in Zoom video conferences almost arbitrarily. Included is also the dissemination of your data, including transcripts and use for machine learning ("training an AI"). This would not have happened with a data-friendly solution from Germany. Equally, it would not have been a problem with your own system. Now all Zoom users potentially have a problem.

All Zoom users potentially have a problem because they allegedly prefer free third-party systems instead of data-friendly solutions.

Thanks to Zoom for the decision-making help.

If you don't make it easier than easy, at least use the ChatGPT interface through your own program. This way many applications can be created. ChatGPT brings with it, in addition to remarkable abilities, several incurable problems:

  • ChatGPT is very slow.
  • Most of ChatGPT's data is irrelevant for business applications (hindering ballast, promoting hallucinations, slowing down the system, increasing error susceptibility).
  • All data lands with OpenAI and thus with Microsoft.
  • Data is not secure at ChatGPT (see late added opt-out instead of consent, data leak, American company policy etc.).
  • ChatGPT is based on outdated general knowledge.
  • ChatGPT is not familiar with your company's documents and hopefully will never learn them.
  • ChatGPT costs money, depending on the number of processed text pieces (tokens). Uploading and analyzing a larger PDF will already make you poorer. Incorrect programming (infinite loop or recursion) will quickly ruin any budget.
  • ChatGPT is not infinitely scalable.

If your inputs are also used for the training of a third-party AI model or for fine-tuning, then privacy and confidentiality cannot be guaranteed anymore. A language model learns not only grammar and structure, but also takes in knowledge. The resulting shortcomings are more annoying and counterproductive than a legal problem. This means that these problems cannot be legally resolved.

Offline-AI as a solution for companies and authorities.

Further information. ([1])

Similar things can be said about image generators like Dall-E or Midjourney. Many of these generators are based on an approach called Stable Diffusion. Almost all relevant methods of this kind use the LAION dataset. This one has used the Common Crawl data dump to find websites that embed images along with image descriptions. Common Crawl, in turn, is a massive dump of nearly any website. If one of your images has landed in the image dataset, it's not in its pure form. Rather, your company image (logo, product image etc.) has ended up in the artificial neurons of a third party's AI dataset in structural storage. Getting that image out again is hardly possible. Rather, the AI model would have to be recalculated. Whether the owner of the AI model will do this is questionable. After all, training is an extremely computationally intensive task with demanding data acquisition.

Proprietary AI systems

All the problems mentioned above are yours when you use your own AI system. I call this type of systems local AI systems or autonomous AI systems. These systems do not require an internet connection and could, in the best case, stand under your desk.

These benefits have in-house systems of Artificial Intelligence:

  • Full Data Control: You decide which training data or pre-trained AI models are used.
  • Ask your data and not internet data: Feed your company documents and media into it.
  • High Speed: Anyway, your system will be faster than ChatGPT if you want it to be. The number of your users will be significantly lower than those of popular AI platforms. Moreover, you can reduce the data volume significantly.
  • Customizable at will: More on this below.
  • A wide range of application scenarios: Semantic search,text understanding, question-and-answer assistants, image generators, audio transcription, and many more.

Here's an example from practice, what is possible with a local system for your company. The example runs on a low-cost server and works. It is however still in development and will look much more than currently at the end. The pending completion is no big deal and only has something to do with my prioritization.

Semantic search for corporate documents

Read full article now via free Dr. GDPR newsletter.
More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks
Already a subscriber? Click on the link in the newsletter & refresh this page.
Subscribe to Newsletter
About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Artificial Intelligence: That's why the AI revolution is the greatest revolution in human history