Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort DSGVO-Probleme finden

Artificial Intelligence: Works of Authors and Their Protection

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

Creators of online accessible works have according to law the possibility to declare a usage reservation. Thus, works should be protected from flowing into electronic brains. Does this approach function? In the contribution possibilities and limitations are named.

Introduction

Artificial intelligence has enormous capabilities that often far surpass those of the average intelligent human being. The Turing Test is considered completed positively. This test checks whether a computer is as intelligent as a human being. Yes, it is now. As ChatGPT shows, an AI can even outperform humans in certain areas, at least if one averages over all people. An AI knows no fatigue and can always rely on better hardware, unlike the human with his very limited brain. The only advantages of humans are, from my point of view, still the senses and the ability to explore and perceive the environment. This will soon change greatly in favor of artificial systems.

AI-Models can online suck up texts and images from authors almost arbitrarily, and do so legally legitimized. The law gives authors the right to a usage reservation, which it effectively does not have. The reasons are of purely organizational and technical nature.

These astonishing abilities of AI are frightening at the same time. Creators worry that their works will now be sucked up and disassembled by an electronic brain. Google has already done this, only nobody got as excited: Someone enters a search term into the search machine. Instead of your website appearing for the search term and you catching the user and using them for your legitimate purposes, the answer is given as an extract of your content in the search engine. The user doesn't even land on your website, but gets drained beforehand. You are the content provider and the fool. Google is happy about it. The user doesn't care.

From many authors of online available works, a demand for consent obligation arose. The author should allow a machine learning model to use their work. Others demand only what is also in the law, namely an opt-out option. This is anchored in § 44b Abs. 3 UrhG and formulated as follows:

Uses pursuant to paragraph 2 sentence 1 [Multiplication of legally accessible works for text and data mining] are only permissible if the rights holder has not reserved them. A reservation of use at online accessible works is only effective when it occurs in machine-readable form.

Section 44b(3) of the Copyright Act (UrhG)

Furthermore, copies of copyrighted works for purposes of Artificial Intelligence are to be deleted as soon as they are no longer needed. This is not a problem, because if you read a text thoroughly, then you also know what the text meant without the original afterwards. The same applies to an AI.

Technical reservations of use

Online freely accessible works, such as websites, linked PDF files, images, audio files, raw text files or free e-books, are examples of this. Authors of such works have no consent right (opt-in inquiry) according to § 44b UrhG, but only an opt-out option. If the author does not give the signal for opt-out, their text can be read and used for Text and Data Mining according to the mentioned legal provision. Under these Sampling processes I also understand applications of Artificial Intelligence. With this view, I am probably not alone.

By the way, the term Opt-Out is actually not a synonym for usage reservation. Because an Opt-Out also affects the past, whereas a usage reservation only affects the future. If a usage reservation is given after a read operation by a crawler has taken place, it has no effect on this particular read operation.

What does a recall option look like technically?

For search engines and other crawlers, this option is already available. It is given by the robots.txt file. This file follows a generally established, widely disseminated, and well-known convention. Every search engine that wants to pretend to be law-compliant respects this file.

The robots.txt file of a website is available under the main path, for example at dr-dsgvo.de/robots.txt. It looks like this on my blog:

# robots.txt
User-agent: ia\_archiver
Disallow: /
User-agent: archive.org\_bot
Disallow: /
User-agent: slurp
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: CCBot
Disallow: /

Additional note: I also use a dynamic bot protection that blocks some search engines as well.

In my robots.txt file, it is declared that the Internet Archive should not crawl my website. This is indicated by the User-Agent named ia_archiver and the directive Disallow. I also prohibit ChatGPT from crawling, as can be inferred from the speaking User-Agent named ChatGPT-User.

Which User-Agent name to use for which search engine, crawler, and AI platform is unknown ad hoc. Large platforms publish the names of their crawlers (User-Agents). A crawler is a program that scrapes online accessible content.

The entire principle of the robots.txt file is based on conventions. Technically, the procedure is extremely simple. If there were no such convention, then there would be no such procedure.

The use reservation of online accessible works against a CI is practically not possible for authors. The reason is the lack of technical convention. Already trained CI models consider no reservations that were only pronounced after training.

Refers to Section 44b(3) of the German Copyright Act.

Assuming you want to block a new AI platform that was announced in the press yesterday, how do you do it? Initially, you didn't know about this platform until yesterday, so you couldn't even search for its user agent, which you now want to block from today. After all, Roland or Susi could build their own AI model and use a crawler to suck up content from the internet.

They would have to find the technical names for all possible AI platforms, including mine, for all of Roland's platforms from one to 5000, for all of Sisi's AI platforms from one to 13847, for Elon's experiments, for your neighbor's, for all US-based AI companies etc.

Current AI-platforms can only be kept apart individually and from contents available online once their existence is known.

Technical fact.

Read full article now via free Dr. GDPR newsletter.
More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks
Already a subscriber? Click on the link in the newsletter & refresh this page.
Subscribe to Newsletter
About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Protect intellectual property and prevent crawling of own content