Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort das Ergebnis sehen

The Complete Failure of Microsoft Copilot

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

Microsoft hails Copilot as a professional solution designed to excel in all sorts of tasks. A test with a standard task reveals that this is, even with a generous interpretation, completely inaccurate. Aside from these functional shortcomings, the question of data security arises.

What is Microsoft Copilot?

Copilot is something to do with Artificial Intelligence. Exactly what Copilot is, couldn't be determined during the test. The test results did not encourage further testing.

The answer to the question of what Copilot is supposed to be, Microsoft provides via email after registering for the free trial version. According to Microsoft, Copilot is a powerful AI system:

"Whether you want to learn programming, plan the perfect vacation, or simply need a little help writing a difficult email, your AI companion in everyday life helps you get everything done like a pro."

Source: Microsoft's welcome email "Welcome to Microsoft Copilot, Your AI Companion in Everyday Life."

This statement sounds as if you could accomplish many things very well with Copilot. You will be empowered by Copilot to "get everything done like a pro," says Microsoft.

The email even includes a concrete example, which is prominently mentioned in the email:

Source: Above-mentioned welcome email for Copilot. Red frame added for this contribution (image was automatically translated).

Summarizing answers is therefore mentioned. What exactly is meant by this is certainly not clear to the author of this contribution. The linked Microsoft page ("Try now") also shines with general platitudes: "Turn inspirations into actions" and "Get more done – anytime, anywhere."

The Copilot Test

This test is certainly not representative of all the possibilities that Copilot is supposed to deliver. However, it does test Copilot's suitability for a very common task: Summarizing text.

Microsoft at least mentions summarized answers as the first use case (see above). Could this perhaps (also or especially) refer to summarizing text?

The task is therefore not overwhelmingly difficult and not out of touch with reality. Almost everyone would probably think of it as a use case for AI systems.

Copilot forced two tests to be carried out. In the first test, Copilot was given a URL to a blog article and was supposed to summarize it. The result was so bad that a second test seemed fair. Here, Microsoft's so-called Copilot was given the test manually, so that Copilot would not be overwhelmed by retrieving an article from the internet.

Test: Summarize a blog article via URL

The question (prompt) to Copilot was simple:

Summarize the following blog article: https://dr-dsgvo.de/ki-und-intelligenz-ist-der-mensch-nicht-auch-ein-token-papagei/

Given question that Copilot should answer.

Copilot's answer was as follows:

Copilot's answer to the above question. As of 08.05.2024 (image was automatically translated).

The sources were anonymized in the screenshot. Of the five mentioned sources, four referred to one website and the fifth to another website. Both websites are not mentioned and not linked in the text that should be summarized.

The provided text, which Copilot was supposed to summarize, does not contain any information about "ADM systems". The author of the text is completely unaware of what an "ADM system" is supposed to be. As an informatician, he has never heard of it. Either 30+ years of IT experience were for nothing or Copilot fabricated or threw irrelevant factoids (related to the task) around.

Copilot answers a standard task completely inappropriately.
Copilot's answer shines by its uselessness.

See post for details.

Copilot writes something about "transparency, self-control and supervision". These terms do not appear in the text. Under the text, there is only the keyword "Full data control" in a contact box, which refers to an Offline-AI that Copilot makes unnecessary for many situations and often surpasses Copilot according to the claim. Also, there was no mention of "discrimination" in the original text, which Copilot insinuated into its answer.

In the article that Copilot was supposed to summarize, it is not primarily about the GDPR, but about AI. The terms "data protection" and "GDPR" are not mentioned in the core text (and if so, very rarely and in the form of "… in the Dr. GDPR Blog" etc.).

Conclusion: Copilot completely messed up and did not solve the task.

Nowhere was it visible that the answer could be wrong, that it should be checked best or similar.

On July 5, 2024, Copilot gave the following answer to the same question (with a slightly different wording):

Source: Microsoft Copilot with red annotations by the author (image was automatically translated).

The picture speaks for itself (please note, that it was automagically translated by an own AI system, which will be released as software-as-a-service soon; because you cannot translate German to English with the same length and same word-order, the translated screenshot cannot be perfect).

Test: Summarize blog article text

Let's move on to test number two. We want to rule out that it was due to retrieving a URL from the internet. It could be that Copilot was overwhelmed with that.

For this test, it should have been easier for Copilot, after Copilot completely messed up in the previous test. Now the text from the blog article was manually entered into Copilot via copy & paste. This is what it looked like:

Copilot test: Summarize the provided text (only the excerpt of the text is shown, because it was too long for the screenshot) (image was automatically translated).

Unfortunately, it was not possible to copy the entire article into Copilot's Chatbox. This was of course taken into account. However, this is not the reason for the following test result. The answer Copilot provides is:

Source: https://copilot.microsoft.com/, as of 08.07.2024 (image was automatically translated).

The answer has nothing to do with the original question. Some evidence for the poor quality of the answer, which is below that of a toddler. The toddler would have done no worse with nothing to say:

  • GPT-3 was not mentioned in the text that Copilot was supposed to summarize (1st, 2nd, and 3rd paragraphs of the Copilot response).
  • The researchers mentioned by Copilot and their study were not mentioned in the text (1st + 2nd paragraph).
  • The aspects mentioned in the 3rd paragraph, "ability to form analogies" and "analogy problems", were not mentioned in the text. There was only talk of analog signals (versus digital signals) and the word "analog" was used in another sentence, but completely subordinate ("…then we talk analog about robots with a built-in computer").
  • The "large language models" mentioned by Copilot were not mentioned in the text. There was only talk of "language models". The words "large" or even "LLM" (like "Large Language Model") did not appear.
  • The sources mentioned by Copilot were not mentioned in the text (1st, 2nd, and 3rd paragraphs as well as Copilot's source citations).

If you remove from the Copilot response all statements that have nothing to do with the original text to be summarized, very little is left. In school, the teacher would probably give Copilot's answer a grade of 6 for "unsatisfactory".

The result generated by Copilot is a disgrace. It shines with arbitrary statements and can be called completely wrong. The task was to summarize a given text. This task was not even attempted. Instead, the answer reads like a random search result that was summarized by a language model. In addition, it is made to seem as if the answer is the (perfect) answer.

Nowhere on the user interface (Web-UI) is there a hint that Copilot's answers could be wrong or inaccurate. At least not if you look all over the screen. Perhaps Microsoft expects the user to search for and read Microsoft knowledge articles until they find somewhere "Copilot can also utter complete nonsense. But that hopefully doesn't happen so often, because we are Microsoft and pretend that Copilot can really be great at all possible tasks, such as summarizing answers".

Result with Own AI

The test could have been possibly unfair (which sounds rather ironic when you consider the task). Summarizing a text was perhaps too difficult and surpasses even the abilities of the smartest human (that was definitely ironic). To rule this out, the same task was given to an own offline AI. Completely without Microsoft, Copilot, Azure or ChatGPT.

The task for the Offline-AI was:

Summarize the following webpage: https://dr-dsgvo.de/ki-und-intelligenz-ist-der-mensch-nicht-auch-ein-token-papagei

Identical task statement to the Offline-AI, which was too difficult for Copilot.

The unedited answer from the Offline-AI is (original was in German and automatically translated to English for this English version of the article; please refer to the German original to catch the details):

The author claims that AI (Artificial Intelligence) and humans function similarly to token pigeons, by making probability-based decisions. He emphasizes that it's not about the mechanism itself, but the outcome. The author criticizes that many people have a false notion of intelligence and cannot define it.
He argues that AI systems like neural networks are "simple" mechanisms that can solve complex problems. He also claims that humans are token pigeons by making probability-based decisions.
The author presents a beta version of an offline AI and emphasizes that this solution is better than ChatGPT and does not send data to third parties. He offers free consulting for companies and authorities that need an AI solution.
Overall, the author argues that AI systems function like token pigeons and that it's not about the mechanism itself, but the outcome.

Outcome of the given task with an AI that completes the work, thus not a Copilot.

The outcome can be judged by anyone. A comparison with Copilot should show everyone where the quality differences lie.

But it must be very complicated to do something that is hardly solvable even for Copilot, right?

Yes, that is already very complicated (irony). It took almost more time to write the program than to wait for the answer (which came quickly). The basis for this offline AI already existed. Every company can have its own such AI. Similarly quickly, prototypes can be built with which the fundamental suitability of an AI for specific use cases can be tested.

The offline AI, which delivers the above result, did the following automatically:

  1. Retrieval of the URL mentioned in the prompt
  2. Extraction of the core content (= blog article)
  3. Summarization of the blog article
  4. Output of the answer

Offline-AI means that the AI can run completely autonomously and can communicate with the internet or other IT systems if needed. That's exactly what happened when the blog article was retrieved from the internet.

Conclusion

Copilot is, after all, a marketing instrument from Microsoft and not a serious AI. At least that applies to the mentioned test. By the way, programming tasks can also be completed without Copilot. For this, one uses freely available AI models that do a very good job.

Anyone who wants to upload their own data to the Microsoft Cloud should think about it again here. Provided they haven't already been put off by Copilot's questionable capabilities.

What is annoying is Microsoft's maximum self-confidence, which does not at all match Copilot's shortcomings. It is done everywhere (email, website) as if Copilot were the savior.

Wouldn't you rather use a better solution? The prerequisite is that concrete use cases are considered. That is honest.

Independent of Copilot, it is a problem that lazy or mediocre developers use AI assistants such as Copilot or ChatGPT to create program code that is more unsafe than if it had been created manually. This is suggested by a study of Standford University. A fool with a tool is still a fool!

Key messages

Microsoft Copilot failed to perform a simple task of summarizing a blog article, providing inaccurate and unhelpful results.

Copilot failed to summarize the provided text accurately, generating irrelevant and nonsensical responses.

Copilot, an AI assistant, failed to accurately summarize a given text, instead producing irrelevant and fabricated information.

The author believes that AI systems, like their own offline solution, are more effective than tools like ChatGPT and Copilot because they focus on delivering tangible results rather than relying on complex mechanisms.

About

About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Write a comment

Ihre Mail-Adresse wird nicht veröffentlicht.

The Sins of Meta (formerly Facebook)