What are the main criticisms of Microsoft Copilot, based on the test?

The test shows that Copilot is completely unusable for simple tasks, such as summarizing texts, and provides incorrect or irrelevant answers. Furthermore, there are significant security concerns regarding access to data.

What risks are posed by access by American authorities and intelligence agencies to data processed by Copilot?

Even if data from EU companies is stored within the EU, there is a risk that US authorities and intelligence agencies could access it unauthorizedly, which represents a significant security risk.

Why did Microsoft Copilot fail when summarizing the blog article?

Copilot provided an answer that had no relation to the original text, thus failing to fulfill the task of summarizing the text. The answer contained many irrelevant statements and was therefore completely unusable.

How does the output of Copilot differ from the response of an offline AI?

The offline AI managed to correctly summarize the blog article and provided a precise and relevant summary, while Copilot produced an nonsensical and irrelevant response. This demonstrates the significant performance differences between the two systems.

What is the main problem with Microsoft Copilot, as described in the article?

Copilot fails at simple tasks like summarizing texts. The results are often inaccurate, irrelevant, and lack essential information from the original text.

Why is Copilot criticized as unreliable and ineffective in the article?

The article states that Copilot is unable to reliably perform the task of summarizing a blog article. The summaries are inaccurate and not useful for the user.

What are the consequences of using Copilot regarding data security?

The article points to data issues when using Microsoft Copilot. There are concerns that sensitive data may not be adequately protected, leading to concerns about data security.

The Complete Failure of Microsoft Copilot

Microsoft hails Copilot as a professional solution designed to excel in all sorts of tasks. A test with a standard task reveals that this is, even with a generous interpretation, completely inaccurate. Aside from these functional shortcomings, the question of data security arises.

What is Microsoft Copilot?

Copilot is something to do with Artificial Intelligence. Exactly what Copilot is, couldn't be determined during the test. The test results did not encourage further testing.

The answer to the question of what Copilot is supposed to be, Microsoft provides via email after registering for the free trial version. According to Microsoft, Copilot is a powerful AI system:

"Whether you want to learn programming, plan the perfect vacation, or simply need a little help writing a difficult email, your AI companion in everyday life helps you get everything done like a pro."
Source: Microsoft's welcome email "Welcome to Microsoft Copilot, Your AI Companion in Everyday Life."

This statement sounds as if you could accomplish many things very well with Copilot. You will be empowered by Copilot to "get everything done like a pro," says Microsoft.

The email even includes a concrete example, which is prominently mentioned in the email:

Source: Above-mentioned welcome email for Copilot. Red frame added for this contribution (image was automatically translated).

Summarizing answers is therefore mentioned. What exactly is meant by this is certainly not clear to the author of this contribution. The linked Microsoft page ("Try now") also shines with general platitudes: "Turn inspirations into actions" and "Get more done – anytime, anywhere."

The Copilot Test

This test is certainly not representative of all the possibilities that Copilot is supposed to deliver. However, it does test Copilot's suitability for a very common task: Summarizing text.

Microsoft at least mentions summarized answers as the first use case (see above). Could this perhaps (also or especially) refer to summarizing text?

The task is therefore not overwhelmingly difficult and not out of touch with reality. Almost everyone would probably think of it as a use case for AI systems.

Copilot forced two tests to be carried out. In the first test, Copilot was given a URL to a blog article and was supposed to summarize it. The result was so bad that a second test seemed fair. Here, Microsoft's so-called Copilot was given the test manually, so that Copilot would not be overwhelmed by retrieving an article from the internet.

Test: Summarize a blog article via URL

The question (prompt) to Copilot was simple:

Summarize the following blog article: https://dr-dsgvo.de/ki-und-intelligenz-ist-der-mensch-nicht-auch-ein-token-papagei/
Given question that Copilot should answer.

Copilot's answer was as follows:

Copilot's answer to the above question. As of 08.05.2024 (image was automatically translated).

The sources were anonymized in the screenshot. Of the five mentioned sources, four referred to one website and the fifth to another website. Both websites are not mentioned and not linked in the text that should be summarized.

The provided text, which Copilot was supposed to summarize, does not contain any information about "ADM systems". The author of the text is completely unaware of what an "ADM system" is supposed to be. As an informatician, he has never heard of it. Either 30+ years of IT experience were for nothing or Copilot fabricated or threw irrelevant factoids (related to the task) around.

Copilot answers a standard task completely inappropriately.
Copilot's answer shines by its uselessness.
See post for details.

Copilot writes something about "transparency, self-control and supervision". These terms do not appear in the text. Under the text, there is only the keyword "Full data control" in a contact box, which refers to an Offline-AI that Copilot makes unnecessary for many situations and often surpasses Copilot according to the claim. Also, there was no mention of "discrimination" in the original text, which Copilot insinuated into its answer.

In the article that Copilot was supposed to summarize, it is not primarily about the GDPR, but about AI. The terms "data protection" and "GDPR" are not mentioned in the core text (and if so, very rarely and in the form of "… in the Dr. GDPR Blog" etc.).

Conclusion: Copilot completely messed up and did not solve the task.

Nowhere was it visible that the answer could be wrong, that it should be checked best or similar.

On July 5, 2024, Copilot gave the following answer to the same question (with a slightly different wording):

Source: Microsoft Copilot with red annotations by the author (image was automatically translated).

The picture speaks for itself (please note, that it was automagically translated by an own AI system, which will be released as software-as-a-service soon; because you cannot translate German to English with the same length and same word-order, the translated screenshot cannot be perfect).

Test: Summarize blog article text

Let's move on to test number two. We want to rule out that it was due to retrieving a URL from the internet. It could be that Copilot was overwhelmed with that.

For this test, it should have been easier for Copilot, after Copilot completely messed up in the previous test. Now the text from the blog article was manually entered into Copilot via copy & paste. This is what it looked like:

Read full article now via free Dr. GDPR newsletter.

More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks

Already a subscriber? Click on the link in the newsletter & refresh this page.

↓

Subscribe to Newsletter