With the AI model DeepSeek-R1, China has succeeded in shaming the Americans into the ground. DeepSeek-R1 is significantly more efficient than OpenAI's ChatGPT. Even super-small offshoots of R1 are almost as good as OpenAI o1 in important benchmarks. The stock market reacted with shock waves. The fact that there is already another DeepSeek model what not even noticed.
Introduction
Everyone knows ChatGPT. Everyone has known DeepSeek since "yesterday" at the latest. Equity investors in particular are likely to have noticed that something has happened on the AI market.
DeepSeek has made a blueprint freely available that can make OpenAI superfluous.
This what demonstrated by DeepSeek-R1 and its distillate models.
OpenAI keeps its top model ChatGPT (including o1, o3 etc.) secret in the newer versions and conceals details about it from the public. DeepSeek, a company from China, on the other hand, is giving away its top model by making it public. Too bad for OpenAI, who wanted to make money with ChatGPT, but will probably never make it into the profit zone.
The costs for creating DeepSeek-R1 were significantly lower than those for ChatGPT in every newer version. Furthermore, R1 can be operated in full capacity by any medium-sized company itself. The smaller variants even run on laptops. Operating oneself means here: They download the R1 model, copy it onto their AI server (or laptop), then deactivate the internet connection if necessary and can thus work completely autonomously with your AI.
DeepSeek's AI models can be downloaded and operated autonomously on your own server or laptop.
This is very useful for many applications.
How good is DeepSeek-R1?
The quality of a machine learning model is checked by benchmarks, which are standardized test questions for the model. How well a machine learning model works for a specific application in your company can only be found out through (simple, fast) testing for that specific application. You just have to know what you want to use machine learning for.
The benchmarks give a very good indication of how good a model is. Here are the benchmark results published by DeepSeek itself:

The left-hand bar is that of DeepSeek-R1, the second bar that of the OpenAI o1 reference model. As can be seen, R1 is just as good as o1. The benchmarks mentioned are standard. They are in detail:
- AIME 2024: Mathematikprobleme
- Codeforces: Programming tasks
- GPQA Diamond: GPQA stands for A graduate-level Google-proof Q&A benchmark. Multiple-Choice Questions
- MATH-500: Mathematical conclusions
- Measuring Massive Multitask Language Understanding: Multiple-Choice Questions from Many Knowledge Domains
To answer the question at the beginning: DeepSeek-R1 is apparently very good. The model shows some intentional knowledge shifts. It answers critical political questions in a way that China likes. However, the model should not be considered as a chatbot. In this regard, this detail may be disturbing, but from a technical point of view it's often irrelevant.
The quality of DeepSeek has been confirmed by some users. Here is a translated review report from English into German, comparing R1 and OpenAI o1 (as follows as a quote):
- For logical reasoning, R1 is much better than any previous SOTA model up to o1. It's better than o1-preview, but one step below o1. This also shows in the ARC AGI Test.
- Mathematics: For mathematics it's the same: R1 is a killer, but o1 is better.
- Encoding: I haven't had a chance to play much, but at first glance it's neck and neck with o1, and the fact that it costs 20x less makes it the practical winner.
- Writing: Here R1 takes over the lead. It conveys the same impressions as the early Opus. It is free, less censored, has much more personality, is easy to control and is very creative compared to the other models, even compared to o1-pro.
The course of the future
Anyone following developments in the AI sector will notice several things:
- AI models are getting better and better.
- Better AI models are getting smaller and smaller (they are hardly "large" anymore, as indicated in "Large Language Models").
- The training methods for creating AI models are becoming increasingly sophisticated.
- Smaller existing AI models can easily be made much better by consulting new models.
- The speed of light is a trivialization for the designation of the speed of developments.
All this poses existential threats for OpenAI.
What is even more important is the following realization: With the help of so-called Reinforcement Learning (reinforcing learning), small, already existing AI models can be made significantly better. To do this, you take the answers that the master model R1 gives to questions and feed them into smaller student models. The intelligence of the student models then gets a huge boost and can think and draw conclusions even better itself. DeepSeek has described a sophisticated method (emergent self-reflection) for this to work particularly well.
So could Open-Source models like Qwen-2.5 and Llama-3.1, which were already very good in themselves, be made even better. With minimal effort. OpenAI, on the other hand, has to invest a lot of time, energy, and money to achieve better results. While OpenAI is a secret keeper, these advances are taking place in public and are freely accessible to everyone.
According to Test report




My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.
