Meta has likely released its powerful AI model LLaMA in version 2 due to Microsoft/OpenAI and Google's current dominance in the AI field, also for commercial use. A local operation without problems with data protection, business secrets or employee data is possible. A practice test.
Introduction
The model released by Meta on July 18 is a Large Language Model (LLM). It's suitable for analyzing text. Therefore, it can be used among other things for the following use cases:
- Summary of text (abstractive summary = abstractive Zusammenfassung = Summary in own/new words).
- Extracting meanings from documents (Example: What questions does the given document answer?).
- Document search engine (vector search = Vektorsuche).
- Answering questions with company documents as a knowledge base (question-answering = Question-Answer System).
- ChatBot (conversational = Konversationen).
Update: More recent and capable language models are used in Offline-AI on Dr. GDPR. ([1])
LLaMA is an abbreviation for Large Language Model Meta AI. Meta AI is a department of the Meta conglomerate that deals with artificial intelligence applications. After Meta has now collected numerous data from users on Facebook, Instagram or WhatsApp, these data are now used for training AI and AI models like LLaMA.
The LLaMA 2 language model can be run locally and data-friendly even for commercial applications. The hardware requirements are met.
See this post as well as other post[s].
Besides models for language understanding, there are models suitable for other data types. Many have probably already heard of StableDiffusion, a AI model with which an image can be generated from a text prompt (Dall-E, Midjourney etc.).
For basics I recommend one of my previous contributions on Artificial Intelligence:
- Foundations for AI systems. ([1])
- Question-Answer System with AI. ([1])
- Current AI is a revolution and is not based primarily on statistics. ([1]) ([2])
- Typical use cases, data protection, confidentiality, misunderstandings. ([1])
- Configuration parameters of a language assistant. ([1])
The hardware requirements for the smaller models are feasible. The model size is determined by the number of parameters in the model. Parameters are neuron connections. Roughly speaking, one could consider the number of neurons in the electronic brain as a parameter.
In AI-models, parameters are abbreviated as follows (examples):
- 7 Billion
- Thirteen billion = 13 Billion
- Seventy Billion = 70 Billion
The "B" comes from "billion", because in English a billion does not exist. A "billion" is therefore a billion. Models with for example 200 million parameters are then called 200M. Good, because in German we would get confused with the "M" for million and the same "M" for billion mixed up.
The parameter count of a model is an excellent indicator for its language understanding. The higher this number, the better "speaks" or understands this model a language. But which one? Most models were until recently only English-based. However, there was always some bycatch. Meaning: Fortunately, there are some texts on the internet that are exceptionally in German and not in English, Chinese or Spanish. So a AI-model with a sufficiently large parameter count can accidentally also understand German. This wasn't meant to be ironic, even though it sounds like it.
The search engine Bing with a GPT language model in the background often provides false answers.
My opinion. See post.
Essential for a model is therefore its parameter number and also the Dyeing language. With large models, there has not yet been one that I know of which was specifically trained on German. That may be different next week. One can see very nicely how slowly some companies, authorities or lawmakers work. While these think in years or three-year periods, four weeks are a long time in the AI scene. Have fun in the future (which is already starting), when we're all being overwhelmed by technological progress and problems. I'm protecting myself more carefully and waiting no longer for laws or judgments.
Also crucial for a KI model is what's called context length. The context length indicates how large text snippets can be that a KI model can process. To do this, the KI model must be trained with text snippets of the same context length. The larger it gets, the better, but also more computationally intensive. I had read at Meta that numerous A100 graphics cards with 80 GB VRAM each were used for training. The computing time was 3.3 million graphics card hours. An A100 is a very expensive graphics card. A piece cost up to recently 15,000 euros. Such a card draws a maximum of 400 watts from the power outlet.
The LLaMA 2 model has a Context length of 4096 characters. That's clearly more than its predecessor, LLaMA version 1, which probably only had 2048 characters. Most models I'm familiar with have had context lengths of just 1024 characters until now. ChatGPT-4 has a context length of 8096 characters, but it's also extremely slow when you look at the chat interface and response time. There are even models out there with context lengths of 128,000 characters nowadays. However, these are currently equipped with relatively few parameters.
Wie gut is LLaMA 2 also?
Practice test of the LLaMA 2 model
My practice test gives an insight and first impression, that's it. As a use case I have used text generation, which should give an answer from Dr. GDPR contributions based on a question. I have asked all questions in German language.




My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.
