Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.

Jetzt testen

sofort das Ergebnis sehen

DSGVO Website-Check

Artificial Intelligence: Practice test of the new LLaMA language model from Meta

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

Meta has likely released its powerful AI model LLaMA in version 2 due to Microsoft/OpenAI and Google's current dominance in the AI field, also for commercial use. A local operation without problems with data protection, business secrets or employee data is possible. A practice test.

Introduction

The model released by Meta on July 18 is a Large Language Model (LLM). It's suitable for analyzing text. Therefore, it can be used among other things for the following use cases:

  • Summary of text (abstractive summary = abstractive Zusammenfassung = Summary in own/new words).
  • Extracting meanings from documents (Example: What questions does the given document answer?).
  • Document search engine (vector search = Vektorsuche).
  • Answering questions with company documents as a knowledge base (question-answering = Question-Answer System).
  • ChatBot (conversational = Konversationen).

Update: More recent and capable language models are used in Offline-AI on Dr. GDPR. ([1])

LLaMA is an abbreviation for Large Language Model Meta AI. Meta AI is a department of the Meta conglomerate that deals with artificial intelligence applications. After Meta has now collected numerous data from users on Facebook, Instagram or WhatsApp, these data are now used for training AI and AI models like LLaMA.

The LLaMA 2 language model can be run locally and data-friendly even for commercial applications. The hardware requirements are met.

See this post as well as other post[s].

Besides models for language understanding, there are models suitable for other data types. Many have probably already heard of StableDiffusion, a AI model with which an image can be generated from a text prompt (Dall-E, Midjourney etc.).

For basics I recommend one of my previous contributions on Artificial Intelligence:

  • Foundations for AI systems. ([1])
  • Question-Answer System with AI. ([1])
  • Current AI is a revolution and is not based primarily on statistics. ([1]) ([2])
  • Typical use cases, data protection, confidentiality, misunderstandings. ([1])
  • Configuration parameters of a language assistant. ([1])

The hardware requirements for the smaller models are feasible. The model size is determined by the number of parameters in the model. Parameters are neuron connections. Roughly speaking, one could consider the number of neurons in the electronic brain as a parameter.

In AI-models, parameters are abbreviated as follows (examples):

  • 7 Billion
  • Thirteen billion = 13 Billion
  • Seventy Billion = 70 Billion

The "B" comes from "billion", because in English a billion does not exist. A "billion" is therefore a billion. Models with for example 200 million parameters are then called 200M. Good, because in German we would get confused with the "M" for million and the same "M" for billion mixed up.

The parameter count of a model is an excellent indicator for its language understanding. The higher this number, the better "speaks" or understands this model a language. But which one? Most models were until recently only English-based. However, there was always some bycatch. Meaning: Fortunately, there are some texts on the internet that are exceptionally in German and not in English, Chinese or Spanish. So a AI-model with a sufficiently large parameter count can accidentally also understand German. This wasn't meant to be ironic, even though it sounds like it.

The search engine Bing with a GPT language model in the background often provides false answers.

My opinion. See post.

Essential for a model is therefore its parameter number and also the Dyeing language. With large models, there has not yet been one that I know of which was specifically trained on German. That may be different next week. One can see very nicely how slowly some companies, authorities or lawmakers work. While these think in years or three-year periods, four weeks are a long time in the AI scene. Have fun in the future (which is already starting), when we're all being overwhelmed by technological progress and problems. I'm protecting myself more carefully and waiting no longer for laws or judgments.

Also crucial for a KI model is what's called context length. The context length indicates how large text snippets can be that a KI model can process. To do this, the KI model must be trained with text snippets of the same context length. The larger it gets, the better, but also more computationally intensive. I had read at Meta that numerous A100 graphics cards with 80 GB VRAM each were used for training. The computing time was 3.3 million graphics card hours. An A100 is a very expensive graphics card. A piece cost up to recently 15,000 euros. Such a card draws a maximum of 400 watts from the power outlet.

The LLaMA 2 model has a Context length of 4096 characters. That's clearly more than its predecessor, LLaMA version 1, which probably only had 2048 characters. Most models I'm familiar with have had context lengths of just 1024 characters until now. ChatGPT-4 has a context length of 8096 characters, but it's also extremely slow when you look at the chat interface and response time. There are even models out there with context lengths of 128,000 characters nowadays. However, these are currently equipped with relatively few parameters.

Wie gut is LLaMA 2 also?

Practice test of the LLaMA 2 model

My practice test gives an insight and first impression, that's it. As a use case I have used text generation, which should give an answer from Dr. GDPR contributions based on a question. I have asked all questions in German language.

Read full article now via free Dr. GDPR newsletter.
More extras for subscribers:
Offline-AI · Free contingent+ for Website-Checks
Already a subscriber? Click on the link in the newsletter & refresh this page.
Subscribe to Newsletter
About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Artificial Intelligence in Administration: often problematic and not useful