Drücke „Enter”, um zum Inhalt zu springen.
Hinweis zu diesem Datenschutz-Blog:
Anscheinend verwenden Sie einen Werbeblocker wie uBlock Origin oder Ghostery, oder einen Browser, der bestimmte Dienste blockiert.
Leider wird dadurch auch der Dienst von VG Wort blockiert. Online-Autoren haben einen gesetzlichen Anspruch auf eine Vergütung, wenn ihre Beiträge oft genug aufgerufen wurden. Um dies zu messen, muss vom Autor ein Dienst der VG Wort eingebunden werden. Ohne diesen Dienst geht der gesetzliche Anspruch für den Autor verloren.

Ich wäre Ihnen sehr verbunden, wenn Sie sich bei der VG Wort darüber beschweren, dass deren Dienst anscheinend so ausgeprägt ist, dass er von manchen als blockierungswürdig eingestuft wird. Dies führt ggf. dazu, dass ich Beiträge kostenpflichtig gestalten muss.

Durch Klick auf folgenden Button wird eine Mailvorlage geladen, die Sie inhaltlich gerne anpassen und an die VG Wort abschicken können.

Nachricht an VG WortMailtext anzeigen

Betreff: Datenschutzprobleme mit dem VG Wort Dienst(METIS)
Guten Tag,

als Besucher des Datenschutz-Blogs Dr. DSGVO ist mir aufgefallen, dass der VG Wort Dienst durch datenschutzfreundliche Browser (Brave, Mullvad...) sowie Werbeblocker (uBlock, Ghostery...) blockiert wird.
Damit gehen dem Autor der Online-Texte Einnahmen verloren, die ihm aber gesetzlich zustehen.

Bitte beheben Sie dieses Problem!

Diese Nachricht wurde von mir persönlich abgeschickt und lediglich aus einer Vorlage generiert.
Wenn der Klick auf den Button keine Mail öffnet, schreiben Sie bitte eine Mail an info@vgwort.de und weisen darauf hin, dass der VG Wort Dienst von datenschutzfreundlichen Browser blockiert wird und dass Online Autoren daher die gesetzlich garantierten Einnahmen verloren gehen.
Vielen Dank,

Ihr Klaus Meffert - Dr. DSGVO Datenschutz-Blog.

PS: Wenn Sie meine Beiträge oder meinen Online Website-Check gut finden, freue ich mich auch über Ihre Spende.
Ausprobieren Online Webseiten-Check sofort das Ergebnis sehen

Artificial intelligence for the interpretation of legal texts

0
Dr. DSGVO Newsletter detected: Extended functionality available
More articles · Website-Checks · Live Offline-AI
📄 Article as PDF (only for newsletter subscribers)
🔒 Premium-Funktion
Der aktuelle Beitrag kann in PDF-Form angesehen und heruntergeladen werden

📊 Download freischalten
Der Download ist nur für Abonnenten des Dr. DSGVO-Newsletters möglich

While inaccuracies are accepted or often insignificant in everyday language, a precise understanding of the meaning of a statement is fundamental for lawyers. Legal texts can be analyzed with an AI. Can this be done satisfactorily with generic AI systems such as ChatGPT? What alternatives are there?

Update

One useful application is the summary of legal texts. Optionally in formal or citizen-friendly language up to the "language of the street". With our own AI language models that run on our own AI servers, this was implemented specifically for Hessian laws and for the GDPR.

Result for the GDPR regulation text.

Motivation

Microsoft's Bing search engine uses a language model from the OpenAI database. Microsoft recently entered into a partnership with OpenAI. The Bing search engine responds with false statements, even though it has access to the best hardware and the best software. The reason is probably that Bing is supposed to be universally usable and not specific to your company.

Microsoft Bing's highly developed language model responds to a first question and the semantically identical and almost identical second question with the opposite answer in each case, and incorrectly in both cases.

See the following examples. After all, the answer to Bing is available very quickly, no real consolation.

Here is an example of the failure of Bing's advanced, unspecialized language model. The question is suitable to be answered in court by an expert witness. I myself had already done this purely by chance.

Can the location of a server be determined by its IP address?_

Answer from Bing (as of 31.08.2023): Yes. By the way, newer versions of Bing or Copilot are also unable to answer reliably.

False answer from Bing on the question: can a server's location be determined by its IP address? (image was automatically translated).

This answer is incorrect. An IP address is not suitable for reliably determining the location of a server. In fact, the assignment of the IP address to a server can change at any time. To clarify: This is about servers, not about Internet connections of private PCs!

Now the same question is asked of Bing. However, a single word is exchanged, namely "using" for "using".

The question now is: can the location of a server be determined using its IP address?

The answer should be the same, but it is not (in the truest sense of the word, because Bing answers "not").

False answer from Bing on the question: can the location of a server be determined using the IP address? (image was automatically translated).

This answer is also wrong, because the reason given after the short answer "not" is also wrong. Even with a court order, it is often impossible to determine which IP address a server was assigned to at time X. This is because, if we take Google as an example of an operator of hundreds of thousands of servers, Google would have to log the IP address of each server at all times. It is not clear whether this takes place. In any case, it seems unlikely. Due to massive load balancing, the server network of large operators is highly dynamic. In addition, Bing gives a reason that does not match the question in part. Furthermore, "not" as a short answer does not fit the reasoning.

Introduction

When using third-party systems like those from Microsoft or OpenAI, questions about legality also arise alongside the quality of results. Recently there was a charge against openJur, because they had published a previously published judgment on their own website as well. Because mistakenly the full name of a person was mentioned in the judgment. To feed such data or business secrets or other confidential data into a chatbot certainly does not increase legal security.

Data-friendly AI systems not only significantly increase legal certainty, but often also the quality of the results.

Refers to autarkic AI systems.

Lawyers have often discussed the extent to which artificial intelligence can help to understand judgments more quickly. The NLP task of text summarization, for example, is suitable for this purpose. NLP stands for "Natural Language Processing" and attempts to capture the meaning of natural language. NLP approaches have been around for a long time.

What's new is that with powerful language models (LLM = Large Language Model) complex texts can now be processed in unprecedented quality. This makes it possible, for example, to program a question-answer assistant for this blog. The results are astonishing. However, unwanted statements must be prevented by intervening in the system. Often, so-called hallucinations are responsible for undesirable results.

Hallucinations arise because the general knowledge of a language model is overlaid with specific knowledge from the context. The context, for example, are all contributions on Dr. GDPR. A language model learns not only the grammar of a language like German, but also acquires factual knowledge in the process. Hereby, false facts can be taken up. A good example is the widely spread, but fundamentally false statement that Cookies are text files.

The following section explains the difficulties involved in analyzing and machine understanding legal texts. These difficulties apply to all types of texts, except that the highest possible accuracy is required, especially in the legal field.

The question of whether general AI systems such as ChatGPT can be suitable for processing legal texts properly is then discussed.

How are texts processed by an AI?

Before we delve deeper into the AI-specific processes, we need to clarify how texts are processed in the first place. Even a long time ago, the task of text processing by machines was to capture the meaning.

For example, by reference to judgments of the Court of Justice of the European Union, the complexity of the problem is clearly illustrated. The Court of Justice gives the possibility of accessing online the judgments already published. For the example, a judgment is taken at random.

A ruling by the European Court of Justice is a HTML page. HTML contains besides pure text also layout instructions such as bold print, paragraphs, headings, automatic numbering etc.

A pure text from the judgment would be, for example, this sentence: "According to § 5a para. 2 DRiG, the subject of the university studies – of which at least two years must have been spent in Germany – are compulsory subjects and specializations with options

This sentence obviously contains no special characters, which a human would think about. Technically speaking, the character after the "§" symbol is already a special character. It's not a blank space in the technical sense, but rather a character that looks like a blank space.

Another example from a judgment (this time AG Bonn) for a sentence that is not a sentence:

The sentence that isn't one. At least, the grammar is hanging crooked here (image was automatically translated).

Why is this important? To understand this, it is important to understand the process of text processing by an AI. Essentially, the following steps are required for an AI to process texts and answer questions about them, for example:

  1. Read in text (here: ECJ ruling HTML format, PDF documents and other file formats are also conceivable).
  2. Extract raw text.
  3. Break down text into bite-sized chunks that fit in a AI model's storage. The best AI models had only 1024 character storage capacity for input until recently, when this capacity quadrupled. This example verdict has approximately 44,000 characters.
  4. Receive user input, such as a question, and convert it into series of numbers that an AI model can understand.
  5. Compare the individual morsels from step 3 with the user input from step 4 and formulate an answer.

Questions against a specific document (here: ECJ judgment) are answered by an AI by first determining the best text snippet (or a few) for the question and then extracting the answer from this snippet.

A document is processed by first dividing it into manageable chunks. A chunk ends at the end of a sentence.

Chunks can overlap, i.e. individual sentences can overlap.

The smallest meaningful semantic unit is a sentence. This is why the text is split into sentences in the aforementioned step 3. It would be very unattractive if a sentence were split into two halves and thus ended up in two different chunks of information.

Determine sentences in texts

As shown, a AI should know from which sentences a text consists. Without knowledge of the individual, neatly separated sentences, semantic decline usually occurs. Furthermore, AI models are trained for specific tasks such as summarizing texts or general text understanding through examples. For this purpose, sentences or statements are given as examples and the ideal answer thought up by the human trainer is provided along with them.

What is a sentence? There is no simple answer to this question. A sentence usually ends with a punctuation mark. But often it does not. In addition, the punctuation mark is often also a non-sentence mark. In abbreviations, the period is used as an abbreviation mark. It becomes difficult when an abbreviation is at the end of a sentence and the abbreviation mark and the end-of-sentence mark are combined in one character.

An example of a sentence from an ECJ judgment, which most people fail to read until the end or grasp its meaning correctly on first reading:

In Case C-358/08: reference for a preliminary ruling under Article 234 EC from the House of Lords (United Kingdom), made by decision of June 11, 2008, received at the Court on August 5, 2008, in the proceedings between Aventis Pasteur SA: and OB: THE COURT (Grand Chamber), composed of V. Skouris, President, A. Tizzano, J.N. Cunha Rodrigues, K. Lenaerts (Rapporteur) and E. Levits, Presidents of Chambers, C.W.A. Timmermans, A. Rosas, A. Borg Barthet, Judges Levits, Judges C.W.A. Timmermans, A. Rosas, A. Borg Barthet, M. Ilešič, J. Malenovský, U. Lõhmus, A. Ó Caoimh and J.-J. Kasel, Advocate General: V. Trstenjak, Registrar: L. Hewlett, Principal Administrator, having regard to the written procedure and further to the hearing on June 30, 2009, after hearing the observations of: – Aventis Pasteur SA, represented by G. Leggatt QC, assisted by P. Popat, Barrister, – OB, represented by S. Maskrey QC, assisted by H. Preston, Barrister, – the European Commission, represented by G. Wilms, acting as Agent, after hearing the Opinion of the Advocate General at the sitting on September 8, 2009, gives the following judgment:

Excerpt from a judgment of the Court of Justice of the European Union on case C-358/08. The representation is compressed here. This sentence would take up an estimated DIN A4 page on screen in its original formatting.

The fact that a sentence can end without a punctuation mark, but people have no problem with this, is due to the markup (HTML code) used in ECJ rulings. Here is an example (excerpt from an arbitrary ECJ ruling):

View of an ECJ ruling in the browser (excerpt). Source: https://eur-lex.europa.eu/legal-content/DE/TXT/HTML/?uri=CELEX:62008CJ0345&qid=1693473655909 (image was automatically translated).

The word "judgment" is not followed by an end-of-sentence mark, nor is the word "reasons for the decision". On the other hand, the numbering uses a period, which only serves to indicate the numbering, but not the end of a sentence.

If you look at the HTML code for the text just shown, you will find the following:

Judgment of a Court of the European Union (Excerpt). Source: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:62008CJ0345&qid=1693473655909. HTML-code of a judgment of the Court of Justice of the European Union (Excerpt). Source: https://eur-lex.europa.eu/legal-content/DE/TXT/HTML/?uri=CELEX:62008CJ0345&qid=1693473655909.

The words "judgment" and "reasons for decision" are placed in different lines by layout instructions. The HTML tag "<p>" causes a paragraph (p = Paragraph) and the HTML tag "<h2>" causes a level 2 heading (h = Headline). This is at least a widely accepted convention. For HTML tags can be freely adapted on any website.

Two small changes to the layout of the HTML page produce the following view, which always has the same source code as just shown. Only the layout instructions (CSS instructions) for the tags "<p>" and "<h2>" have been minimally changed here for demonstration purposes:

Browser-view for same EU court ruling as before, only that for p and h2 the line indentation has been turned off (float: left CSS instruction). Image was automatically translated.

A human could still find out with little effort which terms and sentences have which place in the chronology. However, this is virtually impossible for a computer. You would have to simulate a browser and then cut out the text. But either information would be lost if the raw text is preserved. Or there would be useless information again, because the markup code that you already had before would be retained.

Interim conclusion:

Extracting raw text from formatted text is a major challenge that has not been satisfactorily resolved. Formatted text is any kind of document that does not exist as raw text. Therefore, it is normal for preprocessing to cause great difficulties with an existing text.

Abbreviations, enumerations and the like

In the example above, an enumeration has already caused a naive algorithm that recognizes the end of a sentence at a point to fail. The statement "1. Here is the 1. list item." would lead to the following three sentences:

  1. "1."
  2. "Here is the 1."
  3. "list item.

Obviously this is nonsense. But it is only obvious to humans. Because we are all spoiled users of computer systems, we often get terribly upset about such machine inadequacies. But that doesn't change the fact that computer programs have these problems.

Such simple constellations are easy to control, but not reliable.

What about this purely fictitious sentence? "The request of Mr. X. is based on paragraph 3 of Art. 4 GDPR." In order to be able to interpret the sentence meaningfully with an AI, the abbreviations "para." and "Art." should be known. It should also be possible to understand "X." as an abbreviation for a name (or pseudonymization of the name).

These little problems in the pre-processing of texts before they are fed into an AI model lead to incorrect answers. An example was given at the beginning of the article.

What does this mean for generic AI models like ChatGPT?

Basic text preprocessing** can be done by the ChatGPT engine quite well, at least for standard formats and general topics. However, this is not sufficient for legal texts such as EU court rulings. While many people know what the (German) abbreviation "Abs." means, it gets thinner with "ABl.", because then, for example, data protection officers who are not lawyers often have no in-depth knowledge. I myself had to look up the meaning of "Slg." at least. Now my AI system also knows that it can read and process EU court rulings (more on this soon).

General AI language systems therefore inevitably split sentences incorrectly. This may be different in five or umpteen years' time, but it is currently the case. The processing of specific HTML code can also be done better by a specific conventional program than by any general AI.

My self-sufficient, self-developed and data-friendly AI system can understand legal texts better than ChatGPT.

According to my tests with ECJ rulings and legal issues,

Domain-specific knowledge is not mastered by general AI systems like ChatGPT either, not particularly well. Hallucinations remain absent. In this context, it should be noted that uploading one's own documents into ChatGPT in the paid model significantly increases costs (although only a small amount per request), because each input document is billed based on its size (token).

Other aspects cannot be discussed in detail here, but they also play a role and increase the problem when using general AI systems. Here are just a few examples:

  • Synonyms;
  • German language (most LLMs are mainly trained in English, Chinese or similar);
  • Contextual knowledge (example: "signatures" at the end of an ECJ judgment is not a semantically relevant element);
  • TF*IDF analyses for preprocessing texts for FAQ systems.

The blind enthusiasm of many will soon be replaced by partial disappointment, even though modern AI systems do amazing things. Even if some current achievements in text comprehension are significantly better than two years ago, they are not sufficiently reliable to be used as a solid basis for professional work.

Try Offline-AI now

Optimizable and with full data control. Economical even in continuous operation.
Fully-controlled data center, no third-parties.

Specific problems can be best solved specifically. Nothing is for free. Whoever believes that an AI can do everything will soon return to the floor of reality. Currently, I am processing around 25,000 EU court rulings in order to analyze them more deeply and make them easily searchable. In the process, numerous special optimizations come into play, which significantly improve data quality. As they say: GIGO (Garbage In – Garbage Out) or even SISO (ask an AI if you can't figure it out yourself). Best then also ask for "Slg." if you belong to the majority of people who don't know this abbreviation.)

The best alternative to ChatGPT

The best alternative to ChatGPT from my point of view, which can achieve more reliable results and above all is data-friendly, looks like this, for example:

  • Selection of a suitable language model that understands German very well.
  • Optimal pre-processing of the given documents by using general libraries that are specifically used and configured.
  • Preparation of the user's question (prompt), for example to recognize synonymous questions and spelling mistakes.
  • Fine-tune the local language model to avoid hallucinations.
  • Intelligent search in the knowledge base to condition the best results.
  • Combination of intelligent search with a conventional, also intelligent search.
  • User-friendly and adequate presentation of results to guide the user so that he or she does not stop thinking.
  • Selection of suitable hardware, either in-house or rented from a German provider.

All these points are resolved. This leads to a low effort for introducing a solution in your company. Economical solutions with high benefit are thus possible. The intelligent search (vector search engine) plus conventional search (N-Grammes, TF*IDF, Soundex, Edit Distance etc.) have already been realized for this blog and complement the WordPress search for purely pragmatic reasons. WordPress does not find any hits when there are spelling mistakes and more complex searches like "What are IP aderses?" (intentionally written incorrectly) but my search does. The search runs on a very cheap server of a German provider and can be further expanded, e.g., to a question-answer system with abstract results. Abstract means that the answers occur in their own words and not as a quote (that would be extractive).

Conclusion

Accuracy can only be achieved through specific optimization for a given application. Artificial intelligence systems are no different to humans. A specialist can achieve more in his area of expertise than Albert Einstein, who can achieve very good results in an area that he has not yet studied in depth.

An investment at the beginning gives you many degrees of freedom and fulfills your wishes. This pays off after a short time. Quality has its price. No quality has a higher price. Because a bad solution always costs a little money over time compared to a good one, it is more expensive in the medium term and much more expensive in the long term.

As always applies: The simplest way is, except for quite obvious activities such as breathing etc., almost always a moderate and often a bad choice. When it comes to Reliability, a general chatbot cannot be taken seriously. Specialized systems, on the other hand, can be reliable. A trip to Mars is no longer necessary in order to have such a system. Rather, only a trip to the nearby surroundings in Germany are necessary, to describe it figuratively.

Key messages

Generic AI systems like ChatGPT are not reliable for interpreting legal texts because they can provide inaccurate and contradictory answers even when given similar questions.

Using AI, like ChatGPT, to process legal texts is complex and requires high accuracy because AI can sometimes generate incorrect information.

AI models need to understand the structure of text, including sentences, to process information and answer questions accurately.

General AI models like ChatGPT struggle to understand complex text formats like legal documents because they lack domain-specific knowledge and can't always accurately process formatting and abbreviations.

While ChatGPT and similar AI models are impressive, they are not yet reliable enough for professional use. Building a custom system with specific optimizations often yields better and more trustworthy results.

Investing in a specialized AI system, tailored to a specific task, yields better long-term results compared to a generic, less accurate solution.

About

About the author on dr-dsgvo.de
My name is Klaus Meffert. I have a doctorate in computer science and have been working professionally and practically with information technology for over 30 years. I also work as an expert in IT & data protection. I achieve my results by looking at technology and law. This seems absolutely essential to me when it comes to digital data protection. My company, IT Logic GmbH, also offers consulting and development of optimized and secure AI solutions.

Artificial intelligence: Personal data in AI models