Searching clauses by similarity

Maarten Truyens Updated by Maarten Truyens

Searching by similarity is the coolest and the quickest way to search. It works as follows:

  • You select some text in your MS Word document — presumably a clause or subclause for which you want to find an alternative.
  • You select the right clause library and/or search folder to search in. 
  • You click on the blue Find similar clauses button. 
  • Optional: if multiple candidate folders with clauses are found, you may want to switch between the folders by clicking on them. One or more purple boxes indicate the relative similarity amount.

Understanding the magic

When performing a similarity search, ClauseBuddy will search for folders that contain clauses that are reasonably close to the selected clause. It will perform such search in two steps:

  • In a first step, it will perform a word-by-word search — essentially putting all the unique words of the selected clause in a word-box, and then checking whether there are clauses that have a similar word-box. 
  • If the first step does not deliver any results, an Artificial Intelligence (AI) search will be performed. Essentially, ClauseBuddy will deconstruct your clause into its "raw ingredients", and check whether those ingredients are similar to one of the previously constructed piles of ingredients. With sufficient amounts of raw ingredients, the AI-engine may then be able to automatically infer that a clause talking about risk allocation may actually be close to a clause talking about a liability cap or an exposure limitation. This is the magic of AI: given enough raw ingredients, excellent results may turn up. (Or not, you never know.)
    Note that those raw ingredients may not only be composed of raw ingredients found in your own clauses, but also include raw ingredients found elsewhere. This is why it's so important to assign a correct taxonomy label to your folders if possible, because the similarity search can then "learn" how clauses are related, and become much smarter for clauses in properly tagged folders. 


Similarity searches are cool and can very quickly lead you to appropriate alternative clauses, but they do come with quite some caveats.

  • Obviously, you need a sample clause to start from. If your Microsoft Word document does not yet contain a clause of the kind you want to insert, the similarity search is useless. 
  • If the word-by-word search does not lead to any results, then the AI engine will try to perform its magic. However, you should be aware that AI needs hundreds — and preferably — thousands of sample clauses that it can break down into constitutive raw ingredients in order to "learn" what a typical clause of that kind is made of. Depending on your language, legal domain and jurisdiction, such amount of clauses may simply not exist. 
  • The AI-engine does not "understand" what is written in your clauses: it only statistically correlates certain words and expressions. Accordingly, the AI-engine may at the same time lead to excellent and completely wrong search results. 
  • Even when thousands of sample clauses of the same kind exist, you should be aware that AI can only learn from what is explicitly written in a clause. For many kinds of searches this is not a problem, but every legal professional knows that what is not written in a clause, may be as important — or even more important — than what is explicitly written. This will be particularly true in continental European jurisdictions, where statutory provisions often act als "fallbacks" or standards that do not need to written down in a contract. Accordingly, a condensely written clause of two lines may actually have the same legal effect as a clause of 25 lines. 
For example, unlike UK/US contracts, most continental European jurisdictions do not formally require a force majeure or good faith clause to be included, in order for the standard force majeure provisions and good faith contract execution provisions to apply. While there is a clear tendency in those jurisdictions to nevertheless include such kinds of clauses (e.g., in order to be exhaustive, or to deviate from the standard provisions of the Civil Code), this is not required per se. 

When to use

Similarity searches work best to find alternatives for your own standard clauses. 

For example, if your company always submits the same sales agreement to a customer, and the customer complains about a certain clause, you can simply select that clause, hit the Find similar clauses button, and you can be sure that you will be immediately presented with the folder where that clause (and its alternatives) is located. 

How did we do?

Browsing Folders

Filtering clauses