Bulk Operations
The Bulk Operations module allows you to perform multiple operations simultaneously across one or more Word-files or PDF-files. This means that you can let ClauseBuddy handle tedious editing tasks that would otherwise require many multiple mouse clicks — or even multiple hours.
Possibilities
The Bulk Operations module seems quite simple, but has many different use cases. To whet your appetite, here are some examples of what you can achieve in less than 20 seconds with ClauseBuddy, that would other take you many minutes:
Bundling eight different DOCX-files together into one single DOCX-file.
Bundling those same eight documents into a single PDF-file, preceded by a nice graphical PDF-cover document prepared by your firm's graphical department.
Creating a table with all 635 paragraphs (across 23 long documents) that contain certain keywords — e.g., the name of the entities in the company group.
Updating the same paragraph with some payment term across twenty different templates.
Making sure that, across a large set of documents that have been reviewed by many different lawyers, all the track changes refer to the same "author" (instead of having each lawyer's own name listed).
Replacing all yellow placeholders with commencement date into 23rd April 2025, signature location to Paris and vendor signatory to John LeCarré. Across ten different files.
Within fifteen different annexes, accepting all track changes, removing metadata, and making sure that all footers say "Execution Copy" instead of "For discussion purposes only".
User interface
When you start the Bulk Operations module, you'll notice that there are three different steps to be undertaken:
Selecting the document(s) you want to include in the processing.
Selecting the operation(s) you want to include.
Executing the operations
1. Selecting documents
In the first step, you have to select which documents you want to include in the processing operations. For example, you may want to replace yellow placeholders across twenty documents at the same time.
To add documents, drag them on the Upload area in the upper-left corner, or click on that area and use the Windows/Mac file selection dialog box to navigate to the relevant files.
If you're using ClauseBuddy inside of MS Word, then by default your currently opened document will be included, as shown in the screenshot above. If you removed this document from the list, you can re-add it by clicking on the button that appears:
When you add PDF-files, ClauseBuddy will immediately convert them into DOCX-files. The reason is that most of the processing operations work on DOCX-files only, even though the end result can usually be exported to a PDF-file.
As discussed below, some processing operations — e.g., append or prepend documents — do allow you to add PDF-files as such, without any conversion taking place.
You can reorder files by using the up & down buttons. The order of the files is not relevant for all types of processing operations (e.g., when you would replace headers & footers in twenty files and download the resulting files as a ZIP, the internal order is irrelevant).
2. Selecting operations
ClauseBuddy will apply zero, one or more operations to all your selected documents.
The operations will be executed sequentially, for all documents, in the order those documents were listed in the first step.
In most cases you will select one, two or three different operations — but in theory the number of operations is unlimited.
If you merely want to bundle all files into one single DOCX-document or PDF-file, then don't add any operations at all. Skip this second step, immediately go to the third step (Execute) and export to a single DOCX-document or PDF-file.
Loading existing operation sets
The easiest way to select certain operations, is by choosing a pre-existing set that was prepared by yourself or your colleagues.
Simply click on a box with a predefined set to load it into ClauseBuddy. You can then decide to either take that predefined set as-is, or to modify it to suit your needs.
Creating a custom operation set
Instead, you can also create your own set of operations by clicking on the green Create your own set... at the top. You will then be taken to the next screen, where you can add, rearrange and customise each of the operations.
You add a new operation by clicking on the green Add operation button, and selecting one of the many options. Click on the question mark icon to get some more information on what the operation performs.
Each operation is different. Some have many settings that can be tweaked, others have no settings at all. For example, in the screenshot below, you can see that Extract definitions has no settings to configure, while Remove empty paragraphs has one setting, and Find & replace text has many different settings.
Operations fall into two general categories: some perform extractions (e.g., extract all paragraphs containing certain keywords, or extract the digital fingerprint of the file), while others perform modifications (e.g., changing headers or accepting track changes).
You can remove an operation by clicking on the cross-icon at the right.
You can change the order of operations by clicking on the up & down arrows; this will affect in which order the operations will be executed by ClauseBuddy. Only for some operations the ordering is relevant — for example, you may want to first delete certain paragraphs containing product name X, and only then extract all the paragraphs that contain product name Y. Or you may want to first have a Replace placeholders operation and only then a Proofreading operation, otherwise the leftover placeholders will trigger various proofreading-warnings.
3. Executing the operations
As the final step, you must instruct ClauseBuddy to execute the operations across the selected documents. Depending on the amount of documents and the type of processing operations, ClauseBuddy allows you to choose from different export formats:
The Single file options will only be available when either you have selected multiple documents in the first step (because you can then bundle all of them into one DOCX or PDF), or when you have at least one operation that performs a modification of the document.
The Replace currently opened document option will only be available when you are using ClauseBuddy inside of MS Word. Executing this option will completely replace the content of your currently opened document.
The Separate files (ZIP) option will only be available when you have selected multiple documents in the first step and you simultaneously have at least one operation that performs a modification of the document.
The Table export options will only be available when you have included at least one operation that performs an extraction.
All other options are non-destructive, i.e. they don't affect the original files. They merely result in one or more copies of the files or information that you selected in the first step.
Available operations
Calculate digital fingerprint
This option is simple to use in practice: it calculates the digital fingerprint of all the DOCX-files you uploaded, as well as any other files (of any type) that you would upload in this operation's upload-area.
It then exports all of them to a table in MS Word or Excel.
You can then include these fingerprints (a bunch of seemingly nonsensical numbers & letters) in, for example, a signed contract — e.g. preceded by the following language:
"The Parties agree to incorporate a list of technical schedules as part of this agreement. Due to their size and technical nature, these files are not physically printed and attached. Instead, each Party will hold a copy of those files. The SHA-256 hash of each file is set forth in the following table: ..."
Background explanation
Obviously, all this digital fingerprinting requires quite some explanation to understand, because it will be unfamiliar territory for most lawyers.
Imagine that you're in the typical situation where you have negotiated a large contract that contains many different schedules, with hundreds or even thousands of pages. For reasons of legal certainty you would be inclined to print all pages and have them signed, but this would be cumbersome and time-consuming; furthermore, some types of documents (e.g., huge Excel-files) are not easily printable. Even e-signing won't work if you have to deal with huge files, a large number of documents, or documents that cannot be easily converted to PDF.
There are of course commercial data stores available where you can store files, but both parties have to trust this, the cost must be covered, for long-running contracts it must be guaranteed that the vendor will stay around for many years, etc.
The easiest way to deal with this, while ensuring that everything remains legally sound and tamper-proof, is to calculate a digital fingerprint of each schedule, and then include all those digital fingerprints in the main contract. That's exactly what ClauseBuddy offers with this operation.
The fingerprint is technically called a "hash" — more specifically, a SHA-256 hash, which is a secure and commonly used digital fingerprinting technique that every IT-expert will be familiar with. It is mathematically guaranteed that the hash / digital fingerprint will change completely when even a single letter would be modified in the file.
All this hashing sounds fancy, but is really basic technology that has been around for many years, even though this probably feels like science-fiction for lawyers. If you want to learn more about this, check out videos such as https://lnkd.in/edeaPRhk or introduction texts like https://lnkd.in/e-aKb8Uq
Both parties can then print/e-sign just the main body of the contract containing the hash, and store the files on their own systems. Due to the technical guarantees of the hashing algorithm, each party can rest assured that the other party cannot secretly change something in the files. When you would suspect this behaviour from your counterparty, you can run the same command again (using ClauseBuddy or any of the many technical tools available to IT-experts) and a different hash would be generated.
The end-result is a mathematically sound, low-tech solution to the contract signature process.
See our LinkedIn post that contains a video of how to calculate a hash in Windows without ClauseBuddy. The gist of it is that you run the following commands:
In Windows, using the command prompt: certutil -hashfile filename.docx SHA256
On a Mac or Linux machine, using the terminal: shasum -a 256 filename.docx
Extract text
This operation allows you to find one or more specific keywords and then extract the surrounding text, across entire documents. You can, for example get an overview table of:
all paragraphs in a witness statement where a certain witness' name is mentioned
all clauses that mention "force majeure", across a bundle of documents received from a client
upload a PDF file that contains frequent mentions of a certain legal entity, and extract all those fragments into an Excel file
You can add one or more "targets" (i.e. keywords of literal phrases that you want to search on), by clicking on the green plus-button. For example, when you want to search for paragraphs containing either liability, or damage, or compensation, you would use the following settings:
This would for example result in the following table in MS Word
When you check Case sensitive, ClauseBuddy will only take into account paragraphs that have exactly the same capitalisation in their text — e.g., when searching for Liability with an initial capital, words like liability in lowercase will not be found.
When you check Find whole words only, ClauseBuddy will ignore matches on part of a word. For example, when you search for confidential and this options is not checked, then words like confidentiality will also be found, which may not always be what you want.
The options at the bottom allow you to configure what exactly is being extracted:
Extract paragraph will extract each paragraph in which one or more matching targets are found.
Extract clause will extract the entire clause or subsection — i.e. a title and one or more paragraph— from the document. This will result in much longer fragments than Extract paragraph.
Extract fragment will extract a few hundred characters surrounding the target text.
The Extract fragment option is particularly useful if you're dealing with files that have a bad layout, e.g. with paragraphs that have unnecessary line-endings. This is often the case with converted PDF-files. For example, the following document (converted from a PDF-file) looks OK when you're opening it...
ClauseBuddy will fail to extract any text in such situation, because Extract paragraph and Extract clause work on a paragraph-by-paragraph basis, so will never see "obligations" and "hereunder" next to each other.
Conversely, Extract fragment can see "across" paragraph endings; the downside is that its output does not nicely start at, or end at, paragraph boundaries.
Proofreading
This operation performs a proofreading of the submitted DOCX-files.
It is the equivalent of manually opening each of your DOCX-files in MS Word, opening ClauseBuddy and going to the Proofreading module of ClauseBuddy in order to see whether there are missing definitions, irrelevant definitions, hardcoded numbers, etc.
The output is a table in MS Word or Excel that is roughly equivalent to what you would see inside the Proofreading module, except for the obvious interactivity, i.e. the ability to click on an issue in ClauseBuddy and then go to the corresponding problematic paragraph.
The target audience of this operation obviously consists of transaction-oriented lawyers who have to frequently perform proofreadings across many different files.
Remove metadata
This operation primarily removes all the "metadata" from a DOCX file.
This operations is the equivalent of what you can manually each within MS Word by clicking several layers deep, so the main benefit of this operation is that it saves you some clicks (particularly if you create a predefined set of operations for your colleagues), and particularly that you can perform this in bulk across many different files.
You can also enable the Remove personal information on save option. This causes MS Word to automatically remove any personal information from the DOCX-file upon each save.
Background explanation
"Metadata" consists of several types of information stored within a DOCX-file, such as the author, manager, company, template being used, storage location. In fact, you can even store any amount of custom data in a file.
Because metadata is hidden fairly deeply inside a DOCX-file, companies can inadvertently expose (semi)confidential information to outside parties. Unremoved metadata can also lead to embarrassing situations, e.g. when it is revealed that prestigious law firm X actually used a template from competitor Y...
To see the metadata in MS Word for Windows, you have to go to the File tab in the ribbon, subsection Info.
When you click on Advanced Properties in the top-right corner, you can see all information:
To remove the metadata manually, you have to click on the Inspect Document option in the Info panel of MS Word for Windows.
In the dialog box that appears, you must then click on Inspect at the bottom, and then finally click on Remove all.
The Remove personal information on save option can be found in a completely different location: under File > Options, Trust Center > Trust Center Settings, then Privacy Options:
Because all of this is buried very deeply, it's self-evident why most legal professional don't realise the metadata issue. Or why there exists a cottage-industry of dedicated software applications that remove this automatically (e.g., when you're sending out emails).
ClauseBuddy can offer an interesting alternative here, that goes beyond these applications by allowing you to perform this in bulk.
Last updated