Adding & removing documents

Maarten Truyens Updated by Maarten Truyens

You can upload and remove documents from a locker by clicking on the small triangle at the right of the locker selector, and selecting the Manage... option.

You will then see the following dialog box, containing a list of all your lockers at the left side. Obviously, you will first want to select the relevant locker in which to upload or delete documents. For example, in the screenshot below, locker zzz was selected.

Enriching documents before upload

Before you actually upload a document, you may want to enrich your document with additional data. The reason you may want to do this, is that very little information can actually be extracted from a DOCX/DOC/PDF file, except for the filename and the date of the file.

Even though many files technically mention some author, it is quite likely that that author will not be correct. In practice, many Word-files contain the name of the name of the initial author that at some point in time created the document. However, any subsequent edits made by other persons will not cause the author's name to get updated in the Word file, so that many legal documents contain embarrassing information, such as the name of some lawyer in another law firm, whose document was reused.

In addition, many law firms scrub enrichment data from a DOC/DOCX file before transmission by email. This scrubbing is yet another reason why you may want to manually complete the author's name.

You can enrich the information regarding a document by clicking on the triangle of Additional data to enrich uploaded files..., so that a table with additional information gets shown:

When you complete these boxes — e.g., the category, client or dossier — that data will be associated with each document you upload, until you would clear the boxes.

Uploading files

You can upload one or more files by either dragging them over the Upload documents area, or by clicking on that area and selecting relevant files in the dialog box that appears.

This may sound trivial, but there are actually quite a few details to mention here:

  • You can only upload .DOC, .DOCX, PDF and .RTF files. (In case you are wondering: .RTF is an editable document format roughly similar to .DOC and .DOCX, although it is no longer widely used.) Powerpoint, Excel, and so on are not supported.
  • While you can drag/select hundreds of files simultaneously, only powerful computers can easily handle those files. So if your computer is already a few years old, you may want to limit yourself to about 100 files per upload.
  • Special considerations for PDF-files:
    • Only the readable text parts of PDF-files are supported. If a PDF contains scanned text, you will first need to convert that scan into editable text. Recent versions of Microsoft Word do this automatically, but if you want to do this in batch, you may want to use specialised software or services (e.g., Adobe Acrobat, or the Adobe Acrobat Export PDF online service).
    • Be aware that the layout of PDF-files will be minimal, and may — depending on the file — become somewhat chaotic. This is the nature of the standard PDF-conversions performed by ClauseBuddy; you may need to convert PDF to .DOC/.DOCX first, by using specialised software (such as Adobe Acrobat).
  • If you are using lockers on a private server, you should be aware that the "zero knowledge" confidentiality guarantees of ClauseBuddy only apply to .DOCX files, if uploaded from within MS Word.
    • While .DOCX files can be entirely handled by your copy of Microsoft Word in which ClauseBuddy is running, this is unfortunately not the case for .DOC, .RTF and .PDF-files. To process those files, they first need to be converted to .DOCX by an external service running on the ClauseBase server.
    • Similarly, when you are using ClauseBuddy in a browser (outside Microsoft Word), all files that are uploaded — even .DOCX-files — will need to be converted to .DOCX by an external service running on the ClauseBase server.
  • When .DOCX files are sent to private servers, your local copy of Microsoft Word will open them, optimise them, and then directly send them to the private server without involving any other software. (Conversely, in all other situations, the original file is sent to ClauseBase's servers, which handle the optimisation and/or conversion for you.)
    This involvement of Microsoft Word ensures that no information leaks anywhere, but you should be aware of a few caveats:
    • This may take one to five seconds per file, depending on the speed of your computer and the complexity of the .DOCX-file.
    • While you can have this running in the background (assuming your pc is sufficiently powerful), you should inspect the upload process from time to time, as it may happen that Microsoft Word will ask for your input (e.g., to open a password-protected file, or to request the installation of installation language libraries when you upload files in a strange language).
    • It is advisable to restart Microsoft Word after uploading hundreds of files. From our internal tests, it seems that Word will get slower and slower after opening such amounts of files. Also in day-to-day use you may encounter this problem with Word, but obviously this behaviour is exacerbated by forcing Word to upload hundreds of files.

Inspecting files

While the contents of uploaded files should be primarily inspected by searching in them through Clause Hunt's search interface, you may from time to time want to inspect specific files. You can do so by clicking on the Inventory tab:

In this panel, you get an overview of all the files uploaded to the selected locker.

  • Click on the eye-icon to get a quick preview of the file.
  • Shift-click on the trash-can icon to permanently remove the file from the locker. (You will get a warning when you forget to hold down Shift.)

How did we do?

Managing lockers