Mastering contract efficiency: Reusing clauses and templates for document generation

Lego technical specifications

The maturity model for contracts and document generation looks like this:

  1. Copy/paste a MS Word document and use find-and-replace to update the contract details (aka deal-points).
  2. Manage templates as separate artefacts, and use a software platform to automatically find and replace the template variables with their values, with values supplied by contract managers via web forms or read via API from external systems.
  3. Introduce conditional logic and calculations into the templates, which necessitates a type system for the variables.
  4. To tame the combinatorial explosion in the number of templates, modularize the templates using a set of included micro-templates, often managed as a clause library, dynamically assembling templates based on incoming data.

Stage (4) is interesting because it poses some new challenges related to template governance, naming, semantics and modularity. These challenges are similar to ones that software developers have wrestled with since at least 1958 and the development of Algol. No, I’m not that old! 🙂

In this article I will dig deeper into these challenges, and potential solutions.

What Are Templates?

First, a little background on templates.

A template in its most basic form is natural language text with embedded variables. Here is an example micro-template (aka “clause template”), expressed using Accord Project TemplateMark syntax.

## Period of Insurance

From: {{from as "DD MMMM YYYY"}}  
To:	{{to as "DD MMMM YYYY"}}  
both dates inclusive, local standard time at the address of the Insured as stated above.

The data model for this template declares that the template references two variables “from” and “to” and that they are both of type DateTime.

concept InsurancePeriod {
    o DateTime from
    o DateTime to
}

So, templates bind a locale-neutral data model (in this case an InsurancePeriod concept, with a “from” and a “to” DateTime) to natural language (in this case English). We should not confuse the concept of an InsurancePeriod with how it has been bound to an English paragraph. In some cases we may need to also bind the InsurancePeriod to other natural languages, such as French — perhaps because we are operating in Canada and need to express this concept in two natural languages.

Template models are locale-independent and describe the shape of data (a schema), while template natural language references a template model and may (optionally) use any of the variables defined by the template model. The semantics of the template are carried primarily by the template model, while the natural language of the template expresses the syntax of the template and verbalises the concept for humans.

How to Modularise Templates?

The Period of Insurance template shown above is clearly designed to be used as part of a larger document. It is not a document template itself.

The figure below shows three paragraphs in a document, with the first and the last providing boilerplate text (text without variables) while the middle paragraph is the Period of Insurance template, along with its two variables.

Period of Insurance document template with two variables

So far, so good! As long as the user supplies the values for “to” and “from”, the document can be generated from the template.

A more realistic example, however, would be one where several clause templates shared variables. In this example below, the from variable is referenced in paragraph 1 and 2, while the to variable is referenced in paragraph 2 and 3.

Document template with shared variables

This now poses some interesting challenges:

  1. How do we know that the “from” in paragraph (1) is semantically the same as the “from” in paragraph 2? It may be obvious if both clause templates (1) and (2) are created by the same person at the same time, but what if they are not?
  2. If we assume that things that are named the same, ARE the same then we are making a very strong statement about syntax defining semantics.
  3. What happens as we scale the system: dozens of clause template authors, creating hundreds or thousands of clause templates, potentially spread around the world?
  4. How do clause authors document and discover the semantics of the “from” variable, so that everyone creating a clause template is using it consistently>
  5. How do we enable a clause template author to not have to understand all the details of all the documents within which the clause template may appear?

Lessons from Computer Science

Most computer programmers are (at least subconsciously!) familiar with these problems, as they are the basis for how we create modern software. The maturity model for software over the past 80 years is:

  1. Write a set of instructions for the computer.
  2. Externalise the data from the computer code, introducing variables into the code, and loading the data from punch cards, disk or memory.
  3. Modularise the code into procedures, subroutines or functions, calling them from each other, an entry point, or main procedure as necessary.
  4. Develop processes to ensure that subroutines can be reliably developed, versioned and published, independent of the programs that use them.

The key insight comes from mathematics and Lambda Calculus: suboutines are mathematical functions that receive inputs and produce outputs. 

The mathematical operator that applies a function to a set of inputs doesn’t know or care what the names of the inputs are outside of its scope.

Today, rather than creating programs that reference a fixed set of global variables, computer programmers define a set of functions that transform data, irrespective of where the data is coming from or how it is named. The semantic binding between the data and a function occurs WHEN THE FUNCTION IS CALLED, not when the global variables are defined, or even when the function is defined.

An example will, I hope, make this concrete.

// returns the sum of x and y
Function Add(x:Number, y:Number) : Number {
   Return x+y;
}

The semantics of the Add function are documented (by the function author) and clear: it adds its two numeric arguments together and returns their sum. WITHIN the Add Function these arguments are referred to as x and y. The function author has created a useful function and does not need to be concerned with who is calling the function.

Somewhere else (in space and time) a programmer (who is not necessarily the function author) decides that the Add function would be useful in their program (because they read the documentation and liked the semantics and trusted the implementation). They instantiate two variables A and B and then call the function, assigning the result to variable C.

The caller of the function passed variables A and B to the function (not x and y). They do not need to know how the function is implemented, or how the variables are referred to internally.

Back to Templates

Templates are functions that take data as input and return natural language text. Templates should be CALLED, just as functions are called in our example above. It is only the caller of the function that can map the data they have at hand to the data that is required by the function they would like to call. The onus is on the CALLER of the template to understand when to use the template and its data requirements, not vice versa. Once template authors have to be aware of WHERE their templates might be used, modularity and organisational scalability breaks down. This includes the presence of any sort of list of global variables and their semantics.

In the figure below, the document template author has introduced two document template variables: startDate and endDate.

Document template with StartDate and EndDate variables

The author of the document template would like to use Clause Templates (1) (2) and (3) and they understand that the semantics of the startDate variable corresponds to the “from” variable in the templates, and the semantics of the endDate variable corresponds to the “to” variable. To use the clause templates, they must define the binding between the variables they have in their scope to the variables required by the clause templates they would like to use/call.

One can imagine this binding being more-or-less automatic, but we should never fall into the trap of assuming that variables that are NAMED THE SAME, ARE THE SAME. 

Tables have legs. People have legs. Sports events have legs. Not the same thing! 🙂

 

Dan Selman
Author
Dan Selman
Distinguished Engineer, Smart Agreements
Published