Solving Contract Analytics Using Machine Learning

Computers find contract analytics and extractions hard. The task at hand isn’t as straightforward as being able to recognize the legal provisions. It’s about both identification and localization. For a human brain reading through a page, these two might seem the same, but for a machine, detecting which of the many different combinations of words best fits a provision type is much more difficult than answering a simple yes or no question. Image recognition technology has a similar challenge. It’s harder to single out a cat in an image depicting many animals than to correctly distinguish a cat from a dog.

To make matters worse, the main function for a solution like DocuSign Insight is to perform these extractions on data never seen before. The machine is expected to inherently deal with novelties and uncertainties. These predictive properties may happen naturally and unconsciously in the brain of an expert lawyer but are far trickier to attain in a logical computer. In fact, the legal language itself is continually evolving, making contract analysis feel a lot like trying to hit a constantly moving target.

Machine learning and natural language processing

Machine learning provides very effective tools to deal with these challenges by mimicking the human approach to recognition and applying it to finding contract provisions. Just like a toddler who learns what a cat looks like from real-life examples vs. being told to look out for fur, whiskers or tails, the inner workings of machine learning are automatically deduced from examples, not manually designed. Over time, more examples don’t add complexity but generally improve accuracy, which makes these methods flexible and powerful. These machine learning algorithms also make use of the underlying uncertainties that are inherent to predictive extractions and are an exceptional match for the dynamic and elaborate task at the heart of contract analytics.

Natural language processing (generally defined as “understanding spoken and written language”) is a fast-paced field at the center of a multidisciplinary crossroad, and it relies heavily on machine learning. A cohesive team with varied backgrounds is also pivotal to exploit all the aspects of language processing. 

At DocuSign, our team comprised of computer scientists, data scientists, mathematicians and linguists are sharing their knowledge and collaborating continuously on a variety of tasks, such as:  

  • Designing an efficient computational framework to support the processing 
  • Mastering the intricate mathematical concepts that underpin the algorithms
  • Increasing our knowledge of advanced syntactic and semantic linguistic tools 

Humans and machines together

The accuracy of our systems depends as much on the quality of their design and inner workings as on the material they use to learn. In other words, a machine is only as smart as its teachers. 

To get the finest quality and required quantity of data to teach our systems, it’s crucial to team up with experienced legal experts. Their insights on the meaning and content of the extractions proved to be incredibly useful for our analytical designs. Perhaps the most exciting part of this collaboration, though, is exploring the ways to directly transfer the experts’ knowledge and understanding to our systems. By trying to create a symbiotic relationship between the lawyers and the machines based on feedback and iteration, we leverage both human and computer analytical powers in what has become a rewarding and game-changing experience.

Complex analysis with deep learning

We push the boundaries of contract analytics performance by using a combination of state-of-the-art industry algorithms and our own. Yet, when it comes to analyzing the complex structures of legal documents, one class of algorithms is king: deep learning.

Apart from being particularly trendy, deep learning is a disruptive branch of machine learning that harnesses the analytical power of vast networks of algorithms. This technology originated 40+ years ago when computer scientists tried to imitate the brain’s neural structure. Despite early promise, these first computations were too slow and costly to be effective. However, the tremendous advances in computer hardware of the last few years have sent neural networks back to the top of the AI pyramid, as deep learning systems are now capable of crunching large amounts of data and handling very sophisticated tasks. This explains why it’s both at the forefront of natural language processing and a perfect candidate for our contractual extractions. 

The deep architectures allow the systems to really grasp the meaning of sentences rather than recognize sequences of words. Using deep learning puts us ever closer to understanding the intent of the contract’s author.

Using cutting edge technology and developing our own, the DocuSign Insight R&D team is dedicated to making sure you get the most accurate analysis on your contracts.

Published