The purchase order looks legitimate, yet does it have all the proper approvals? Many lawyers reviewed this draft contract so is this the latest version? Can we prove that this essential document hasn’t been tampered with, before I sign it? Can we prove that these two versions of a document are absolutely identical?
Blockchain might be able to help solve these kinds of everyday trust issues related to documents, especially when they are PDFs—data files created using the Portable Document Format. Blockchain technology is best known for securing financial transactions, including powering new financial instruments such as Bitcoin. But blockchain’s ability to increase trust will likely find enterprise use cases solving common, non-financial information exchanges like these documents use.
Joris Schellekens, a software engineer and PDF expert at iText Software in Ghent, Belgium, recently presented his ideas for blockchain-supported documents at Oracle Code Los Angeles.Oracle Code is a series of free events around the world created to bring developers together to share fresh thinking and collaborate on ideas like these.
PDF’s Power and Limitations
The PDF file format was created in the early 1990s by Adobe Systems. PDF was a way to share richly formatted documents whose visual layout, text, and graphics would look the same, no matter which software created them or where they were viewed or printed. The PDF specification became an international standard in 2008.
Early on, Adobe and other companies implemented security features into PDF files. That included password protection, encryption, and digital signatures. In theory, the digital signatures should be able to prove who created, or at least who encrypted, a PDF document. However, depending on the hashing algorithm used, it’s not so difficult to subvert those protections to, for example, change a date/time stamp, or even the document content, says Schellekens. His company, iText Software, markets a software development kit and APIs for creating and manipulating PDFs.
“The PDF specification contains the concept of an ID tuple,” or an immutable sequence of data, says Schellekens. “This ID tuple contains timestamps for when the file was created and when it was revised. However, the PDF spec is vague about how to implement these when creating the PDF.”
Even in the case of an unaltered PDF, the protections apply to the entire document, not to various parts of it. Consider a document that must be signed by multiple parties. Since not all certificate authorities store their private keys with equal vigilance, you might lack confidence about who really modified the document (e.g. signed it), at which times, and in which order. Or, you might not be confident that there were no modifications before or after someone signed it.
A related challenge: Signatures to a digital document generally must be made serially, one at a time. The PDF specification doesn’t allow for a document to be signed in parallel by several people (as is common with contract reviews and signatures) and then merged together.