File Preservation Format

Validity & version checking, structured data and metadata binding


Codel is specifically interested in promoting the  digital notary for use in protecting, preserving and proving the integrity of documents and data. It therefore developed methodology for ensuring that information is both verifiable and readable for the foreseeable future, with the intention for the long-term preservation of records and for authors and owners of intellectual property to store and prove ownership of their content. Codel's CDLxml  format is based on W3’s standard for the digital signing of records ‘XML Dsig’ provides a storage container known that binds the Codelmarked file to essential metadata components for viewing and verifying Codelmarked content. extensions to metadata tags can be requested by customers. CDLxml can also support encryption so that only authorised can view content. Codel keeps an audit trail of all verification events for full traceability. The viewing and verification code is available open-source at GitHub. When a CDLxml file is opened, a look-up is performed with the Codel service to check whether the file has been registered and whether there is other status information that has been flagged by the author;

Some CDLxml Metadata Components (not exhaustive)

1.       Character Encoding

 <?xml version="1.0" encoding="UTF-8"?>

This ensures that the characters display correctly.

2.       Canonicalization Method

<CanonicalizationMethod Algorithm=" http://www.w3.org/TR/xml-c14n "/>

W3’s standard that directs the user to the official ‘location’ for the document, in this case Codel’s verification service

3.       Hashing Algorithm

<CodelDigestMethod Algorithm="http://www.codelmark.com/2007/01/xmlcodelmark#sha256"/>

The hashing algorithm used to create the hash – in Codel’s case sha256

4.       File Name

 <Reference Type="http://www.codelmark.com/2007/01/xmlcodelmark#Object" URI="#filename.pdf">

The Uniform Resource Identifier given to the file that has been Codelmarked (hashed) and which his stored in the cdlxml file container. This is used to identify it and enable interaction with representations of this file over the World Wide Web, using specific protocols.

5.       Transformation Algorithm

<Transforms><Transform Algorithm="http://www.codelmark.com/2007/01/xmlcodelmark#base64"/>

The transformation algorithm used to convert to base 64 and ensure the file displays correctly for a recipient who now knows that it is base 64 encoded and that textual display is preserved

6.       The Base 64-transformed digest value (hash)

</Transforms><DigestValue>pGP4/vEuNC5MynkjfaJZppFUfY34k3ZVFWuz9d1V91g=</DigestValue>

The hash or digest value so that the accuracy/integrity of the transformed file can be verified

7.       The Codelmark Value

</Reference></CodelmarkInfo><CodelmarkValue>WgVdpUmL63GvUoirP51AnoYVu8Hg+WGsxpKXPYifIR0=</CodelmarkValue>

8.       Object Identity

</CodelmarkValue><Object ID="filename.doc">

The object identity is…….

9.       Item Data Reference

</Object><CodelmarkItemData><ItemReference>20233</ItemReference>

This can be specified by the author – for example if the author wants file verification to point to an optional reference or location for their locally stored files

10.   Item Owner/Author

<ItemOwner>617682308</ItemOwner>

This denotes the Codel user account of the author - not the same as identity but can be used to reference identity

11.   The URI for the Codel Service

<CodelURI>https://services.codelmark.com</CodelURI></CodelmarkItemData></Codelmark>

This points to the hosted site for the Codel registry where the file verification is performed