Bletchley Park To Digitise A Million War Records

The WWII codebreaking centre wants to preserve around a million documents in digital form, with some available online

Bletchley Park, the site credited with helping the Allies crack secret German codes, has announced a plan to digitise its archive of historical documents, referencing some of key events during World War II.

Bletchley park announced this week that it is working with tech company HP to digitise around 1 million documents, including communication transcripts, communiques, and photographs. Digitising the documents and then making them available online will give users outside of the UK access to the information easily for the first time.

Fragile Materials

Commenting on the plan, Simon Greenish, Trust Director of Bletchley Park Trust, said that the project would help to preserve and increase access to the fragile materials. “There can be few archives which contain material that had such a profound impact on the world at the time and which is still relevant today,” he said. “The project represents a considerable technical challenge and without the help from HP and the top end technology now in place, this project would not be happening.”

The project is expected to take between three and five years and is being made possible in part thanks to scanning equipment and software provided by HP and its partner Digital Workplace. The first stage of the process is expected to take about a year, following which a selection of the documents will be made available online through free and paid-for access.

Norman Richardson, vice president and general manager, HP Imaging and Printing Group UK, said the the Bletchley Park archive contains documents related to events that defined the outcome of World War II. “Our collaboration with Bletchley Park will not only ensure the preservation of this hugely significant archive but will also allow it to be made accessible and searchable digitally for the first time, untapping the value of this content for the benefit of audiences all over the world,” he said.

But while archivists will welcome plans to preserve the documents, the whole area of digitising information has attracted some controversy relating to which file formats are selected. The use of proprietary formats have been criticised by open source activists for potentially locking-up data which should be freely available. In 2007, the UK’s National Archive ran into problems when it revealed that it had 580TB of data in formats that were no longer commercially supported.

Proprietary Data Formats?

HP was approached for comment on what data formats would be used in the digitisation project but did not respond in time for this article.

During World War II, Bletchley Park was the site of the UK’s main decryption establishment – the Government Code and Cypher School. One of the centre’s greatest achievements was decrypting German Enigma codes to French and British intelligence. The British used this information as the foundation for their own early efforts to decrypt Enigma.

Bletchley Park also houses Colossus – the world’s first semi-programmable computer, which was constructed in order to break the German Enigma ciphers. In September 2009 it was also announced that the Harwell computer – the oldest “original functioning electronic stored program” machine of its kind – would be restored by a group of volunteers on the site.

More recently Bletchley Park played host to a national partnership, designed to encourage the UK local authorities to work together to save £60 million a year across their educational ICT budgets, through the use of open source solutions.