The Persistence Of Paper

Turning your archives into digital data ready for analysis isn’t as easy as you might think, finds Masibulele Lunika.

Converting paper archives into digital material that can be used to make your business more competitive sounds like a straightforward exercise, but, as many firms who’ve begun the task testify, it can be far more difficult.

Analyst and CEO at Strategy Worx Steven Ambrose says the biggest problem with digitising hard copy data is setting up the processes and platforms to sort out what needs to be kept and then putting that into a meaningful and useful format.

“For example, if you have tons of financial data, what is it that you want to record and why?” he says, “and will it bring any value to your business?”

Herman Crowther, managing director at archive management firm, Firstcoast Technologies, shares his sentiment, saying the biggest pitfall for organisations going digital with their documentation is applying the wrong technology for their needs.

“Purchasing equipment, software or services for document digitalisation is not an ‘out-the-box’ fit for all projects,” he says. “The key is to engage with a qualified and knowledgeable prospective service provider.”

 The problem with paper

Clearing out a paper archive is a good opportunity to dispose of material that is no longer needed or could fall foul of privacy legislation. But, while scanning paper and turning it into a PDF file on a hard drive may reduce the amount of physical space required, it doesn’t make the captured data any easier to mine for business value. That requires making it machine-readable, which means performing optical character recognition (OCR), says Phumlani Nhlanganiso Khoza, associate lecturer in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand.

“There are numerous technologies that support OCR,” Khoza says. “In some cases, the process can be laborious, for example, when extracting data from original and valuable historical records. However, the process is economically viable when delicate handling is not a requirement. Heavily stained documents, or unclear text, cannot be readily handled.”

 Is AI the answer?

Of course, archiving is getting more sophisticated. Melissa Jantjies, associate systems engineer at software multinational SAS, says that applying analytics and artificial intelligence (AI) can help sift information and make the digitalisation
process smoother.

“While companies know that there is value in the data, one of the main challenges is how to analyse it and extract meaningful value for decision-making and insights,” says Jantjies, “When dealing with paper archives, text analytics is appropriate, especially when the volume of unstructured content is no longer economical to manage manually. Often human evaluation of text data can be inconsistent and error-prone.

“To truly digitally transform and leverage text analytics, automation is the key,” she continues. “Raw data can be transformed using Natural Language Processing (NLP) and machine learning. NLP is one of the key components in text analytics.”

All of which means that if you were hoping to get rid of paper with an intern and a scanner, it’s probably time to revisit your plans.

Image: ©iStock - 533981720

You might be interested in these articles?