Digital documents: free flow of information

17 Aug 2002
Any agency that claims to provide a news service ends up building a sizeable collection of news clippings, which continues to grow alarmingly rapidly. Given that newspapers have a limited shelf life, preserving the clippings is a difficult job; preserving them in a way that permits easy access to the information they contain is even more difficult. IT (information technology) offers several solutions to this problem, one of the easy ones being to scan each clipping and store the image. Conventional indexing provides the means of retrieving the images relevant to a given search, all the clippings on water harvesting or transgenic plants or electricity tariffs or whatever. However, this does not provide access to individual words or phrases within the text of a clipping because the entire text of the clipping is stored as a single image; as far as the technology is concerned, it could as well be the image of an intricately embroidered carpet or even a painting. The information contained in the clipping is accessible only through the few words chosen by the indexer as representatives of its substance. Contrast this with a truly ?digital? document, every single word or even every single character of which is retrievable, thus making it possible narrowly focused searches?not just all the clippings on water harvesting but only those dealing with recharge of roundwater; not just those on transgenic plants but only those dealing with resistance to insects; and so on. Digital documents thus liberate information from the constraints imposed by form and set it free?if it was IT that brought about the split between the container and the content, it is IT again that can transcend the boundaries and make free flow of information a reality.