The analysis of data in clinical records could be useful to epidemiologists in planning analytical studies and identifying new research initiatives. This paper describes the method used to develop a systematic, replicable technique for compressing many words of text into fewer content categories on the basis of explicit rules of user-defined coding, and systematically sorting a large volume of records accurately and reliably. The method was used to categorise the reasons for retirement from racing in Hong Kong of 3727 thoroughbred racehorses between the 1992/93 and 2003/04 racing seasons into a user-defined dictionary. An automated process successfully categorised 95 per cent of the records. The other 5 per cent were assigned manually to one of the dictionary categories. The whole process from initial screening to the categorisation of all the records took approximately 100 man-hours to complete.
- British Veterinary Association. All rights reserved.