Bulletin Spring‧Summer Autumn‧Winter 1999
Progress Including gearing-up time for computer assistants, it has taken nine months (September 1989 to May 1990) for about two million characters from 27 texts including Zhanguoce, L i j i , and Zhouli to be entered into the database. By 1992, a database consisting o f eight million characters from 102 texts w i ll have been fully established. Proofreading is entrusted to researchers of the project and senior students of the Chinese Language and Literature Department. Each text w i ll be proofread at least seven times after it has been entered into the computer to ensure a high degree of accuracy. Concordances will be published when the computerization of each individual text has been finished. The first one to appear by the end of 1990 w i ll be the concordance to Zhanguoce, which w i ll be divided into six sections: (i) Foreword, (ii) How to use the Concordance, (iii) Radical/Stroke Index, (iv) Index of Syllables in Hanyu pinyin and Wade-Giles Romanization, (v) Text (with textual notes), (vi) Concordance and Appendix. Difficulties Encountered The project team have come across three major technical difficulties: (1) Special and archaic characters not available on commercial software have to be created ‘by hand'. A l l the current Chinese software systems provide facilities for the creation of a limited number of new characters. The ET system, for example, allows for the creation of 5809 new characters, the highest figure to date. But w i ll this be adequate for the database? The researchers cannot yet give an answer, because they do not know how many characters beyond the Big 5 internal code (the cod e used by the ET system and most other Chinese systems to compose individual characters) they w i ll need. So far, they need only to have created about 300 new characters for the two-million-character database, but they believe that the software's capacity for creating new characters w i ll be taxed to the limit when they come to the ancient Chinese dictionaries such as the Shuowen jiezi. They have therefore considered several emergency measures such as establishing an individual character-making file for each text in case the character-making capacity of the ET system proves to be insufficient. (2) To meet the standard required for publication, there is the need for operators to create the ‘high definition character pattern'. This is a task that requires sound knowledge about Chinese calligraphy. (3) Most of the texts to be entered into the database were published in the Song or Ming dynasties and the characters contained therein may be very different from standardized forms used in modem Chinese software systems. Operators have frequently found it difficult to identify certain character forms. Whenever differences in shape, however minor, occur, operators w i ll need to consult the researchers, who in turn have to consult the dictionary. This procedure to ensure accuracy tends to slow down progress. Conclusion The establishment of the database of the entire body of extant Han and pre-Han traditional Chinese texts is a mammoth task and a novel attempt. The project team are aware that difficult problems may crop up from time to time but they w i ll strive to overcome any difficulty to bring the project to fruition, for they are full y convinced of the significant contribution the project can make to the studies of ancient Chinese language and culture. Project Team M r . H o , A l a n Y . S . C o n v e n e r: m o n i t o r s p r o g r e s s Associate Director of CSC of the project and makes d e c i s i o n s M r . L e u n g , P h i l i p K . H . P r o g r a m m i n g c o n s u l t a n t Computer Officer M r . H o , K w o k K i t P r o g r a m m er Assistant Computer Officer M r . C h u , K w o k F a n C o n s u l t a nt o n t e x t m a n a g e m e n t Assistant Editor of ICS M r . H o , C h e W a h P r o j e ct c o o r d i n a t o r Project Coordinator of ICS M r . N g , C h o k K i P r o g r a m m i ng a s s i s t a n t Assistant Computer Officer M r . N g , E m i e W . T . P r o g r a m m i ng c o n s u l t a n t (resigned) R E S E A R CH — 9
Made with FlippingBook
RkJQdWJsaXNoZXIy NDE2NjYz