Innovative Algorithm Funding

12th June 2019

Congratulations to Dr Michael Schimmelpfennig on his Australian Centre on China in the World Grant!

Dr Michael Schimmelpfennig – Digital Text Analysis in Traditional Chinese Studies: Exploring Possibilities of the Computational Analysis of Multilayered Texts.

Digital Humanities (DH) is a growing field of study and specialisation worldwide, including in Chinese Studies. One exciting area to emerge is digital analysis of Chinese texts.

On the surface, it may seem a simple process to create a computer program to analyse texts. For example, if you had lines from an English language poem you could copy and paste it into some software program to analyse its composition or compare it with other poems of a similar kind or era, which would likely only take seconds.

During our discussion with Michael, we found out that traditional Chinese literature is not nearly that easy to decipher, even with the advances in modern technology such as computational text analysis.

In fact, conducting digital analysis on traditional Chinese texts is very challenging, as Michael explains, because Chinese is an “isolated language” and though, “every character stands for a word” there is no indication of a character's word class, case or conjugation, and thus “no discernible way of how a sentence is put together”. This makes it particularly difficult to use algorithms to make meaning of a text”. Without parsing, a computer would view the entirety of characters that constitute a traditional Chinese book as a single sentence.

To get around this problem, scholars in Chinese studies have focused their attention on recognizable entities within a text like names of persons and places, dates, official titles, and the like; trying to retrieve historical links that existed between them and arrive at networks of persons and places that could even be mapped geographically.

Other approaches look at the particular use of certain characters within texts. One somewhat crude method is to look at the frequency of the appearance of certain characters within groups of Chinese texts to arrive at hypotheses regarding the topical emphasis based on the terminology that is most frequently employed by their authors.

More refined are narrowing-methods, like searches that show a defined number of characters that precede or follow the target expression. Such an approach allows to compare character strings surrounding a target expression in a sequence of different texts. It can be used, for example, to trace changes in meaning and usage of the target expression over time.

Michael’s current project, ‘Interpretation as Impediment? The Impact of the Commentarial Tradition to the Songs of Chu (Chuci)’ on Contemporary Research in China’, attempts to develop such methods further to gain deeper insights into the mechanics of interpretation.

If we think of philosophy we normally think of a group of thinkers or teachers and disciples involved in debates that tackle problems of the human condition or existence. In China those kinds of discussions took on the form of written commentaries to an existing corpus of normative texts, i.e. ancient literature that provides values or ethical norms for a given culture. These commentaries not only provide strong clues to the understanding of such literature, but their influence on the culture becomes nearly as important as the literature itself.

Still, commentarial interpretations were contested over time and subsequent commentaries were written to adapt an interpretation to the requirements or the advance of knowledge of a certain era. However, during this process there are certain understandings of normative texts that tend to solidify and become paradigms.

Michael's grand hypothesis is that “in cultures with rich commentarial traditions like China, these paradigms ultimately become so ingrained that they not only direct but may limit or even forestall scientific advance.” With a combination of methods of philology, reception history, and computational text analysis; Michael attempts to understand “the impact of traditional Chinese commentaries on research in the Humanities in modern and contemporary China." The idea is to combine these approaches, in order to identify areas of contested interpretation that Michael calls hot spots, in order to distinguish between “imperatives and scopes of commentarial understanding”.

Not only in regard to the computational approach, the devil lies in the details. As Michael explains, “you have the main text which is a poem, and then you have an interlinear commentary that refers to certain characters in the main text and explains them. There are different techniques to do this, and my question is, would it be possible to use computational text analysis to compare different commentaries to the same main text? And one simple difficulty is that these commentaries are certainly not similar. For example one commentary says, my analysis goes through the poem line by line, one says, no it has couplets, so the immediate question arises how can you tag or indicate that the couplet commentary relates to that poetic line that the other commentary relates to?”

With funding from his initial Asia Pacific Innovation Program (APIP) grant, Michael was able to hire two Research Assistants to assist with the time intensive work of taking, “two commentaries and tag them, meaning entering the indicators between the Chinese characters that tell you that this line in the commentary refers to this character in the main text.” Using this method, researchers are able to “track how earlier commentaries are reused or modified by later commentators” to “monitor how uses develop”.

With the funding from his China in the World grant, Dr Schimmelpfennig will host a 3 day workshop in December to further explore the topic with invited international experts in the fields of Chinese studies and DH. It will be a platform for ANU scholars and students to discover the latest advances in the specialised field of computational analysis of Traditional Chinese texts. In addition, it will be the first forum of its kind to examine possibilities of applying such techniques to multilayered texts, such as texts and commentaries.

Through the development of new networks and collaborations stemming from the workshop, this innovative project will lead to new lines of research, publications and potentially larger research funding opportunities such as ARC grants.

Additionally, Michael plans to use a portion of the CIW funding to provide local and external students the opportunity to win one of eight $300 mini grants for most creative and innovative DH poster. The winning students can present their posters at the workshop to receive valuable feedback from experts. The posters will be part of an exhibition in CIW. Information about the mini grants will be released in the coming months. The 3 day workshop event will also include a public lecture. Details about the lecture will be released in the near future.

For further information about Dr Michael Schimmelpfennig and his research, please click here.

Tags: CIW, China, algorithm

Updated:  7 July 2017/Responsible Officer:  Director, Culture, History & Language/Page Contact:  CHL webmaster