The basic elements in computational biology are strings or sequences describing DNA or proteins. A number of questions thus require the analysis of sets of strings and the extraction of (grammatical) rules and patterns.
所以如果我們把漢字轉換成生物序列,生物資訊與漢字資訊就可以變成交流學門。這也可以是 BioNLP 的一個新領域。
We are dealing with strings, trees and graphs. When dealing with strings, we typically manipulate a 4-letter alphabet {A,T,G,C} or a 20-letter alphabet if we intend to assemble the nucleotides 3 * 3 into amino-acids in order to define proteins. There are trees involved in the patterns or in the secondary structure, and even graphs in the tertiary structure.
My research notes on Linguistic Ontology, natural language Processing and E-humanities
Showing posts with label HanziNet、Linguistic Ontology. Show all posts
Showing posts with label HanziNet、Linguistic Ontology. Show all posts
Saturday, July 23, 2011
Saturday, August 23, 2008
Hanzi, Hanzi, Hanzi!
How to deal with Chinese character-word mystery in terms of (western) linguistic morphology?
1. In principle, each character stands for a morpheme (morpho-semantically). Exceptions are 葡萄、踟躕、... where two characters are born to tie together.
In these cases, a morpheme 葡萄 is represented by two characters 葡 and 萄, respectively.
2. The types of morphemes that the character stand for vary. E.g., 今: bound root morpheme; 者: bound suffix; 打: free root morpheme; 了: inflectional
morpheme (?).
3. A Chinese word is composed of some possible combinations like (1) morpheme + word 辛苦 (2) word + morpheme 打字機 (3) morpheme 喝 (4)
morpheme + morpheme 前進
4. A Chinese compound word is composed of (at least) two words/compound words. Formally, (1) w1+w2+.....電腦 (2) cw1+w2 ...電腦螢幕
5. Criteria of the judgement of Chinese wordhood:
5.1 Stand alone in the similar semantic context
5.2 Psychological reality
5.3 Proper name and fixed expressions
6. Operation:
7. Dubious/counter-intuitive cases: 腳踏車 a word or a compound word?
1. In principle, each character stands for a morpheme (morpho-semantically). Exceptions are 葡萄、踟躕、... where two characters are born to tie together.
In these cases, a morpheme 葡萄 is represented by two characters 葡 and 萄, respectively.
2. The types of morphemes that the character stand for vary. E.g., 今: bound root morpheme; 者: bound suffix; 打: free root morpheme; 了: inflectional
morpheme (?).
3. A Chinese word is composed of some possible combinations like (1) morpheme + word 辛苦 (2) word + morpheme 打字機 (3) morpheme 喝 (4)
morpheme + morpheme 前進
4. A Chinese compound word is composed of (at least) two words/compound words. Formally, (1) w1+w2+.....電腦 (2) cw1+w2 ...電腦螢幕
5. Criteria of the judgement of Chinese wordhood:
5.1 Stand alone in the similar semantic context
5.2 Psychological reality
5.3 Proper name and fixed expressions
6. Operation:
7. Dubious/counter-intuitive cases: 腳踏車 a word or a compound word?
Subscribe to:
Comments (Atom)