An auxiliary unicode Han character lookup service based on glyph shape similarity

Tunghai University Institutional Repository > 管理學院 > 資訊管理學系所 > 會議論文 > Item 310901/23005

Please use this identifier to cite or link to this item: http://140.128.103.80:8080/handle/310901/23005

Title:	An auxiliary unicode Han character lookup service based on glyph shape similarity
Authors:	Lin, J.-W.a , Lin, F.-S.b
Contributors:	Department of Information Management, Tunghai University
Keywords:	edit distance;glyph expression;Han character lookup;Hanzi;Unicode
Date:	2011
Issue Date:	2013-05-24T09:16:27Z (UTC)
Publisher:	Hangzhou; China
Abstract:	Most legacy computer systems only well support input and display of 20,902 Han characters (Hanzis for short) encoded in Unicode 1.0. In 2010, Unicode 6.0 has encoded 75,616 Hanzis. However, it is not easy to use these newly encoded Hanzis, even in the latest computers. Most of these newly encoded Hanzis are rarely used in daily lives. Some are only used in ancient literature or individual Sinospherical countries. Users may have confusion of their glyph shapes, pronunciations, meanings, and usages. Most Chinese IMEs (input method editors) require users to have good knowledge of Hanzis. As a result, users cannot input these Hanzis. We present an auxiliary Unicode Hanzi lookup service based on glyph shape similarity. One can key in a similar Hanzi by any IME to look up the wanted Hanzi. Each Unicode Hanzi is decomposed as a glyph expression. The similarity of glyph shapes of two Hanzis is calculated based on a derived edit distance on their glyph expressions. As a result, the system provides users a convenient way to look up unfamiliar Hanzis. ? 2011 IEEE.
Relation:	11th International Symposium on Communications and Information Technologies, ISCIT 2011 2011, Article number6092155, Pages 489-492
Appears in Collections:	[資訊管理學系所] 會議論文

Files in This Item:

File	Size	Format
index.html	0Kb	HTML	551	View/Open

Loading...