以階層式詞義網路為基礎的中文文件分析及其效能評估

Tunghai University Institutional Repository > 工學院 > 資訊工程學系所 > 碩士論文 > Item 310901/3697

Please use this identifier to cite or link to this item: http://140.128.103.80:8080/handle/310901/3697

Title:	以階層式詞義網路為基礎的中文文件分析及其效能評估
Authors:	施政瑋
Contributors:	呂芳懌 Leu, Fang-Yie 東海大學資訊工程學系
Keywords:	資訊檢索;文件檢索;同義詞集合;相似性比對 Information retrieval;Document retrieval;Synset;Similarity comparison
Date:	2003
Issue Date:	2011-04-27T06:56:25Z (UTC)
Abstract:	本論文係以中文詞彙所蘊含的概念（concept）為基礎，探討詞彙之間的相互關係（relationship），並將中文同義詞集合（synset）依照詞彙概念，建立一個以同義詞結構為基礎的詞庫。之後配合各種資訊檢索技術（information retrieval），包括中文斷詞與文法規則、自然語言處理、查詢處理與延伸、關鍵詞索引建置（index construction）、文件分類（document clustering）等，設計出一套以概念為檢索條件的文件分析法，除了可以自然語言輸入查詢語句，以剖析關鍵詞的基礎之外，也能以詞義鏈（lexical chain）的方式進行全文檢索。本論文採用向量空間模組（Vector space model ）作為比較其績效之基準，我們將以特定領域的文件及所用術語，對文件內容作檢索並依相似程度排序，最後以召回度（recall）及精確度（precision）作為評估兩者之間效率及正確性的指標。 In this paper we research the mutual semantic relationship between terms via term concepts. We collect Chinese synonyms for building a synonyms thesaurus, and analyze documents with information retrieval tech like Chinese word tagging, natural language processing（NLP）, query processing, index constructing, etc. and build a semantic based document retrieval system.. It not only can handle keyword-based query, it can analyze documents by using lexical chain instead of keyword search to improve the accuracy in document retrieval. Besides, we evaluate retrieval performance with vector space model. Finally, we examine the system with document in certain domain and evaluate performance of the algorithm with recall and precision degree.
Appears in Collections:	[資訊工程學系所] 碩士論文

Files in This Item:

File	Size	Format
091THU00394007-001.pdf	2167Kb	Adobe PDF	466	View/Open

Loading...