Tunghai University Institutional Repository:Item 310901/5035
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 21921/27947 (78%)
Visitors : 4250394      Online Users : 284
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://140.128.103.80:8080/handle/310901/5035


    Title: 植基於本體論之文件摘要系統之研究-以中文股市新聞為例
    Other Titles: A Study on Ontology-based Document Summarization System for Chinese Stock News
    Authors: 徐銘忠
    Hsu, Ming-Chung
    Contributors: 呂芳懌
    Leu, Fang-Yie
    東海大學資訊工程學系
    Keywords: 文件摘要;Extraction;Abstraction;本體論
    Document summarization;Extraction;Abstraction;Ontology
    Date: 2004
    Issue Date: 2011-05-19T07:32:41Z (UTC)
    Abstract: 隨著網際網路之快速發展,人們取得資訊的管道也越來越方便,但也因此造成資訊過量(Information Overloading)及使用者不知如何面對龐大資料的問題,如何有效率且快速地取得正確所需的資訊,已成為資訊領域一項重要的課題。文件摘要(Document Summarization)技術,正好可用來過濾文章內不重要之訊息,提供較簡潔的資訊內容,方便人們在短時間內快速閱讀以尋求所需要的資訊,俾進一步的深入閱讀全文資訊,因此成為近些年來資訊探索的重要研究方向之一。 文件摘要技術計有兩種作法:摘錄(Extraction)和摘要(Abstraction)。以往文件摘要的研究,多以採取單一作法為主,本文提出以本體論(Ontology)建立股市新聞方面之領域知識(Domain knowledge),再以AFE(Abstraction From Extraction in a domain-specific, AFE)做兩段式之摘要方法。其做法是,首先利用統計方法計算每一篇文章中各個句子的權重,並依權重高低排序,以取出其中權重較高的句子作為特徵語句(Feature Sentences);再將特徵語句中所含之詞組與其詞性,依句型樣板(Sentence Pattern)重新組合成語句,淬取出文章之精華,當做摘要之結果,提供是否閱讀全文之參考,以利使用者能快速的吸收及尋找所需要之資訊。
    Under the rapid evolution of the Internet, people can conveniently gather the information needed by using browsers. This results information overloading and users do not know how to deal with such a massive data. So how to get correct information efficiently and effectively becomes an important issue. However, document summarization technologies are capable of providing concise and compact content by filtering redundant and less important information existing in the document with thm, people can catch the key meaning of a document in a very short period of time rather then spend a lot of time to read the full text. Thus they have attracted the researcher’s eyes, especially in the area of information retrieval. Conceptually, document summarization techniques can be classified into two classes:Extraction and Abstraction. In the past, most of the researches focus on only one of them. In this paper, we propose a combination of the two classes named Abstraction From Extraction(AFE) in a specific domain based on domain ontologies. In this combination, extraction is performaced first, by invoking statistical methods to rank each sentence in the document concerned. The sentences with the highest ranks are the feature sentences of the document. The structures of the most important feature sentences selected are then compared with sentence patterns previously prepared based on the characteristics of the domain concerned. Those matched the sentence patterns will be summarized providing users to decide whether they want to read the full text of the document or not. Users can then save their time to choose the correct information.
    Appears in Collections:[Department of Computer Science and Information ] Master's Theses

    Files in This Item:

    File SizeFormat
    092THU00394012-001.pdf2552KbAdobe PDF0View/Open


    All items in THUIR are protected by copyright, with all rights reserved.


    本網站之東海大學機構典藏數位內容,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback