한국해양대학교

Login Library

KMOU Repository 한국해양대학교 대학원 컴퓨터공학과 Thesis

Detailed Information

Metadata Downloads

온톨롤지를 적용한 문서 분류에서의 자질추출

DC Field	Value	Language
dc.contributor.author	조희영	-
dc.date.accessioned	2017-02-22T06:44:40Z	-
dc.date.available	2017-02-22T06:44:40Z	-
dc.date.issued	2008	-
dc.date.submitted	56877-07-05	-
dc.identifier.uri	http://kmou.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002175515	ko_KR
dc.identifier.uri	http://repository.kmou.ac.kr/handle/2014.oak/9753	-
dc.description.abstract	With rapid development of Internet and information service techniques, a huge amount of electronic documents are steadily produced on the Web. The documents like news papers are classified by trained persons without any delay from day to day, but it is a very labor-intensive work and requires a lot of time and cost. Several studies on automatic document classification have been performed in order to lessen this burden. The studies using techniques of machine learning and natural language processing have shown successful results in the Web ining field. The performance of document classification systems is very much depending on feature sets even though there are also other many factors that can affect the performance. In this thesis, we propose methods for extracting good ature sets using ontology. Terms in documents are transformed into terms in ontology in order to reduce the size of feature sets and to compress information of the documents at the expense of some loss of the meaning. This transformation can be performed after or efore general feature selection. We use only relations of synonyms and hypernyms in Korean ontology, U-WIN which has been developed by Ulsan University. We have experimented with the proposed methods on four classifiers and nine feature selectors in order to objectively evaluate the performance of the proposed methods. The several experiments have shown that the proposed methods using ontology outperform existing feature selectors over most classifiers except a na？ve Bayesian classifier and also the method applying ontology after eature selection outperforms that before feature selection over every classifiers. We have observed that the performance of feature selectors is very sensitive to classifiers, especially Rocchio classifier. In the future, we will experiment with a large scale of documents of various fields and many languages like English and Japanese to show more objective results. The ambiguation on multiple hypernyms of a term will be tackled as word sense disambiguation problem.	-
dc.description.tableofcontents	목 차 표 목차 ⅱ 그림 목차 ⅲ Abstract ⅳ 제 1 장 1 제 2 장 관련 3 2.1 문서표현 3 2.2 자질선택 6 2.3 문서분류 8 2.4 온톨로지 11 제 3 장 온톨로지를 적용한 자질추출 15 3.1 전처리 및 자질생성 17 3.2 온톨로지 적용 18 3.3 자질선택 21 3.4 벡터표현 21 3.5 문서분류 22 제 4 장 실험 및 평가 24 4.1 실험 환경 24 4.2 평가 방법 24 4.3 성능 평가 및 분석 25 제 5 장 결론 38 참고문헌 39	-
dc.language	kor	-
dc.publisher	한국해양대학교 대학원	-
dc.title	온톨롤지를 적용한 문서 분류에서의 자질추출	-
dc.type	Thesis	-
dc.date.awarded	2008-02	-

Appears in Collections:: 컴퓨터공학과 > Thesis

Files in This Item:: 000002175515.pdf Download

메타데이터 간략히 보기

qrcode

트윗하기

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

OAK

ywm85@kmou.ac.kr Tel: 051-410-4085

KMOU Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.

Browse

Library LOGIN