한국해양대학교

Login Library

KMOU Repository 한국해양대학교 대학원 컴퓨터공학과 Thesis

Detailed Information

Metadata Downloads

사례기반 학습을 이용한 한국어 어절 분류

DC Field	Value	Language
dc.contributor.author	朴浩珍	-
dc.date.accessioned	2017-02-22T06:17:00Z	-
dc.date.available	2017-02-22T06:17:00Z	-
dc.date.issued	2002	-
dc.date.submitted	56797-10-27	-
dc.identifier.uri	http://kmou.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002173883	ko_KR
dc.identifier.uri	http://repository.kmou.ac.kr/handle/2014.oak/9270	-
dc.description.abstract	Generally, Internet users have exploited search engines to find the information that they need. Such search engines require fast processing and particularly morphological analysis in Korean. The notorious problem in Korean morphological analysis is over-generation, which is caused by the lack of morphotactics. This paper describes the eojeol classification in order to lighten the burden of the over-generation. In other word, we want to reduce the search space for morphological analysis using eojeol categories. In this paper, we propose a method for eojeol classification using an instance-based learning technique. To evaluate our proposed system, we use two test corpora (KAIST and ETRI) that are part-of-speech tagged in Korean. In addition, we use the cross validation method for training and evaluation since the test corpora are not enough. The average accuracies of the test corpora are 97% and 96.6% under 22 features, respectively, but the average accuracy is reduced into 95.5% even though the two corpora are combined. We believe that the tragedy results from the inconsistent tagging method in spite of the larger amount of training data. To select optimal features for our system, we employ backward sequential selection. As a result, we choose 16 features as the optimal features and the performance of our system is improved by about 0.2%. Furthermore the reduction rate is 35% on average when our system is applied to Korean morphological analysis.	-
dc.description.tableofcontents	목차 1제장 서론 = 1 2제장 관련 연구 = 3 2.1 분류 = 3 2.2 사례기반 학습 = 3 2.3 결정트리 = 5 2.4 변형기반 학습 = 6 3제장 한국어 어절범주 = 8 3.1 한국어 어절 = 8 3.2 한국어 어절범주 = 9 3.2.1 체언절 = 10 3.2.2 용언절 = 11 3.2.3 수식언절 = 12 3.2.4 감탄사 = 12 3.2.5 기호 = 13 4제장 한국어 어절 분류 시스템 = 14 4.1 시스템 구성 = 14 4.2 학습단계 = 15 4.2.1 전처리기 = 16 4.2.2 어절범주 부착기 = 18 4.2.3 자질 추출기 = 19 4.2.4 사례기반 학습 = 22 4.3 실행단계 = 24 4.3.1 자질 추출기 및 어절 분류기 = 25 4.3.2 후처리기 = 25 5제장 실험 및 평가 = 27 5.1 실험 말뭉치 = 27 5.2 성능 평가 방법 = 27 5.3 어절 분류기 성능 = 28 5.4 자질 최적화 = 29 5.5 최적 성능 = 31 5.6 오류 분석 = 32 5.7 형태소 분석 축소율 = 34 6제장 결과 및 향후 연구방향 = 36 참고 문헌 = 38	-
dc.publisher	韓國海洋大學校	-
dc.title	사례기반 학습을 이용한 한국어 어절 분류	-
dc.title.alternative	Korean Eojeol Classification Using Instance-based Learning	-
dc.type	Thesis	-

Appears in Collections:: 컴퓨터공학과 > Thesis

Files in This Item:: 000002173883.pdf Download

메타데이터 간략히 보기

qrcode

트윗하기

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

OAK

ywm85@kmou.ac.kr Tel: 051-410-4085

KMOU Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.

Browse

Library LOGIN