한국해양대학교

Detailed Information

Metadata Downloads

웹 문서와 접근로그의 하이퍼링크 추출을 통한 웹 구조 마이닝

DC Field Value Language
dc.contributor.author 박철현 -
dc.date.accessioned 2017-02-22T06:48:57Z -
dc.date.available 2017-02-22T06:48:57Z -
dc.date.issued 2007 -
dc.date.submitted 56850-02-09 -
dc.identifier.uri http://kmou.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002175608 ko_KR
dc.identifier.uri http://repository.kmou.ac.kr/handle/2014.oak/9856 -
dc.description.abstract It is difficult to predict Web structures for being rapidly changed with frequent updates of documents on the Web. Nevertheless,given the structures, information providers can discover users'behavior patterns and characteristics and supply better services to users, and users can find useful information easily and exactly. This paper proposes an improved method for extracting Web structures. The method consists of two steps. The first is constructing a directed graph on Web documents as node with their hyperlinks using the depth-first search algorithm. The second is making up for the direct graph by discovering the hyperlinks, which are not extracted in the first step, called hidden hyperlinks. They can be found by analyzing Web access logs, in which click streams are contained. The click streams do not include clicks on 'Back' buttons because of the local cache problem of Web browsers. This causes the problem not finding correct hidden hyperlinks. To cope with the problems, this paper propose an algorithm on searching hidden hyperlinks. We have simulated the discovery of the hidden hyperlinks to evaluate the proposed method experimentally. Through the simulations, we have observed that the proposed method discovers most hidden hyperlinks appeared on clickstreams. In the future we should develop some tools for visualizing discovered Web structures and do study on discovering more correct hidden hyperlinks through improving the proposed algorithm. -
dc.description.tableofcontents 제 1 장 서론 = 1 제 2 장 관련 연구 = 4 21 웹 마이닝 = 4 22 웹 로그 = 6 23 웹 로그 전처리 과정 = 10 24 클릭스트림 = 14 제 3 장 웹 문서 구조화 = 15 31 웹 문서의 하이퍼링크 구조화 = 16 32 접근로그 전처리 과정 = 23 33 클릭스트림 수집 = 25 34 접근로그를 통한 정점과 간선 추가 = 28 제 4 장 시스템 구현 및 실험 = 37 41 시스템 구현환경 = 37 42 시뮬레이션 = 42 제 5 장 결론 및 향후 과제 = 51 참고문헌 = 53 -
dc.language kor -
dc.publisher 한국해양대학교 대학원 -
dc.title 웹 문서와 접근로그의 하이퍼링크 추출을 통한 웹 구조 마이닝 -
dc.title.alternative Web Structure Mining by Extracting Hyperlinks from Web Documents and Access Logs -
dc.type Thesis -
dc.date.awarded 2007-02 -
dc.contributor.alternativeName Park -
dc.contributor.alternativeName Chul-Hyun -
Appears in Collections:
컴퓨터공학과 > Thesis
Files in This Item:
000002175608.pdf Download

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse