기계학습을 이용한 핵심 특허문헌 추출시스템에 관한 연구
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 류길수 | - |
dc.contributor.author | 윤병수 | - |
dc.date.accessioned | 2019-12-16T02:42:07Z | - |
dc.date.available | 2019-12-16T02:42:07Z | - |
dc.date.issued | 2017 | - |
dc.identifier.uri | http://repository.kmou.ac.kr/handle/2014.oak/11371 | - |
dc.identifier.uri | http://kmou.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002330737 | - |
dc.description.abstract | The government as well as corporations is promoting research and development (R&D) of new growth engines that are internationally competitive to overcome the ongoing global economic downturn. To do this, they set the direction of the R&D by using some patent trend analysis from the stage of planning and evaluation of the R&D in order to create valuable patented technology with international competitiveness. Such a patent trend analysis, however, is a time-consuming and error-prone task because it requires patent researchers to manually examine the extracted candidate patent documents one by one and to understand the patent technology out of their expertise. This is a serious problem. In this dissertation, we propose a method for extracting core patent documents using information retrieval and machine learning. The method contains three steps: 1) extract valid patent documents from retrieved patent documents using a patent search service; 2) classify the valid patent documents into sub-technology categories; 3) finally extract core patent documents from valid patent documents classified by sub-technology categories. The first step ranks retrieved patent documents to obtain valid patent documents for a given queried technology by cosine similarity between the vector of each retrieved patent document and that of the technical summary as the queried technology. The second step classifies valid patent documents into sub-technology categories using a five layered neural network, of which the input is TF-IDF weights and technology-related weights for each valid patent documents. The final step extracts core patent documents from the valid patent document classified by sub-technology categories. In detail, valid patent documents is ranked by linear combination of patent feature values (for instance, impact factor, the number of family nations, cosine similarity, and so on) and a patent feature priority. For the evaluation, we analyzed patent trends on radiopharmaceuticals as an example. The patent search service retrieved 4,603 candidate patent documents for a technical summary as a queried technology. We compared the results of the proposed system and those obtained manually by a patent investigator in time and accuracy. First, in the execution time, it takes 13,095 minutes to perform manual operations, while the proposed system performed the same operations for 134 minutes. It is 97 times as fast as the manual operations can. And the proposed system have shown the accuracy of 86.88% for extracting valid patent documents, the accuracy of 91.08% for classifying into detailed technology categories, and the accuracy of 75.76% for extracting core patent documents. Consequentially, we have shown that the proposed system is effective because it helps patent researchers to save the time and to reduce the errors. In the future, we will improve the performance of the proposed system in accuracy using a cutting-edge technology like deep learning and apply to several areas except radiopharmaceuticals. | - |
dc.description.tableofcontents | 1. 서 론 1 1.1 연구의 배경 1 1.2 연구의 목적 및 범위 6 1.3 연구방법 7 2. 관련 연구 9 2.1 정보검색 9 2.1.1 색인어 검출 및 가중치 9 2.1.2 문헌유사도 12 2.2 문서분류 16 2.3 신경망이론 20 2.4 특허문헌의 특징 24 3. 핵심 특허문헌 추출 방법 27 3.1 핵심 특허문헌 추출시스템의 개요 27 3.2 유효 특허문헌 추출 30 3.2.1 자질추출 30 3.2.2 기술요약서의 자질추출 30 3.2.3 특허문헌의 자질추출 32 3.2.4 유사도 측정 39 3.3 세부기술별 특허문헌 분류 40 3.4 핵심 특허문헌 추출 44 4. 실험 및 평가 48 4.1 실험 절차의 개요 48 4.2 실험 환경 50 4.3 실험 자료 51 4.4 평가 방법 62 4.5 성능 평가 63 4.5.1 유효 특허문헌 추출 63 4.5.2 세부기술별 특허문헌 분류 91 4.5.3 핵심 특허문헌 추출 97 4.6 결과 분석 106 5. 결론 및 향후 과제 108 참고문헌 110 | - |
dc.format.extent | 131 | - |
dc.language | kor | - |
dc.publisher | 한국해양대학교 대학원 | - |
dc.rights | 한국해양대학교 논문은 저작권에 의해 보호받습니다. | - |
dc.title | 기계학습을 이용한 핵심 특허문헌 추출시스템에 관한 연구 | - |
dc.type | Dissertation | - |
dc.date.awarded | 2017-02 | - |
dc.contributor.alternativeName | Yoon, Byung Soo | - |
dc.contributor.department | 대학원 컴퓨터공학과 | - |
dc.contributor.affiliation | 한국해양대학교 대학원 | - |
dc.description.degree | Doctor | - |
dc.subject.keyword | 특허동향분석, Machine Learning, 기계학습, Cosine Similarity, 코사인 유사도, Neural Network, 신경망 | - |
dc.type.local | Text | - |
dc.title.translated | Extracting Core Patent Documents Using Machine Learning | - |
dc.identifier.holdings | 000000001979▲000000006780▲000002330737▲ | - |
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.