한국해양대학교

Detailed Information

Metadata Downloads

Network Analysis of Maritime English Corpus with Multi-word Compounds

DC Field Value Language
dc.contributor.author 이성민 -
dc.date.accessioned 2017-02-22T02:24:47Z -
dc.date.available 2017-02-22T02:24:47Z -
dc.date.issued 2016 -
dc.date.submitted 2016-03-12 -
dc.identifier.uri http://kmou.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002236076 ko_KR
dc.identifier.uri http://repository.kmou.ac.kr/handle/2014.oak/8361 -
dc.description.abstract As an official language within the international maritime community, maritime English is one of the branches of English for Specific Purposes (ESP). However, corpus linguists have paid little attention to maritime English. This thesis has two aims. The first aim is to compile a four million word maritime English corpus (MEC) consisting of academy, news, laws, and textbooks. The MEC contains tagged multi-word compounds, which can be called specific purpose terms in maritime English. Tagging multi-word compounds is essential for the ESP study because maritime vocabulary includes a great variety of n-grams such as ballast water, fore peak bulkhead, container freight station charges, etc. The second aim is to provide a further explanation of corpus linguistic data, adopting language network analysis and comparing keyword networks with collocation networks. My idea converging on corpus linguistics and language networks has been originally traced back to researches published by Jones in 1971 and Scott and Tribble in 2006. Jones discussed four types of links between keyword nodes such as strings, stars, cliques, and clumps in her keyword retrieval study. Based on Jones’ work, Scott and Tribble hypothesized that keywords could be redrawn as a network of connections to show a picture of understanding about a text or texts. By incorporating corpus linguistics and language networks, this thesis tries to explore what the structures of keywords networks and collocation networks can tell us about maritime English through centrality and cohesion algorithms. This thesis makes an attempt to answer the following two research questions. First, how can we build a corpus of maritime English to represent specific purpose terms such as multi-word compounds? Second, if language network analysis can be one of the explanatory analyses to make up for the present corpus linguistic descriptions, what can keyword networks and collocation networks tell us about the MEC? In pursuit of my research questions, I review previous studies about the concepts of keyness, collocations, and language networks. I then discuss how to compile the MEC focusing on representativeness, balance, size, and sampling, proposing a method of tagging English multi-word compounds. In addition, I propose a language network analysis in order to give a further explanatory power to the descriptions of maritime English. I compare keyword networks with collocation networks with regard to network structures using centrality and cohesion for the better understanding of maritime English. In conclusion, my network analysis and critical evaluation led us to clarify and confirm that centrality structures created by eigenvector and betweenness in collocation networks have more advantages over keyword network structures to find general purpose terms. On the other hand, the cohesion community structures created by eigenvector and betweenness in keyword networks distinguish a group of the specific purpose terms from a group of general purpose terms. More specifically, the eigenvector centrality structures in collocation networks represented better results than betweenness centrality in identifying general purpose terms. On the other hand, the eigenvector cohesion community structures in keyword networks represented better results than betweenness in identifying specific purpose terms. -
dc.description.tableofcontents Chapter 1. Introduction 1.1 Focus of Inquiry 1 1.2 Outline of the Thesis 3 Chapter 2. Literature Review 2.1 Introduction 5 2.2 Maritime English as English for Specific Purposes 5 2.3 Keywords in Text 6 2.3.1 Strategies for a Reference Corpus 7 2.3.2 Statistical Measures for Keyword Analysis 8 2.3.3 Problems of Previous Keyword Analysis 12 2.4 Collocations in Text 14 2.4.1 Types of Collocations 14 2.4.2 Statistical Measures for Window Collocations 15 2.4.3 Problems of Previous Collocation Analysis 16 2.5 Visualization in Corpus Linguistics 17 2.5.1 Text Visualizations 18 2.5.2 Collocation Networks 21 2.6 Language Networks 28 2.6.1 Basic Concepts 29 2.6.2 Previous Studies 31 2.6.3 Definitions 33 2.6.4 Types of Language Network Constructions 34 Chapter 3. Maritime English Corpus 3.1 Introduction 37 3.2 Corpus Design 37 3.3 Corpus Compilation 44 3.3.1 Stratified Random Sampling 45 3.3.2 Web Crawling and Cleansing 46 3.3.3 Converting PDF to Texts 49 3.4 Multi-word Compounds 51 3.5 Critical Evaluation and Tagging for Multi-word Compounds 54 3.6 Comparison of With and Without Compounds 62 3.6.1 Comparison of Basic Statistics 63 3.6.2 Comparison of Word Lists, N-gram Lists, and Keyword Lists 65 3.6.3 Comparison of Visualizations 69 3.6.3.1 Dispersion Plots 69 3.6.3.2 GraphColl 1.0 71 3.7 Summary and Implications 74 Chapter 4. Language Network Structure Analysis 4.1 Introduction 76 4.2 Frameworks of Network Analysis 77 4.2.1 Source Nodes and Target Nodes 77 4.2.2 Two Mode Structures and One Mode Structures 85 4.2.3 Centrality and Cohesion Algorithms 90 4.3 Comparison of Keyword Networks and Collocation Networks 92 4.3.1 Centrality Structures: Eigenvector and Betweenness 92 4.3.2 Cohesion Structures: Eigenvector and Betweenness 105 4.4 Critical Evaluation 122 4.5 Summary and Implications 128 Chapter 5. Conclusion 5.1 Summary 131 5.2 Findings and Implications 132 References 135 -
dc.language eng -
dc.publisher 한국해양대학교 국제대학 영어영문학과 -
dc.title Network Analysis of Maritime English Corpus with Multi-word Compounds -
dc.title.alternative 복수 단어로 구성된 합성어가 포함된 해사영어 코퍼스 언어네트워크 분석: 키워드 네트워크와 연어 네트워크 -
dc.type Thesis -
dc.date.awarded 2016-02 -
dc.contributor.alternativeName Sung-Min Lee -
Appears in Collections:
영어영문학과 > Thesis
Files in This Item:
000002236076.pdf Download

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse