한국해양대학교

Detailed Information

Metadata Downloads

OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients

DC Field Value Language
dc.contributor.advisor 신옥근 -
dc.contributor.author NURUL AZZAHRA PUTRI BINTI KAMIS -
dc.date.accessioned 2020-07-22T04:17:48Z -
dc.date.available 2020-07-22T04:17:48Z -
dc.date.issued 2020 -
dc.identifier.uri http://repository.kmou.ac.kr/handle/2014.oak/12342 -
dc.identifier.uri http://kmou.dcollection.net/common/orgView/200000283919 -
dc.description.abstract These days, OCR and digital image processing technology are rapidly developing and there are many application areas in research and industry. This thesis presents a method to produce a better and reliable recognition by manipulating the output of OCR process in domain specific word recognition tasks. The output of OCR is improved by two post-processing steps: the tokenization and the extraction of correct word using dictionaries. The tokenization is a process where texts retrieved by OCR are seperated into word tokens. Then the tokens are compared with english and proprietary dictionaries in sequence. English dictionary is used to convert the word tokens into correct words candidates, while proprietary dictionary is used as a guide to select only meaningful words in the domain specific task. The practicality of the proposed approach was demonstrated in the task of text recognition of the ingredients list printed on the cover of the packaged foods. Based on the uploaded image of packaged food, the system performs OCR to get the editable texts. The editable texts are then tokenized into word tokens before the post-processing steps. Word tokens are then converted into correct words by the processes implicates the use of dictionaries.The result of these combined approaches on the system are reliable as it gives an accurate result of the ingredients without useless characters and nonessential ingredients. -
dc.description.tableofcontents Abstract iv List of Abbreviations vi List of Tables vii List of Figures viii Chapter I: Introduction 1.1 Background of Research 1 1.2 Research Objectives 2 Chapter II: Literature Review 2.1 Review of Research Topics 4 2.1.1 Review of Optical Character Recognition (OCR) 4 2.1.2 Review of The Tesseract OCR Engine 6 2.1.3 Review of OCR Post-Processing 8 2.1.4 Review of Food Intolerance, Allergies and Auto-immune Diseases 9 2.2 Review of Related Work 10 2.2.1 Eatable 10 2.2.2 Food Allergy Scanner 12 Chapter III: System Design 3.1 Overall Architecture of the System 14 3.2 Get Image of Product’s Ingredients 16 3.2.1 Client-Server Architecture 16 3.3 Perform OCR 17 3.3.1 OCR and Its Pre-Processing 18 3.4 Post-Processing of OCR 20 3.4.1 Tokenization 21 3.4.2 Extract Correct Ingredients Using Dictionaries 23 3.5 Search Harmful Ingredients for The User using Database 25 3.6 Notify Result to the user 27 Chapter IV: System Implementation 4.1 Overall Explanation 28 4.1.1 Pre-Processing of OCR 31 4.1.2 Post-Processing of OCR 32 4.2 Database of the System 36 4.3 System Prototype using Android Studio 39 Chapter V: Conclusion 42 References 44 Acknowledgement 50 -
dc.format.extent 61 -
dc.language eng -
dc.publisher 한국해양대학교 대학원 -
dc.rights 한국해양대학교 논문은 저작권에 의해 보호받습니다. -
dc.title OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients -
dc.type Dissertation -
dc.date.awarded 2020. 2 -
dc.contributor.department 대학원 컴퓨터공학과 -
dc.description.degree Master -
dc.identifier.bibliographicCitation NURUL AZZAHRA PUTRI BINTI KAMIS. (2020). OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients. -
dc.subject.keyword OCR, Post-Processing, Proprietary Dictionary, Tokenization, Tesseract, Food allergy, packaged food, OCR 후 처리 -
dc.title.translated 식품 불내성 환자를 위한 포장 식품의 OCR 기반 안전 확인시스템 -
dc.identifier.holdings 000000001979▲200000001565▲200000283919▲ -
Appears in Collections:
해운항만물류학과 > Thesis
Files in This Item:
200000283919.pdf Download

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse