한국해양대학교

Detailed Information

Metadata Downloads

OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients

Title
OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients
Author(s)
NURUL AZZAHRA PUTRI BINTI KAMIS
Keyword
OCR, Post-Processing, Proprietary Dictionary, Tokenization, Tesseract, Food allergy, packaged food, OCR 후 처리
Issued Date
2020
Publisher
한국해양대학교 대학원
URI
http://repository.kmou.ac.kr/handle/2014.oak/12342
http://kmou.dcollection.net/common/orgView/200000283919
Abstract
These days, OCR and digital image processing technology are rapidly developing and there are many application areas in research and industry. This thesis presents a method to produce a better and reliable recognition by manipulating the output of OCR process in domain specific word recognition tasks. The output of OCR is improved by two post-processing steps: the tokenization and the extraction of correct word using dictionaries. The tokenization is a process where texts retrieved by OCR are seperated into word tokens. Then the tokens are compared with english and proprietary dictionaries in sequence. English dictionary is used to convert the word tokens into correct words candidates, while proprietary dictionary is used as a guide to select only meaningful words in the domain specific task. The practicality of the proposed approach was demonstrated in the task of text recognition of the ingredients list printed on the cover of the packaged foods. Based on the uploaded image of packaged food, the system performs OCR to get the editable texts. The editable texts are then tokenized into word tokens before the post-processing steps. Word tokens are then converted into correct words by the processes implicates the use of dictionaries.The result of these combined approaches on the system are reliable as it gives an accurate result of the ingredients without useless characters and nonessential ingredients.
Appears in Collections:
해운항만물류학과 > Thesis
Files in This Item:
200000283919.pdf Download

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse