Pipeline for Calculating Calories for Print Recipes with Minimal User Intervention

dc.contributor.advisorSusan McRoy
dc.contributor.committeememberEthan Munson
dc.contributor.committeememberTian Zhao
dc.creatorHolten, Karl W
dc.date.accessioned2025-01-16T18:51:53Z
dc.date.available2025-01-16T18:51:53Z
dc.date.issued2022-08-01
dc.description.abstractThe thesis will provide a pipeline to estimate calorie counts from print recipes. The pipeline takes scanned recipes from cookbooks and uses Optical Character Recognition (OCR) to convert the scanned images of recipes to text. Several OCR tools were tested for their accuracy on fractions using a sample of the data, and the most accurate tool is used on the data. Next, a specially trained named entity recognition model is used to identify ingredients, quantities and units. These ingredients are used to search a database of values from the FDA to compute a calorie count for the recipe. The thesis tests the effectiveness of search by examining performance over 100 of the most common ingredients in the corpus of recipes. Finally, the thesis tests the performance of the model on a set of recipes, and found to estimate the calorie count at least as accurately as other automated approaches, such as those based on image recognition.
dc.identifier.urihttp://digital.library.wisc.edu/1793/87508
dc.relation.replaceshttps://dc.uwm.edu/etd/3016
dc.subjectBERT
dc.subjectDietary Self Monitoring
dc.subjectNamed Entity Recognition
dc.subjectOptical Character Recognition
dc.titlePipeline for Calculating Calories for Print Recipes with Minimal User Intervention
dc.typethesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Holten_uwm_0263M_13322.pdf
Size:
956.97 KB
Format:
Adobe Portable Document Format
Description:
Main File