Bayesian Methods and Machine Learning for Processing Text and Image Data

Gu, Yingying

Bayesian Methods and Machine Learning for Processing Text and Image Data

dc.contributor.advisor	Jun Zhang
dc.contributor.committeemember	Susan W McRoy
dc.contributor.committeemember	Tian Zhao
dc.contributor.committeemember	Yi Hu
dc.contributor.committeemember	Seyed H Hosseini
dc.creator	Gu, Yingying
dc.date.accessioned	2025-01-16T18:05:29Z
dc.date.available	2025-01-16T18:05:29Z
dc.date.issued	2017-08-01
dc.description.abstract	Classification/clustering is an important class of unstructured data processing problems. The classification (supervised, semi-supervised and unsupervised) aims to discover the clusters and group the similar data into categories for information organization and knowledge discovery. My work focuses on using the Bayesian methods and machine learning techniques to classify the free-text and image data, and address how to overcome the limitations of the traditional methods. The Bayesian approach provides a way to allow using more variations(numerical or categorical), and estimate the probabilities instead of explicit rules, which will benefit in the ambiguous cases. The MAP(maximum a posterior) estimation is used to deal with the local maximum problems which the ML(maximum likelihood) method gives inaccurate estimates. The EM(expectation-maximization) algorithm can be applied with MAP estimation for the incomplete/missing data problems. Our proposed framework can be used in both supervised and unsupervised classification. For natural language processing(NLP), we applied the machine learning techniques for sentence/text classification. For 3D CT image segmentation, MAP EM clustering approach is proposed to auto-detect the number of objects in the 3D CT luggage image, and the prior knowledge and constraints in MAP estimation are used to avoid/improve the local maximum problems. The algorithm can automatically determine the number of classes and find the optimal parameters for each class. As a result, it can automatically detect the number of objects and produce better segmentation for each object in the image. For segmented object recognition, we applied machine learning techniques to classify each object into targets or non-targets. We have achieved the good results with 90% PD(probability of detection) and 6% PFA(probability of false alarm). For image restoration, in X-ray imaging, scatter can produce noise, artifacts, and decreased contrast. In practice, hardware such as anti-scatter grid is often used to reduce scatter. However, the remaining scatter can still be significant and additional software-based correction is desirable. Furthermore, good software solutions can potentially reduce the amount of needed anti-scatter hardware, thereby reducing cost. In this work, the scatter correction is formulated as a Bayesian MAP (maximum a posteriori) problem with a non-local prior, which leads to better textural detail preservation in scatter reduction. The efficacy of our algorithm is demonstrated through experimental and simulation results.
dc.identifier.uri	http://digital.library.wisc.edu/1793/85974
dc.relation.replaces	https://dc.uwm.edu/etd/1633
dc.subject	Bayesian Method
dc.subject	Image Classification
dc.subject	Image Restoration
dc.subject	Image Segmentation
dc.subject	Machine Learning
dc.subject	Natural Language Processing
dc.title	Bayesian Methods and Machine Learning for Processing Text and Image Data
dc.type	dissertation
thesis.degree.discipline	Engineering
thesis.degree.grantor	University of Wisconsin-Milwaukee
thesis.degree.name	Doctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Gu_uwm_0263D_11895.pdf
Size:: 2.01 MB
Format:: Adobe Portable Document Format
Description:: Main File

Download

Collections

UW Milwaukee Electronic Theses and Dissertations