Data Mining Revision Controlled Document History Metadata for Automatic Classification

dc.contributor.advisorEthan Munson
dc.contributor.committeememberSusan McRoy
dc.contributor.committeememberDietmar Wolfram
dc.creatorMaass, Dustin
dc.date.accessioned2025-01-16T18:49:02Z
dc.date.available2025-01-16T18:49:02Z
dc.date.issued2013-12-01
dc.description.abstractVersion controlled documents provide a complete history of the changes to the document, including everything from what was changed to who made the change and much more. Through the use of cluster analysis and several sets of manipulated data, this research examines the revision history of Wikipedia in an attempt to find language-independent patterns that could assist in automatic page classification software. Utilizing two sample data sets and applying the aforementioned cluster analysis, no conclusive evidence was found that would indicate that such patterns exist. Our work on the software, however, does provide a foundation for more possible types of data manipulation and refined clustering algorithms to be used for further research into finding such patterns.
dc.identifier.urihttp://digital.library.wisc.edu/1793/87444
dc.relation.replaceshttps://dc.uwm.edu/etd/296
dc.subjectClassification
dc.subjectClustering
dc.subjectHistory
dc.subjectRevision
dc.subjectWikipedia
dc.titleData Mining Revision Controlled Document History Metadata for Automatic Classification
dc.typethesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Maass_uwm_0263m_10459.pdf
Size:
132.38 KB
Format:
Adobe Portable Document Format
Description:
Main File