On the Geometric and Statistical Interpretation of Data Augmentation

Feng, Zhili

On the Geometric and Statistical Interpretation of Data Augmentation

dc.contributor.advisor
dc.contributor.author	Feng, Zhili
dc.date.accessioned	2019-05-28T16:47:08Z
dc.date.available	2019-05-28T16:47:08Z
dc.date.issued	2019-05-10
dc.description.abstract	Data augmentation (DA) is a common technique in training machine learning models. For example in image classifications, people augment image datasets by random cropping, rotating, and adding random noises. Another trending technique is the adversarial training, where the datasets are augmented by adversarial examples. Despite its empirical effectiveness, the theory behind DA is rarely known. In this thesis, we analyze why DA generalizes and robustifies our models, from both geometric and statistical points of view. Geometrically, we provide both upper and lower bounds on the margins created by DA, via convex geometric arguments. The upper bound on the margin is distribution-independent, while the lower bound on the margin fits a wide range of probability distributions. Statistically, we prove that DA helps generalization by controlling the stability of our learning algorithm, in a very small cost, given the training data is sufficiently large. In addition, with the same sample complexity, noise robustness is guaranteed.	en_US
dc.identifier.citation	TR1858	en_US
dc.identifier.uri	http://digital.library.wisc.edu/1793/79132
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	tech reports;TR1858
dc.subject	statistical learning theory	en_US
dc.subject	stability	en_US
dc.subject	robustness	en_US
dc.title	On the Geometric and Statistical Interpretation of Data Augmentation	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR1858.pdf
Size:: 325.43 KB
Format:: Adobe Portable Document Format
Description:: Master's Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.92 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

CS Technical Reports