Evaluation of inverse reinforcement learning

dc.contributor.advisorNguyen, Hien
dc.contributor.advisorGunawardena, Athula
dc.contributor.advisorZhou, Jiazhen
dc.contributor.authorSchmitt, Anthony
dc.date.accessioned2020-01-15T20:34:41Z
dc.date.available2020-01-15T20:34:41Z
dc.date.issued2019-12
dc.descriptionThis file was last viewed in Adobe Acrobat Pro.en_US
dc.description.abstractInverse Reinforcement Learning (IRL) is a technique that is concerned with learning the intrinsic reward function of an expert by observing them perform a task. Many different methods exist from linear programming to deep learning for efficiently computing the reward values. However, evaluation of the performance of these methods is often minimal. Typically there is an evaluation of if a goal is reached. This is not enough to establish that the reward function of an expert is successfully captured. There is no standard for evaluating the outcome of IRL. This thesis proposes using a method for measuring the accuracy of any IRL method. This measure requires the construction of two graphs. The first is a graph that is constructed from the observation trajectories that are being used with IRL. The second is a graph created by an agent implementing the results of the IRL in the environment. These can then be compared using graph edit distance, which will give a discrete measure for evaluating the accuracy of any IRL method.en_US
dc.identifier.urihttp://digital.library.wisc.edu/1793/79580
dc.language.isoen_USen_US
dc.publisherUniversity of Wisconsin--Whitewateren_US
dc.subjectReinforcement learningen_US
dc.subjectLinear programmingen_US
dc.titleEvaluation of inverse reinforcement learningen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TonySchmittThesis.pdf
Size:
2.64 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.92 KB
Format:
Item-specific license agreed upon to submission
Description: