The Relationship Between Precision-Recall and ROC Curves

dc.contributor.authorDavis, Jesseen_US
dc.contributor.authorGoadrich, Marken_US
dc.date.accessioned2012-03-15T17:20:00Z
dc.date.available2012-03-15T17:20:00Z
dc.date.created2006en_US
dc.date.issued2006en_US
dc.description.abstractReceiver Operator Characteristic (ROC) curves and Precision-Recall (PR) curves are commonly used to present results for binary decision problems in machine learning. When the class distribution is close to being uniform, ROC curves have many desirable properties. However, when dealing with a highly skewed dataset, PR curves give a more accurate picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space. We prove that a curve dominates in ROC space if and only if it dominates in PR space. An important corollary to this proof is the notion of an achievable PR curve, and we show an efficient algorithm for computing the achievable PR curve. While it cannot be called a convex hull, this curve has properties much like the convex hull in ROC space. Finally, we show that differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between point. Furthermore, an algorithm which optimizes the area under the ROC curve is not guaranteed to optimize the area under the PR curve.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTR1551en_US
dc.identifier.urihttp://digital.library.wisc.edu/1793/60482
dc.publisherUniversity of Wisconsin-Madison Department of Computer Sciencesen_US
dc.titleThe Relationship Between Precision-Recall and ROC Curvesen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR1551.pdf
Size:
1.27 MB
Format:
Adobe Portable Document Format