End-Biased Samples for Join Cardinality Estimation
| dc.contributor.author | Estan, Cristian | en_US |
| dc.contributor.author | Naughton, Jeffrey F. | en_US |
| dc.date.accessioned | 2012-03-15T17:19:38Z | |
| dc.date.available | 2012-03-15T17:19:38Z | |
| dc.date.created | 2005 | en_US |
| dc.date.issued | 2005 | en_US |
| dc.description.abstract | We present a new technique for using samples to estimate join cardinalities. This technique, which we term �end-biased samples,� is inspired by recent work in network traffic measurement. It improves on random samples by using coordinated pseudo-random samples and retaining the sampled values in proportion to their frequency. We show that end-biased samples always provide more accurate estimates than random samples with the same sample size. The comparison with histograms is more interesting � while end-biased histograms are somewhat better than end-biased samples for uncorrelated data sets, end-biased samples dominate by a large margin when the data is correlated. Finally, we compare end-biased samples to the recently proposed �skimmed sketches� and show that neither dominates the other, that each has different and compelling strengths and weaknesses. These results suggest that endbiased samples may be a useful addition to the repertoire of techniques used for data summarization. | en_US |
| dc.format.mimetype | application/pdf | en_US |
| dc.identifier.citation | TR1541 | en_US |
| dc.identifier.uri | http://digital.library.wisc.edu/1793/60464 | |
| dc.publisher | University of Wisconsin-Madison Department of Computer Sciences | en_US |
| dc.title | End-Biased Samples for Join Cardinality Estimation | en_US |
| dc.type | Technical Report | en_US |
Files
Original bundle
1 - 1 of 1