End-Biased Samples for Join Cardinality Estimation

dc.contributor.authorEstan, Cristianen_US
dc.contributor.authorNaughton, Jeffrey F.en_US
dc.date.accessioned2012-03-15T17:19:38Z
dc.date.available2012-03-15T17:19:38Z
dc.date.created2005en_US
dc.date.issued2005en_US
dc.description.abstractWe present a new technique for using samples to estimate join cardinalities. This technique, which we term �end-biased samples,� is inspired by recent work in network traffic measurement. It improves on random samples by using coordinated pseudo-random samples and retaining the sampled values in proportion to their frequency. We show that end-biased samples always provide more accurate estimates than random samples with the same sample size. The comparison with histograms is more interesting � while end-biased histograms are somewhat better than end-biased samples for uncorrelated data sets, end-biased samples dominate by a large margin when the data is correlated. Finally, we compare end-biased samples to the recently proposed �skimmed sketches� and show that neither dominates the other, that each has different and compelling strengths and weaknesses. These results suggest that endbiased samples may be a useful addition to the repertoire of techniques used for data summarization.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTR1541en_US
dc.identifier.urihttp://digital.library.wisc.edu/1793/60464
dc.publisherUniversity of Wisconsin-Madison Department of Computer Sciencesen_US
dc.titleEnd-Biased Samples for Join Cardinality Estimationen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR1541.pdf
Size:
2.31 MB
Format:
Adobe Portable Document Format