RDBMS Index Support for Sparse Data Sets

dc.contributor.authorBeckmann, Jenniferen_US
dc.contributor.authorChu, Ericen_US
dc.contributor.authorNaughton, Jeffreyen_US
dc.date.accessioned2012-03-15T17:20:33Z
dc.date.available2012-03-15T17:20:33Z
dc.date.created2006en_US
dc.date.issued2006en_US
dc.description.abstractMaintenance costs and storage overheads incurred by indexes often limit the number of indexes created per table in an RDBMS. For sparse data, where a table may have hundreds of attributes, indexing only a few attributes means that a vanishingly small percentage of attributes will have indexes, which unfortunately means that a table scan is the only evaluation plan for almost all selection queries on that table. This paper demonstrates that sparsity of the data actually enables index support for most, if not all, attributes in the data. Our approach leverages "sparse indexes", which are partial indexes that store only non-null values. Sparse indexes incur low maintenance costs and storage overheads because most values in a sparse table are null. Properties of the data lead us to two other contributions toward index support for sparse data; we show that sparse indexes benefit greatly from building all indexes in one-pass of the data; and we identify that multi-column sparse indexes are preferable as covering indexes when attributes in the data are correlated. We qualitatively evaluate our approaches with synthetic and real-world data to show that our suggestions significantly out-perform traditional indexing approaches designed for dense data.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTR1566en_US
dc.identifier.urihttp://digital.library.wisc.edu/1793/60506
dc.publisherUniversity of Wisconsin-Madison Department of Computer Sciencesen_US
dc.titleRDBMS Index Support for Sparse Data Setsen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR1566.pdf
Size:
2.1 MB
Format:
Adobe Portable Document Format