RDBMS Index Support for Sparse Data Sets

Beckmann, Jennifer; Chu, Eric; Naughton, Jeffrey

RDBMS Index Support for Sparse Data Sets

dc.contributor.author	Beckmann, Jennifer	en_US
dc.contributor.author	Chu, Eric	en_US
dc.contributor.author	Naughton, Jeffrey	en_US
dc.date.accessioned	2012-03-15T17:20:33Z
dc.date.available	2012-03-15T17:20:33Z
dc.date.created	2006	en_US
dc.date.issued	2006	en_US
dc.description.abstract	Maintenance costs and storage overheads incurred by indexes often limit the number of indexes created per table in an RDBMS. For sparse data, where a table may have hundreds of attributes, indexing only a few attributes means that a vanishingly small percentage of attributes will have indexes, which unfortunately means that a table scan is the only evaluation plan for almost all selection queries on that table. This paper demonstrates that sparsity of the data actually enables index support for most, if not all, attributes in the data. Our approach leverages "sparse indexes", which are partial indexes that store only non-null values. Sparse indexes incur low maintenance costs and storage overheads because most values in a sparse table are null. Properties of the data lead us to two other contributions toward index support for sparse data; we show that sparse indexes benefit greatly from building all indexes in one-pass of the data; and we identify that multi-column sparse indexes are preferable as covering indexes when attributes in the data are correlated. We qualitatively evaluate our approaches with synthetic and real-world data to show that our suggestions significantly out-perform traditional indexing approaches designed for dense data.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	TR1566	en_US
dc.identifier.uri	http://digital.library.wisc.edu/1793/60506
dc.publisher	University of Wisconsin-Madison Department of Computer Sciences	en_US
dc.title	RDBMS Index Support for Sparse Data Sets	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR1566.pdf
Size:: 2.1 MB
Format:: Adobe Portable Document Format

Download

Collections

CS Technical Reports