OLAP over Imprecise Data With Domain Constraints
Loading...
Files
Date
Authors
Burdick, Doug
Doan, AnHai
Ramakrishnan, Raghu
Vaithyanathan, Shivakumar
Advisors
License
DOI
Type
Technical Report
Journal Title
Journal ISSN
Volume Title
Publisher
University of Wisconsin-Madison Department of Computer Sciences
Grantor
Abstract
Several recent works have focused on OLAP over imprecise data, where
each fact can be a region, instead of a point, in a multi-dimensional
space. They have provided a multiple-world semantics for such data,
and developed efficient solutions to answer OLAP aggregation queries
over the imprecise facts. These solutions however assume that the
imprecise facts can be interpreted {\em independently\/} of one another, a
key assumption that is often violated in practice. Indeed, imprecise
facts in real-world applications are often correlated, and such
correlations can be captured as domain integrity constraints (e.g.,
repairs with the same customer names and models took place in the same
city, or a text span can refer to a person or a city, but not both).
In this paper we provide a solution to answer OLAP aggregation queries
over imprecise data, in the presence of such domain constraints. We first
describe a relatively simple yet powerful constraint language, and define
what it means to take into account such constraints in query answering.
Next, we prove that OLAP queries can be answered efficiently given a
database $D*$ of fact marginals. We then exploit the regularities in the
constraint space (captured in a constraint hypergraph) and the fact space
to efficiently construct D*. Extensive experiments over real-world and
synthetic data demonstrate the effectiveness of our approach.
Description
Keywords
Related Material and Data
Citation
TR1595