Scalable Anonymization Algorithms for Large Data Sets

dc.contributor.authorLeFevre, Kristenen_US
dc.contributor.authorDeWitt, Daviden_US
dc.date.accessioned2012-03-15T17:21:29Z
dc.date.available2012-03-15T17:21:29Z
dc.date.created2007en_US
dc.date.issued2007en_US
dc.description.abstractk-Anonymity is a widely-studied mechanism for protecting identity when distributing non-aggregate personal data. This basic mechanism can also be extended to protect an individual-level sensitive attribute. Numerous algorithms have been developed in recent years for generalizing, clustering, or otherwise manipulating data to satisfy one or more anonymity requirements. However, few have considered large-scale input data sets that do not fit in main memory. This paper proposes two techniques for incorporating (external) scalability into an existing algorithmic framework. The first technique is based on ideas from scalable decision tree construction, and the second technique is based on sampling. In both cases, the resulting algorithms are guaranteed to produce output data that satisfies the given anonymity requirements. We evaluate the performance of each algorithm both analytically and experimentally.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTR1590en_US
dc.identifier.urihttp://digital.library.wisc.edu/1793/60548
dc.publisherUniversity of Wisconsin-Madison Department of Computer Sciencesen_US
dc.titleScalable Anonymization Algorithms for Large Data Setsen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR1590.pdf
Size:
641.79 KB
Format:
Adobe Portable Document Format