Clarinet: WAN-Aware Optimization for Analytics Queries
| dc.contributor.author | Viswanathan, Raajay | |
| dc.contributor.author | Ananthanarayanan, Ganesh | |
| dc.contributor.author | Akella, Aditya | |
| dc.date.accessioned | 2017-03-08T20:46:41Z | |
| dc.date.available | 2017-03-08T20:46:41Z | |
| dc.date.issued | 2017-03-08T20:46:41Z | |
| dc.description.abstract | Recent work has made the case for geo-distributed analytics, where data collected and stored at multiple datacenters and edge sites world-wide is analyzed in situ to drive operational and management decisions. A key issue in such systems is ensuring low response times for analytics queries issued against geo-distributed data. A central determinant of response time is the query execution plan (QEP). Current query optimizers do not consider the network when deriving QEPs, which is a key drawback as the geo-distributed sites are connected via WAN links with heterogeneous and modest bandwidths, unlike intra-datacenter networks. We propose Clarinet, a novel WAN-aware query optimizer. Deriving a WAN-aware QEP requires working jointly with the execution layer of analytics frameworks that places tasks to sites and performs scheduling. We design efficient heuristic solutions in Clarinet to make such a joint decision on the QEP. Our experiments with a real prototype deployed across EC2 datacenters, and large-scale simulations using production workloads show that Clarinet improves query response times by greater than 50% compared to state-of-the-art WAN-aware task placement and scheduling. | en |
| dc.identifier.citation | TR1841 | en |
| dc.identifier.other | TR1841 | |
| dc.identifier.uri | http://digital.library.wisc.edu/1793/76122 | |
| dc.language.iso | en_US | en |
| dc.relation.ispartofseries | tech reports;TR1841 | |
| dc.subject | Geo-distributed analytics | en |
| dc.subject | WAN awareness | en |
| dc.subject | query optimization | en |
| dc.title | Clarinet: WAN-Aware Optimization for Analytics Queries | en |
| dc.type | Technical Report | en |