Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area
Author(s): Rabkin, Ariel; Arye, Matvey; Sen, Siddhartha; Pai, Vivek S; Freedman, Michael J
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1725z
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Rabkin, Ariel | - |
dc.contributor.author | Arye, Matvey | - |
dc.contributor.author | Sen, Siddhartha | - |
dc.contributor.author | Pai, Vivek S | - |
dc.contributor.author | Freedman, Michael J | - |
dc.date.accessioned | 2021-10-08T19:48:43Z | - |
dc.date.available | 2021-10-08T19:48:43Z | - |
dc.date.issued | 2014 | en_US |
dc.identifier.citation | Rabkin, Ariel, Matvey Arye, Siddhartha Sen, Vivek S. Pai, and Michael J. Freedman. "Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area." In 11th USENIX Symposium on Networked Systems Design and Implementation (2014): pp. 275-288. | en_US |
dc.identifier.uri | https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-rabkin.pdf | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1725z | - |
dc.description.abstract | We present JetStream, a system that allows real-time analysis of large, widely-distributed changing data sets. Traditional approaches to distributed analytics require users to specify in advance which data is to be backhauled to a central location for analysis. This is a poor match for domains where available bandwidth is scarce and it is infeasible to collect all potentially useful data. JetStream addresses bandwidth limits in two ways, both of which are explicit in the programming model. The system incorporates structured storage in the form of OLAP data cubes, so data can be stored for analysis near where it is generated. Using cubes, queries can aggregate data in ways and locations of their choosing. The system also includes adaptive filtering and other transformations that adjusts data quality to match available bandwidth. Many bandwidth-saving transformations are possible; we discuss which are appropriate for which data and how they can best be combined. We implemented a range of analytic queries on web request logs and image data. Queries could be expressed in a few lines of code. Using structured storage on source nodes conserved network bandwidth by allowing data to be collected only when needed to fulfill queries. Our adaptive control mechanisms are responsive enough to keep end-to-end latency within a few seconds, even when available bandwidth drops by a factor of two, and are flexible enough to express practical policies. | en_US |
dc.format.extent | 275 - 288 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | 11th USENIX Symposium on Networked Systems Design and Implementation | en_US |
dc.rights | Final published version. This is an open access article. | en_US |
dc.title | Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area | en_US |
dc.type | Conference Article | en_US |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Jetstream.pdf | 586.67 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.