Bootstrapping evolvability for inter-domain routing with D-BGP

Conference papers
Raja R. Sambasivan, David Tran-Lam, Aditya Akella, Peter Steenkiste
In Proceedings of SIGCOMM 2017.
Publication year: 2017

The Internet’s inter-domain routing infrastructure, provided today by BGP, is extremely rigid and does not facilitate the introduction of new inter-domain routing protocols.  This rigidity has made it incredibly difficult to widely deploy critical fixes to BGP.  It has also depressed ASes’ ability to sell value-added services or replace BGP entirely with a more sophisticated protocol.  Even if operators undertook the significant effort needed to fix or replace BGP, it is likely the next protocol will be just as difficult to change or evolve.  To help, this paper identifies two features needed in the routing infrastructure (i.e., within any inter-domain routing protocol) to facilitate evolution to new protocols.  To understand their utility, it presents D-BGP, a version of BGP that incorporates them.

Principled workflow-centric tracing of distributed systems

Conference papers
Raja R. Sambasivan, Ilari Shafer, Jonathan Mace, Benjamin H. Sigelman, Rodrigo Fonseca, Gregory R. Ganger
In Proceedings of SoCC 2016
Publication year: 2016

Workflow-centric tracing captures the workflow of causally-related events (e.g., work done to process a request) within and among the components of a distributed system.  As distributed systems grow in scale and complexity, such tracing is becoming a critical tool for understanding distributed system behavior.  Yet, there is a fundamental lack of clarity about how such infrastructures should be designed to provide maximum benefit for important management tasks, such as resource accounting and diagnosis.  Without research into this important issue, there is a danger that workflow-centric tracing will not reach its full potential.  To help, this paper distills the design space of workflow-centric tracing and describes key design choices that can help or hinder a tracing infrastructure’s utility for important tasks.  Our design space and the design choices we suggest are based on our experiences developing several previous workflow-centric tracing infrastructures.

Bootstrapping evolvability for inter-domain routing with D-BGP

Technical reports
Raja R. Sambasivan, David Tran-Lam, Aditya Akella, Peter Steenkiste
Carnegie Mellon University Department of Computer Science Technical Report CMU-CS-16-117.
Publication year: 2016

Note: This technical report has been superseded by the SIGCOMM’17 paper.

It is extremely difficult to utilize new routing protocols in today’s Internet. As a result, the Internet’s baseline inter-domain protocol for connectivity (BGP) has remained largely unchanged, despite known significant flaws. The difficulty of using new protocols has also depressed opportunities for (currently commoditized) transit providers to provide value-added routing services. To help, this paper proposes Darwin’s BGP (D-BGP), a modified version of BGP that can support evolvability to new protocols. D-BGP modifies BGP’s advertisements and advertisement processing based on requirements imposed by key evolvability scenarios, which we identified via analyses of recently proposed routing protocols.

Bootstrapping evolvability for inter-domain routing

Workshop papers
Raja R. Sambasivan, David Tran-Lam, Aditya Akella, Peter Steenkiste
In Proceedings of HotNets 2015
Publication year: 2015

It is extremely difficult to deploy new inter-domain routing protocols in today’s Internet.  As a result, the Internet’s baseline protocol for connectivity, BGP, has remained largely unchanged, despite known significant flaws.  The difficulty of deploying new protocols has also depressed opportunities for (currently commoditized) transit providers to provide value-added routing services.  To help, we identify the key deployment models under which new protocols are introduced and the requirements each poses for enabling their usage goals.  Based on these requirements, we argue for two modifications to BGP that will greatly improve support for new routing protocols.

So, you want to trace your distributed system? Key design insights from years of practical experience

Technical reports
Raja R. Sambasivan, Rodrigo Fonseca, Ilari Shafer, Gregory R. Ganger
Carnegie Mellon Parallel Data Lab Technical Report CMU-PDL-14-102
Publication year: 2014

Note: This technical report has been superseded by our SoCC16 paper, Principled workflow-centric tracing of distributed systems.

End-to-end tracing captures the workflow of causally-related activity (e.g., work done to process a request) within and among the components of a distributed system. As distributed systems grow in scale and complexity, such tracing is becoming a critical tool for management tasks like diagnosis and resource accounting. Drawing upon our experiences building and using end-to-end tracing infrastructures, this paper distills the key design axes that dictate trace utility for important use cases. Developing tracing infrastructures without explicitly understanding these axes and choices for them will likely result in infrastructures that are not useful for their intended purposes. In addition to identifying the design axes, this paper identifies good design choices for various tracing use cases, contrasts them to choices made by previous tracing implementations, and shows where prior implementations fall short. It also identifies remaining challenges on the path to making tracing an integral part of distributed system design.

Visualizing request-flow comparison to aid performance diagnosis in distributed systems

Conference papersJournal papers
Raja R. Sambasivan, Ilari Shafer, Michelle Mazurek, Gregory R. Ganger
IEEE Transactions on Visualization and Computer Graphics (Proc. Information Visualization 2013), Vol. 19, no. 12, Dec. 2013
Publication year: 2013

Distributed systems are complex to develop and administer, and performance problem diagnosis is particularly challenging. When performance degrades, the problem might be in any of the system’s many components or could be a result of poor interactions among them. Recent research efforts have created tools that automatically localize the problem to a small number of potential culprits, but research is needed to understand what visualization techniques work best for helping distributed systems developers understand and explore their results.  This paper compares the relative merits of three well-known visualization approaches (side-by-side, diff, and animation) in the context of presenting the results of one proven automated localization technique called {\it request-flow comparison}.  Via a 26-person user study, which included real distributed systems developers, we identify the unique benefits that each approach provides for different problem types and usage modes.

Visualizing request-flow comparison to aid performance diagnosis in distributed systems

Technical reports
Raja R. Sambasivan, Ilari Shafer, Michelle Mazurek, Gregory R. Ganger
Carnegie Mellon University Parallel Data Lab technical Report CMU-PDL-13-104
Publication year: 2013

Note: This technical report has been superseded by our InfoVis’13 paper of the same name.

Distributed systems are complex to develop and administer, and performance problem diagnosis is particularly challenging. When performance degrades, the problem might be in any of the system’s many components or could be a result of poor interactions among them. Recent research ešorts have created tools that automatically localize the problem to a small number of potential culprits, but ešective visualizations are needed to help developers understand and explore their results. is paper compares side-by-side, diš, and animation-based approaches for visualizing the results of one proven automated localization technique called request-žow comparison. Via a óä-person user study, which included real distributed systems developers, we identify the unique benets that each approach provides for dišerent usage modes and problem types.

Visualizing request-flow comparison to aid performance diagnosis in distributed systems

Technical reports
Raja R. Sambasivan, Ilari Shafer, Michelle Mazurek, Gregory R. Ganger
Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-12-102
Publication year: 2013

Note: This technical report has been superseded by our InfoVis’13 paper of the same name and CMU-PDL-13-104.

Distributed systems are complex to develop and administer, and performance problem diagnosis is particularly challenging. When performance decreases, the problem might be in any of the system’s many components or could be a result of poor interactions among them. Recent research has provided the ability to automatically identify a small set of most likely problem locations, leaving the diagnoser with the task of exploring just that set. This paper describes and evaluates three approaches for visualizing the results of a proven technique called “request-flow comparison” for identifying likely causes of performance decreases in a distributed system. Our user study provides a number of insights useful in guiding visualization tool design for distributed system diagnosis. For example, we find that both an overlay-based approach (e.g., diff) and a side-by-side approach are effective, with tradeoffs for different users (e.g., expert vs. not) and different problem types. We also find that an animation-based approach is confusing and difficult to use.

Specialized storage for big numeric time series

Workshop papers
Ilari Shafer, Raja R. Sambasivan, Gregory R. Ganger
In Proceedngs of HotCloud 2013
Publication year: 2013

Numeric time series data has unique storage require- ments and access patterns that can benefit from special- ized support, given its importance in Big Data analyses. Popular frameworks and databases focus on addressing other needs, making them a suboptimal fit. This paper describes the support needed for numeric time series, suggests an architecture for efficient time series storage, and illustrates its potential for satisfying key require- ments.

Diagnosing performance changes in distributed systems by comparing request flows

Dissertations & proposals
Raja R. Sambasivan
Carnegie Mellon University Parallel Data Lab Ph.D Dissertation. CMU-PDL-13-105.
Publication year: 2013

Diagnosing performance problems in modern datacenters and distributed systems is challenging, as the root cause could be contained in any one of the system’s numerous components or, worse, could be a result of interactions among them. As distributed systems continue to increase in complexity, diagnosis tasks will only become more challenging. There is a need for a new class of diagnosis techniques capable of helping developers address problems in these distributed environments.

As a step toward satisfying this need, this dissertation proposes a novel technique, called request-flow comparison, for automatically localizing the sources of performance changes from the myriad potential culprits in a distributed system to just a few potential ones. Request-flow comparison works by contrasting the workflow of how individual requests are serviced within and among every component of the distributed system between two periods: a non-problem period and a problem period. By identifying and ranking performance-affecting changes, request-flow comparison provides developers with promising starting points for their diagnosis efforts. Request workflows are obtained with less than 1% overhead via use of recently developed end-to-end tracing techniques.

To demonstrate the utility of request-flow comparison in various distributed systems, this dissertation describes its implementation in a tool called Spectroscope and describes how Spectroscope was used to diagnose real, previously unsolved problems in the Ursa Minor distributed storage service and in select Google services. It also explores request-flow comparison’s applicability to the Hadoop File System. Via a 26-person user study, it identies effective visualizations for presenting request-flow comparison’s results and further demonstrates that request-flow comparison helps developers quickly identify starting points for diagnosis. This dissertation also distills design choices that will maximize an end-to-end tracing infrastructure’s utility for diagnosis tasks and other use cases.

ProQuest version can be found here.

Automated diagnosis without predictability is a recipe for failure

Workshop papers
Raja R. Sambasivan, Gregory R. Ganger
In Proceedings of HotCloud'12
Publication year: 2012

Automated management is critical to the success of cloud computing, given its scale and complexity. But, most systems do not satisfy one of the key properties required for automation: predictability, which in turn relies upon low variance. Most automation tools are not effective when variance is consistently high. Using automated performance diagnosis as a concrete example, this position paper argues that for automation to become a reality, system builders must treat variance as an important metric and make conscious decisions about where to reduce it. To help with this task, we describe a framework for reasoning about sources of variance in distributed systems and describe an example tool for helping identify them.

Diagnosing performance changes by comparing request flows

Conference papers
Raja R. Sambasivan, Alice X. Zheng, Michael De Rosa, Elie Krevat, Spencer Whitman, Michael Stroucken, William Wang, Lianghong Xu, Gregory R. Ganger
In Proceedings of NSDI 2011
Publication year: 2011

The causes of performance changes in a distributed system often elude even its developers. This paper develops a new technique for gaining insight into such changes: comparing system behaviours from two executions (e.g., of two system versions or time periods). Building on end-to-end request flow tracing within and across components, algorithms are described for identifying and ranking changes in the flow and/or timing of request processing. The implementation of these algorithms in a tool called Spectroscope is described and evaluated. Six case studies are presented of using Spectroscope to diagnose performance changes in a distributed storage system caused by code changes, configuration modifications, and component degradations, demonstrating the value and efficacy of comparing request flows. Preliminary experiences of using Spectroscope to diagnose performance changes within Google are also presented.

Automation without predictability is a recipe for failure

Technical reports
Raja R. Sambasivan, Gregory R. Ganger
Carnegie Mellon University Parallel Data Lab Technical Report CMU-PD-11-101
Publication year: 2011

Note: This technical report has been superseded by our HotCloud’12 paper, “Automated diagnosis without predictability is a recipe for failure.”

Automated management seems a must, as distributed systems and datacenters continue to grow in scale and complexity. But, automation of performance problem diagnosis and tuning relies upon predictability, which in turn relies upon low variance—most automation tools aren’t effective when variance is regularly high. This paper argues that, for automation to become a reality, system builders must treat variance as an important metric and make conscious decisions about where to reduce it. To help with this task, we describe a framework for understanding sources of variance and describe an example tool for helping identify them.

Managing execution of database queries

Patents
Stepan Krompass, Harumi Anne Kuno, Umeshwar Dayal, Janet Wiener, Raja Sambasivan
United States Patent Application US 20100082603 A1, April 2010
Publication year: 2010

One embodiment is a method to manage queries in a database. The method identifies a query that executes on the database for an elapsed time that is greater than a threshold and then implements a remedial action when the query executes on the database for an execution time that is greater than an estimated execution time.

Diagnosing performance problems by visualizing and comparing system behaviours

Technical reports
Raja R. Sambasivan, Alice X. Zheng, Elie Krevat, Spencer Whitman, Michael Stroucken, William Wang, Lianghong Xu, Gregory R. Ganger
Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-103
Publication year: 2010

Note: This technical report has been superseded by our NSDI’11 paper, “Diagnosing performance changes by comparing request flows,” and CMU-PDL-10-107.

Spectroscope is a new toolset aimed at assisting developers with the long-standing challenge of performance debugging in distributed systems. To do so, it mines end-to-end traces of request processing within and across components. Using Spectroscope, developers can visualize and compare system behaviours between two periods or system versions, identifying and ranking various changes in the flow or timing of request processing. Examples of how Spectroscope has been used to diagnose real performance problems seen in a distributed storage system are presented, and Spectroscope’s primary assumptions and algorithms are evaluated.

Diagnosing performance changes by comparing system behaviors

Technical reports
Raja R. Sambasivan, Alice X. Zheng, Elie Krevat, Spencer Whitman, Michael Stroucken, William Wang, Lianghong Xu, Gregory R. Ganger
Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-107
Publication year: 2010

Note: This techreport has been superseded by our NSDI’11 paper, “Diagnosing performance changes by comparing request flows.”

The causes of performance changes in a distributed system often elude even its developers. This paper develops a new technique for gaining insight into such changes: comparing system behaviours from two executions (e.g., of two system versions or time periods). Building on end-to-end request flow tracing within and across components, algorithms are described for identifying and ranking changes in the flow and/or timing of request processing. The implementation of these algorithms in a tool called Spectroscope is described and evaluated. Five case studies are presented of using Spectroscope to diagnose performance hanges in a distributed storage system caused by code changes and configuration modifications, demonstrating the value and efficacy of comparing system behaviours.

A transparently-scalable metadata service for the Ursa Minor storage system

Conference papers
Shafeeq Sinnamohideen, Raja R. Sambasivan, James Hendricks, Likun Liu, Gregory R. Ganger
In Proceedings of ATC'10
Publication year: 2010

The metadata service of the Ursa Minor distributed storage system scales metadata throughput as metadata servers are added. While doing so, it correctly han- dles operations that involve metadata served by differ- ent servers, consistently and atomically updating such metadata. Unlike previous systems, Ursa Minor does so by reusing existing metadata migration functionality to avoid complex distributed transaction protocols. It also assigns object IDs to minimize the occurrence of multi- server operations. This approach allows Ursa Minor to implement a desired feature with less complexity than al- ternative methods and with minimal performance penalty (under 1% in non-pathological cases).

Relative fitness modeling

Journal papers
Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, Gregory R. Ganger
Communications of the ACM 52, 4 (April 2009), 91-96
Publication year: 2009

Relative fitness is a new approach to modeling the performance of storage devices (e.g., disks and RAID arrays). In contrast to a conventional model, which predicts the per- formance of an application’s I/O on a given device, a relative fitness model predicts performance differences between devices. The result is significantly more accurate predictions.

Modeling the relative fitness of storage

Conference papers
Michael Mesnier, Matthew Wachs, Raja R. Sambasivan, Gregory R. Ganger
In Proceedings of SIGMETRICS 2007
Publication year: 2007

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an abso- lute model that predicts the performance of a workload on a given storage device, a relative fitness model predicts per- formance differences between a pair of devices. There are two primary advantages to this approach. First, because a relative fitness model is constructed for a device pair, the application-device feedback of a closed workload can be cap- tured (e.g., how the I/O arrival rate changes as the workload moves from device A to device B). Second, a relative fitness model allows performance and resource utilization to be used in place of workload characteristics. This is beneficial when workload characteristics are difficult to obtain or concisely express (e.g., rather than describe the spatio-temporal char- acteristics of a workload, one could use the observed cache behavior of device A to help predict the performance of B).

This paper describes the steps necessary to build a relative fitness model, with an approach that is general enough to be used with any black-box modeling technique. We compare relative fitness models and absolute models across a vari- ety of workloads and storage devices. On average, relative fitness models predict bandwidth and throughput within 10– 20% and can reduce prediction error by as much as a factor of two when compared to absolute models.

Categorizing and differencing system behaviors

Workshop papers
Raja R. Sambasivan, Alice X. Zheng, Eno Thereska, Gregory R. Ganger
In Proceedings of HotAC 2007
Publication year: 2007

Making request flow tracing an integral part of soft- ware systems creates the potential to better understand their operation. The resulting traces can be converted to per- request graphs of the work performed by a service, repre- senting the flow and timing of each request’s processing. Collectively, these graphs contain detailed and comprehen- sive data about the system’s behavior and the workload that induced it, leaving the challenge of extracting insights. Categorizing and differencing such graphs should greatly improve our ability to understand the runtime behavior of complex distributed services and diagnose problems. Clus- tering the set of graphs can identify common request pro- cessing paths and expose outliers. Moreover, clustering two sets of graphs can expose differences between the two; for example, a programmer could diagnose a problem that arises by comparing current request processing with that of an earlier non-problem period and focusing on the aspects that change. Such categorizing and differencing of system behavior can be a big step in the direction of automated problem diagnosis.