List

Lily did a fabulous job presenting her early work on this research.  I’ve listed the abstract and video below.

Logging what matters: The Pythia just-in-time instrumentation framework (Lily Sturmann) (Slides): We will present our current work on Pythia, a just-in-time instrumentation framework for distributed systems that automatically enables instrumentation in the right areas to provide visibility into newly-observed problems in a running system. The talk will discuss key challenges involved in creating such a framework: (1) understanding where in the distributed system (e.g., which components) additional instrumentation is needed, (2) understanding what instrumentation (e.g., log statements or information contained in logs, such as function parameter values) is needed, (3) Understanding how to limit the overheads of enabling too much instrumentation.  It will discuss how end-to-end tracing, combined with statistical measures and machine-learning techniques, provide a foundation to address these challenges.  The talk will conclude with our current progress building Pythia and applying it to problems in OpenStack.

Leave a Reply

Your email address will not be published. Required fields are marked *

  Posts

December 20th, 2018

DOCC Lab celebrates its first end-of-semester dinner

My students and I celebrated the end of the Fall’18. with dinner at the Q restaurant in downtown Boston last […]

December 10th, 2018

Lily presented, “Logging what matters: The Pythia just-in-time instrumentation framework for distributed applications,” at the 2018 Observability summit

Lily did a fabulous job presenting her early work on this research.  I’ve listed the abstract and video below. Logging […]

August 19th, 2018

DOCC-Lab students presented on diagnosis research at 2018 DevConf.US

Many of our students presented at Red Hat’s developer conference (DevConf.US) this year.  I’ve listed abstracts and talk videos below. […]

August 2nd, 2018

Our NSF proposal, “A just-in-time, cross-layer instrumentation framework for diagnosing performance problems in distributed applications,” was funded by NSF

Diagnosing and fixing problems in distributed applications running in cloud environments is extremely challenging.  One key reason is a lack […]

April 29th, 2018

Attending CSR Aspiring PIs Workshop

Thanks to NSF for selecting me to attend this workshop and for funding my travel costs.  I’m looking forward to […]

October 1st, 2017

Harshal and Andrew named Siemens Research Competition semifinalists!

Harshal and Andrew’s project, Tarpan: a router that supports evolvability, involved implementing a robust version of D-BGP in Quagga. D-BGP […]

May 5th, 2017

Our paper, “Bootstrapping evolvability for inter-domain routing with D-BGP,” was accepted to SIGCOMM’17!

August 21st, 2016

Our paper, “Principled Workflow-centric tracing of distributed systems,” was accepted to SoCC’16

Workflow-centric tracing (also called end-to-end tracing or distributed-systems tracing) captures the work done within and among distributed-system components to service […]