Research
Flexible, High-Fidelity Memory Telemetry
Modern servers use complex memory hierarchies that require active OS management, but the OS lacks low-overhead and accurate visibility into how memory is actually used, resulting in poor policy decisions. Existing approaches either rely on software that is flexible but has high overhead, or hardware counters that are efficient but too coarse and inflexible. Our system breaks this trade-off with HW/SW co-design. It introduces a small, programmable monitoring mechanism inside memory controllers that lets the OS directly track memory activities of interest, achieving both accuracy and flexibility with low overhead.
FlowGuard: Information Flow Control for Coding Agents
Coding agents powered by large language models are increasingly being used to automate software development tasks. However, these agents can inadvertently leak sensitive information or execute unauthorized operations when interacting with external systems. FlowGuard addresses these security concerns by applying information flow control to coding agents, ensuring that sensitive data and operations are properly isolated and controlled throughout the agent's execution.
Masa: Microservice SLO Management
Microservice applications face challenges in maintaining consistent performance and meeting service-level objectives (SLOs) across services. Traditional approaches often optimize individual services in isolation, leading to suboptimal end-to-end performance. Masa improves microservice goodput by coordinating SLO enforcement across service boundaries, enabling end-to-end SLO guarantees that account for the complex interactions between microservices.
Loom: Efficient Telemetry for Production Systems
Debugging production systems today requires a difficult trade-off between how much data is collected and how quickly it can be queried: indexing for fast queries often slows ingestion and forces data to be dropped. Furthermore, data collection must not slow down the application being monitored. Loom breaks this trade-off by showing that fast queries do not require indexing every record: coarse summaries over small chunks are sufficient. With a small CPU and memory footprint in production, Loom minimizes application interference.
Quicksand: Unstrand Resources with Granular Computing
Resource stranding causes substantial underutilization in datacenters and is commonly addressed through hardware-based disaggregation on emerging hardware. Quicksand argues that stranded resources can be addressed on hardware today via a new programming framework layer. Its key insight is to decompose applications into fine-grained, resource-specific units that can be scheduled onto stranded resources. Quicksand performs this decomposition transparently and dynamically adjusts it at millisecond timescales to adapt to changing resource availability.