Research
My research explores how we can build better computer systems by co-designing across traditional boundaries — between the network and distributed systems, between hardware and software, between theory and practice. I focus on distributed systems, networking, and transactions, often finding that crossing these lines leads to systems that are faster, simpler, and more reliable.
Network-system co-design
The core thread of my research: what if distributed systems were designed together with the network, rather than treating it as an opaque layer? This has led to replication protocols that exploit network ordering to eliminate coordination, consensus mechanisms that run partly inside the network, and in-network acceleration for distributed ML training.
- Speculative Paxos / NOPaxos — consensus protocols that use network ordering to eliminate leader bottlenecks (NSDI ‘15, OSDI ‘16)
- Eris — coordination-free distributed transactions using in-network sequencing (SOSP ‘17)
- Pegasus — in-switch selective replication for skewed-workload KV stores (OSDI ‘20)
- Harmonia — near-linear scalability for replicated storage with in-switch conflict detection (VLDB ‘20)
- SwitchML — in-network aggregation for distributed ML training (NSDI ‘21)
- Hydra — serialization-free network ordering for strongly consistent applications (NSDI ‘23)
Network function acceleration
Building and accelerating network infrastructure using programmable switches and SmartNICs — from stateful in-network processing to hardware load balancing at datacenter scale. This line of work has led to ongoing collaborations with Azure Networking.
- RedPlane — fault-tolerant stateful in-switch applications (SIGCOMM ‘21)
- SwiSh — distributed shared state abstractions for programmable switches (NSDI ‘22)
- SlimeMold — hardware load balancer at datacenter scale (APNet ‘23)
Transactions and consistency
Building practical systems that provide strong guarantees — serializability, linearizability — without paying the traditional performance cost.
- TAPIR — transactional protocol that avoids consistency within the replication layer (SOSP ‘15)
- Meerkat — zero-coordination replicated transactions (EuroSys ‘20)
- Serializable Snapshot Isolation — making serializable transactions practical in PostgreSQL (VLDB ‘12)
- TxCache — transactionally consistent caching for web applications (OSDI ‘10)
- Beaver — practical partial snapshots for distributed cloud services (OSDI ‘24)
Datacenter architecture
Rethinking the systems stack for modern datacenter hardware — from RDMA interfaces to NVMe virtualization to CXL memory pooling.
- Oasis — pooling PCIe devices over CXL to boost utilization (SOSP ‘25)
- PRISM — rethinking the RDMA interface for distributed systems (SOSP ‘21)
- LeapIO — efficient and portable virtual NVMe storage on ARM SoCs (ASPLOS ‘20)
- Capybara — microsecond-scale live TCP migration (APSys ‘23)
Systems security and architecture
Earlier work on operating systems, security, and low-latency systems.
- Arrakis — operating system that bypasses the kernel for I/O, achieving near-hardware-speed networking (OSDI ‘14)
- Overshadow — protecting application data from a compromised OS using virtual machine techniques (ASPLOS ‘08)
- Tales of the Tail — diagnosing and eliminating tail latency in datacenter systems (SoCC ‘14)
- Aeolus — platform for building applications with information flow security (USENIX ATC ‘12)