Loading...
Please wait, while we are loading the content...
Similar Documents
Building Fast, CPU-Efficient Distributed Systems on Ultra-Low Latency, RDMA-Capable Networks
| Content Provider | Semantic Scholar |
|---|---|
| Author | Mitchell, Christopher |
| Copyright Year | 2015 |
| Abstract | Modern datacenters utilize traditional Ethernet networks to connect hundreds or thousands of machines. Although inexpensive and ubiquitous, Ethernet imposes design constraints on datacenter-scale distributed storage systems that use traditional client-server architectures. Round-trip latency around 100μs means that minimizing per-operation round trips and maximizing data locality is vital to ensure low application latency. To maintain liveness, sufficient server CPU cores must be provisioned to satisfy peak load, or clients need to wait for additional server CPU resources to be spun up during load spikes. Recent technological trends indicate that future datacenters will embrace interconnects with ultralow latency, high bandwidth, and the ability to offload work from servers to clients. Interconnects like Infiniband found in traditional supercomputing clusters already provide these features, including < 3μs latency, 40− 80 Gbps throughput, and Remote Direct Memory Access (RDMA), which allows access to another machine’s memory without involving its CPU. Future datacenter-scale distributed storage systems will need to be designed specifically to exploit these features. This thesis explores what these features mean for large-scale in-memory distributed storage systems, and derives two key insights for building RDMA-aware distributed systems. First, relaxing locality between data and computation is now practical: data can be copied from servers to clients for computation. Second, selectively relaxing data-computation locality makes it possible to optimally balance load between server and client CPUs to maintain low application latency. This thesis presents two in-memory distributed storage systems built around these two insights, Pilaf and Cell, that demonstrate effective use of ultra-low-latency, RDMA-capable interconnects. Pilaf is a distributed in-memory key-value store that achieves high performance with low latency. Clients perform read operations, which commonly dominate key-value store workloads, directly from servers’ memory via RDMA. By contrast, write operations are serviced by the server to simplify synchronizing memory accesses. Pilaf balances high performance with modest system complexity, disentangling read latency from server CPU load without forcing clients to complete difficult distributed RDMA-based write operations. Client RDMA reads can still conflict with concurrent server CPU writes, a problem Pilaf solves using self-verifying data structures that can detect read-write races without client-server coordination. Pilaf achieves low latency and high throughput while consuming few CPU resources. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://cs.nyu.edu/media/publications/mitchell_christopher.pdf |
| Alternate Webpage(s) | http://cs.nyu.edu/web/Research/Theses/mitchell_christopher.pdf |
| Alternate Webpage(s) | http://www.cs.nyu.edu/media/publications/mitchell_christopher.pdf |
| Alternate Webpage(s) | https://cs.nyu.edu/media/publications/mitchell_christopher.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |