RDMA reads complicates the system and puts additional requirements on the application, at the extreme wasting
space and burning through some of the performance benefit it brings. In the future we might have it all—a fast
hashtable implemented in hardware can give us performance of hardware and flexibility of a key-value store, but
this will require overcoming a number of challenges.
References
[1] B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, M. Paleczny. Workload Analysis of a Large-scale Key-value Store.
SIGMETRICS 2012.
[2] V. Basant, P. Rahoul. Oracle’s Sonoma processor: Advanced low-cost ‘ processor for enterprise workloads.
HOTCHIPS 2015.
[3] A. Caulfield, E. Chung, A. Putnam, H. Angepat, J. Fowers, M. Haselman, S. Heil, M. Humphrey, P. Kaur, J.-Y. Kim,
D. Lo, T. Massengill, K. Ovtcharov, M. Papamichael, L. Woods, S. Lanka, D. Chiou, D. Burger. A Cloud-Scale
Acceleration Architecture. MICRO 2016.
[4] A. Dragojevi
´
c, D. Narayanan, M. Castro, O Hodson. FaRM: Fast Remote Memory. NSDI 2014.
[5] A. Dragojevi
´
c, D. Narayanan, E. B. Nightingale, M. Renzelmann, A. Shamis, A. Badam, M. Castro. No Compro-
mises: Distributed Transactions with Consistency, Availability, and Performance. SOSP 2015.
[6] M. Flajslik, M. Rosenblum. Network Interface Design for Low Latency Request-Response Protocols. USENIX ATC
2013.
[7] S. Hudgens. Overview of Phase-Change Chalcogenide Nonvolatile Memory Technology. MSR bulletin, 2004, vol
29, number 11.
[8] Z. Istv
´
an, D. Sidler, G. Alonso. Building a distributed key-value store with FPGA-based microservers. FPL 2015.
[9] Z. Istv
´
an, D. Sidler, G. Alonso, M. Blott, K. Vissers. A Flexible Hash Table Design For 10Gbps Key-value Stores on
FPGAs. FPL 2013.
[10] J. Ousterhout, A. Gopalan, A. Gupta, A. Kejriwal, C. Lee, B. Montazeri, D. Ongaro, Diego and S. J. Park, H. Qin,
M. Rosenblum, S. Rumble, R. Stutsman, S. Yang. The RAMCloud Storage System. TOCS, September 2015, vol 33,
number 3.
[11] A. Kalia, M. Kaminsky, D. G. Andersen. FaSST: Fast, Scalable, and Simple Distributed Transactions with Two-Sided
(RDMA) Datagram RPCs. OSDI 2016.
[12] A. Kalia, M. Kaminsky, D. G. Andersen. Using RDMA Efficiently for Key-Value Services. SIGCOMM 2014
[13] C. Lee, S.J. Park, A. Kejriwal, S. Matsushita, J. Ousterhout. Implementing Linearizability at Large Scale and Low
Latency. SOSP 2015.
[14] C. Mitchell, G. Yifeng, L. Jinyang. Using one-sided RDMA reads to build a fast, CPU-efcient key-value store.
USENIX ATC 2013.
[15] R. Mittal, T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi, A. Vahdat, Y. Wang, D. Wetherall, D. Zats.
TIMELY: RTT-based Congestion Control for the Datacenter. SIGCOMM 2015.
[16] S. Neuvonen, A. Wolski, M. Manner, V. Raatikka. Telecom Application Transaction Processing benchmark. http:
//tatpbenchmark.sourceforge.net.
[17] D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, M. Rosenblum. Fast Crash Recovery in RAMCloud. SOSP
2011.
[18] A. Putnam, A. Caulfield, E. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, J. Gray,
M. Haselman, S. Hauck, S. Heil, A. Hormati, J.-Y. Kim, S. Lanka, E. Peterson, A. Smith, J. Thong, P. Y. Xiao, D.
Burger, J. Larus, G. P. Gopal, S. Pope. A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services.
ISCA 2014.
[19] S. M. Rumble, A. Kejriwal, J. Ousterhout. Log-structured Memory for DRAM-based Storage. FAST 2011.
13