Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DEDICATED MEMORY SERVER
Document Type and Number:
WIPO Patent Application WO/2016/122607
Kind Code:
A1
Abstract:
Example implementations relate to a dedicated memory server. In example implementations, virtual addresses may be allocated between a plurality of client servers, which may be communicatively coupled to a dedicated memory server. A virtual address specified by a remote direct memory access (RDMA) write command received by the dedicated memory server may be identified. A mapping of the identified virtual address may be updated in response to data associated with the RDMA write command being written to a memory on the dedicated memory server.

Inventors:
BROWNELL PAUL V (US)
OLSON DAVID M (US)
JUNG JASON (US)
WITKOWSKI MICHAEL L (US)
BENAVIDES JOHN A (US)
WALKER WILLIAM J (US)
Application Number:
PCT/US2015/013810
Publication Date:
August 04, 2016
Filing Date:
January 30, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HEWLETT PACKARD ENTPR DEV LP (US)
International Classes:
G06F15/173
Foreign References:
US20140089585A12014-03-27
US20130275631A12013-10-17
JP2008217214A2008-09-18
JP2005038218A2005-02-10
Attorney, Agent or Firm:
HA, Miranda J, (3404 E. Harmony RoadMail Stop 7, Fort Collins CO, US)
Download PDF:
Claims:
We claim:

1 , A system comprising:

a plurality of client servers; and

a dedicated memory server communicatively coupled to the plurality of client servers, the dedicated memory server comprising:

a first memory;

a second memory, wherein latency of the first memory is lower than latency of the second memory;

a remote direct memory access (RDMA) communication module to receive RDMA commands from the plurality of client servers and transmit, using RDMA, data to the plurality of ciient servers;

an allocation module to allocate virtual addresses between the plurality of client servers; and

a mapping module to maintain mappings of the allocated virtual addresses to physical addresses on the first memory or the second memory.

2. The system of claim 1, wherein;

the dedicated memory server is communicatively coupled to the plurality of client servers over Ethernet; and

the RDMA communication module comprises an RDMA on

Converged Ethernet (RoCE) engine.

3. The system of claim 1 « wherein;

the dedicated memor server further comprises a memory controller communicatively coupled to the first memory;

the memory controller is to write data to a location in the first memory in response to an RDMA write command received by the RDMA communication module;

the mapping module is further to update, in response to the writing of data, a mapping of a virtual address specified by the RDMA write command; and the updated mapping of the virtual address specified by the ROMA write command comprises a physica! address corresponding to the location in the first memory to which data is written.

4. The system of c!aim 3, wherein:

the memory controller is further to;

monito available memory space in the first memory;

identify, if the available memory space is iess than an available memory threshold, a set of data in the first memory to transfer to the second memory; and

transfer the identified set of data from the first memory to the second memory; and

the mapping module is further to update, in response to the transfer of the set of data, a mapping of a virtual address corresponding to the transferred set of data, the updated mapping of the virtual address corresponding to the transferred set of data comprising a physical address corresponding to a location in the second memory to which the set of data is transferred.

5. The system of claim 1 , wherein

the RD!vA communication module is further to identify, based on mappings maintained by the mapping module, a location, in the second memory, that corresponds to a virtual address specified by an RDMA read command received by the RDM A communication module;

the dedicated memory server further comprises a memory controller communicatively coupled to the second memory;

the memory controller is to transfer data from the location in the second memory to a location in the first memory; and

the mapping module is further to update, in response to the transfer of data, a mapping of the virtual address, the updated mapping comprising a physical address corresponding to the location in the first memory to which data is transferred.

8. The system of claim 1 , wherein:

the first memor is a voiatiie memory;

the second memory is a non-volatile memory (NVM);

the mapping module is further to store a respective status indicator for each of the mappings, each status indicator indicating whether data associated with a virtual address of the respective mapping has been modified without being copied to the second memory; and

the allocation module is further to dynamically reallocate virtual addresses between the plurality of client servers.

7. The system of claim 1 , further comprising:

an acceleration module to perform workload-specific processing for any of the pluralit of client servers; and

a prioritization module to prioritize requests from the plurality of client servers,

8. The system of claim 1 , wherein;

the dedicated memory server and a first client server of the plurality of client servers are at a first geographical site; and

a second client server of the plurality of client servers is at a second geographical site.

9. A machine-readable storage medium encoded with instructions executable by a processor, the machine-readable storage medium comprising:

instructions to allocate virtuai addresses between a plurality of client servers, the plurality of client servers communicatively coupled to a dedicated memory server;

instructions to identify a virtuai address specified by a remote direct memory access (RDSV5A) write command received by the dedicated memor server from a first of the plurality of client servers;

instructions to update, in response to data associated with the

RD A write command being written to a first memory on the dedicated memory server, a mapping of the identified virtual address, the updated mapping of the identified virtual address comprising a physical address corresponding to a first location in the first memory to which data is written; and

instructions to store a status indicator for the mapping of the identified virtual address, the status indicator indicating whether data associated with the identified virtual address has been modified without being copied to a second memory on the dedicated memory server, wherein latency of the first memory is lower than latency of the second memory.

10. The machine-readable storage medium of claim 9, further comprising: instructions to monitor available memory space in the first memory; instructions to identify, if the available memory space is less than an available memory threshold, a set of data in the first memory to transfer to the second memory;

instructions to transfer the identified set of data from the first memory to the second memory; and

instructions to update, in response to the transfer of the set of data, a mapping of a virtual address corresponding to the transferred set of data, the updated mapping of the virtual address corresponding to the transferred set of data comprising a physical address corresponding to a location in the second memory to which th set of data is transferred.

11. The machine-readable storage medium of claim 9, further comprising; instructions to identify a location, in the second memory, that corresponds to a virtual address specified by an RDMA read command received by the dedicated memory server from a second of the plurality of client servers;

instructions to transfer data from the location in the second memory to a second location in the first memory; and

instructions to update, in response to the transfer of data, a mapping of the virtual address specified by the RD A read command, the updated mapping of the virtual address specified by the RDMA read command comprising a physical address corresponding to the second location in the first memory.

12. A method comprising:

allocating virtuai addresses between a piuraliiy of client servers, the piuraliiy of client servers communicatively coupied to a dedicated memory server;

identifying a virtuai address specified by a remote direct memory access (RDMA) write command received by the dedicated memory server from a first of the plurality of client servers;

updating, in response to data associated with the RDMA write command being written to a first memory on the dedicated memory server, a mapping of the identified virtuai address, the updated mapping of the identified virtuai address comprising a physicai address corresponding to a first location in the first memory to which data is written; and

transferring data from a location, in a second memory on the dedicated memory server, and that corresponds to a virtual address specified by an RDMA read command received by the dedicated memory server from a second of the plurality of client servers, to a second location in the first memory, wherein latency of the first memory is lower than latency of the second memory.

13. The method of ciaim 12, further comprising:

updating, in response to the transfer of data, a mapping of the virtual address specified by the RDMA read command, the updated mapping of the virtual address specified by the RD A read command comprising a physical address corresponding to the second location in the first memory; and

storing a status indicator fo the mapping of the virtual address specified by the RDMA read command, the status indicator indicating whether data associated with the virtuai address specified by the RDMA read command has been modified without being copied to the second memory.

14. The method of ciaim 12, further comprising dynamically reallocating virtual addresses between the plurality of client servers.

15. The method of claim 12, further comprising:

monitoring available memory space in ihe first memory; identifying, if the available memory space is less than an available memory threshoid, a set of data in the first memory to transfer to the second memory;

transferring the identified set of data from the first memory to the second memory; and

updating, in response to the transfer of the set of data, a mapping of a virtual address corresponding to the transferred set of data, the updated mapping of the virtual address corresponding to the transferred set of data comprising a physical address corresponding to a iocation in the second memory to which the set of data is transferred.

Description:
DEDICATED MEMORY SERVE

BACKGROUND

[0001] An application server may store data in memory that is on another server. Data may be transferred from one server to another server using a remote direct memory access (RD!VIA). RD As may allow data to be transferred between two servers without involving either server's operating system.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002] The following detailed description references the drawings, wherein:

[0003] FIG. 1 is a block diagram of an example system for expanding the amount of memory available to client servers;

[0004] FIG. 2 is a block diagram of an example system for transferring data between memories on a dedicated memory server;

(0005] FIG. 3 is a block diagram of an example device that includes a machine-readable storage medium encoded with instructions to update a mapping of a virtual address;

[0008] FIG. 4 is a block diagram of an example device that includes a machine-readable storage medium encoded with instructions to transfer a set of data from a first memory to a second memory;

[0007] FIG. 5 is a block diagram of an example device that includes a machine-readable storage medium encoded with instructions to enable transferring of data in response to an RDMA read command;

[0008] FIG. 6 is a flowchart of an example method for transferring data between memories on a dedicated memory server;

[0009] FIG, 7 is a flowchart of an example method for tracking data stored on a dedicated memory server; and

[0010] FIG. 8 is a flowchart of an example method for determining when to transfer data between memories on a dedicated memory server. DETAILED DESCRiPTjON

[0011] Remote servers may be used for memory capacity expansion in server systems. Memory in a particular server in a commodity-based server system may be available to be used by other servers in the server system. Such disaggregation of memory across a server system may reduce provisioning and power costs while enhancing performance. To reduce centra! processing unit (CPU) overhead in communications between servers, remote direct memory accesses (RDMAs may be used. Commands transmitted using RDMA may be referred to herein as "RDMA commands".

[0012] Referring now to the figures, FIG. 1 is a block diagram of an example system 100 for expanding the amount of memory available to client servers. In FIG. 1 , system 100 includes dedicated memory server 104, which is communicatively coupled to client servers 102a, 102b, and 102c, As used herein, the term "client server" should be understood to refer to a computing device, or a program running on a computing device, that performs tasks on behalf of, and/or otherwise fulfills requests received from, other computing devices. A client server may be, for example, a database server, application server, web server, and/or mail server. As used herein, the term "dedicated memor server" should be understood to refer to a server whose memory is available to be used by any client server communicatively coupled to the dedicated memory server.

[0013] Dedicated memory server 104 and client servers 102a-c may be directly coupled, or may be communicatively coupled via a network (e.g., Ethernet, InfiniBand), it should be understood that system 100 may include more or fewer client servers than are show in FIG. 1 , and that the concepts described herein may be applicable to any number of client servers communicatively coupled to a dedicated memory server. It should be understood that the concepts described herein may be applicable to any suitable form factors of dedicated memory servers and client servers; for example, dedicated memory server 104 and client servers 102a-c may include blade servers, tower servers, and/or rack servers.

[0014] Dedicated memory server 104 may include RDMA communication module 1 10, allocation module 1 12, mapping module 1 14, memory 106, and memory 108. As used herein, the terms "include", "have", and "comprise" are interchangeable and should be understood to have the same meaning, in some examples, RDMA communication module 110, allocation moduie 1 12, and/o mapping module 114 may be implemented in a field-programmable gate array (FPGA) or application-specific integrated circuit (ASIC). A module may inciude a set of instructions encoded on a machine-readable storage medium and executable by a processor. In addition or as an alternative, a moduie may include a hardware device comprising electronic circuitr for implementing the functionality described below.

[0015] RDMA communication moduie 10 may receive RDMA commands from client servers 1G2a-c and transmit, using RDMA, data to client servers 1 G2a-c. RDMA communication module 110 may also receive data (e.g., data associated with RDMA commands) from client servers 102a-c. in implementations where dedicated memory server 104 is communicatively coupled to client servers 102a-c via an Ethernet network, RDfvIA communication module 110 may inciude Ethernet ports. In some implementations, RDWA communication moduie 1 10 may include network adapters and/or other mechanisms for maintaining/managing connections between dedicated memory server 104 and client servers 102a~c.

[0016] Allocation module 112 may allocate virtual addresses between client servers 102a-c. In some implementations, the total memory space of the allocated virtual addresses may exceed the amount of memory space of memory 108 on dedicated memory server 104, but be less than the total amount of memory space of memories 106 and 108. Latency of memory 106 may be tower than latency of memory 108, as discussed below. Allocation module 1 12 may allocate different amounts of virtual address space to different client servers. Allocation module 112 may determine how much virtual address space to allocate to a client server based on, for example, how many applications the client server supports, the importance of processes running on the client server compared to the importance of processes running on other client servers, and/or the maximum amount of memory the client server is expected to use. In some implementations, the allocated virtual addresses may be remote machine memory addresses (R MAs).

[0017] In some implementations, allocation module 1 12 may dynamically reallocate virtual addresses between client servers 102a-c. In some implementations, the number and/or priority of tasks running on client servers 102a-c may be periodically assessed, and allocation module 112 may dynamically reallocate virtual addresses to a client server having a higher number and/or priority of tasks. For example, if allocation module 1 12 initially allocates equal amounts of virtual address space to client servers 102a~c and client server 102a is determined to be performing more high priority tasks than client servers 102b and 102c during runtime, allocation module 112 may identify some virtual addresses that were initially assigned to client servers 102b and 102c, and reallocate such virtual addresses to client server 102a. In some implementations, client servers 102a-c may be in the same enclosure, and when another client server is added to the enclosure, allocation module 112 may reallocate virtual addresses between client servers 102a-c and the added client server.

[0018] Mapping module 1 14 may maintain mappings of the allocated virtual addresses to physical addresses on memory 108 or memory 108. As used herein, the term "mapping" should be understood to refer to an indication of a relationship between a virtual address and a physical address, the virtual address being used by a client server to refer to the physical address on a dedicated memory server. The use of virtual addresses by client servers may allow the changing of physical locations of data on a dedicated memory server (e.g., dedicated memory server 104) to be transparent to the client servers (e.g., client servers 102a~c), In some implementations, the mappings may be stored In a mapping table in a static random-access memory (SRAM) in an FPGA o ASIC on dedicated memory server 104.

[0019] Latency of memory 106 may be lower than latency of memory 108. The term "latency", as used herein with respect to a memory, should be understood to refer to a length of time from when a command to obtain data from, move data out of, and/or move data into the memory, or perform any other manipulation of data in the memory, is issued, to when data associated with the command is available or the manipulation of data is complete. In some implementations, memory 106 may be a volatile memory, and memory 108 may be a non-volatile memory. For example, memory 106 may be a double data rate type three (DDR3) dynamic random-access memory (DRAM), and memory 108 may be a solid-state drive (SSD), hard disk drive (HDD), or memristor memory. In some implementations, memory 106 may be a first type of non-volatile memory and memory 108 may be a second type of non- volatile memory that has a higher latency than the first type of non-volatile memory. For example, memory 106 may be a memristor memory and memory 108 may be an HDD.

[0020] In some implementations, mapping module 114 may store a respective status indicator for each of the mappings. Each status indicator may indicate whether data associated with a virtual address of the respective mapping has been modified without being copied to memory 108. For example, in response to R.D A write commands received by RDMA communication module 110, data associated with the RDMA write commands may be written to memory 108. Multiple RDIVIA write commands may alter data stored in a given location of memory 106. Data in memory 106 may be copied to memory 108, for example to create backup copies of the data or to make room for more data in memory 106 (e.g., least recently used data may be moved from memory 06 to memory 108). A status indicator may include a dirty bit that indicates whether data in a particular location in memory 106 has been modified since the last time data in that location was copied to memory 108, Such dirt bits may be used to ensure that the most current version of data in memory 106 is copied to memory 108.

[0021] FIG. 2 is a block diagram of an example system 200 for transferring data between memories on a dedicated memory server. In FIG, 2, system 200 includes dedicated memory server 204, which is communicatively coupled to client servers 202a, 202b, and 202c via network 226, In some implementations, network 226 may be an Ethernet or InfiniBand network, In some implementations, dedicated memory server 204 and one of client servers 202a-c (e.g., client server 202a) may be at a first geographical site, and another one of client servers 202a-c (e.g., client server 202b) may be at a second geographical site. [0022] Dedicated memory server 204 may include RDMA communication module 210, allocation module 212, mapping module 214, acceleration module 218, prioritization module 218, memory controller 222, memory controller 224, memory 206, and memory 208. Allocation module 212 and mapping module 214 of FIG. 2 may be analogous to (e.g., have functions and/or components similar to) allocation module 112 and mapping module 1 14, respectively, of FIG. 1. A module ma include a set of instructions encoded on a machine-readable storage medium and executable by a processor. In addition or as an alternative, a module may inciude a hardware device comprising eiectronic circuitry for implementing the functionality described below.

|0023] RDIV1A communication module 210 may receive RDMA commands over network 228 from client servers 202a~e and transmit, using RDSVSA, data to client servers 202a-c via network 226. RDMA communication module 210 may also receive data (e.g., data associated with ROMA commands) from client servers 202a -c. In implementations where dedicated memory server 204 is communicatively coupled to client servers 202a~c over Ethernet, RDMA communication module 210 may include RDMA on Converged Ethernet (RoCE) engine 220. RoCE engine 220 may manage/maintain connections between dedicated memory server 204 and client servers 202a-c. In some implementations, RoCE engine 220 may be implemented in an FPGA or ASIC on dedicated memory server 204.

[0024] In some implementations, RDMA communication module 210 may identify, based on mappings maintained by mapping module 214, a location, in memory 208, that corresponds to a virtual address specified by an RDIV1A read command received by RDMA communication module 210, Latency of memory 208 may be Sower than latency of memor 208, and dedicated memory server 204 may include memory controller 224 communicatively coupled to memory 208. In some examples, memory controller 224 may be implemented in an FPGA or ASIC on dedicated memory server 204, and the FPGA or ASIC may include a Peripheral Component Interconnect (PCI) Express link to configure an RDMA controller. In some implementations, memory 208 may be a volatile memory (e.g., DDR3 DRAM), and memory 108 may be a non-volatile memory (e.g., SSD, HDD, or memristcr memory), in some implementations, memory 206 may be a first type of non-volatile memory and memory 208 may be a second type of non-volatile memory that has a higher latency than the first type of nonvolatile memory, as discussed above with respect to F!G. 1. Memory controller 224 may transfer data from the identified location in memory 208 to a location in memory 206, Mapping module 214 may update, in response to the transfer of data, a mapping of the virtual address. The updated mapping may include a physical address corresponding to the location in memory 206 to which data is transferred. The data may be transmitted to a client server (e.g., the client server from which the RDMA read command was received),

|0025] In some implementations, system 200 may include memory controller 222 communicatively coupled to memory 206. In some examples, memory controller 222 may be implemented in an FPGA or ASIC on dedicated memory server 204. Memory controller 222 may write data to a location in memory 206 in response to an RDMA write command received by RDMA communication module 220. Mapping moduie 214 may update, in response to the writing of data, a mapping of a virtual address specified by the RDMA write command. The updated mapping of the virtual address specified by the RDMA write command may include a physical address corresponding to the location in memory 206 to which data is written.

[0026] In some implementations, memory controller 222 may monitor available memory space in memory 208. For example, memory controller 222 may determine how many more bytes (or other suitable unit) of data may be stored in memory 206 before memory 206 is full. Memory controller 222 may identify, if the available memory space is less than an available memory threshold, a set of data in memory 206 to transfer to memory 208. For example, memory controller 222 may identify least recently used data, and/or data flagged to be discarded/ transferred, in memory 206 to transfer to memory 208. Latency of memory 206 may be lower than latency of memory 208. For example, memory 206 may be a volatile memory, and memory 208 may be a non-volatile memory. Memory controller 222 may transfer the identified set of data from memory 206 to memory 208. Mapping module 214 may update, in response to the transfer of the set of data, a mapping of a virtoai address corresponding to the transferred set of data. The updated mapping of the virtual address corresponding to the transferred set of data may include a physical address corresponding to a location in memory 208 to which the set of data is transferred.

[0027] Acceleration module 216 may perform workload-specific processing for any of client servers 202a-c. Sn some implementations, one of client servers 202a-c may transmit to dedicated memory server 204 an RD A command to execute a process related to a request that the client server has received from another computing device, and the process may be executed by acceleration module 216. For example, one of client servers 202a-c may transmit to dedicated memory server 204 an RDMA command to perform a table lookup or MapReduce operations. Thus, processing capabilities as weil as memory on dedicated memory server 204 may be utilized by client servers 202a-c. In some examples, acceleration modul 216 may be implemented in an FPGA or ASIC on dedicated memory server 204.

£0028] Prioritization module 218 may prioritize requests from client servers 202a-c, Requests may include RDMA read/write commands and/or RDiVfA commands to execute a process. If RD A communication module 210 receives multiple RDMA commands, either from the same client server or from multiple client servers, that cannot be executed at the same time, prioritization module 218 may determine the order in which the RDfv!A commands should be executed on dedicated memory server 204. For example, prioritization module 218 may determine that requests received from a ciient server supporting a large number of mission-critical applications should be executed before requests received from other client servers, in some implementations, prioritization module 218 may determine which requests can be processed at the same time on dedicated memory server 204, and may process such requests first. In some examples, prioritization module 218 may be implemented in an FPGA or ASIC on dedicated memory server 204.

[0029] FIG. 3 is a block diagram of an example device 300 that includes a machine-readabie storage medium encoded with instructions to update a mapping of a virtual address. Device 300 may be implemented, for example, in a dedicated memory server (e.g., dedicated memory server 104 or 204). In FIG. 3, device 300 includes processor 302 and machine-readable storage medium 304, [0030] Processor 302 may include a central processing unit (CPU), microprocessor (e.g., semiconductor-based microprocessor), and/or other hardware device suitable for retrieval and/or execution of instructions stored in machine-readable storage medium 304. Processor 302 may fetch, decode, and/ or execute instructions 308, 308, 310, and 312. As an alternative or in addition to retrieving and/or executing instructions, processor 302 may include an electronic circuit comprising a number of electronic components for performing the functionality of instructions 306, 308, 310, and/or 312.

[0031] Machine-readabie storage medium 304 may be any suitable electronic, magnetic, optical, or other physical storage device that contains or stores executable instructions. Thus, machine-readable storage medium 304 may Include, fo example, a random-access memory (RAM), an Electrically Erasable Programmable Read-Oniy Memory (EEPRO ), a storage device, an optical disc, and the like. In some impiementations, machine-readable storage medium 304 may include a non-transitor storage medium, where the term "non-transitory" does not encompass transitory propagating signals. As described in detail below, machine- readable storage medium 304 may be encoded with a set of executable instructions 306, 308, 3 0, and 312.

[0032] Instructions 306 may allocate virtual addresses between a plurality of client servers. The plurality of client servers (e.g. , client servers 102a-c or 202a-c) may be communicatively coupled to a dedicated memory server. In some impiementations, the total memory space of the allocated virtual addresses may exceed the amount of memory space of a memory on the dedicated memory server, but be less than the total amount of memory space of two memories (e.g., memories 106 and 108, or memories 208 and 208) on the dedicated memory server. In some impiementations, instructions 306 may allocate different amounts of virtual address space to different client servers, instructions 306 may determine how much virtual address space to allocate to a c!ient server based on, for example, how many applications th client server supports, the importance of processes running on the client server compared to the importance of processes running on other client servers, and/or the maximum amount of memory the client server is expected to use. In some implementations, the allocated virtual addresses may be RMMAs.

[0033] instructions 308 may identify a virtual address specified by an RD A write command received by the dedicated memory server from one of the plurality of client servers. In some implementations, data associated with the RDMA write command may be written to the identified virtual address, in some implementations, the identified virtual address may be an RMMA.

[0034] instructions 310 may update a mapping of a virtual address. For example, in response to data associated with the RDMA write command being written to a first memory on the dedicated memory server, instructions 310 may update a mapping of the identified virtual address. The updated mapping of the identified virtual address may include a physical address corresponding to a location in the first memory to which data is written.

[0035] instructions 312 may store a status indicator for the mapping of the identified virtual address. The status indicator may indicate whether data associated with the identified virtual address has been modified without being copied to a second memory on the dedicated memory server. Latency of the first memory may be iower than latency of the second memory, in some implementations, the first memory may be a voiatiie memory (e.g., DDR3 DRAM), and the second memory may foe a non-voiati!e memor (e.g., SSD, HDD, or memristor memory). In some implementations, the first memory may be a first type of non-voJatiie memory and the second memory may be a second type of non-volatile memory that has a higher latency than the first type of non-voiatiie memory, as discussed above with respect to FiG. 1.

[0038] in some implementations, in response to RDMA write commands received by the dedicated memory server, data associated with the RDMA write commands may be written to the first memory. Multiple RDMA write commands may alter data stored in a given location of the first memory. Data in the first memory may be copied to the second memory, for example to create backup copies of the data or to make room for more data in the first memory (e.g., least recently used data may be moved from the first memory to the second memory). A status indicator may include a dirty bit thai indicates whether data in a particular location in the first memory has been modified since the last time data in that location was copied to the second memory. Such dirty bits may be used to ensure that the most current version of data in the first memory is copied to the second memory.

[0037] FIG. 4 is a block diagram of an example device 400 that includes a machine-readabie storage medium encoded with inst ictions to transfer a set of data from a first memory to a second memory. Device 400 may be implemented, for exampie, in a dedicated memory server (e.g., dedicated memory server 104 or 204), in FSG. 4, device 400 includes processor 402 and machine-readable storage medium 404.

[0038] As with processor 302 of FIG. 3, processor 402 may include a CPU, microprocessor (e.g., semiconductor-based microprocessor}, and/or other hardware device suitable for retrieval and/or execution of instructions stored in machine-readabie storage medium 404, Processor 402 may fetch, decode, and/ or execute instructions 408, 408, 410, 412, 414, 416, and 418. As an alternative or in addition to retrieving and/or executing instructions, processor 402 may include an electronic circuit comprising a number of electronic components for performing the functionality of instructions 406, 408, 410, 412, 414, 416, and/or 418.

[0039] As with machine-readabie storage medium 304 of FIG. 3, machine- readable storage medium 404 may be any suitable physical storage device that stores executable instructions. Instructions 408, 408, 410, and 412 on machine- readable storage medium 404 may be analogous to instructions 306, 308, 310, and 312, respectively, on machine-readable storage medium 304. Instructions 414 may monitor available memory space in a first memory on a dedicated memory server. For example, instructions 414 may determine how many more bytes (or other suitable unit) of data may be stored in the first memor before the first memory is full.

[0040] Instructions 416 may identify, if the available memory space is less than an available memory threshold, a set of data in the first memory to transfer to a second memory on the dedicated memory server. For example, instructions 416 may identify least recently used data, and/or data flagged to be discarded/ transferred, in the first memory to transfer to the second memory. Latency of the first memory may be lower than latency of the second memory. In some implementations, the first memory may be a volatile memory {e.g., DDR3 DRAM), and the second memory may be a non-volatile memory (e.g., SSD, HDD, or memristor memory), In some implementations, the first memory may be a first type of non-volatile memory and th second memory may be a second type of non-volatile memory that has a higher latency than th first type of non-volatile memory, as discussed above with respect to FIG. 1.

[0041] Instructions 418 may transfer the identified set of data from the first memory to the second memory. Instructions 410 may update, in response to the transfer of the set of data, a mapping of a virtual address corresponding to the transferred set of data. The updated mapping of the virtual address corresponding to the transferred set of data may include a physical address corresponding to a location in the second memory to which the set of data is transferred,

[0042] FIG. 5 is a block diagram of an example device 500 that includes a machine-readable storage medium encoded with instructions to enable transferring of data in response to an RDMA read command. Device 500 may be implemented, for example, in a dedicated memory server (e.g., dedicated memory server 104 or 204). In FIG, 5, device 500 includes processor 502 and machine-readable storage medium 504.

[0043] As with processor 302 of FIG, 3, processor 502 may include a CPU, microprocessor (e.g., semiconductor-based microprocessor), and/or other hardware device suitable for retrieval and/or execution of instructions stored in machine-readable storage medium 404. Processor 502 may fetch, decode, and/ or execute instructions 506, 508, 510, 512, 514, and 518 to enable transferring of data in response to an RDMA read command, as described beiow. As an alternative or in addition to retrieving and/or executing instructions, processor 502 may include an electronic circuit comprising a number of electronic components for performing the functionality of instructions 506, 508, 510, 512, 514, and/or 516. [0044] As with machine-readable storage medium 304 of FSG. 3, machine- readable storage medium 504 may be any suitable physical storage device that stores executable instructions, Instructions 506, 508, 510, and 512 on machine- readable storage medium 504 may be analogous to instructions 306, 308, 310, and 312, respectively, on machine-readable storage medium 304. Instructions 514 may identify a location, in a second memory on a dedicated memory server, that corresponds to a virtual address specified by an RD1V1A read command received by the dedicated memory server from one of a plurality of client servers communicatively coupled to the dedicated memory server. The second memory may have a higher latency than a first memory on the dedicated memory server. in some implementations, the first memory may be a volatile memory (e.g., DDR3 DRAM), and the second memory may be a non-volatile memory (e.g., SSD, HDD, or memristor memory). In some implementations, the first memory may be a first type of non-volatile memory and the second memory may be a second type of non-volatile memory that has a higher latency than the first type of non-volatile memory, as discussed above with respect to FIG. 1.

[0045] Instructions 518 may transfer data from the location in the second memory to a location in the first memory, in response to the transfer of data, instructions 510 may update a mapping of the virtual address specified by the RDMA read command. The updated mapping of the virtual address specified by the RDMA read command may include a physical address corresponding to the location in the first memory.

[0048] Methods related to dedicated memory server operations are discussed with respect to FIGS, 8-8. FIG, 6 is a flowchart of an example method 800 for transferring data between memories on a dedicated memory server. Although execution of method 600 is described below with reference to processor 502 of FIG. 5, it should be understood that execution of method 600 may be performed by other suitable devices, such as processors 302 and 402 of FIGS. 3 and 4, respectively. Method 600 may be implemented in the form of executable instructions stored on a machine-readable storage medium and/or in the form of electronic circuitry. [0047] Method 800 may start in biock 802, where processor 502 may allocate virtual addresses between a plurality of client servers. The plurality of client servers (e.g., client servers 102a-c or 202a-c) may be communicatively coupled to a dedicated memory server. In some implementations, the iota! memory space of the allocated virtual addresses may exceed the amount of memory space of a memory on the dedicated memory server, but be less than the total amount of memory space of two memories (e.g., memories 108 and 106, or memories 206 and 208) on the dedicated memory server. In some implementations, processor 502 may allocate different amounts of virtual address space to different client servers. Processor 502 may determine how much virtual address space to allocate to a client server based on, for example, how many applications the client server supports, the importance of processes running on the client server compared to the importance of processes running on other client servers, and/or the maximum amount of memory the client server is expected to use. in some implementations, the allocated virtual addresses may be RWfvlAs.

0048] In block 804, processor 502 may identify a virtual address specified by an RDfvlA write command received by the dedicated memory server from a first of the plurality of client servers, in some implementations, data associated with the RDM A write command may be written to the identified virtual address. In some implementations, the identified virtual address may be an R MA.

[0049] In block 606, processor 502 may update, in response to data associated with the RDMA write command being written to a first memory on the dedicated memory server, a mapping of the identified virtual address. The updated mapping of the identified virtual address may include a physical address corresponding to a first location in the first memory to which data is written.

[0050] in block 608, processor 502 may transfer data from a location, in a second memory on the dedicated memory server, and that corresponds to a virtual address specified by an RDMA read command received by the dedicated memory server from a second of the plurality of client servers, to a second location in the first memory. The data may be transmitted to a client server (e.g., the client server from which the RDMA read command was received). Latency of the first memory may be lower than latency of the second memory. In some implementations, the first memory may be a volatile memory (e.g., DDR3 DRAM), and t e second memory may be a non-volatile memory (e.g., SSD, HDD, or memristor memory), in some implementations, the first memory may be a first type of non-voiati!e memory and the second memory may be a second type of non-volatile memory that has a higher latency than the first type of non-volatile memory, as discussed above with respect to FIG. 1.

[0051] FIG. 7 is a flowchart of an example method 700 for tracking data stored on a dedicated memory server. Although execution of method 700 is described below with reference to processor 502 of FIG. 5, it should be understood that execution of method 700 may be performed by other suitable devices, such as processors 302 and 402 of F!GS. 3 and 4, respectively. Some biocks of method 700 may be performed in parallel with and/or after method 600. Method 700 may be implemented in the form of executable instructions stored on a machine-readable storage medium and/or in the form of electronic circuitry.

[0052] Method 700 may start in biock 702, where processor 502 may transfer data from a location, in a second memory on a dedicated memory server, and that corresponds to a virtual address specified by an RDMA read command received by the dedicated memory server (e.g., dedicated memory server 104 or 204) from one of a plurality of client servers (e.g., client servers 102a-c or 2Q2a-c) communicatively coupied to the dedicated memory server, to a location in a first memory on the dedicated memory server. Latency of the first memory may be lower than latency of the second memory. In some implementations, the first memory may be a volatile memory (e.g., DDR3 DRAM), and the second memory may be a non-volatile memory (e.g., SSD, HDD, or memristor memory). In some implementations, the first memory may be a first type of non-volatile memory and the second memory may be a second type of non-volatile memory that has a higher latency than the first type of non-volatile memory, as discussed above with respect to FIG. 1.

[0053] In block 704, processor 502 may update, in response to the transfer of data, a mapping of the virtual address specified by the RDMA read command. The updated mapping of the virtual address specified by the RDMA read command may include a physical address corresponding to the location in the first memory. The data may be transmitted to a client server (e.g., the client server from which the RDMA read command was received),

[0054] in biock 706, processor 502 may store a status indicator for the mapping of the virtuai address specified by the RDMA read command. The status indicator may indicate whether data associated with the virtuai address specified by the RDMA read command has been modified without being copied to the second memory, in some implementations, the status indicator may include a dirty bit, as discussed above with respect to FIG, 1.

[0055] in biock 708, processor 502 may dynamically reallocate virtual addresses between the plurality of ciient servers, in some implementations, the number and/or priority of tasks running on the piuraiity of client servers ma be periodically assessed, and processor 502 may dynamically reallocate virtuai addresses to a ciient serve having a highe numbe and/or priority of tasks. For example, if equal amounts of virtuai address space are initially allocated to the piurality of client servers and one of the piuraiity of client servers is determined to be performing more high priority tasks than the other ciient servers during runtime, processor 502 may identify some virtual addresses that were initially assigned to the other ciient server, and reallocate such virtual addresses to the one of the plurality of client servers. In some implementations, the plurality of client servers may be in the same enclosure, and when another client server is added to the enclosure, processor 502 may reai locate virtual addresses between the plurality of client servers and the added ciient server. Although block 708 is shown below blocks 702, 704, and 708 in FIG. 7, it should be understood that the elements of biock 708 may be performed before or in parallel with the elements of blocks 702, 704, and/or 706,

[0056] FIG. 8 is a flowchart of an example method 800 for determining when to transfer data between memories on a dedicated memory server. Although execution of method 800 is described below with reference to processor 402 of FIG, 4, it should be understood that execution of method 800 may be performed by other suitable devices, such as processors 302 and 502 of FIGS. 3 and 5, respectively. Some blocks of method 800 may be performed in parallel with and/or after method 600 or 700, Method 800 may be implemented in the form of executable instructions stored on a machine-readable storage medium and/or in the form of electronic circuitry,

[0057] Method 800 may start in block 802, where processor 402 may monitor available memory space in a first memory on a dedicated memory server. The dedicated memory server (e.g., dedicated memory server 104 or 204) may be communicatively coupled (e.g., via an Ethernet or InfiniBand network) to a plurality of client servers (e.g., client servers 102a-c or 202a-c). In some implementations, processor 402 may determine how many more bytes {or other suitable unit) of data may be stored in the first memory before the first memory is full.

[0058] In block 804, processor 402 may determine whether the available memory space is less than an available memory threshold. If, in block 804, processor 402 determines that the available memory space is not less than the available memory threshold, method 800 ma loop back to block 802, if, in block 804, processor 402 determines that the available memory space is less than the available memory threshold, method 800 may proceed to block 806, where processor 402 may identify a set of data in the first memory to transfer to a second memory on the dedicated memory server. For example, processor 404 may identify least recently used data, and/or data flagged to be discarded/transferred, in the first memory to transfer to the second memory. Latency of the first memory may be Sower than latency of the second memory. In some implementations, the first memory may be a volatile memory {e.g., DDR3 DRAM), and the second memory may be a non-volatile memory {e.g., SSD, HDD, or memristor memory). In some implementations, the first memory may be a first type of non-volatile memory and the second memory may be a second type of non-volatile memory that has a higher latency than the first type of non-volatile memory, as discussed above with respect to FIG. 1 .

[0059] In block 808, processor 402 may transfer the identified set of data from the first memory to the second memory. In block 810, processor 402 may update, in response to the transfer of the set of data, a mapping of a virtual address corresponding to the transferred set of data. The updated mapping of the virtual address corresponding to the transferred set of data may include a physical address corresponding to a iocaiion in the second memory to which the set of data is transferred.

[0060] The foregoing disclosure describes server systems having a dedicated memory server. Example implementations described herein enable disaggregation of memory across a server system and dynamic sharing of remote memory.