Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CLUSTERING SYSTEM
Document Type and Number:
WIPO Patent Application WO/2018/011425
Kind Code:
A1
Abstract:
The present invention relates to modular computing hardware and networking software. More specifically, it relates to a modular system in which a motherboard comprises several interface boards, and each interface board comprises multiple modules further comprising processing or storage hardware. Each interface board is networked such that it is capable of communicating directly with any other interface board in the system, which effectively allows any board of the system to communicate with any other board in the system. This provides a system which has large scalability and may use low power components to reduce cooling requirements in, for example, a server system.

Inventors:
BATCHELOR PETER (GB)
Application Number:
PCT/EP2017/067943
Publication Date:
January 18, 2018
Filing Date:
July 14, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NEBRA MICRO LTD (GB)
International Classes:
G06F1/18
Domestic Patent References:
WO2011034900A12011-03-24
Foreign References:
US20060140179A12006-06-29
US20150169013A12015-06-18
US20070133188A12007-06-14
US20140173157A12014-06-19
CN201111027Y2008-09-03
CN202512473U2012-10-31
Attorney, Agent or Firm:
WITHERS & ROGERS LLP (GB)
Download PDF:
Claims:
CLAIMS

1. A computing cluster (100), comprising:

a motherboard (200) including a plurality of interface board connectors (205);

a plurality of interface boards (300) arranged to be engaged with the interface board connectors (205), each interface board (300) including a plurality of module connectors (303); and

a plurality of modules (400,500), the plurality of modules (400, 500) being arranged to be engaged with the module connectors (303).

2. A computing cluster (100) according to claim 1, wherein the interface board connectors (205) are self-retaining connectors.

3. A computing cluster (100) according to either of claims 1 or 2, wherein the interface board connectors (205) are SODIMM connectors.

4. A computing cluster (100) according to any of claims 1 to 3, wherein the plurality of interface boards (300) are arranged to communicate with each other via a network pass through.

5. A computing cluster (100) according to any of claims 1 to 4, wherein at least one of the modules (400, 500) is engaged with a module connector (303) such that it is positioned substantially parallel to the interface board (300). 6. A computing cluster (100) according to any one of claims 1 to 5, wherein the motherboard (200) further includes a network controller (206).

7. A computing cluster (100) according to claim 6, wherein the network controller (206) is arranged to receive inputs from the plurality of interface boards (300).

8. A computing cluster (100) according to any one of claims 1 to 7, wherein the interface board connectors (205) comprises pins relating to power, Ethernet, serial data, enabling, power fault and grounding.

8. A computing cluster (100) according to any one of claims 1 to 8, wherein the module connectors (303) comprise pins relating to power, Ethernet, serial data, enabling, powering down, power faults, universal serial bus, serial peripheral interfacing and grounding. 10. A computing cluster (100) according to any one of claims 1 to 9, wherein each of the plurality of interface boards (300) comprises a hot swap controller (308) arranged to control power into the plurality of interface boards (300).

11. A cluster server system (700), comprising:

at least one computing cluster (100), in accordance with any of claims 1 to 8; and a networking board (600), in data communication with the or each of the computing clusters (100).

12. An interface board (300), arranged to be inserted into and powered by a motherboard (200), comprising:

a plurality of connectors (303), arranged to receive a plurality of modules (400, 500); and

a network controller (306), arranged to control networking between the interface board (300) and the plurality of connectors (303),

wherein the network controller (306) is also arranged to act as a network pass through between one or more other interface boards (300).

13. An interface board (300) according to claim 10, wherein the interface board (300) is arranged to be inserted into and powered by the motherboard (200) using a self-retaining connector.

14. An interface board (300) according to either of claims 10 or 1 1, wherein the interface board (300) is arranged to be inserted into and powered by the motherboard (200) using a SODIMM connector.

15. An interface board (300) according to any of claims 12 to 14, wherein the plurality of connectors (303) are further arranged to hold one or each of the modules (400, 500) positioned substantially parallel to the interface board (300).

16. An interface board (300) according to any of claims 12 or 15, wherein the network controller (306) is further arranged to send inputs to the motherboard (200).

17. An interface board (300) according to any of claims 12 to 6 wherein the connectors (303) comprise pins relating to power, Ethernet, serial data, enabling, powering down, power faults, universal serial bus, serial peripheral interfacing and grounding.

18. An interface board (300) according to any one of claims 12 to 17, wherein the interface board (300) further comprises a hot swap controller (308) arranged to control power into the interface board (300).

Description:
CLUSTERING SYSTEM

FIELD OF THE INVENTION The present invention relates to networking of computer systems and components. In particular networking of components for use in high density computer networks.

BACKGROUND OF THE INVENTION The clustering system of the present invention applies generally to computing and should therefore be considered in this context. However, the invention will generally be discussed in relation to server systems, being one of the most beneficial applications of the invention.

Servers are purpose-built computing devices used for performing particular tasks. They are normally mounted on standardised 19-inch server racks which provide a high density support structure in which to house multiple servers. Conventional server systems are run twenty- four hours a day in order to provide uninterrupted availability of resources. They can be used for several different computing purposes, for example, providing additional computing power over a network, storage, communications, mail, web content, printing and gaming.

The key issue surrounding current server systems is the cost in running server farms on an uninterrupted basis. Owing to the high density of computing power in a standard server farm, high power requirements must be met in order to maintain the system. Further, the high density and continuous usage leads to high temperatures which can damage server components, resulting in server downtime. As a result, high power cooling systems are a necessity within server farms, further increasing power requirements. One of the largest costs involved with maintenance of a server farm is the power required to run and cool the servers. As such, server systems are commonly rated on a performance per watt basis. Other rating systems are also used, such as performance per unit cost, depending on user requirements.

In order to increase the performance per watt value of server systems, blade servers provide stripped down server computers with modular designs arranged to make good use of space. Several servers can be housed within a single blade, which greatly increases the performance per unit volume, and decreases cooling requirements, increasing performance per watt. Due to the standardisation of server sizes, the size of a server is referred to in rack units, U, being 19 inches (480mm) wide and 1.75 inches (44mm) tall. Common server racks have a form-factor of 42U high, which limits the number of servers that can be placed in a single rack. Some high-end blade systems today can achieve a density of 2352 processing cores per 42U rack.

The present invention aims to reduce the density constraints of current systems by providing an ultra-high density modular cluster server system.

SUMMARY OF THE INVENTION

In order to reduce the limitations associated with the prior art, a computing cluster according to a first aspect of the present invention comprises: a motherboard comprising a plurality of interface board connectors; a plurality of interface boards arranged to be engaged with the interface board connectors, each interface board comprising a plurality of compute module connectors; and a plurality of compute modules, the plurality of compute modules being arranged to be engaged with the compute module connectors. As will be appreciated, the present invention provides several advantages over the prior art. For example, due to the hierarchical structure of the system, the cluster system may use low power components with a relatively small footprint compared to typical server systems. This leads to a much higher density of modules than present systems, greatly increasing the amount of computing power or storage space available per unit area. Further, the use of lower electrical power modules reduces the cooling requirements of the system. As the components are smaller, self-retaining connectors can be used to allow freestanding interface boards, reducing the complexity of the structure and further increasing density.

The interface board connector may be a self-retaining connector. The interface board connector may be a SODIMM connector. The plurality of interface boards may be arranged to communicate with each other via a network pass through. At least one of the modules may be engaged with the module connector such that it is positioned substantially parallel to the interface board. The motherboard may further include a network controller. The network controller may be arranged to receive inputs from the plurality of interface boards. The interface boards may comprise pins relating to power, Ethernet, serial data, enabling, power fault and grounding. The module connectors may comprise pins relating to power, Ethernet, serial data, enabling, powering down, universal serial bus, serial peripheral interfacing and grounding. A cluster system may comprise at least one computing cluster, in accordance with the above; and a networking board, in data communication with the, or each of the, computing cluster(s).

An interface board according to a second aspect of the invention, arranged to be inserted into and powered by a motherboard, comprises: a plurality of connectors arranged to receive a plurality of compute modules; and a network controller arranged to control networking between the interface board and the plurality of connectors, wherein the network controller is also arranged to act as a network pass through between one or more other interface boards.

Due to the network pass through capabilities of the interface boards, any interface board may directly communicate with any other interface board within the server system without external overheads, allowing for much greater scalability beyond a single unit.

The interface board may be arranged to be inserted into and powered by the motherboard using a self-retaining connector. The interface board may be arranged to be inserted into and powered by the motherboard using a SODIMM connector. The plurality of connectors may be further arranged to hold one or each of the modules positioned substantially parallel to the interface board. The network controller may be further arranged to send inputs to the motherboard. The connectors may comprise pins relating to power, Ethernet, serial data, enabling, powering down, power faults, universal serial bus, serial peripheral interfacing and grounding.

BRIEF DESCRIPTION OF THE DRAWINGS

Other advantages and benefits of the present invention will become apparent from a consideration of the following description and accompanying drawings, in which:

FIGURE 1 shows an isometric view of a cluster, in accordance with the present invention;

FIGURE 2 shows a circuit diagram of a motherboard according to the present invention; FIGURE 3 shows a circuit diagram of a daughterboard according to the present invention;

FIGURE 4 show a circuit diagram of a compute module according to the present invention;

FIGURE 5 shows a circuit diagram of a storage module according to the present invention;

FIGURE 6 shows a circuit diagram of a networking board according to the present invention;

FIGURE 7 shows a schematic of the overall network structure according to the present invention; and

FIGURE 8 shows a schematic of a cluster structure according to the present invention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

A clustering system in accordance with an exemplary embodiment of the present invention is shown in Figures 1 to 8.

Figure 1 shows a cluster 100 comprising a motherboard 200, designed to act as the structural base, as well as the networking and power core of each cluster 100. The motherboard includes four 80-pin slot connectors 205 and a modular connector 201 to allow for external Ethernet connections to the cluster 100. The cluster 100 further comprises four interface boards (also referred to as daughterboards) 300 individually attached to the motherboard via the 80-pin slot connectors 205. Each daughterboard 300 includes four M.2 slot connectors 303 and is connected to four modules 400, 500, via the M.2 slot connectors 303.

The purpose of the motherboard 200 is to distribute power and to control data and communications channels within the cluster 100. Referring now to Figure 2, the motherboard 200 has several communication interfaces as well as other components.

The modular connector 201 has an enhanced bandwidth capable of supporting data and control information for the motherboard 200 on a single channel. In this embodiment, the modular connector 201 is an RJ45 connector, however, it may, of course, be any type of modular connector.

In an alternative embodiment (not shown), there are two modular connectors 201 which provide two data channels to the motherboard 200, one channel being used as a command channel and the other being for data communication external to the local network.

In one embodiment, the motherboard includes Ethernet hubs 202 which control the data channel and allow access to the daughterboards 300. The Ethernet hubs 202 operate at base 10/100/1000, allowing for a maximum data transfer rate of 1Gb per second per cluster 100. A microcontroller 206, or network controller 206, controls the Ethernet hubs 202, although the input for the microcontroller 206 comes from the daughterboards 300 owing to limitations in the Ethernet network. In an alternate embodiment, the microcontroller 206 on the motherboard 200 is Ethernet enabled. In this embodiment, the microcontroller 206 may receive inputs directly, allowing a cluster 100 to be monitored and controlled even if the interface boards 300 are in an Off state. The motherboard 200 also includes power connectors 203 to supply a 12V power input to the motherboard 200. A power regulator 204 takes the 12V input from the power connectors 203 and provides a regulated 3.3 V output to the electronics of the motherboard 200.

The 80-pin slot connectors 205 provide the communication interface between the motherboard 200 and the daughterboards 300. Each of the 80-pin slot connectors 205 has the following connections: four 12V power pins; eight Ethernet pins forming two channels, three I 2 C (or other serial bus) pins; an enable pin for each daughterboard; a power fault pin; and seven ground pins. The remaining pins of the connectors 205 are unallocated, available for use when configuring or customising the system.

Referring now to Figure 3, the daughterboard 300 is shown which can be connected to the 80-pin slot connector 205 on the motherboard 200 using an 80-pin edge connector 301. The 80-pin edge connector 301 has corresponding specifications to the 80-pin slot connector 205 described above. The daughterboard 300 has further communication interfaces in four M.2 slot connectors 303, two on each face of the daughterboard 300, with B keying, and a 6-pin programming header 307. The M.2 slot connectors 303 are arranged with the slots at right angles such that the modules 400, 500, when connected, sit substantially parallel to the daughterboard 300. The B keying of the M.2 slot connectors 303 specifies where the orientation notch is placed. Although M.2 slot connectors 303 have standardised pin formats, in the present embodiment they are being used in a bespoke configuration, which is: two 12V power pins; four Ethernet pins for one channel; three I 2 C (or other serial bus) pins; an enable pin; a power down pin; a power fault pin; two universal serial bus (USB) pins; three serial peripheral interface bus (SPI) pins; and six ground pins. This pin configuration allows for the possibility of multiple alternate communication lines to be utilised in alternate embodiments beyond those described here. It should be noted, however, that alternate keying arrangements may be used. In one embodiment, the daughterboard 300 also comprises a 5 port Ethernet hub 302 for distributing the data channel received through the 80-pin edge connector 301 through each of the four M.2 slot connectors 303. In an alternate embodiment, the Ethernet hub 202 of the motherboard 200 communicates directly with the modules 400, 500, removing the need for an Ethernet hub 302 on the daughterboard 300.

It is to be noted that, in alternate embodiments, the 80-pin edge connectors 205, 301 may be replaced with other types of connector. For example, in one embodiment, the connector 205, 301 may be a small outline dual in-line memory module (SODIMM), as this allows for orientation and relation control.

A SODIMM connector is one of several connectors with a self-retaining locking mechanism which locks a connected board in place. Other examples of self-retaining connectors are: DIMM connectors, screw connectors and multiple in-line pin connectors. Typically, computing clusters of this type would use external retention mechanisms, as the systems are generally too large and the components too heavy to rely on self-retaining mechanisms. However, the present invention uses smaller boards and, as such, self-retaining connectors may be used. The self-retaining nature of the connectors allows the interface boards to be freestanding, removing the need for any external retention mechanism. Therefore, the boards can be placed closer together, increasing cluster, and therefore computing, density. The freestanding nature of the interface boards also reduces the complexity of the structure, and facilitates the use of the M.2 connectors for parallel connection of the modules to the interface boards.

Preferably, of the self-retaining connectors available, the present invention uses a SODIMM connection. This is due to the size of the connector, the cost, and the nature of the self- retaining mechanism which, for SODIMM connectors, is a pair of simple-to-use lugs. The SODIMM connection therefore allows for the simplest means of adding/removing interface boards at low cost. The small size of SODIMM connectors makes them especially unsuited to large size computing cluster systems, however, the present system relies on much smaller components, atypical of a computing cluster, which allows for the use of SODIMM connectors.

An Ethernet to SPI controller 305 uses the control channel from the motherboard 200 and converts it to the SPI communication standard to be used by a microcontroller 306 (or network controller 306). The microcontroller 306 then routes the control data to each M.2 slot connector 303 via the I 2 C communication lines. The microcontroller 306 is also capable of communicating with all other microcontrollers 306, 206 in the cluster 100 via I 2 C communication lines. As such, the microcontroller 306 of the daughterboard 300 can communicate external instructions to the microcontroller 206 of the motherboard 200, which, in one embodiment, does not have a direct external connection, and transfers data to or from other daughterboards 300 in the cluster 100 or within the wider system. In this manner, the microcontroller 306 of the daughterboard 300 may act as a network pass through (or network interface), shunting data through to other daughterboards 300 or motherboards 200. With the microcontroller 306 acting as a network pass through, each daughterboard 300 is able to directly network with any other daughterboard 300 or motherboard 200, allowing for a simpler network structure.

The pass through is enabled using a direct Ethernet connection between each daughterboard 300. When a daughterboard 300 receives data through the pass through, it simply redirects it on to the intended destination. Therefore, a first module 400, 500 may directly communicate with a second module 400, 500 utilising the pass through connection of a daughterboard 300. In alternative embodiments, the pass through function may be enabled by Ethernet connected microcontrollers 206 in the motherboard 200, which would perform the same function as the Ethernet-enabled daughterboard microcontrollers 306 in acting as a point which facilitates inter-component communication at high speed. Furthermore, an external system, into which the motherboard 200 is inserted, may also perform the pass through function.

The daughterboard 300 also includes a 12V to 3.3V power regulator 304 for the daughterboard's electronics. The daughterboard 300 also comprises hot swap controllers 308. These sit on the main power lines into each daughterboard 300 and act as a switch allowing power to be enabled and disabled. The controller 308 also provides over voltage and over current protection, as well as delayed start up functionality. The hot swap controller 308 allows each daughterboard 300 to be fully turned off, this enables power saving when a daughterboard 300 is not required, but also allows the daughterboard 300 to be replaced without powering down the whole system. The hot swap controller 308 receives control inputs from the microcontroller 206 of the motherboard 200.

Referring now to Figures 4a, 4b, 5a and 5b, modules 400, 500 comprise M.2 edge connectors 403, 503 arranged to connect with the M.2 slot connectors 303 of the daughterboard 300. A module 400, 500 may be used as a compute module or as a storage module. Figure 4 shows a compute module 400, and Figure 5 shows a storage module 500.

A compute module 400 allows the cluster 100 to operate as an ultra-high density compute platform. As mentioned above, this could be for server hardware, or it could be, for example, for desktop rendering. The specific processor used will change depending on the system requirements and, for example, may be a mobile phone/tablet CPU due to the high performance to cost ratio, a high spec GPU or an FPGA. The compute module 400 has a single communication interface, being the M.2 edge connector 403, the pin out for which corresponds to the pin out described for the daughterboard 300 above. The compute module 400 also comprises all the necessary electronics required for a processor 406 to function: a power regulator 404; RAM 401 for the processor; communications conversion 402; and storage 405 for firmware storage. In one embodiment, the communications conversion 402 may be an Ethernet to USB hub 402. An Ethernet to USB hub 402 may be used in conjunction with a quad core processor to facilitate communication. However, other processors do not require this form of communications conversion.

The storage module 500 acts as an ultra-high density storage array where, for example, each module may comprise one or more solid state drives (SSDs) associated with corresponding NAND Flash integrated chips 505. Storage module 500 has M.2 pin outs 503 which correspond to the pin out described for the daughterboard 300 above, and also includes all the necessary electronics for a functioning storage module: a power regulator 504; a CPU 501 to interface with the rest of the cluster 100; and one or more NAND Flash integrated chips for storing data 505. The SSD chips associated with the NAND Flash integrated chips 505 may be up to 2TB in size with current technology. The CPU 501 need only be a basic storage controller or a low spec processor used to direct data toward the memory of the module 500.

Figure 6 shows a networking board 600 according to the present invention. The networking board 600 comprises modular connectors 601 for receiving data and control signals and cluster connectors 602 for transmitting the data and control signals to the clusters 100. The networking board 600 further comprises communication hubs 604 for controlling the flow of the signals.

The networking board 600 also comprises inter-board connectors 605 arranged to facilitate communication with other networking boards 600. In this manner, a number of identical networking boards 600 may be connected together to control networking of systems of any size. The networking boards 600 are arranged to communicate directly with any cluster 100 attached to the board 600, through the cluster connectors 602, and also with any other cluster 100 attached to any other board 600, through the inter-board connectors 605. Each communications hub 604 comprises four ports to facilitate communication within the networking board 600. One port supports a data feed to/from a cluster 100 via a cluster connector 602. One port supports an external communication feed via the modular connectors 601. Two ports support communication to/from other networking boards 600 via the inter- board connectors 605. In a preferred embodiment, the networking board 600 is directly connected to two clusters 100, and therefore requires two communication hubs 604 and two cluster connectors 602. However, in alternate embodiments, the networking board 600 may be arranged to directly connect to any number of clusters 100, with differing numbers of communications hubs 604 and cluster connectors 602 as required.

In a preferred embodiment, the data is transmitted via Ethernet using Ethernet hubs 604 through RJ45 connectors 601, 602. In one embodiment, some of the cluster connectors 602 are dual RJ45 connectors, arranged to transmit the data and control signals over a single channel.

The networking board 600 further comprises a power connector 606, a power regulator 607 and a hot swap controller 608. The hot swap controller 608 allows clusters 100 to be powered on/off and removed without affecting the rest of the system.

Figure 7 shows an example of server system 700 comprising ten clusters 100 of the type described above. The server system 700 is designed to use the cluster 100 in a rack mounted format. Several clusters 100 are provided with additional infrastructure to operate in a server environment.

A power supply unit (PSU) 701 receives power from an external power input 702 and provides power directly to each cluster 100 through the power connectors 203 of the motherboards. The PSU 701 also supplies power directly to a networking board 600 (or a series of interconnected networking boards 600) and a fan and system management board 705.

The networking board(s) 600 directly interface(s) with each cluster 100 and the fan and system management board 705.

The fan and system management board 705 sends and receives data through the networking board(s) 600 in order to control the temperature of the server system 700. The fan and system management board 705 also provides controls for system power on/off as well as fan controlled temperature regulation. The server system further comprises a chassis to house the clusters.

The chassis is designed to fit in a standard rack mount cabinet. Therefore, each server system 700 has a width of 450mm, which includes mounting rails for mounting the system 700 within a cabinet. Each cluster is 80mm in height, and the server system can be standardised to a height of 2U, where U is a unit of height in a rack mount server (about 45mm).

As a result of the high density, modular nature and increased networking capabilities of the present invention, conventionally high power, high cost components can be replaced with low power, low cost components arranged to network with each other directly.

The depth of the system 700 is dictated by the number of clusters 100. Within a single server system 700, four clusters 100 can fit side by side along the width of the cabinet. As such, server systems 700 are designed in groups of four clusters 100 up to a maximum of 32 clusters 100, comprising a total of 512 modules 400, 500, in a single 2U chassis. Therefore, a 42U rack may contain as many as 672 clusters 100, comprising 10752 modules 400, 500.

Figure 8 shows a schematic of a cluster 100. The power connectors 203 of the motherboard 200 provide power to the components of the motherboard 200, to each daughterboard 300 and to each module 400, 500. The 12V input is regulated to a 3.3V output before sending it to board components.

Data and control signals are received at the modular connector 201 of the motherboard 200. In a preferred embodiment, these signals are transmitted through 10G Ethernet to the Ethernet hub 202. The Ethernet hub 202 sends controls signals and data signals to the microcontroller 206. In one embodiment, the data signals are transmitted via an Ethernet to SPI controller. In this preferred embodiment, the control signals may be sent directly to the microcontroller 206 as it is Ethernet enabled.

Control signals are sent to the hot swap controller 308 of each daughterboard 300, from the microcontroller 206, to control the on/off state. Ethernet communications are sent directly to the daughterboards 300 and modules 400, 500 from the Ethernet hub 202 on the motherboard 200. As with the motherboard 200, the data and control signals are sent to the microcontroller 306, which further routes control data to the modules 400, 500 via I 2 C communication lines. Data signals are sent to the modules 400, 500 through 1G Ethernet.

In one embodiment of the present invention, a daughterboard 300 may comprise four quad core ARM ® CPUs with 4GB of RAM, one CPU per compute module 400. Each cluster 100 of four daughterboards 300 requires roughly 16W of power. Taking all the periphery components into account, a system 700 comprising 32 clusters 100 will have a theoretical power envelope of about 600W. Therefore a rack of 10752 compute modules 400 will have a theoretical total power envelope of 12.6kW providing 43TB of RAM.

Due to the modular nature of the modules 400, 500, and the accompanying networking structure each cluster 100 may contain any combination of compute modules 400 and storage modules 500 according to a user's requirements. For example, a single daughterboard 300 might carry two compute modules 400 and two storage modules 500, four compute modules 400, or four storage modules 500.

It is to be understood that, although various specific connectors with particular pin formations are discussed in describing embodiments of the present invention, other connectors and pin formations could alternatively be used and, as such, the present invention should not be considered as limited to these specific connectors and pin formations. For example, although the present invention is described, in one embodiment, as using Ethernet to receive control and data channels, it is to be understood that an alternate network connection may be used.

Further, although the present invention has been described in relation to one particular embodiment in which one motherboard may comprise four interface boards, and one interface board may comprise four modules, it is to be understood that this is one example. In another embodiment, a daughterboard may comprise six storage modules. In a further embodiment, a motherboard may comprise five interface boards. The present invention should not be considered as limited to the specific structure described above.