Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR DISTRIBUTING SPATIAL DATA TO A PLURALITY OF DATABASE NODES
Document Type and Number:
WIPO Patent Application WO/2022/111791
Kind Code:
A1
Abstract:
An apparatus (101) for distributing spatial data to a plurality of database nodes (107, 111a-c) is disclosed. Each data element is associated with a position in a m-dimensional space, with m ≤ 2, wherein each database node stores a subset of the plurality of data elements. The space is partitioned into a plurality of sub-spaces, wherein each sub-space comprises a central portion and a boundary portion, and the data elements associated with a position in a first sub-space are stored in a first database node. For one or more further sub-spaces neighbouring the first sub-space, the data elements associated with a position in the one or more further sub-spaces are stored in one or more further database nodes, and the data elements associated with a respective position in the boundary portion of the first sub-space are stored in at least one of the one or more further database nodes. Advantageously, this allows reducing the inter-node communication necessary for handling a spatial data query.

Inventors:
NOZDRIN ALEXANDER (DE)
ALVAREZ VICTOR (DE)
Application Number:
PCT/EP2020/083171
Publication Date:
June 02, 2022
Filing Date:
November 24, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
NOZDRIN ALEXANDER (DE)
International Classes:
G06F16/27
Other References:
"Related work and GEOQL ED - Martin Atzmueller; Alvin Chin; Frederik Janssen; Immanuel Schweizer; Christoph Trattner", 9 September 2013, ICIAP: INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, 17TH INTERNATIONAL CONFERENCE, NAPLES, ITALY, SEPTEMBER 9-13, 2013. PROCEEDINGS; [LECTURE NOTES IN COMPUTER SCIENCE; LECT.NOTES COMPUTER], SPRINGER, BERLIN, HEIDELBERG, PAGE(S) 9 - 79, ISBN: 978-3-642-17318-9, XP047291495
Attorney, Agent or Firm:
KREUZ, Georg (DE)
Download PDF:
Claims:
CLAIMS

1. An apparatus (101) for distributing a plurality of data elements to a plurality of database nodes (107, 111a-c) of a distributed database system (100), wherein each data element is associated with a position (210) in a m-dimensional space (200), where m > 2, wherein each database node (107, 111a-c) is configured to store at least a subset of the plurality of data elements, wherein the apparatus (101) is configured to: partition the m-dimensional space (200) into a plurality of sub-spaces (201-209), wherein the position of each data element is associated with one of the plurality of subspaces (201-209) and each sub-space (201-209) comprises a central portion (205a) and a boundary portion (205b); store the data elements of the plurality of data elements associated with a position (210) in a first sub-space (205) of the plurality of sub-spaces (201-209) in a first database node of the plurality of database nodes (107, 111 a-c) ; for one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) neighbouring the first sub-space (205), store the data elements of the plurality of data elements associated with a position in the one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) of the plurality of sub-spaces (201-209) in one or more further database nodes of the plurality of database nodes (107, 111 a-c) ; and store the data elements associated with a respective position (210) in the boundary portion (205b) of the first sub-space (205) of the plurality of sub-spaces (201- 209) in at least one of the one or more further database nodes of the plurality of database nodes (107, 111a-c).

2. The apparatus (101) of claim 1 , wherein the apparatus (101) is further configured to store one or more of the data elements associated with a respective position in the boundary portion of the further sub-spaces neighbouring the first sub-space in the first database node of the plurality of database nodes (107, 111a-c).

3. The apparatus (101) of claim 1 or 2, wherein, for every data element associated with a respective position (210) in the boundary portion (205b) of the first sub-space (205) of the plurality of sub-spaces (201-209), the apparatus (101) is configured to: determine a coverage region (210a) centred on the position (210) associated with the data element; determine those of the one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) neighbouring the first sub-space (205) that are at least partially covered by the coverage region (210) for determining one or more at least partially covered subspaces (201 , 202, 204); and store the data elements in those database nodes of the plurality of database nodes (107, 111 a-c) that store the data elements of the plurality of data elements associated with a position in the one or more at least partially covered sub-spaces (201 , 202, 204) of the plurality of sub-spaces.

4. The apparatus (101) of claim 3, wherein a size of the coverage region (210a) is based on a coverage region size parameter (210b), and the apparatus (101) is further configured to adjust the coverage region size parameter (210b).

5. The apparatus (101) of claim 4, wherein the coverage region (210a) is an n-sphere (210a), where n=m-1 , and the coverage region size parameter (210b) is a radius (210b) of the n-sphere.

6. The apparatus (101) of any one of the preceding claims, wherein a size of the boundary portion (205b) of each sub-space (201-209) is based on a boundary portion size parameter, and the apparatus (101) is further configured to adjust the boundary portion size parameter.

7. The apparatus (101) of any one of the preceding claims, wherein the plurality of sub-spaces (201-209) have the same size and shape.

8. The apparatus (101) of any one of the preceding claims, wherein the m- dimensional space (200) is a two-dimensional space (200) and the first sub-space (205) has a rectangular shape.

9. The apparatus (101) of claim 8, wherein the central portion (205a) of the first subspace (205) has a rectangular shape.

10. A distributed database system (100) comprising a plurality of database nodes (107, 111a-c) and an apparatus (101) according to any one of the preceding claims.

11. The distributed database system (100) of claim 10, wherein the apparatus (101) is configured to provide one database (107) of the plurality of database nodes (107, 111a-c).

12. A method (300) for distributing a plurality of data elements to a plurality of database nodes (107, 111a-c) of a distributed database system (100), wherein each data element is associated with a position in a m-dimensional space (200), where m > 2, wherein each database node (107, 111a-c) is configured to store at least a subset of the plurality of data elements, the method (300) comprising: partitioning (301) the m-dimensional space (200) into a plurality of sub-spaces (201-209), wherein the position (210) of each data element is associated with one of the plurality of sub-spaces (201-209) and each sub-space (201-209) comprises a central portion (205a) and a boundary portion (205b); storing (303) the data elements of the plurality of data elements associated with a position (210) in a first sub-space (205) of the plurality of sub-spaces in a first database node of the plurality of database nodes (107, 111 a-c) ; for one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) neighbouring the first sub-space (205), storing (305) the data elements of the plurality of data elements associated with a position in the one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) of the plurality of sub-spaces (201-209) in one or more further database nodes of the plurality of database nodes (107, 111 a-c); and storing (307) the data elements associated with a respective position (210) in the boundary portion (205b) of the first sub-space (205) of the plurality of sub-spaces (201- 209) in at least one of the one or more further database nodes of the plurality of database nodes (107, 111a-c).

13. The method (300) of claim 12, wherein the method (300) further comprises storing one or more of the data elements associated with a respective position in the boundary portion of the further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) neighbouring the first sub-space (205) in the first database node of the plurality of database nodes (107, 111a-c).

14. The method (300) of claim 12 or 13, wherein, for every data element associated with a respective position (210) in the boundary portion (205b) of the first sub-space (205) of the plurality of sub-spaces (201-209), the method (300) further comprises: determining a coverage region (210a) centred on the position (210) associated with the data element; determining those of the one or more further sub-spaces (201 , 202, 203, 204, 206, 207, 208, 209) neighbouring the first sub-space (205) that are at least partially covered by the coverage region (210) for determining one or more at least partially covered subspaces (201 , 202, 204); and storing the data elements in those database nodes of the plurality of database nodes (107, 111 a-c) that store the data elements of the plurality of data elements associated with a position in the one or more at least partially covered sub-spaces (201 , 202, 204) of the plurality of sub-spaces.

15. A computer program product comprising a non-transitory computer-readable storage medium carrying program code which causes a computer or a processor to perform the method (300) of any one of claims 12 to 14 when the program code is executed by the computer or the processor.

Description:
APPARATUS AND METHOD FOR DISTRIBUTING SPATIAL DATA TO A PLURALITY OF DATABASE NODES

TECHNICAL FIELD

The present disclosure relates to distributed database systems. More specifically, the present disclosure relates to an apparatus and method for distributing spatial data to a plurality of database nodes of a distributed database system. BACKGROUND

Distributed database systems (also referred to as distributed database management systems (DBMS)) allow storing and accessing data at a plurality of different database nodes. Distributed database systems can be coordinated (with dedicated coordination nodes) or uncoordinated (all database nodes are the same) depending on the way the metadata and the cluster state is maintained. In any case, a query is usually coordinated, i.e. orchestrated by a certain node (assigned either statically or dynamically). Both DBMS types usually distribute the whole dataset among the set of database nodes, where a single database node may be an instance of the distributed DBMS software running on a single (physical or virtual) computer node. Each database node stores a part of the whole dataset and handles client requests to update and retrieve data.

The notion of data partitioning is relevant in distributed DBMSs, since the whole data set is spread across multiple nodes. Data partitioning is a method that determines how partitions are spread across the data nodes in the distributed DBMS. Each partition may be spread over multiple nodes, and users at the nodes can perform local transactions on the partition. Locality significantly improves query performance. The partitioning is usually defined per table. The intention is usually to distribute the whole table data set among data nodes in the way to provide uniform load among all data nodes.

Distributed DBMSs may be used for storing spatial data, e.g. data associated with objects (e.g. locations, streets, points of interest) in 2D/3D space and, as such, has a specific set of (well-established) properties and operations. Spatial data usually has a write- once/read-intensive workload profile (used heavily in analytical workloads). The distribution of spatial data over the database nodes of a distributed database system has its challenges due to the following characteristics of spatial data. Spatial data usually has several dimensions (e.g. 2D, 3D, or even higher dimensions) and often has a highly non- uniform density distribution, e.g. there are very "hot", i.e. dense data regions (such as in city centers) and very "cold", i.e. "sparse" data areas (such as in deserts, city outskirts). Spatial data may have non-uniform access patterns in that areas of comparable density have different access ratios (which may further depend on the time). For instance, business areas may be crowded during rush hours and "cold" during off-hours. Newly constructed areas are usually "cold" all day long.

SUMMARY

It is an objective of the present disclosure to provide improved devices, systems and methods for distributing spatial data over a plurality of database nodes of a distributed database system.

The foregoing and other objectives are achieved by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.

Generally, embodiments are based on a combination of replication and distribution population strategies for spatial data elements. Embodiments allow replicating spatial data elements in a given neighbourhood of boundaries of sub-spaces of a space, while data elements not contained in any neighbourhood are distributed. Embodiments are suitable for any partitioning scheme, provided that no single database node has data points spread across the whole space covered by the spatial data elements.

More specifically, according to a first aspect, an apparatus for distributing a plurality of spatial data elements to a plurality of database nodes of a distributed database system is provided. Each data element is associated with a position in a m-dimensional space with m > 2, i.e. having at least two dimensions. For a two-dimensional space, each data element may be associated with its coordinates defined by a Global Navigation Satellite System (GNSS). Each database node is configured to store at least a subset of the plurality of data elements.

The apparatus is configured to partition the m-dimensional space into a plurality of m- dimensional sub-spaces, wherein the position of each data element is associated with one of the plurality of m-dimensional sub-spaces and wherein each m-dimensional sub-space comprises an inner central portion and an outer boundary portion. Moreover, the apparatus is configured to store the data elements of the plurality of data elements associated with a respective position in a first sub-space of the plurality of sub-spaces in a first database node of the plurality of database nodes. In other words, all data elements located in the first sub-space are stored in the first database node.

For one or more further sub-spaces neighbouring the first sub-space, i.e. having a common border with the first sub-space, the apparatus is further configured to store the data elements of the plurality of data elements associated with a respective position in the one or more further sub-spaces of the plurality of sub-spaces in one or more further database nodes of the plurality of database nodes. In other words, the data elements located in at least one neighbouring subspace are stored in a different database node.

Moreover, the apparatus is configured to store the data elements associated with a respective position in the outer boundary portion of the first sub-space of the plurality of sub-spaces in at least one of the one or more further database nodes of the plurality of database nodes. In other words, data elements in the boundary region are stored twice, i.e. duplicated in two database nodes. Advantageously, this allows reducing the inter-node communication necessary for handling a spatial data query.

In a further possible implementation form, the apparatus is further configured to store one or more of the data elements associated with a respective position in the outer boundary portion of the further sub-spaces neighbouring the first sub-space in the first database node of the plurality of database nodes.

In a further possible implementation form, for every data element associated with a respective position in the outer boundary portion of the first sub-space of the plurality of sub-spaces, the apparatus is configured to:

(i) determine a coverage region centred on the position associated with the data element;

(ii) determine those of the one or more further sub-spaces neighbouring the first subspace that are at least partially covered by the coverage region for determining one or more at least partially covered sub-spaces; and

(iii) store the data element in those database nodes of the plurality of database nodes that store the data elements of the plurality of data elements associated with a respective position in the one or more at least partially covered sub-spaces of the plurality of subspaces. In a further possible implementation form, a size of the coverage region is defined by a coverage region size parameter, wherein the apparatus is configured to adjust the coverage region size parameter. Advantageously, adjusting the coverage region size parameter allows adjusting the amount of data stored in at least two database nodes.

In a further possible implementation form, the coverage region is an n-sphere with n=m-1 , wherein the coverage region size parameter is a radius of the n-sphere. Thus, for the two- dimensional case the coverage region may be a circle.

In a further possible implementation form, a size of the outer boundary portion of each m- dimensional sub-space is defined by a boundary portion size parameter, wherein the apparatus is configured to adjust the boundary portion size parameter. Advantageously, adjusting the boundary portion size parameter allows adjusting the amount of data stored in at least two database nodes.

In a further possible implementation form, the plurality of m-dimensional sub-spaces have the same size and shape.

In a further possible implementation form, the m-dimensional space is a two-dimensional space, wherein the first sub-space has a rectangular, in particular quadratic shape.

In a further possible implementation form, the inner central portion of the first sub-space has a rectangular, in particular quadratic shape.

According to a second aspect, a distributed database system is provided comprising a plurality of database nodes and an apparatus according to the first aspect, wherein each database node is configured to store at least a subset of the plurality of data elements.

In a further possible implementation form of the second aspect, the apparatus is configured to provide one of the plurality of database nodes. In other words, the apparatus itself may operate on the plurality of database nodes.

According to a third aspect, a method for distributing a plurality of spatial data elements to a plurality of database nodes of a distributed database system is provided. Each data element is associated with a position in a m-dimensional space with m > 2, wherein each database node is configured to store at least a subset of the plurality of data elements.

The method comprises the steps of: partitioning the m-dimensional space into a plurality of m-dimensional sub-spaces, wherein the position of each data element is associated with one of the plurality of m- dimensional sub-spaces and wherein each m-dimensional sub-space comprises an inner central portion and an outer boundary portion; storing the data elements of the plurality of data elements associated with a respective position in a first sub-space of the plurality of sub-spaces in a first database node of the plurality of database nodes; for one or more further sub-spaces neighbouring the first sub-space, storing the data elements of the plurality of data elements associated with a respective position in the one or more further sub-spaces of the plurality of sub-spaces in one or more further database nodes of the plurality of database nodes; and storing the data elements associated with a respective position in the outer boundary portion of the first sub-space of the plurality of sub-spaces in at least one of the one or more further database nodes of the plurality of database nodes.

In a further possible implementation form of the third aspect, the method further comprises the step of storing one or more of the data elements associated with a respective position in the outer boundary portion of the further sub-spaces neighbouring the first sub-space in the first database node of the plurality of database nodes.

In a further possible implementation form of the third aspect, for every data element associated with a respective position in the outer boundary portion of the first sub-space of the plurality of sub-spaces, the method comprises the further steps of: determining a coverage region centred on the position associated with the data element; determining those of the one or more further sub-spaces neighbouring the first sub-space that are at least partially covered by the coverage region for determining one or more covered sub-spaces; and storing the data element in those database of the plurality of database nodes that store the data elements of the plurality of data elements associated with a respective position in the one or more covered sub-spaces of the plurality of sub-spaces.

The data distribution method according to the third aspect of the present disclosure can be performed by the apparatus according to the first aspect of the present disclosure and the distributed database system according to the second aspect of the present disclosure. Thus, further features of the data distribution method according to the third aspect of the present disclosure result directly from the functionality of the apparatus according to the first aspect of the present disclosure and/or the distributed database system according to the second aspect of the present disclosure as well as their different implementation forms described above and below.

According to a fourth aspect, a computer program product comprising a non-transitory computer-readable storage medium carrying program code which causes a computer or a processor to perform the method according to the third aspect, when the program code is executed by the computer or the processor, is provided.

Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, embodiments of the present disclosure are described in more detail with reference to the attached figures and drawings, in which:

Fig. 1 is a schematic diagram illustrating a distributed database system according to an embodiment, including an apparatus according to an embodiment for distributing spatial data to nodes of the distributed database system;

Fig. 2 is a schematic diagram illustrating different aspects implemented by an apparatus according to an embodiment for an exemplary two-dimensional data space; and

Fig. 3 is a flow diagram illustrating different steps of a method for distributing spatial data to the nodes of a distributed database system according to an embodiment.

In the following, identical reference signs refer to identical or at least functionally equivalent features. DETAILED DESCRIPTION OF THE EMBODIMENTS

In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, specific aspects of embodiments of the present disclosure or specific aspects in which embodiments of the present disclosure may be used. It is understood that embodiments of the present disclosure may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.

For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of specific method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.

Figure 1 is a schematic diagram illustrating a distributed database system 100 according to an embodiment. The distributed database system 100 comprises a plurality of database nodes (or short "nodes") 107, 111a-c configured to store data, wherein each database node 107, 111 a-c is configured to store a subset of the whole data to be stored. By way of example, four different database nodes 107, 111 a-c are shown in figure 1. As will be appreciated, however, the distributed database system 100 may comprise more or less than the four different database nodes 107, 111 a-c shown in figure 1. In an embodiment, the different database nodes 107, 111a-c of the distributed database system 100 may have different data storage capacities. For instance, a first node 111a may have a data storage capacity that is about twice as large as the data storage capacity of a second node 111b and/or a third node 111c.

The distributed database system 100 further comprises an apparatus 101 for distributing spatial data to the different database nodes 107, 111a-c of the distributed database system 100. In the embodiment shown in figure 1 , the apparatus 101 comprises a database node 107 itself, i.e. the apparatus 101 may distribute a portion of the data to itself. In another embodiment, the apparatus 101 may not comprise a database node 107 itself and may be configured to distribute the spatial data to the "remote" database nodes 111a-c.

As illustrated in figure 1 , the apparatus 101 may further comprise a processor 103 for processing data and a communication interface 105 for exchanging data with the database nodes 107, 111a-c of the distributed database system 100. The processor 103 may be implemented in hardware and/or software, which when executed causes the apparatus 101 to perform the functions and methods described herein. The hardware may comprise digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or general-purpose processors. The communication interface 103 may comprise one or more wired or wireless communication interfaces 103 for exchanging data with the database nodes 107, 111a-c of the distributed database system 100.

As will be described in more detail below, the spatial data comprises a plurality of data elements or records, wherein each data element is associated with a position in a space having two or more dimensions. An example for such a space is the two-dimensional space or map 200 shown in figure 2. Although a detailed embodiment will be described in the following in the context of the two-dimensional space 200 shown in figure 2, the person skilled in the art will appreciate that the concepts described herein can be applied to a space having more dimensions than the two-dimensional space 200 shown in figure 2.

In an embodiment, the exemplary two-dimensional space 200 shown in figure 2 may be, for instance, a square-shaped or rectangular portion of a map and each data element of the plurality of data elements may be associated with a respective position, such as the position 210 within the two-dimensional space 200. In an embodiment, each data element may be associated, for instance, with coordinates defining its respective position within the space 200 (i.e. on the map 200), such as x, y coordinates, or latitude and longitude coordinates.

As illustrated in figure 2, the apparatus 101 is configured to partition the two-dimensional space 200 into a plurality of two-dimensional sub-spaces 201-209, i.e. portions or cells of the space 200, such that each spatial data element by means of its position within the space 200 is associated with a respective one of the plurality of two-dimensional subspaces 201-209. For instance, a spatial data element having the position 210 within the space 200 shown in figure 2 is uniquely associated with the two-dimensional sub-space 205. As illustrated in figure 2, the plurality of two-dimensional sub-spaces 201-209 may have the same size and shape. In the embodiment shown in figure 2, the plurality of two- dimensional sub-spaces 201-209 have a quadratic shape. In the embodiment shown in figure 2, the plurality of two-dimensional sub-spaces 201-209 have the same shape as the two-dimensional space, i.e. the map 200, namely a square shape. In other embodiments, the shape of the space 200 and/or its sub-spaces 201-209 may be different, for instance, rectangular.

In the following, further features of the apparatus 101 for distributing spatial data among the plurality of database nodes 107, 111 a-c will be described in the exemplary context of the two-dimensional sub-space 205, i.e. the sub-space in the center of the space 200 shown in figure 2. As will be appreciated, however, the apparatus 101 may be configured to distribute the spatial data associated with the other two-dimensional sub-spaces 201- 204, 206-209 in the same way as the spatial data associated with the exemplary two- dimensional sub-space 205.

As illustrated in figure 2 for the exemplary two-dimensional sub-space 205, each two- dimensional sub-space 201-209 comprises an inner central portion, such as the exemplary inner central portion 205a, and an outer boundary portion, such as the exemplary outer boundary portion. In the embodiment shown in figure 2, the inner central portion 205a has a rectangular, in particular quadratic shape and the outer boundary portion 205b has the shape of a rectangular, in particular quadratic frame.

The apparatus 101 is configured to store the spatial data elements of the plurality of data elements associated with a respective position in the exemplary first sub-space 205 of the plurality of sub-spaces 201-209, such as the data element associated with the position 210 shown in figure 2, in a first database node of the plurality of database nodes 107,

111 a-c, such as the exemplary first database node 111a. In other words, the apparatus 101 is configured to store all data elements associated with positions located in the exemplary first sub-space 205, including the data element(s) associated with the position 210, in the exemplary first database node 111a.

With respect to the data elements associated with a respective position in the other subspace 201-204, 206-209, the apparatus 101 is configured to handle this data in the same way or a similar way as the data elements associated with a respective position in the exemplary first two-dimensional sub-space 205. More specifically, for one or more of the other sub-spaces 201-204, 206-209 neighbouring the first sub-space 205, i.e. having a common border with the first sub-space 205, the apparatus 101 is configured to store the data elements associated with a respective position in the one or more further sub-spaces 201-204, 206-209 in one or more of the further database nodes 107, 111 b,c other than the exemplary first database node 111a. In other words, the data elements associated with a respective position located in at least one neighbouring subspace 201-204, 206-209 are stored in a different database node than the data elements associated with a respective position in the exemplary first sub-space 205. For instance, the apparatus 101 may be configured to store the data elements associated with a respective position in the subspace 201 in the database node 111b, the data elements associated with a respective position in the sub-space 202 in the database node 111c and the data elements associated with a respective position in the sub-space 202 in the database 107. The data elements associated with respective positions in the sub-spaces 203 and 206-209 may be stored in one of these databases 111b, 111c, 107 as well or in other databases not shown in figure 1. In an embodiment, the apparatus may be configured to store the data elements associated with respective positions in each respective sub-space 201-209 in a separate database, i.e. each sub-space 201-209 may be assigned to a separate database node.

In addition to storing the data associated with positions in the different sub-spaces 201- 209 in the different database nodes 107, 111a-c, the apparatus 101 is configured to duplicate data, i.e. store certain data elements in more than one database node 107,

111 a-c. This will be described in more detail in the following in the context of data elements associated with a position in the exemplary sub-space 205 as an example for the other sub-spaces 201-204, 206-209. The apparatus 101 is configured to store the data elements associated with a respective position in the outer boundary portion 205b of the exemplary first sub-space 205, which already have been stored in the first database node 111a, in at least one of the further database nodes 107, 111b, 111c. In other words, the apparatus 101 is configured to store data elements associated with a respective position in the boundary region 205b, such as the position 210, in at least two, possibly mode database nodes. Advantageously, this allows reducing the inter-node communication necessary for handling a spatial data query.

In the following, more details will be described about how the apparatus 101 according to an embodiment is configured to determine the further database nodes for storing the spatial data element associated with the position 210 in the sub-space 205. In an embodiment, the apparatus is configured to determine a coverage region 210a centred on the position 210. In the embodiment shown in figure 2, the coverage region 210a has the shape of a circle 210a with a radius 210b and a circular boundary 210c. In other embodiments, the coverage region 210a may have a different shape, such as a rectangular shape. As will be appreciated, the size of the circular coverage region 210a is defined by the radius 210b, i.e. a coverage size parameter. In an embodiment, the apparatus 101 is configured to adjust the radius 210b, i.e. the coverage region size parameter for adjusting the number of further database nodes where the data element associated with the position 210 is stored.

The apparatus 101 is further configured to determine the neighbouring sub-spaces of the exemplary first sub-space 205 that are at least partially covered by the circular coverage region 210a centred on the position 201 for determining one or more at least partially covered sub-spaces. In the embodiment shown in figure 2 the circular coverage region 210a covers portions of the sub-spaces 201 , 202 and 204. Having determined the neighbouring sub-spaces being partially covered or touched by the circular coverage region 210a, namely the sub-spaces 201 , 202 and 204, the apparatus 101 stores the data element associated with the position 210 in those further database nodes storing the data associated with positions in the sub-spaces 201 , 202 and 204. In an embodiment, the apparatus 101 is configured to perform the same procedure with all of the data elements associated with a position in the outer boundary portion 205b of the exemplary sub-space 205. As will be appreciated, the position of a data element within the outer boundary portion 205b of the exemplary sub-space 205 determines in how many and in which further database nodes the data element will be stored by the apparatus 101. In an embodiment, the size of the outer boundary portion 205b of the exemplary subspace 205 is defined by a boundary portion size parameter, wherein the apparatus 101 is configured to adjust the boundary portion size parameter. Advantageously, adjusting the boundary portion size parameter allows adjusting the amount of data stored in at least two database nodes.

Embodiments of the apparatus 101 provide a reduced (or even avoid in some cases) remote inter database node data shipping in the process of executing a window query.

The query execution time decreases. Queries can be potentially answered by a single database node, which leads to less load on other database nodes and the network infrastructure. As data duplication may increase space consumption, embodiments of the apparatus 101 may be used as a cache.

Figure 3 is a flow diagram illustrating different steps of a method 300 for distributing spatial data to the nodes 107, 111a-c of the distributed database system 100 according to an embodiment.

The method 300 comprises a first step 301 of partitioning the m-dimensional space 200 into a plurality of sub-spaces 201-209, wherein the position 210 of each data element is associated with one of the plurality of sub-spaces 201-209 and wherein each sub-space 201-209 comprises a central portion 205a and a boundary portion 205b.

The method 300 comprises a further step 303 of storing the data elements of the plurality of data elements associated with a position in a first sub-space, by way of example subspace 205 shown in figure 2, of the plurality of sub-spaces 201-209 in a first database node of the plurality of database nodes 107, 111 a-c.

Furthermore, for one or more further sub-spaces 201 , 202, 203, 204, 206, 207, 208, 209 neighbouring the first sub-space 205, the method comprises a step 305 of storing the data elements of the plurality of data elements associated with a respective position in the one or more further sub-spaces 201 , 202, 203, 204, 206, 207, 208, 209 of the plurality of subspaces 201-209 in one or more further database nodes of the plurality of database nodes 107, 111a-c.

The method 300 comprises a further step 307 of storing the data elements associated with a respective position 210 in the boundary portion 205b of the first sub-space 205 of the plurality of sub-spaces 201-209 in at least one of the one or more further database nodes of the plurality of database nodes 107, 111 a-c.

The person skilled in the art will understand that the "blocks" ("units") of the various figures (method and apparatus) represent or describe functionalities of embodiments of the present disclosure (rather than necessarily individual "units" in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit = step).

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described embodiment of an apparatus is merely exemplary. For example, the unit division is merely logical function division and may be another division in an actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, functional units in the embodiments of the invention may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.