A SYSTEM AND METHOD FOR A NETWORK DISTRUBUTED DATABASE

Title:

A SYSTEM AND METHOD FOR A NETWORK DISTRUBUTED DATABASE

Document Type and Number:

WIPO Patent Application WO/2008/006195

Kind Code:

Abstract:

The present invention provides a database that is distributed throughout a network with each node in the network able to both add information and search for information. This distributed database may sit on any network that provides a hierarchy or tree structure.

Inventors:

DAVIES CHRIS (CA)

Application Number:

PCT/CA2007/001201

Publication Date:

January 17, 2008

Filing Date:

July 10, 2007

Export Citation:

Click for automatic bibliography generation Help

Assignee:

DAVIES CHRIS (CA)

International Classes:

H04L12/28; G06F16/182; H04L69/14

Download PDF:

View/Download PDF PDF Help

Claims:

Claims

1. A self-organizing network where nodes in said network are able to organize into a hierarchy such as described in PCT/CA2005/000194 and use said hierarchy to store and search information as part of a distributed database.

Description:

A SYSTEM AND METHOD FOR A NETWORK DISTRUBTED DATABASE

Priority of Claims

[0010] The present invention claims priority from Canadian Patent Application

No. 2552451 filed on July 12, 2006.

Field Of Invention

[001 1] The field of the invention relates generally to electronic, telecommunication and computing devices that communicate with each other and more particularly to a distributed database therefore.

Background of Invention

[0012] Databases are extremely common. There are large numbers of free and commercial databases currently available such as MySQL, DB2, Oracle, etc. Distributed databases are less common and generally refer to several database servers connected through a network that mirror information between them. These multiple database copies can improve performance and uptime by providing multiple instances of the same information.

[0013] These prior architectures suffer in an environment where network connectivity to a particular database is not assured. An example would be wireless ad-hoc networks. No matter which node is a 'database' node, it is possible for a number of wireless nodes to form a network where none of them have access to this database.

[0014] If this database holds an 'address book ^* that stores node names mapped to ip addresses, if a node has no way to reach this database, it will be unable to resolve a node name to an IP address, even if the node with that IP address in the same network as said node.

[0015] The additional problem is one of scale. If there is a small, fixed size network it is trivial for every device to store a copy of the information it needs (in the

example of an 'address book' database). However, if a network contains many thousands or millions of devices, it is impractical, and in most cases impossible, for each node to have a complete, up to date copy of the entire database.

[0016] For example, if you consider a network that contains hundreds of thousands of low capacity devices (such as sensors), that only have enough capacity to know of a few hundred nodes, as well as a number of larger capacity nodes. The problem is how to distribute the address book information in order that every node is able to look up the name/iρ mapping for any other node in this network, while not overwhelming a particular nodes storage capacity.

Summary of Invention

[0017] It is an object of the present invention to provide a novel system and method for networking a distributed database that obviates or mitigates at least one of the above-identified disadvantages of the prior art.

|0018 ^" 1 This invention is based on the principles revealed in patent application

PCT CA2005'000] 94 and allows a distributed database in which every node plays a part in maintaining information. The amount of information each node will store and how current that information is kept is managed in large part the same way that node route information is managed.

[0019] In simple terms, information is attached to node route updates. This allows any node to create information, attach it to its node name, and then have that information pushed up the revealed hierarchical tree structure, and propagated in a limited fashion throughout the network.

[0020] This tree structure as described in one embodiment in patent application

PCT/CA2005/000194 is revealed by having each node pick one directly connected neighbor node that is the next best step to the most other nodes

[0021] Since information is attached to the node updates, this information is spread m exactly the same way as node route updates. Since the system and method described in application PCT/CA2005/000194 is able to precisely limit the spread of route information

in constrained environments, the attached information is also spread in this same limited way.

[0022] Querying this database is similar to requesting imile information about a particular node as described in patent application PCT/Cλ2005/000194. λ search query is sent from the requesting node up the revealed hierarchy. If it matches the information of any node, that node's priority is improved such that the matched node and its attached information are brought back down to the requesting node.

[0023] Utilizing this approach helps mitigate the problems limiting the spread of information such that nodes have as much information as they need, are not overwhelmed, are able to work within their bandwidth constraints and the network has a suitable level of information redundancy.

[0024] Since this invention builds on (in a non-interfering way) the invention disclosed in PCT/CA2005/000194 it should be seen as having the same or great scope as PCT/CA2005/000194, and as such is a system, method, a computer implemented process and a computer readable medium.

Detailed Description of the Invention

[0025] Exemplary embodiments thus follow in order to clarify understanding. These examples, when making specific reference to numbers, other parties' software or other specifics, are not meant to limit the generality of the method and system described herein. A person of skill in the art will be able to realize when two or more merged concepts could be separately implemented or useful, even if not explicitly described as such. Alternative embodiments should not b>; considered mutually exclusive unless specifically stated.

[0026] It is preferred (but not required) that every node in the network will take an active role in maintaining the distributed database. Ideally (although not limited to) each node will perform the same operations, only the amount of information and the speed of processing and propagation may vary.

[0027J The information that is attached to the node names can take almost any form or structure. λ preferred embodiment would be to have all information in the form of a 'description ^* and a "value'. This pair is called a tuple. This example is not intended to limit the scope of the invention to a tuple.

[0028] For example, tuples associated with a node might look like:

[0029] Each node is able to have as many or as few tuples as it requires. A node

(as described in application PCT/CA2005/000194) is able to have a number of node names associated with each node. Each of these separate node names might have a different set of tuples associated with it.

[0030] A route update (for example) that includes these tuples might look like this:

struct sRouteUpdate J

/ ^' / ^' the name of the node that this route update is about char *pNodeName;

'/ the cumulative hop cost to the specified node float fCumulativeHopCostToNode;

// the importance value of this route update float flmportaiiceValue;

' ^• / a series of tuples associated with this node char *pTuplel_Description; char *pTuplelJValue; char *pTuple2_Description; char *pTuple2_Value;

[0031 ] Tuples may be updated, added or removed as required. These changes will be propagated through the network in the same fashion and frequency as normal node route updates.

[0032] Since all nodes make use of a mechanism called a High Speed Propagation

Path (HSPP) to push route information up the tree any information associated with a node will be pushed up the tree.

[0033] By definition, the way the tree in any network is revealed those nodes with more capacity and more stability will be closer to the head/top of the hierarchy of the tree. This will cause sensor (and other low capacity nodes) to be in the periphery of the tree. Since these lower capacity nodes are further away from the head of the tree they will need to store correspondingly less information since there will be fewer nodes beneath them in the tree. Those nodes near the head of the tree will need to have the capacity to store information about most or all of the nodes in the network.

[0034] A node that wishes to request information from the distributed database will create a query. This query might be SQL, or some standard query language that is able to search information the way it is attached to a node name. In our example, a simple query might (for example) look like this:

Description = "Name " AND Value= "*Davies* " [0035] The query is sent up the revealed tree structure in the same fashion as a

'Request HSPP'. Any node on the route for this query that matches the queiy will have its importance improved in order that the node route update and associated information travel down to the requesting node as quickly as possible. The improvement of importance will also help ensure that if the matching node updates its information associated with its node name that the updated information is communicated to the querying node as quickly as possible.

[0036] A Request IISPP travels from parent node to parent node up the revealed tree structure. It does not spread in the way a route update spreads. This query will also

travel from parent node to parent node up the tree structure. A data structure that would hold the query might look like this:

struct sQuery

// the text of the query char *pQuery;

[0037] In large networks, not all nodes have enough storage to contain knowledge of all the nodes in the network. As described in patent application PCT ^/ CA2005/000194 the ordering of route updates allows a node to decide which are the 'most important' route updates to be stored. For example, if a node is part of a network that contains 100 nodes, but it only has enough storage to record information on K) nodes, it will store those 10 nodes that are considered 'most important'. Since queries that match a node will improve the importance of that node, they can cause node (and association information) to be stored on a node that would otherwise not store that information.

[0038] This importance value may be referred to as "Distance from Flow' in patent application PCT/CA2005/000194. Additionally, improving the importance in the language of patent PCT/CA2005/000194 means decreasing the 'Importance' or 'Distance from Flow' value. Since all route updates to be sent are ordered in ascending order by importance, those with a lower value will be sent first. The importance value is not necessarily related to hop cost.

[0039] When a node as described in application PCT/CA2005/000194 establishes a connection through the network to some other node, route updates associated with the communicating nodes are improved significantly in priority. This improvement will not only help ensure route updates are current (in order to maintain and converge the connection), it will also help ensure that any updates to the information associated with this node is kept current as well.

Query Alternative Embodiment

[004Oj This alternative embodiment describes how to improve the query language using the tuples described previously.

[0041 ] Many times it will be advantageous to query on an individual tuple as well as on all the tuples associated with the node. For example:

Search for a node where the name matches 'Davies ' and phone number has the three digit prefix 416.

[0042 j If we write out a query such as this, we might try:

Descriptiυn= "Name " AND VaIiIc= "*Davies* "AND Description- "Phone "AND Value= "4J6*"

[0043] The problem is there is no way to determine what "Description' goes with what "Value'. One possible alternative embodiment is to introduce the use of a particular type of bracket that indicates that all expression evaluation will be performed on one tuple at a time. For example, we could use '[' and ']' . The previous expression could be rewritten like this:

[Description^ "Name " AND VaI ue= "*Davies * "] AND

[Description^ "Phone " AND VaIw= "416 * "]

[0044] We can also use normal brackets "(' and ')' both inside and outside the square brackets. For example, we want to look for phone numbers that start with both 416 and 519.

[Description^ "Name " AND VaIm= "*Davies* "] AND [Description^ "Phone " AND (Valuer "416*"OR Value= "519*")]

[0045] The advantage of this embodiment is that it allows us to work with a list of

Description/Value tuples that may have almost infinite variability, in contrast to a conventional relational database that has a fixed set of table names and table fields.

Relational Database Alternative Embodiment

[0046] Although not ideal, some of the functionality of a relational database can be mimicked using these tuples. For example, a relational database uses table names and field names. The description value of the tuple could be a combination of:

Description = ^' TableName:FieldName:FieldName:Field\'aιne: ... ' Value = 'VahieID: VahieID: ValueID:.. '

[0047] For example:

Description- ^' PeopleTable:Name:Position:Phone ^' Value= "David Davies:Partner:5l9 941 4581 '

[0048] Every node name can support any number of tuples, and thus any number of rows in any number of tables.

[0049] A relational query that matches a position to an index in a position table might work like this. Two separate tuples (perhaps originating on separate nodes), look like this:

Tuple 1 :

Description^ ^' PeopleTable:Name:Position:Phone '

Valιιe= 'David Davies:[table:PositionTable;Field:PositionID]2:519 941 4581 ' Tuple 2:

Description = Po sition Table: PositionID: Position ' Value= '2.'Partner '

[0050] A node doing a lookup on 'David Davies' would then need to do another lookup on the 'PositionTable:PositionlD' in order to resolve the position id '2' referred to in Tuple 1 to the text value 'Partner'.

[0051 1 Someone skilled in the art would be aware of variations.

Service Oriented Architecture Alternative Embodiment

[0052] This distributed database can also be used to help support a Service

Oriented Architecture (SOA). SOA requires that devices and applications publish to a database what services they provide (for example, streaming video) and a method to access these services.

[0053] Tuples can be used to allow SOA capable nodes and applications to share their capabilities with other nodes in the network.

[0054] Each node would publish its SOA specifications as tuples. For example, a node capable of video streaming and monitoring temperature might publish two entries:

Description^ 'SOA Description 1 ^'

Value= 'type:Streaming_Video format: A VI access/node ^■ standard'

Descφtion = 'SOA Description 2 ' Valιιe= 'type ' Streaming ^Temperature format :Text accessmode: standard'

[0055] A node looking to find other nodes that provided a SOA streaming video service might create a query (that travels from parent to parent until it reaches the head of the network) that (for example) looks like this:

Query— ' [Description - 'SOA * ' AND Value= ' type:Streaming_Video '] '

[0056] This would return all nodes in the network that support streaming video with SOA.

[0057] Someone skilled in the art would be aware of variations.

Special Purpose Tuple Types Alternative Embodiment

[0058] Just as in regular databases, there are certain types of queries that require a particular format of information in order to execute.

[0059J For example, it may be desirable to have every node attach its current GPS co-ordinates as a tuple to its node name. This might look like:

Description= 'GPS Location ' Valiιe= ' N49.25903, Wl 22.77428 '

[0060] Whenever a node changed location, it would update its tuple, and this information would be propagated through the network database.

[0061] With this particular type of tuple there is the ability to run special queries. For example, a node in the network might want to know about all nodes within I kilometer radius of a particular GPS position. This query might look like this (for example):

Querγ= '[Description- 'GPS Location ' AND Value= '1000m radius of point N49.25903, Wl 22.77428 '] '

[0062] These queries could be combined with other queries, allowing a node to request (for example) all nodes within a l km radius of a certain point that can provide a streaming video feed.

[0063] Someone skilled in the art would be aware of variations.

Encryption Alternative Embodiment

[0064] In larger networks, or networks where information needs to be protected only for authorized users encryption may be applied to the tags.

|0065] For example (but not limited to this example), the description might include a component indicating the type or class of encryption being used.

Description^ ^' GPS Locatiυn:ENC:5521a ^" Value= ' XXXXXXenaypteJXXXXXXXXX'

[0066] In the previous eκamρle a node would only be able to decrypt the value if it had the correct encryption code specified by "5:>21a". This code would correspond to an

encryption method (ex: AF.S256) and a key (ex: 0x67a3829aal 0002al01). Only nodes that had this information would be able to read the value.

[0067J Someone skilled in the art would be aware of variations.

Historical Information Alternative Embodiment

[0068] An alternative embodiment would have the nodes store the information they have seen from other nodes in the network in local storage. This information would be able to be recalled based on time in order to provide each node a historical record of previous nodes and previous node states.

[0069] Someone skilled in the art would be aware of variations.

Possible Uses

[0070] This distributed database may be used in a variety of applications:

- sensors may add their sensor information to the network so that any other node in the network may instantly query that sensor information. For example, sensors that detect biological, chemical or other information may add this information to the network.

In a military or first-responder environment this database can allow the locations and status of nodes to be easily visible by all nodes in the network.

The search capability may be used to request the location of the specific assets. For example, 'show me the location of all medics within 5 km of my present location'

- Cities may use this technology to monitor water and electricity use by homes and businesses.

This invention may also be used by oil & gas, pharmaceutical and other industries that need to record and manage sensor information.

Service Oriented Architecture (SOA) applications can be supported by this distributed database.

This technology can enable robotic and automatic controls of machinery such as automobiles and aircraft.

- This can support visualization tools such as Common Operating Picture

(COP) and Situational Awareness Screens that enable monitoring of the nodes and participants in the network.

- By enabling optimized routing the software can be used by delivery and shipping services for route discovery and optimization.

- By being able to generate global knowledge of certain groups, such as aircraft networks, this technology can enable air traffic management for large areas.

- For automobiles this technology can enable networked vehicles to be automatically monitored, repaired, notified and optimized while being used.

Previous Patent: PURIFICATION OF PGM SPECIES FROM MIXTURES THEREOF

Next Patent: SYSTEMS AND METHODS FOR MANAGING NETWORKS