Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SEARCH METHOD FOR AN UNSTRUCTURED P2P NETWORK
Document Type and Number:
WIPO Patent Application WO/2008/028319
Kind Code:
A1
Abstract:
The present invention provides a search method for an unstructured P2P network, comprising the steps of: selecting a subset of its neighbours adaptively in the unstructured P2P network by a querying peer; forwarding a Query message to the subset of its neighbours intelligently on the basis of a search criteria by the querying peer; and returning a QueryHit message to the querying peer in response to the Query message by a receiver peer, if the receiver peer in the subset of the querying peer's neighbours can provide a reply; otherwise, propagating the Query message to one of its neighbours having the most probability to respond to the Query message with a QueryHit message by the receiver peer.

Inventors:
XU YAN (CN)
MA XIAOJUN (CN)
WANG CHUANMING (CN)
Application Number:
PCT/CN2006/002185
Publication Date:
March 13, 2008
Filing Date:
August 25, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
THOMSON BROADBAND R & D BEIJIN (CN)
XU YAN (CN)
MA XIAOJUN (CN)
WANG CHUANMING (CN)
International Classes:
H04L29/08
Other References:
KALOGERAKI D., GUNOPULOS D., ZEINALIPOUR-YAZTI D.: "A Local Search Mechanism for Peer-to-Peer Networks", CIKM'02, November 2002 (2002-11-01), XP002242714
YANG B. AND GARCIA-MOLINA H.: "Improving Search in Peer-to-Peer Network", ICS, 2002
TSOUMAKOS D. AND ROUSSOPOULOS N.: "Adaptive Probabilistic Search (APS) for Peer-to-Peer Networks", TECHNICAL REPORT CS-TR-4451, UN OF MARYLAND, 2003
XIA Q.-Z. AND XIE G.-G.: "Search Methods and Its Improvements for Unstructured P2P Networks", INSTITUTE OF COMPUTING TECHNOLOGY, CHINESE ACADEMY OF SCIENCES, no. 9, 2005, pages 1 - 5
Attorney, Agent or Firm:
YU, Gang (P.C.Floor 16, Tower A,InDo Building, A48 Zhichun Road,Haidian District, Beijing 8, CN)
Download PDF:
Claims:

CLAIMS

1. A search method for an uns I; rue Lu rod P? P network, comprising the steps ol :

selecting a subset of -its neighbours adaptivcly in the unstructured P2P network by a querying peer.;

forwarding a Query message to the subset of its neighbours intelligently on the basis of " a search criteria by the querying peer; and

returning a Queryliit message to the querying peer in response to the Query message by a receiver peer, if the receiver peer in the subset of the querying peer's neighbours can provide a reply; otherwise, propagating the Query message to one of its neighbours having the most probabi IiLy Lo respond to the Query message wj th a QuoryHJ t message by the receiver peer.

2. The search method according to claim 1, further comprising the step of:

setting a limit on the search depth to terminate searching process by the querying peer.

3. The search method according to claims 1 or 2, further comprising the step of:

providing each of said peers Ln the unstructured P2P network with an information table for storing the statistics information of neighbour peers of the each peer.

4. The search method according to claim 3, wherein the statistics information comprises the number of shared files, the most recent succeeded queries, and the specific peer providing the response to the Query message.

5. The search method according to claim 4, wherein the information table comprises a f Ll e table for storing the number of the shared .P.i ' 1 es of the neighbour peers and a Query tab] e for storing the most recent succeeded queries of the neighbour peers.

6. The search method according to claim b, wherein the querying peer in the unstructured P2P network selects the subset of its neighbours adaptively by the steps of:

S40?. , setting the initial va.l ue of the number of the selected peers to be zero, wherein, the total number of peers to be selected Ls n, n<N, N is the total number of the neighbours;

S404, looking up the Query tables of the n peers to confirm whether the search criteria is in the Query tables;

S406, selecting the peers having the Query table including the search criteria; and

S416, repeating steps S404 to S406 unti-1 the number of the selected peers is equal to n.

7. The search method according to claim 6, further comprising the following steps between steps S406 and S416:

S408, determining whether the number of the selected peers is less than n, if: the number of the selected peers is not less than n, the selection for the subset is terminated;

S410, if the number of the seJ.ecl.ed peers is less than n, repeating steps S404 to S408 until Lt arrives at the end of its neighbours' Query tab Ie;

S412, " looking up the file table of the neighbour peers; and

S414, seJecting the peer which has the max number of the shared file, and add the number of the selected peers with 1.

8. The search method according to anyone of claims 1 to 7, wherein when forwarding a Query message, the querying peer also sends a ping message simultaneously, and the receiver peer responds to the ping message with a pong message.

9. The search method according to claim 8, wherein the pong message includes the address of the receiver peer and the number of the files that arc shared by the receiver peer on the network.

10. The search method according to claim 9, wherein the querying peer caches the received pong massage in which the information about the number of the shared files can be picked up.

11. The search method according to claim b, wherein the Query table includes the responder o C the QueryHit message and the corresponding search c ri teria .

12. The search method according to claim 11, wherein the descriptor_id field of the QueryHil. message contains the same value as that of the associated Query message, wh.i ch associates the QueryHit message with the corresponding Query message .

13. The search method according to any one of claims 4 to 12, wherein the most recent succeeded queries of the neighbour are obtained by the step of:

upon receipt of a QueryHit message, the querying peer records the QueryHit information, which includes the resporider of the Qucryllit message and the corresponding search criteria in the Query message .

14. The search method according to anyone of claims 4 to 12, wherein the most recent succeeded queries of the neighbour arc obtained by the step of:

upon receipt of a QueryHit message, the querying peer records the QueryHit information and also sends the QueryHit information to al 1 its neighbours .

15. The search method according to any one of claims 8 to 12, further comprising the steps of:

setting a rate limit of the rece.i ved messages for each peer in the unstructured P2P network on the basis of its capability, if the rece.i.ving rate of " the peer is below the rate limit, the peer receives the incoming message, if not, the peer discards the message .

16. The search method according Lo claim 15, wherein all the messages are classified into a plurality of levels on the basis of their priorities.

17. The search method according to claim 16, wherein the message having a higher priority is processed first.

18. The search method according to claim 17, wherein the messages are .listed as Col Lows with decreased priorities: QueryII.it message, Query message, Pong message, Ping message, and Push message .

19. The search method according to any one of claims 1 to 18, wherein the unstructured P2P network is a Gnutella network.

Description:

Search Method for αxi Unstructured P2P Network

L 11 J-KLp OF THE INVENTION

The present invenLion relaLes qene rally Lo network communication, and particularly to a search method for an unstructured P2P network.

BACKGROUND OF THE INVENTION

Peer-to-peer (P2P) systems are distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other. According to the data location and network topology, P2P systems can be classified .into unstructured and structured. In structured systems, contents (or point to the contents) are placed at a specific peer, while in unstructured systems contents cou.ld be placed at any peer.

One of the most challenging design aspects of a P2P system is efficient techniques for search and retrieval of data. Searching in highly structured systems such as Chord, Tapestry, Pastry and CAN, folJ ows the we! 1 -de Ti nod neighbouring links. For this reason, structured P2P systems provide guarantees on finding existing data and bounded data lookup efficiency in terms of the number of overlay hops; however, the strict network structure imposes high overhead for handling frequent node join and leave. Unstructured P2P systems are exfremcJ y rcs.i I ient to node join and leave, because no special network structure needs to be maintained. Searching in unstructured networks such as Gnutella is often based on Cl ooding or its variation because there is no control over data storage. Flooding causes large amount of

unnecessary traffic which degrades the performance of Lhe P2P system greatly.

The original Gnutella uses flooding which .is the

Breadth First Search (BFS) of the overlay network graph with depth limit D. In this approach, the querying node sends the query request to all its neighbours. J 1 Iach neighbour processes the query and returns the resu.l ts if the contents are found. Each neighbour then forwards the query request further to all its neighbours except for the querying node. Th.i s procedure continues untjl. the depth limit D is reached. This method generates a large number of messages and does not scale well.

Random walk is a well-known technique, which forwards a query message to a random].y chosen neighbour at each step unt.il the object is found. This message is called a "walker".

When using the standard random walk (with one walker), it cuts down the message overhead s i.gn i f i can tJ y . However, this efficiency comes at an order of magnitude increase in user-perceived delay of successful searches.

To decrease the delay, k-walker random walk is deployed. The requesting node (peer) sends out k query messages to an equal number of randomly chosen neighbours. FIach of these messages follows its own path, having intermediate nodes forward it to a randomly chosen neighbour at each step. There are two mechanisms to terminate the walks, TTL-based and checking. 'I 1 Tl, means each random walk terminates after a certain number of hops. "Checking" means a walker periodically checks with the original requester before walking to the next node. Simulations confirm that k walkers after T steps reach roughly the same number of nodes as 1 walker after k*T

stops. Therefore, by using k-walkcrs, we can cxpccl: Lo cut the delay down by a factor of k.

The most serious disadvantage of this algorithm is its highly variable performance. Success rales and number of hits vary greatly depending on Lhe random choices made .

Therefore, one object of the present invention is to provide an improved search method for an unstructured

P2P network which aims to reduce the number of " nodes that receive the query while still being able to obtain satisfactory query result.

In order to achieve the above object, the present invention proposes a search method based on the ;i nformat-i on of the neighbours to adaptive 1.y select a set of neighbours to forward the query, which could reduce the aggregate load generated by each query across the network and improve the performance of the network.

SUMMARY OF 'J 1 IIE INVENTION

In accordance with the present invention, a search method for an unstructured P2P network is provided.

The search method for an unstructured P2P network in accordance with the present invention comprises the following steps: a querying peer in the unstructured P2P network selects a subset of its neighbours adaptively; the querying peer forwards a Query message to the subset of its neighbours intelligently on the basis of a search criteria; and in response to the Query message, if a receiver peer in the subset of the querying peer's neighbours can provide a QueryH.it message, the receiver peer returns the QueryHi t message to the querying peer; otherwise, the receiver peer propagates Lhe Query message

Lo one of its neighbours having the roost probabi 1 i Ly Lo respond to the Query message wi.th a QuerylliL message.

In accordance with the present invention, the querying peer sets a limit on the search depth to terminate searching process .

In accordance with the present invention, the method further provides each peer in the unstructured P2P network with an information tabJ e for storing the statistics information of: neighbour peers of the each peer.

In accordance with the present invent.!.on, the statistics information comprises the number of shared files, the most recent succeeded queries, and the specific peer providing the response to the Query message.

In accordance with the present invention, the information table comprises a file table for storing the number of the shared files of the neighbour peers and a Query table for storing the most recent succeeded queries of the neighbour peers .

In accordance with the present invention, the querying peer in the unstructured P2P network selects the subset of its neighbours adaptively by the steps of ' :

S402, setting the initial value of the number of the selected peers to be zero, wherein the LoLa I number of peers to be selected is n, n<N, N is the total, number of the neighbours;

S404, looking up the Query tab.1 es of the n peers to confirm whether the search criteria is in the Query tables;

S406, selecting the poors having l;hc Query table including the search criteria; and

S416, repeating steps S404 to S406 until the number of the selected peers is equal to n .

In accordance w.ϊ Lh the present invention, the following steps could be added between steps S406 and S416:

S408, determining whether the number of the selected peers is less than n, if it is not less than n, the selection for the subset is terminated; and

S410, if the number of the selected peers is loss than n, repeating steps S404 to S408 until it arrives at the end of its neighbours' Query table;

S412, look.ing up the file table of the neighbour peers; and

S414, se.lect.lng the peer which has the max number of the shared file, and add the number of the selected peers with 1.

In accordance with the present .invention, when the querying peer forwards a Query message, Lt also sends a ping message simultaneously, and the receiver peer responds to the ping message with a pong message.

In accordance with the present invention, the pong message includes the address of the receiver peer and the number of the files that are shared by the receiver peer on the network.

In accordance with the present invention, the querying peer caches the received pong massage in which

the .Information about the number of the shared f i 1 cs can be picked up.

In accordance with the present invention, the Query tabie includes the respondcr oi the Queryl-Iit message and the corresponding search criteria.

In accordance with the present .invention, the descriptor_id field of: the Queryllil: message contains the same value as that of the associated Query message, which associates the Qucryllit message with Lhc corresponding Query message.

In accordance with the present .invention, the most recent succeeded queries of the neighbour arc obtained when the querying peer receives a QucryHit message, it records the QueryHit information, which includes the responder of the QuerγHit message and the corresponding search criteria in the Query message. It does not need any extra message. λs an alternative, the most recent succeeded queries of the neighbour can be obtained when the querying peer receives a QueryHit message, i t records the QueryHit information and also sends the Queryllit information to all its neighbours. It has more accurate history information of the peers, at the expense of extra messages .

In accordance with the present invention, a rate limit of the received messages is set for each peer in the unstructured P2P network on the basis of its capability, if the receiving rate of the peer is below the rate limit, the peer w i .1 ] receive the incoming message; if not, the peer wi Ii discard the message.

In accordance with the present invention, all the messages are classified into several levels on the basis

of their priorities. The message having higher priori Ly is processed first. The messages are 1 is Led as To 1 lows with decreased priority: QueryHit message, Query message, Pong message, Ping message, and Push message cLc.

In accordance w.i Lh the presenL invention, the unstructured P2P network is a Gnutella network.

Simulation results show that the improved search method outperforms the pure random search method in terms of query success rate, query response time and hit number per query.

Other objects, advantages and novel features of the present invention w.i 1.1 become more apparent from the following detailed description when taken in conjunction w.i th the accompanying drawings .

It is to be understood that both the foregoing general description and the following dotai led description, of the present invention arc exemplary and explanatory and are intended to provide further explanation of the invention as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which arc included to provide a further understanding of the invention and are incorporated in and constitute a part of th.i s application, illustrate embodiment { s) of the invention and together with the description serve to explain the principle of the invention. In the drawings:

Fig. 1 shows the flow chart of the search method according to the present invention;

Fig. 2 shows Lhe frame structure of the Pong message;

Fig. 3A shows the frame structure of lhe Query message;

Fig. 3B shows the frame structure o[ the Qucryllit message; and

Fig. 4 shows the flow chart of the neighbour selection process according to the present invcnUon.

DETAILED DESCRIPTION Oh' 'I 1 HF, PREFERRED EMBODIMENT

The technical features of the present .invention will be described further with reference to the embodiments. The embodiments are only preferable examples without being limited to the present invention. Tt will be well understood by the ski Lied person in the art upon reading the fo 1 lowing detailed description in conjunction with the accompanying drawings.

Locating a resource or service efficiently is one of the most important .issues related to unstructured decentralized peer-to-peer systems (networks) . This invention proposes an improved searching method for Gnutella systems, which is based on the random walk and with emphasis on the peers' selection criteria. The selection is based on several hints of its neighbours, including the number of the shared files, the most recent succeeded queries and the specific peer that provided the answer. By using these hints, a peer selection mechanism is proposed. The search source selects a set of neighbours according to the selection criteria. Ln addition, a flow control ' mechanism is adopted in order to avoid the overload of hotspots and message loss. The

C] ow control is based on the rate 1 i m i L and message priority.

Fig. 1 shows the flow chad, of the search method according to the present invention.

First a querying peer in the unstructured P2P network selects a subset of its neighbours adapt LvcJ y

(S102) . Then the querying peer forwards a Query message to the subset of its neighbours Intel M gent Iy on the basis of search criteria (S104) . in response to the Query message, if a receiver peer in the subset of the querying peer's neighbours can provide a Queryll i t message, the receiver peer returns the Queryllit message to the querying peer; otherwise, the receiver peer propagates the Query message to one of its neighbours having the most probability to respond to the Query message with a QueryHi t message (S106) .

In the above method, the querying peer sets a Limit on the search depth to terminate searching process, each peer in the unstructured P2P network is provided with an information table lor storing the statistics iniormation of neighbour peers of the each peer.

In the above method, the statistics information comprises the number of shared files, the most recent succeeded queries and the specific peer providing the response to the Query message.

In the above method, the information table comprises a file tabie for storing the shared fiJcs of the neighbour peers and a Query table for storing the most recent succeeded queries of the neighbour peers.

The technical solution in accordance with the present invention includes three components: search

mechanism, neighbour selection mechanism and Clow conl.ro L mechanism. TL wil.1 be discussed in detail as below.

1. Search Mechanism

The peer an.LLiaL.ing a search is referred Lo as querying peer. The peer receiving a query message is called the receiver peer. Our search proLocol uses a biased random walk: raLher than forwarding Incoming queries to randomly chosen neighbours, the querying peer forwards the query message to a subset of .its neighbours intelligently. Tf the receiver peer can provide an answer, it returns the result to the requesting querying peer; otherwise, it propagates the query message to one of its neighbour which has the most probab i J j ty to hit the query. The querying peer sets a Limit on the search depth to terminate the walk.

2 . Neighbour selec tion mecha n i sm

Tn order Lo select the appropriate neighbours, a node should maintain a table which contains the statistics information on its neighbours. Those statistics includes the number of the shared f i lcs, the most recent succeeded queries and the specific peer that provided the answer. With these statistics, a method can be developed to help selecting the best neighbour Lo send the query .

2.1. Information collection

These statistics arc collected as fol lows:

2.1.1. The number of the shared Li les

The Gnutella protocol defines the way in which peers communicate over the network. Pong descriptors are

only sent in response to an Incoming Ping descriptor. It contains the address of the peer and the amount of f i 1 cs it's sharing on the network (Pong descriptor is illustrated as Fig. 2) . λ node caches the Pong information it received and then it can pick up information about the number oC the shared fries of its neighbours from its cache. The shared files of. " the neighbours are stored in file table.

λ neighbour with .large number of shared Fi les implies having higher probability to hit a query at this node .

2.1.2. The most recent succeeded queries

The most recent succeeded qu.er.ies of the neighbours are stored in query table. The query table LncLudes the responder of the Queryl-lit and the correspond i nq search criteria .

The Descr.i ptor_Td field in the Descriptor Header of the Queryliit contains the same value as that of the associated Query descriptor. This associates the QueryHit descriptors with Query descriptors it qcneratcd. Thus, query table can be got ffroru the QueryHit message and corresponding Query message.

To obtain the most recent succeeded queries of the neighbour, two following methods may be appl icd.

1. When a peer receives a Que ryllj t message, it records the QueryHit information, which includes the responder of the QueryHit and the corresponding search criteria in the Query message. This method docs not need any extra message.

2. When a peer receives a QucryHil message, it records the QueryHit information and also sends it to all its neighbours. This method has more accurate history information of the peers, at the expense of extra messages.

The frame structures of Query message and QueryWit message are shown in Fig. 3A and 313.

2.2. Selection rule

When a peer initiates a query, it selects n (n<N, N is the total number of neighbours) peers from .its neighbours. The value of n is denoted as some percent of the total number of neighbours .

The peer looks up the current "search criteria" in the recent succeeded queries of its neighbour. If " one of its neighbours has the record of the "search criteria", then the neighbour is selected. This process continues until the number of the selected peers equals to n or it arrives at the end of its neighbours' query .list. After this process, if the selected number is less than n, it continues to select peers from the rest neighbours who have more shared files until the number of the selected peers equals n.

Fig. 4 shows the flow chart of the neighbour selection process according to the present Invention, which has the steps of:

S402, setting the initial value of the number of the selected peers to be zero, wherein, the total number of peers to be selected is n, n<N, N is the total, number of the neighbours;

S404, looking up the Query LabJes o[ Lhc n peers Lo confirm whether the search criteria is in the Query tables;

S406, selecting the peers having the Query table including the search criteria;

S408, determining whether the number of the selected peers is less than n, if " the number of the selected peers is not less than n, the selection for the subset is terminated;

S410, if the number of the selected poors is less than n, repeating steps S404 to S408 until it arrives at the end of its neighbours' " Query table;

S412, looking up the file table of the neighbour peers; and

S414, selecting the peer which has the max number o F the shared file, and add the number of the selected peers with 1; and

S416, repeating steps S412 to S416 until the number of the selected peers is equal to n.

3. Flow control mechanism

The flow control mechanism aims to avoid overload of hotspots and message loss. Each node sets a rate limit of the receiving messages based on its capability. If the receiving rate of the node is below the limit, the node will receive the incoming message, if not, it wi J L discard the message. Tn addition, all the messages are classified into several levels according to their priority. The messages are 1 is Led as follows with

decreased priority: QuerγH:i t, Query, Pong, P i nq , Push etc. The message with higher pr.io.rj Ly will be processed first.

It is applicable to peer-to-peer systems not only under wired environment but also under wireless environment .

Whilst there has been described in the f ' orqo.ing description preferred embodiments and aspects of the present invention, it will be understood by those skilled in the art that many variations in details of design or construction may be made without departing from the present invention. The present invention extends to a Ii features disclosed both individually, and in ail possible permutations and combinations.