METHOD AND APPARATUS FOR REDUCING DATA REDISTRIBUTION BASED ON QUERY SPLIT

Title:

METHOD AND APPARATUS FOR REDUCING DATA REDISTRIBUTION BASED ON QUERY SPLIT

Document Type and Number:

WIPO Patent Application WO/2018/100417

Kind Code:

Abstract:

The present invention discloses method and apparatus for reducing data redistribution based on query split. In contrast to the prior-art techniques, the present invention by method and apparatus when receives a query that has multilevel grouping/ aggregation, converts the query into join of two queries where one query will project operators which are simple and does not needs to be redistributed, whereas other will project the redistributable operator only. In such case the present invention enables only one aggregate function to be evaluated at multi level and all others can be just evaluated at one level.

Inventors:

KUMAR DILIP (IN)
RAMAMURTHI PRASANNA (IN)
SIVAKUMAR KALYAN (IN)

Application Number:

PCT/IB2016/057302

Publication Date:

June 07, 2018

Filing Date:

December 02, 2016

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH INDIA PVT LTD (IN)
HUAWEI TECH CO LTD (CN)

International Classes:

G06F17/30

Foreign References:

US7984043B1

2011-07-19

Attorney, Agent or Firm:

MAJUMDAR, Subhatosh et al. (IN)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

An apparatus , for providing execution of a query, in a distributed system, the apparatus comprising:

a memory that stores a plurality of instructions; and

a processor operatively coupled to the memory , the processor configured to execute the instructions to:

receive the query for execution;

optimize the query to generate a query plan for the execution of the query received; wherein optimizing the query comprises:

determining at least one first function executable at a node without using redistribution and at least one second function executable at one or more nodes using redistribution in the query received; and

splitting, the query in at least two sub-queries, wherein at least one first sub- query from the two sub-queries comprising the at least one first function executable at a node without using redistribution , and at least one second sub- query from the two sub-queries comprising the at least one second function executable at one or more nodes using redistribution; and

generate the query plan based on the at least two sub-queries for the execution.

The apparatus as claimed in claim 1 , wherein the at least one second function comprising multilevel grouping or at least one aggregation function.

The apparatus as claimed in claim 1, the processor further configured to execute the instructions to: join the query plan generated based on the at least two sub-queries by at least a JOIN function.

The apparatus as claimed in claim 1, the processor further configured to execute the instructions to:

execute the query plan generated based on the at least two sub-queries;

generate at least one tuple in response to the execution of the query plan; and transmit the tuple generated by the one or more nodes to the apparatus.

5. The apparatus as claimed in claim 4, the processor further configured to execute the instructions to:

comprising:

receive at least one tuple from the one or more nodes; group the at least one tuple to generate at least one result to the query received for execution; and

display the result of execution.

6. A method, implemented by an apparatus in a distributed system,, for providing execution of a query, the method comprising:receiving at least one query for execution;

optimizing the query to generate at a query plan for the execution of the query received; wherein optimizing the query comprises:

splitting, , the query in at least two sub-queries, wherein at least one first sub-query from the two sub-queries comprising the at least one first function executable at a node without using redistribution and at least one second sub-query from the two sub-queries comprising the at least one second function executable at one or more nodes using redistribution; and generating the query plan based on the at least two sub-queries for the execution..

7. The method as claimed in claim 6, wherein the at least one second function comprising multilevel grouping or at least one aggregation function.

8. The method as claimed in claim 6, further comprising: joining the query plan generated based on the at least by at least a JOIN function..

9. The method as claimed in claim 6, method further comprising: executing the query plan based on the at least two sub-queries; ;

generating at least one tuple in response to the execution of the query plan; and

transmitting the tuple generated by the one or more nodes to the apparatus.

10. The method as claimed in claim 9, further comprising:

receiving at least one tuple from the one or more nodes;

grouping the at least one tuple to generate at least result to the query received for execution; and

displaying the result of execution.

11. An apparatus, for providing execution of a query, in a distributed system, the apparatus comprising:

a receiving module configured to receive the query for execution;

an optimization module configured to optimize the query to generate a query plan for the execution of the query received; wherein the optimization module further comprises:

a determining module configured to determine at least one first function executable at a node without using redistribution and at least one second function executable at one or more nodes using redistribution in the query received; and

a convertor module configured to split the query in at least two sub- queries, wherein at least one first sub-query from the two sub-queries comprising the at least one first function executable at a node without using redistribution, and at least one second sub-query from the two sub- queries comprising the at least one second function executable at one or more nodes using redistribution;

and

a query plan generator configured to generate the query plan based on the at least two sub-queries for the execution.

12. The apparatus as claimed in claim 11, wherein the at least one second function comprising multilevel grouping or at least one aggregation function.

13. The apparatus as claimed in claim 11 further comprising : an executing module configured to join the query plan generated based on the at least two sub-queries by at least a JOIN function.

14. The apparatus as claimed in claim 11 or 13, the executing module further configured to :

execute the query plan generated based on the at least two sub-queries; generate at least one tuple in response to the execution of the query plan; and

transmit the tuple generated by the one or more nodes to the apparatus.

15. The apparatus as claimed in claim 14, wherein the receiving module further configured to receive at least one tuple from the one or more nodes;

the grouping module configured to group the at least one tuple together to generate at least one result; and

an interface configured to display the result of execution.

Description:

METHOD AND APPARATUS FOR REDUCING DATA REDISTRIBUTION

BASED ON QUERY SPLIT

TECHNICAL FIELD

[001] The present subject matter described herein, in general, relates to database technologies or database query aggregation operations, and more particularly, to methods and apparatuses for reducing data redistribution based on query split.

BACKGROUND

[002] As conventionally known, a database system provides a high-level view of data, but ultimately the data have to be stored as bits on one or more storage nodes. A vast majority of databases today store data on magnetic disk (and, increasingly, on flash storage) and fetch data into main memory for processing, or copy data onto tapes and other backup nodes for archival storage. The physical characteristics of storage nodes play a major role in the way data are stored, in particular because access to a random piece of data on disk is much slower than memory access: Disk access takes tens of milliseconds, whereas memory access takes a tenth of a microsecond.

[003] A database management system (DBMS) is generally system software for creating and managing databases. The DBMS provides users and programmers with a systematic way to create, retrieve, update and manage data. The DBMS is a collection of programs that enables you to store, modify, and extract information from a database. A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a number of nodes, or a computer network in which users store information on a number of peer network nodes. A query is an inquiry into the database using the SELECT statement. The data relating to a table is managed in multiple data nodes/databases/machine and fetched based on the query fired to the database. In distributed databases, the select query is sent to all the data nodes and the data nodes execute the plan independently and the results are merged to give a unified response to the query.

[004] Data Redistribution is the process of moving data replicates from one data site to another to meet business needs. It is a process that constantly balances data needs, data volumes, data usage, and the physical operating environment. When data items are redistributed between nodes, network I/O contributes significantly to the completion time of the query. In conventional distributed systems, in existence of any operator in the query which cannot be pushed down to lower node for execution, all aggregate function will be evaluated at intermediate level and lots of tuple will be redistributed across the nodes. Hence there is a need to reduce data redistribution/movement of large set of records over the network as a part of query execution. Especially in columnar system/databases, in spite of being faster, fetching independent column will have extra overhead. Further, in single node execution also all the aggregate function will be evaluated at intermediate level. Figure 1 illustrates an exemplary distributed system projecting data tuples on execution of a query with a single node performing all the aggregate functions. As shown in the figure 1, even if the final number of tuples projected are very less, there are some functions which cannot by calculated at lower levels and cause all the tuples to be projected from lower layer, and since aggregates are on many columns all the columns need to be fetched. The aggregate function mentioned in the example as shown in figure 1 is mere an example of the class the function that are referred to as ".=" functions that cannot be pushed down the query execution tree, thus resulting in a large number of records to move up the tree (in case of single node execution) or get redistributed in case of distributed databases.

[005] Hence, the problem with the existing distributed systems and/or the single node execution is that, due to presence of some operator which cannot be pushed down, all operators have to be executed at multi level. Further, even though final number of tuple projected are very less, all the tuples needs to be projected from lower layer, and since aggregates are on many columns all the columns need to be fetched (as shown in figure 1). Therefore, the large amount of data needs to move/redistribute during the query execution is not only in a distributed environment, but also profound in a local query execution in a single node as well.

SUMMARY

[006] This summary is provided to introduce concepts related to method and apparatus for reducing data redistribution based on query split, and the same are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.

[007] A main objective of the present invention is to solve the technical problem as recited above by providing method and apparatus for reducing data redistribution based on query split wherein when there is a query having multilevel grouping/ aggregation, the query is converted into join of two queries. One query of the two queries will project operators which are simple and does not need to be redistributed in the distributed system and other query of the two queries will project the redistributable operator only. Hence, only one aggregate function needs to evaluate at multi level and all others can be just evaluated at one level.

[008] After splitting the query into two, out of the two queries, if any one of the query having aggregate function, then the same needs to be redistributed in the distributed system and needs to be executed at multilevel. So, there is a relation between redistribution and multilevel evaluation i.e. the query having the function which can't be executed at lower level/node level, needs to be redistributed and execute the same at multilevel or intermediate level. Again, executing the function at multilevel or intermediate level is also dependent on the type of the aggregate function.

[009] Accordingly, one aspect of the invention, an apparatus, for providing execution of a query, in a distributed system, the apparatus comprising:

a memory that stores a plurality of instructions; and a processor operatively coupled to the memory , the processor configured to execute the instructions to:

receive the query for execution;

optimize the query to generate a query plan for the execution of the query received; wherein optimizing the query comprises:

splitting, the query in at least two sub-queries, wherein at least one first sub-query from the two sub-queries comprising the at least one first function executable at a node without using redistribution , and at least one second sub-query from the two sub-queries comprising the at least one second function executable at one or more nodes using redistribution; and

generate the query plan based on the at least two sub-queries for the execution.

[0010] According to another aspect of the invention, there is provided a method, implemented by an apparatus in a distributed system, for providing execution of a query, the method comprising: receiving at least one query for execution;

optimizing the query to generate at a query plan for the execution of the query received; wherein optimizing the query comprises:

splitting the query in at least two sub-queries, wherein at least one first sub-query from the two sub-queries comprising the at least one first function executable at a node without using redistribution and at least one second sub-query from the two sub-queries comprising the at least one second function executable at one or more nodes using redistribution; and

generating the query plan based on the at least two sub-queries for the execution. [0011] In one implementation, an apparatus is disclosed. The apparatus, for providing execution of a query, in a distributed system, the apparatus comprising:

a receiving module configured to receive the query for execution;

an optimization module configured to optimize the query to generate a query plan for the execution of the query received; wherein the optimization module further comprises: a determining module configured to determine at least one first function executable at a node without using redistribution and at least one second function executable at one or more nodes using redistribution in the query received; and

a convertor module configured to split the query in at least two sub-queries, wherein at least one first sub-query from the two sub-queries comprising the at least one first function executable at a node without using redistribution , and at least one second sub- query from the two sub-queries comprising the at least one second function executable at one or more nodes using redistribution; and

a query plan generator configured to generate the query plan based on the at least two sub-queries for the execution.

The apparatus further comprising an executing module configured to join the query plan generated based on the at least two sub-queries by at least a JOIN function

[0012] In one implementation, an apparatus is disclosed. The apparatus comprises a processor, and a memory coupled to the processor for executing a plurality of modules present in the memory. The plurality of modules comprises a receiving module, a grouping module, and an interface. The receiving module is configured to receive at least tuple form the one or more systems, the tuple is obtained on execution of a query plan generated for at least a join of at least two queries, wherein at least one query from the two queries contain at least one function executable at a single node and/or at least second query from the two queries contain at least other function executable using redistribution in the query received. The grouping module is configured to group the tuples received together to generate at least result to the query received for execution. The interface is configured to display the result of execution.

[0013] In one implementation, the apparatus is configured to receive the query for execution; optimize the query to generate at a query plan for execution of the query received; wherein optimizing the query comprises, few set of database functions which fall into either of the two categories: 1. Functions that can be executed at the lowest level, and 2. functions that cannot be executed at lower level, doing so will result in wrong result. These functions must be applied on the data set collected from various data nodes. The example for functions of first category are sum(), min() etc. that does not depend on any distribution. If a query contains both the types of the functions, then the first type of functions are forced to act as second type. This forcing causes a lot of records to unnecessarily move up the query chain and thus reducing the query's throughput. They are like the dead weight in the query chain. The effect is glorified by the presence of a grouping operator. According to the present invention, the cases where the query contains both the types of functions are identified, the query are split into two parts, the first part containing the 1 ^st type of functions and the other part containing the 2 ^nd type of query. The results are then unified by a join query. [0014] In contrast to the prior-art techniques, the present invention by method and apparatus converts a single table scan with "group by" to join of two scan with "group by" on same table by splitting the target list.

[0015] The technical benefit achieved by the present invention is that the query with multi level group by and lots of aggregate function can get huge performance benefit in calculating the aggregate (for column store it will be more visible as each column data needs to be fetched independently). In distributed databases (especially column store), when some of the aggregate operation cannot be executed locally and need data redistribution the present invention saves a huge cost.

[0016] The various options and preferred embodiments referred to above in relation to the first implementation are also applicable in relation to the other implementations . BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

[0017] The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer like features and components.

[0018] Figure 1 illustrates grouping and aggregation of the data at intermediate level.

[0019] Figure 2 illustrates an example of a single table scan query converted to join of two scan (join on unique field) on same table and target list can be divided across two scans, in accordance with an embodiment of the present subject matter. [0020] Figure 3 illustrates a flowchart for optimizers, in accordance with an embodiment of the present subject matter.

[0021] Figure 4 illustrates a block diagram for plan execution, in accordance with an embodiment of the present subject matter.

[0022] Figure 5 illustrates an apparatus for execution of a query, in accordance with an embodiment of the present subject matter.

[0023] Figure 6 illustrates a method for execution of a query in distributed systems, in accordance with an embodiment of the present subject matter.

[0024] Figure 7 illustrates a generalized computer network arrangement in accordance with an embodiment of the present subject matter. [0025] It is to be understood that the attached drawings are for purposes of illustrating the concepts of the invention and may not be to scale.

DETAILED DESCRIPTION OF THE PRESENT INVENTION

[0026] The following clearly describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.

[0027] The invention can be implemented in numerous ways, as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or electronic communication links. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention.

[0028] A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured. [0029] In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components, modules, units and/or circuits have not been described in detail so as not to obscure the invention.

[0030] Although embodiments of the invention are not limited in this regard, discussions utilizing terms such as, for example, "processing," "computing," "calculating," "determining," "establishing", "analyzing", "checking", or the like, may refer to operation(s) and/or process(es) of a computer, a computing platform, a computing system, or other electronic computing device, that manipulates and/or transforms data represented as physical (e.g., electronic) quantities within the computer's registers and/or memories into other data similarly represented as physical quantities within the computer's registers and/or memories or other information non-transitory storage medium that may store instructions to perform operations and/or processes.

[0031] Although embodiments of the invention are not limited in this regard, the terms "plurality" and "a plurality" as used herein may include, for example, "multiple" or "two or more". The terms "plurality" or "a plurality" may be used throughout the specification to describe two or more components, devices, elements, units, parameters, or the like. Unless explicitly stated, the method embodiments described herein are not constrained to a particular order or sequence. Additionally, some of the described method embodiments or elements thereof can occur or be performed simultaneously, at the same point in time, or concurrently.

[0032] Consider a query with a group by clause having lots of aggregates

(especially in distributed columnar system). If some of the aggregate function (such as, count (Distinct)) is present, at intermediate level such aggregate functions cannot be pushed down at lower level. This cause a lot of tuple to flow from lower nodes to upper node. [0033] Hence, the problem with the existing distributed systems is that, due to presence of some operator which cannot be pushed down, all operators have to be executed at multi level. Further, even though final number of tuple projected are very less, all the tuples needs to be projected from lower layer, and since aggregates are on many columns all the columns need to be fetched (as shown in figure 1).

[0034] Systems, methods and apparatuses for reducing data redistribution based on query split are disclosed.

[0035] While aspects are described for methods and apparatuses for reducing data redistribution based on query split, the present invention may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary systems, apparatus, and methods.

[0036] Henceforth, embodiments of the present disclosure are explained with the help of exemplary diagrams and one or more examples. However, such exemplary diagrams and examples are provided for the illustration purpose for better understanding of the present disclosure and should not be construed as limitation on scope of the present disclosure.

[0037] In order to solve the technical problems as recited in the sections above, the present invention when receives a query that has multilevel grouping/ aggregation, converts the query into join of two queries where one query will project operators which are simple and does not need to be redistributed and other will project the redistributable operator only, in such cases only one aggregate function needs to evaluated at multi level and all others can be just evaluated at one level. [0038] In one implementation, any single table scan query can be converted to join of two scan (join on unique field) on same table and target list can be divided across two scans. Referring now to figure 2, an example showing a group by on Gl field is shown. The figure shows how the splitting the query into two parts and joining their results ends up saving the cost of the execution, according to the present invention. Consider, any SQL query of type:

Π (Fl(cl), F2(c2)...Fn(cn)) Group By gl (Tbl) => Π (Fl(cl), F2(c2)...Fk(ck)) Group By gl (Tbl) JOIN Π (Fk+l(ck+l)„ Fn(cn) ) Group By gl (Tbl)

[0039] Suppose per aggregate function cost is 1, the aggregate cost will increase with increasing the tuples. So suppose we have 1000 tuples and 10 aggregate function then cost is 1000*10 = 10000.

[0040] Referring now to figure 3 a flowchart for optimizers is illustrated, in accordance with an embodiment of the present subject matter. As shown in figure 3, whenever a new query enters/is received, the optimizer will analyze whether it has any grouping clause and one of the aggregate function is a count (Distinct). If the query has more than one aggregate then a single table scan will be converted to a join of two queries, where one query contains count (Distinct) and other contains remaining aggregate. After conversion a final cost will be compared and appropriate plan will be selected for execution. Now only plan with count(Distinct) cannot push the group by and needs redistribution, however, the other plan group by can be pushed down to lower level and no redistribution is required.

[0041] Referring now to figure 4, a block diagram for plan execution is illustrated, in accordance with an embodiment of the present subject matter. As shown in figure 4, is divided into two sub-diagrams (top and bottom) as a single table group may be converted to a Join of two groups by operators.

[0042] As shown in the top sub-diagram of figure 4 is the execution of Planl this plan looks same as the base plan only difference is projection is very small compared to base plan. The top sub-diagram shows that all the results needs to be sent coordinator (or any other centralized/specialized data node) to handle the results which contains a lot of records However, as shown in the bottom sub-diagram of figure 4 is Plan 2 wherein the number of tuple projected to coordinator are just aggregated results so number of tuple is very less.

[0043] Referring now to figure 5, an apparatus in the distributed system for execution of a query is illustrated, in accordance with an embodiment of the present subject matter. In one implementation, the apparatus 500 in the distributed system is disclosed. Although the present subject matter is explained considering that the present invention is implemented in the apparatus 500 in the distributed system, it may be understood that the present invention may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. It will be understood that the apparatus 500 in the distributed system may be accessed by multiple users, or applications residing on the database system. Examples of the apparatus 500 in the distributed system may include, but are not limited to, a portable computer, a personal digital assistant, a handheld node, sensors, routers, gateways and a workstation. The apparatus 500 distributed system / are communicatively coupled to each other and/or other nodes or a nodes or apparatuses to form a network (not shown). Examples of the apparatus 500 in the distributed system may include, but are not limited to, a portable computer, a personal digital assistant, a handheld node, sensors, routers, gateways and a workstation.

[0044] The apparatus 500 in the distributed system is communicatively coupled to each other and/or other nodes or a nodes or apparatuses to form a network (not shown). In one implementation, the network (not shown) may be a wireless network, a wired network or a combination thereof. The network can be implemented as one of the different types of networks, such as GSM, CDMA, LTE, UMTS, intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network may include a variety of network nodes, including routers, bridges, servers, computing nodes, storage nodes, and the like.

[0045] In one implementation, an apparatus 500 for execution of a query is disclosed. The apparatus 500 is configured to receive the query for execution; optimize the query to generate at a query plan for the execution of the query received; wherein optimizing the query comprises: determining at least one function executable at a single node and/or at least other function executable using redistribution in the query received; splitting, if at least one function and at least other function determined, the query in at least two sub-queries, at least a first sub-query from the two sub-queries contain the at least one function executable at a single node without using redistribution and at least the second sub-query from the two sub-queries contain the other function executable using redistribution; generate the query plan based on the two sub-queries for the execution, wherein sub-queries contain the other function executable using redistribution is redistributed to one or more other nodes of the distributed system for execution.

[0046] In one implementation, the one or more nodes are configured to execute the query plan generated by the apparatus; generate at least tuple in response to the execution of the query plan; and transmit the tuple generated by the one or more nodes to the apparatus.

[0047] In one implementation, the apparatus 500 is further configured to receive at least tuple from the one or more nodes; group the tuples together to generate at least result to the query received for execution; and display the result of execution.

[0048] In one implementation, an apparatus 500 is disclosed. The plurality of modules comprises an optimization module 504, a determining module 506, a convertor module 508, and a query plan generator module 510.. The optimization module 504 is configured to optimize the query to generate at a query plan for the execution of the query received. The determining module 506 is configured to determine at least one function executable at a single node and/or at least other function executable using redistribution in the query received. The converter module 508 is configured to split, if at least one function and at least other function determined, the query in at least two sub-queries, at least a first sub-query from the two sub-queries contain the at least one function executable at a single node without using redistribution and at least the second sub-query from the two sub-queries contain the other function executable using redistribution. The query plan generator module 510 configured to generate the query plan based on the two sub-queries for the execution, wherein sub-queries contain the other function executable using redistribution is redistributed to one or more other nodes of the distributed system for execution.

[0049] In one implementation, an apparatus 500 is disclosed. The plurality of modules comprises a receiving module 502, a grouping module 512, and an executing module 514. The executing module (514) configured to join the query plan generated based on the at least two sub-queries by at least a JOIN function. The executing module (514) further configured to execute the query plan generated based on the at least two sub-queries; generate at least one tuple in response to the execution of the query plan; and transmit the tuple generated by the one or more nodes to the apparatus. The receiving module 502 is configured to receive at least tuple form the one or more nodes the tuple is obtained on execution of a query plan generated for at least a join of at least two queries, wherein at least one query from the two queries contain at least one function executable at a single node and/or at least second query from the two queries contain at least other function executable using redistribution in the query received. The grouping module 512 is configured to group the tuples received together to generate at least result to the query received for execution. The interface 714 is configured to display the result of execution.

[0050] In one implementation, the join of two queries is the join on unique field on same table.

[0051] In one implementation, the query plan generated, by the apparatus 500, for at least a query containing at least a grouping clause, at least a distinct operator, and at least two aggregate operations. [0052] Referring now to figure 6, a method for execution of a query in distributed systems is illustrated, in accordance with an embodiment of the present subject matter. The method may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.

[0053] The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method or alternate methods. Additionally, individual blocks may be deleted from the method without departing from the protection scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described below, the method may be considered to be implemented in the above described apparatus 500 in the distributed system.

[0054] In one implementation, a method for execution of a query in distributed systems is disclosed. [0055] At block 602, at least one query is received for execution.

[0056] At block 604, a presence of at least one function executable at a single node and/or at least other function executable using redistribution is determined in the query received. [0057] At block 606, if at least one function and at least other function is determined, split the query in at least two sub-queries, at least a first sub-query from the two sub-queries contain the at least one function executable at a single node and at least the second sub-query from the two sub-queries contain the other function executable using redistribution.

[0058] At block 608, the query plan based on the two sub-queries is generated for the execution of the two queries. [0059] At block 610, sub-queries contain the other function executable using redistribution is redistributed to one or more other nodes of the distributed system for execution.

[0060] At block 612, the query plan is executed on the one or more nodes of the distributed system depending on the query plan generated.

[0061] At block 614, at least tuple from the one or more nodes is received.

[0062] At block 616, the tuples are grouped together to generate at least result to the query received for execution by the apparatus selected in the distributed system. The grouped tuples are further used to generate result and the result is displayed as a result of execution of the query.

[0063] In one implementation, the method further comprises joining the query plan generated based on the two sub-queries preferably by at least a join query based on unique field on same table.

[0064] With reference to FIG. 7, the computing environment 700 includes at least one processor 702 and memory 704. The processor 702 executes computer-executable instructions and may be a real or a virtual processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. The processor 702 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any nodes that manipulate signals based on operational instructions. Among other capabilities, the at least one processor is configured to fetch and execute computer-readable instructions or modules stored in the memory 706. The memory 704 may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two. In some embodiments, the memory 704 stores software implementing described techniques.

[0065] A computing environment may have additional features. For example, the computing environment 700 includes storage 706 one or more input devices 708, one or more output devices 710, and one or more communication connections 712 and an interfac 714. An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment 700. Typically, operating system, software (not shown) provides an operating environment for other software executing in the computing environment 700, and coordinates activities of the components of the computing environment 700. [0066] The storage 706 may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, CD-RWs, DVDs, or any other medium which may be used to store information and which may be accessed within the computing environment 700. In some embodiments, the storage 706 stores plurality of instructions or modules or applications to perform various functionalities. The memory includes routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types

[0067] The input device(s) 708 may be a touch input device such as a keyboard, mouse, pen, trackball, touch screen, or game controller, a voice input device, a scanning device, a digital camera, or another device that provides input to the computing environment 700. The output device(s) 710 may be a display, printer, speaker, or another device that provides output from the computing environment 700.

[0068] The communication connection(s) 712 enable communication over a communication medium to another computing entity. The communication medium conveys information such as computer-executable instructions, audio or video information, or other data in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, infrared, acoustic, or other earner.

[0069] The interface (I/O interface) 714, may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface may allow the database system, the first node, the second node, and the third node to interact with a user directly. Further, the I/O interface may enable the distributed system / the apparatus 500 to communicate with other nodes or nodes, computing nodes, such as web servers and external data servers (not shown). The I/O interface can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, GSM, CDMA, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface may include one or more ports for connecting a number of nodes to one another or to another server. The I/O interface may provide interaction between the user and the distributed system / the apparatus 500 via, a screen provided for the interface.

[0070] implementations may be described in the general context of computer- readable media. Computer-readable media are any available media that may be accessed within a computing environment. By way of example, and not limitation, within the computing environment 700, computer-readable media include memory 704, storage 706, communication media, and combina tions of any of the above. [0071] Apart from what is discussed above, the present invention has some additional advantages and technical benefits, as provided below:

• The present invention enables to convert a single table scan with "group by" to a join of two scan with "group by" on same table by splitting the target list. · The present invention enables a query with multi level group by and lots of aggregate function a huge performance benefit in calculating the aggregate (further, for column store it will be more visible as each column data needs to be fetched independently).

• In case of the distributed database (especially, column store), when some of the aggregate operation cannot be executed locally and need data redistribution the present invention saves a huge cost.

• When a query have group by one column and count (distinct) on other column, the present invention using count distinct force to group at multi level and then execute.

· The present invention enables any single table scan query to be converted into a join of two scan (join on unique field) on same table and target list can be divided across two scans.

[0072] A person skilled in the art may understand that any known or new algorithms by be used for the implementation of the present invention. However, it is to be noted that, the present invention provides a method to be used during back up operation to achieve the above mentioned benefits and technical advancement irrespective of using any known or new algorithms. [0073] A person of ordinary skill in the art may be aware that in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether the functions are performed by hardware or software depends on the particular applications and design constraint conditions of the technical solution. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of the present invention.

[0074] It may be clearly understood by a person skilled in the art that for the purpose of convenient and brief description, for a detailed working process of the foregoing system, apparatus, and unit, reference may be made to a corresponding process in the foregoing method embodiments, and details are not described herein again.

[0075] In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the described apparatus embodiment is merely exemplary. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms. [0076] When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product. The computer software product is stored in a storage medium, and includes several instructions for instructing a computer node (which may be a personal computer, a server, or a network node) to perform all or a part of the steps of the methods described in the embodiment of the present invention. The foregoing storage medium includes: any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc. [0077] Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise. In addition, devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries.

[0078] When a single device or article is described herein, it will be readily apparent that more than one device/article (whether or not they cooperate) may be used in place of a single device/article. Similarly, where more than one device or article is described herein (whether or not they cooperate), it will be readily apparent that a single device/article may be used in place of the more than one device or article or a different number of devices/articles may be used instead of the shown number of devices or programs. The functionality and/or the features of a device may be alternatively embodied by one or more other devices which are not explicitly described as having such functionality/features. Thus, other embodiments of the invention need not include the device itself.

[0079] Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based here on. Accordingly, the disclosure of the embodiments of the invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.

[0080] With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context and/or application. The various singular/plural permutations may be expressly set forth herein for sake of clarity. [0081] Although implementations for method and apparatus for reducing data redistribution based on query split have been described in language specific to structural features and/or methods, it is to be understood that the appended claims are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations of the method and apparatus for reducing data redistribution based on query split.

Previous Patent: COMMUNICATION METHOD AND SYSTEM

Next Patent: ORTHOPEDIC LOCKING SCREW