Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DYNAMIC PACKET AGGREGATION AND COMPRESSION
Document Type and Number:
WIPO Patent Application WO/2016/009254
Kind Code:
A1
Abstract:
In some embodiments, a data compression method performs an initial analysis on each of a plurality of candidate combinations of blocks containing inbound packets of network traffic data. The method determines an initial analysis result for each of the candidate combinations and compares the initial analysis results of the candidate combinations. The method determines a desired analysis result based on the comparison and selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result. The method forwards the selected subset of blocks for combining into outbound payload data.

Inventors:
BRAZEAU ALEXANDRE (CA)
CAVE GEORGE (CA)
Application Number:
PCT/IB2014/063193
Publication Date:
January 21, 2016
Filing Date:
July 17, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERICSSON TELEFON AB L M (SE)
International Classes:
H03M7/30; H04L47/36
Foreign References:
US6195024B12001-02-27
US20040103216A12004-05-27
Other References:
None
Attorney, Agent or Firm:
JUFFA, Kathryn A. (2001 Ross Avenue Suite 60, Dallas Texas, US)
Download PDF:
Claims:
CLAIMS:

1. A data compression method comprising the steps of:

for a set of blocks, performing (208) an initial analysis on each of a plurality of candidate combinations of the blocks, each block comprising one or more inbound packets of network traffic data;

determining (210) an initial analysis result for each of the candidate combinations; comparing (212) the initial analysis results associated with two or more of the candidate combinations;

determining (214) a desired analysis result based on the comparison;

selecting (216) a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result; and

forwarding (218) the selected subset of blocks for combining into outbound payload data.

2. The method of Claim 1 , wherein the initial analysis comprises a preliminary compression.

3. The method of Claim 1, further comprising combining (220) the selected subset of blocks into the outbound payload data.

4. The method of Claim 3, further comprising:

determining whether compressing the outbound data payload increases the size of the outbound payload data; and

leaving the outbound payload data uncompressed if compressing the outbound data payload increases the size of the outbound payload data.

5. The method of Claim 3, further comprising compressing (224) the outbound payload data according to a compression technique associated with the initial analysis.

6. The method of Claim 5, wherein:

each inbound packet comprises an inbound packet header and an inbound packet payload; and

the compressing of the outbound payload data includes compressing the inbound packet headers and the inbound packet payloads of the inbound packets that are comprised within the outbound payload data.

7. The method of Claim 1, further comprising:

combining the selected subset of blocks into the outbound payload data; and adding a header to the outbound payload data to form an outbound packet.

8. The method of Claim 1, wherein the method performs the initial analysis on a minimum number of the candidate combinations necessary to satisfy a desired analysis result such that fewer than all possible combinations of a pre-determined number of blocks from the set are analysed.

9. The method of Claim 1, wherein the method generates the candidate combinations according to a round-robin technique such that the initial analysis is performed on all possible combinations of a pre-determined number of blocks from the set.

10. The method of Claim 1, wherein the initial analysis is performed in parallel by a plurality of computing resources.

1 1. The method of Claim 10, wherein the number of computing resources is dynamically scaled relative to the rate at which the method receives the blocks.

12. The method of Claim 1, wherein the method further comprises forming the blocks according to the steps comprising:

aggregating (204) the inbound packets into a current one of the blocks if the current block can accommodate the inbound packets within a desired block size, wherein the aggregating occurs in real-time as the inbound packets are received;

aggregating the inbound packets into a next one of the blocks if the current block cannot accommodate the inbound packets within the desired block size.

13. The method of Claim 12, wherein the method determines the desired block size based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet.

14. The method of Claim 12, wherein the method determines the desired block size based on an amount of computing resources available.

15. The method of Claim 1, wherein the selected subset of blocks for combining into outbound payload data comprises a pre-determined number of blocks and the method determines the pre-determined number of blocks based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet.

16. The method of Claim 1 , further comprising:

evaluating a first compression technique during the initial analysis of a current set of blocks;

determining that an amount of time from receipt of the inbound packets of the current set of blocks to transmission of a corresponding outbound packet exceeds a maximum delay; and

in response, evaluating a second, faster compression technique during the initial analysis of a next set of blocks.

17. An network node comprising one or more processors and memory, the memory containing instructions executable by the one or more processors, whereby the network node is operable to:

for a set of blocks, perform (208) an initial analysis on each of a plurality of candidate combinations of the blocks, each block comprising one or more inbound packets of network traffic data;

determine (210) an initial analysis result for each of the candidate combinations;

compare (212) the initial analysis results associated with two or more of the candidate combinations;

determine (214) a desired analysis result based on the comparison;

select (216) a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result; and

forward (218) the selected subset of blocks for combining into outbound payload data.

18. The network node of Claim 17, wherein the initial analysis comprises a preliminary compression.

19. The network node of Claim 17, further operable to combine (220) the selected subset of blocks into the outbound payload data.

20. The network node of Claim 19, further operable to:

determine whether compressing the outbound data payload increases the size of the outbound payload data; and

leave the outbound payload data uncompressed if compressing the outbound data payload increases the size of the outbound payload data.

21. The network node of Claim 19, further operable to compress (224) the outbound payload data according to a compression technique associated with the initial analysis.

22. The network node of Claim 21 , wherein:

each inbound packet comprises an inbound packet header and an inbound packet payload; and

the compressing of the outbound payload data includes compressing the inbound packet headers and the inbound packet payloads of the inbound packets that are comprised within the outbound payload data.

23. The network node of Claim 17, further operable to:

combine the selected subset of blocks into the outbound payload data; and

add a header to the outbound payload data to form an outbound packet.

24. The network node of Claim 17, wherein the network node performs the initial analysis on a minimum number of the candidate combinations necessary to satisfy a desired analysis result such that fewer than all possible combinations of a pre-determined number of blocks from the set are analysed.

25. The network node of Claim 17, wherein the network node generates the candidate combinations according to a round-robin technique such that the initial analysis is performed on all possible combinations of a pre-determined number of blocks from the set.

26. The network node of Claim 17, wherein the initial analysis is performed in parallel by a plurality of computing resources.

27. The network node of Claim 26, wherein the number of computing resources is dynamically scaled relative to the rate at which the network node receives the inbound packets.

28. The network node of Claim 17, further operable to form the blocks according to the steps comprising:

aggregating (204) the inbound packets into a current one of the blocks if the current block can accommodate the inbound packets within a desired block size, wherein the aggregating occurs in real-time as the inbound packets are received;

aggregating the inbound packets into a next one of the blocks if the current block cannot accommodate the inbound packets within the desired block size.

29. The network node of Claim 17, wherein the network node determines the desired block size based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet.

30. The network node of Claim 17, wherein the network node determines the desired block size based on an amount of computing resources available.

31. The network node of Claim 17, wherein the selected subset of blocks for combining into outbound pay load data comprises a pre-determined number of blocks and the method determines the pre-determined number of blocks based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet.

32. The network node of Claim 17, further operable to:

evaluate a first compression technique during the initial analysis of a current set of blocks;

determine that an amount of time from receipt of the inbound packets of the current set of blocks to transmission of a corresponding outbound packet exceeds a maximum delay; and

in response, evaluate a second, faster compression technique during the initial analysis of a next set of blocks.

33. A system for data compression, comprising:

an analyzer (120) operable to:

for a set of blocks, perform (208) an initial analysis on each of a plurality of candidate combinations of the blocks, each block comprising one or more inbound packets of network traffic data;

determine (210) an initial analysis result for each of the candidate combinations;

compare (212) the initial analysis results associated with two or more of the candidate combinations;

determine (214) a desired analysis result based on the comparison;

select (216) a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result; and

forward (218) the selected subset of blocks for combining into outbound payload data;

an aggregation unit (130) operable to receive the selected subset of blocks from the analyzer (120) and combine (220) the selected subset of blocks into the outbound payload data; and

a compressor (140) operable to receive the outbound payload data from the aggregation unit (130) and compress the outbound payload data according to a compression technique associated with the initial analysis.

34. A non-transitory computer readable storage medium comprising logic, the logic, when executed by a processor, operable to:

for a set of blocks, perform (208) an initial analysis on each of a plurality of candidate combinations of the blocks, each block comprising one or more inbound packets of network traffic data;

determine (210) an initial analysis result for each of the candidate combinations;

compare (212) the initial analysis results associated with two or more of the candidate combinations;

determine (214) a desired analysis result based on the comparison;

select (216) a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result; and

forward (218) the selected subset of blocks for combining into outbound payload data.

Description:
DYNAMIC PACKET AGGREGATION AND COMPRESSION

TECHNICAL FIELD

The present disclosure relates, in general, to generating outbound packets from inbound packets of network traffic data and, more particularly, to methods for dynamic packet aggregation and compression.

BACKGROUND

Network nodes, such as routers or computers, communicate packets in a network. As an example, a router receives inbound packets of network traffic data from a first computer, generates outbound packets from the inbound packets, and transmits the outbound packets to a second computer. The process of generating the outbound packets may include compressing the data, in order to reduce its size. To decrease the size of data, certain compression techniques replace multiple sets of matching data with shorter symbols representing the matching data. When the data needs to be read, decompression techniques replace the symbols representing the matching data with the matching data itself.

Existing real time solutions aggregate and compress inbound packets on a first in, first out basis regardless of whether the compression is optimal. As a result, in some cases, compression actually increases the size of the data. Some existing solutions aggregate and compress packet headers based on the robust header compression (ROHC) IEEE standard RFC 6846. ROHC only compresses the header portion of the packet. ROHC does not compress the payload portion of the packet, which means packets with matching payloads will not be compressed. These existing real time solutions do not generate outbound packets based on the best possible compression configuration.

SUMMARY

Disclosed is a data compression method that performs an initial analysis on each of a plurality of candidate combinations of blocks containing inbound packets of network traffic data.

In some embodiments, the method determines an initial analysis result for each of the candidate combinations and compares the initial analysis results of the candidate combinations. The method determines a desired analysis result based on the comparison and selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result. The method forwards the selected subset of blocks for combining into outbound payload data.

In some embodiments, the method also combines the selected subset of blocks into the outbound payload data. The selected subset of blocks may comprise a pre-determined number of blocks, and the number may be determined based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet. In some embodiments, the method adds a header to the outbound payload data to form an outbound packet.

In some embodiments, the initial analysis comprises a preliminary compression and the method compresses the outbound payload data according to a compression technique associated with the initial analysis. The compression may be performed on both the header and payload portions of the inbound packets contained within the outbound payload data. In some embodiments, prior to compression, the method determines if the compression would cause the size of the outbound payload data to increase and, if so, leaves the outbound payload data uncompressed.

In some embodiments, the method evaluates a first compression technique during the initial analysis of a current set of blocks and determines an amount of time from receipt of the inbound packets of the current set of blocks to transmission of a corresponding outbound packet. If the amount of time exceeds a maximum delay, the method evaluates a second, faster compression technique during the initial analysis of a next set of blocks.

In some embodiments, the method performs the initial analysis on a minimum number of the candidate combinations necessary to satisfy a desired analysis result such that fewer than all possible combinations of a pre-determined number of blocks from the set are analysed. In alternative embodiments, the method generates the candidate combinations according to a round-robin technique such that the initial analysis is performed on all possible combinations of a pre-determined number of blocks from the set.

In some embodiments, the initial analysis is performed in parallel by a plurality of computing resources. Optionally, the number of computing resources can be dynamically scaled relative to the rate at which the method receives the blocks.

In some embodiments, the method forms the blocks by aggregating inbound packets into a current one of the blocks if the current block can accommodate the inbound packets within a desired block size. The aggregating occurs in real-time as the inbound packets are received. The method aggregates the inbound packets into a next one of the blocks if the current block cannot accommodate the inbound packets within the desired block size. The method may determine the desired block size based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet, based on an amount of computing resources available, or both.

In some embodiments, a network node includes one or more processors and memory. The memory contains instructions executable by the one or more processors, whereby the network node is operable to perform an initial analysis on candidate combinations of blocks of inbound packets. The network node determines an initial analysis result for each of the candidate combinations and compares the initial analysis results associated with two or more of the candidate combinations to determine a desired analysis result. The network node selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result and forwards the selected subset of blocks for combining into outbound payload data.

In some embodiments, the initial analysis comprises a preliminary compression and the network node compresses the outbound payload data according to a compression technique associated with the initial analysis. The compression may be performed on both the header and payload portions of the inbound packets contained within the outbound payload data. In some embodiments, prior to compression, the network node determines if the compression would cause the size of the outbound payload data to increase and, if so, leaves the outbound payload data uncompressed.

In some embodiments, the network node evaluates a first compression technique during the initial analysis of a current set of blocks and determines an amount of time from receipt of the inbound packets of the current set of blocks to transmission of a corresponding outbound packet. If the amount of time exceeds a maximum delay, the network node evaluates a second, faster compression technique during the initial analysis of a next set of blocks.

In some embodiments, the network node performs the initial analysis on a minimum number of the candidate combinations necessary to satisfy a desired analysis result such that fewer than all possible combinations of a pre-determined number of blocks from the set are analysed. In alternative embodiments, the network node generates the candidate combinations according to a round-robin technique such that the initial analysis is performed on all possible combinations of a pre-determined number of blocks from the set.

In some embodiments, the initial analysis is performed in parallel by a plurality of computing resources. Optionally, the number of computing resources can be dynamically scaled relative to the rate at which the network node receives the inbound packets.

In some embodiments, the network node forms the blocks by aggregating inbound packets into a current one of the blocks if the current block can accommodate the inbound packets within a desired block size. The aggregating occurs in real-time as the inbound packets are received. The network node aggregates the inbound packets into a next one of the blocks if the current block cannot accommodate the inbound packets within the desired block size. The network node may determine the desired block size based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet, based on an amount of computing resources available, or both.

In some embodiments, a system for data compression includes an analyzer, an aggregation unit, and a compressor. The analyzer performs an initial analysis on candidate combinations of the blocks of inbound packets to determine an initial analysis result for each of the candidate combinations. The analyzer then compare the initial analysis results associated with two or more of the candidate combinations and determines a desired analysis result based on the comparison. The analyzer also selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result and forwards the selected subset of blocks the aggregation unit. The aggregation unit combines the selected subset of blocks into outbound payload data and sends the outbound payload data to the compressor. The compressor compresses the outbound payload data according to a compression technique analyzed in the initial analysis.

In some embodiments, a non-transitory computer readable storage medium comprises logic. The logic, when executed by a processor, is operable to perform an initial analysis on each of a plurality of candidate combinations of the blocks, each block comprising one or more inbound packets of network traffic data. The logic determines an initial analysis result for each of the candidate combinations and compares the initial analysis results associated with two or more of the candidate combinations. The logic determine a desired analysis result based on the comparison, selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result, and forwards the selected subset of blocks for combining into outbound payload data.

Those skilled in the art will appreciate the scope of the present disclosure and realize additional aspects thereof after reading the following detailed description of the embodiments in association with the accompanying drawing figures.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.

FIGURE 1 illustrates a block diagram of a system for dynamic packet aggregation and compression according to one embodiment of the present disclosure;

FIGURES 2A-2B show a flow diagram of a method for dynamic packet aggregation and compression according to one embodiment of the present disclosure;

FIGURE 3 illustrates generating candidate combinations of blocks for dynamic packet aggregation and compression according to one embodiment of the present disclosure; and

FIGURE 4 illustrates a block diagram of one or more modules described with respect to FIGURE 1 , according to one embodiment of the present disclosure.

DETAILED DESCRIPTION FIGURE 1 illustrates a block diagram of a system for dynamic packet aggregation and compression according to one embodiment of the present disclosure. The system includes modules comprising one or more controllers 110, one or more analyzers 120 (shown as 120a, 120b, and 120n), one or more aggregation units 130, one or more compressors 140 (shown as compressors 140a, 140n), and/or one or more encapsulation units 150. In general, controller 1 10 receives inbound packets of network traffic and creates blocks from the inbound packets. A current block may be created by aggregating inbound packets in-real time as they are received. If the current block cannot accommodate additional inbound packets within a desired block size, the additional inbound packets may be aggregated in the next block.

For example, controller 1 10 may create block A from inbound packets P(l) through P(i), block B from inbound packets P(i+1) through P(j), block C from inbound packets P(j+1 ) through P(k), and so on. Controller 1 10 may begin filling the current block A with packet P(l) and continue filling block A as the packets are received. Upon receipt of a packet that cannot be accommodated within the desired block size of block A, such as packet P(i+1) in the example, controller 1 10 may begin filling the next block, block B, with packet P(i+1) and continue filling block B as the packets are received. Upon receipt of a packet that cannot be accommodated within the desired block size of block B, such as packet P(j+1) in the example, controller 1 10 may begin filling the next block, block C, with packet P(j+1) and continue filling block C as the packets are received. Controller 1 10 then sends a set of blocks, such as blocks A, B, C, and D, to analyzer 120.

Analyzer 120 receives the blocks from controller 110 and performs an initial analysis on candidate combinations of the blocks. The candidate combinations include two or more of the blocks. In the example, candidate combinations could include AB, BC, CD, ABC, BCD, or other suitable combinations of the blocks. In some embodiments, the candidate combinations include a pre-determined number of blocks. As an example, if the predetermined number of blocks is set to two, the candidate combinations may include AB, BC, CD, and/or other combinations of two blocks. If the pre-determined number is set to three, the candidate combinations may include ABC, BCD, and/or other combinations of three blocks.

Performing the initial analysis yields an initial analysis result that can be used to determine how good the candidate combination is relative to other candidate combinations. In some embodiments, the initial analysis is used to analyze a preliminary compression of the blocks in the candidate combination. The initial analysis uses any suitable compression technique, such as a dictionary based lossless compression technique or a non-dictionary based lossless compression technique (such as predictive compression or Pattern based compression), for example, Huffman coding, adaptive Huffman coding, arithmetic coding, or other compression technique. In some embodiments, the initial analysis could evaluate multiple compression techniques for the same set of blocks to determine an optimal compression technique.

For an initial analysis of a preliminary compression, the initial analysis result could be a compression score, such as a compression ratio for the candidate combination. The compression ratio may refer to the ratio between the uncompressed size and the compressed size of the candidate combination. As an example, if candidate combination AB compresses from 20 MB to 2 MB, the compression ratio would be 20/2 = 10 (or 10: 1). If candidate combination BC compresses from 20 MB to 5 MB, the compression ratio would be 20/5 = 4 (or 4: 1).

Analyzer 120 compares the initial analysis results of the candidate combinations to determine a desired analysis result. The desired analysis result allows for selecting an optimal combination of blocks. In the example, analyzer 120 compares candidate combination AB's compression ratio of 10 to candidate combination BC's compression ratio of 4 and determines that 10 is the desired analysis result for the set of blocks. That is, of the compression ratios that are possible from the analyzed candidate combinations, the compression ratio of 10 provides a more optimal reduction in data size than the compression ratio of 4. In some embodiments, analyzer 120 may generate a map or tree of compression scores to determine the desired analysis result. The desired analysis result could be set such that multiple candidate combinations meet the desired analysis result. As an example, if the compression ratios for a set of candidate combinations include the values 2, 4, 5, 8, 2, 10, and 9, a compression ratio greater than or equal to 8 could be set as the desired analysis result for the set.

After performing the initial analysis, analyzer 120 forwards the blocks, the initial analysis results, and/or the desired analysis results to aggregation unit 130. Aggregation unit 130 selects a subset of blocks to combine out of the analyzed set of blocks A, B, C, and D. Aggregation unit 130 selects the subset of block for which the initial analysis result of the candidate combination satisfies the desired analysis result. In some embodiments, aggregation unit 130 navigates a score map to determine an optimal combination of blocks to combine. In the above example, aggregation unit 130 determines to combine block B with block A (rather than block C) because the candidate combination AB has a better compression ratio than the candidate combination BC. Aggregation unit 130 then combines the selected blocks AB into outbound payload data.

Aggregation unit 130 sends the outbound payload data to compressor 140 for compression. Compressor 140 compresses the outbound payload data using the compression technique analyzed by analyzer 120 during the initial analysis. Continuing with the preceding example, if the initial analysis analyzed Huffman coding to generate the compression ratio of 10 that led to the selection of AB as an optimal combination, then compressor 140 applies Huffman coding to the outbound payload data created from the combination of blocks A and B. As described above, the outbound payload data is created from blocks and the blocks are in turn created from inbound packets. The inbound packets include inbound packet headers and inbound packet payloads. In some embodiments, compressor 140 applies compression to both the inbound packet headers and the inbound packet payloads of the inbound packets comprised within the outbound payload data.

Compressor 140 sends the compressed outbound payload data to encapsulation unit 150. Encapsulation unit 150 forms an outbound packet by adding a header to the compressed outbound payload data. As an example, the header may contain a destination address of a network node to which the outbound packet is being sent. Encapsulation unit 150 then transmits the outbound packet.

Modules 1 10, 120, 130, 140, and 150 may be configured together on dedicated hardware or separated across multiple pieces of hardware communicatively coupled by a network (e.g., in a cloud-based implementation). Although the preceding example describes an embodiment in which certain modules perform certain functionality, in other embodiments the functionality may be performed by any suitable module or combination of modules. Additionally, in some embodiments, the number of modules may be dynamically scaled to optimize the amount of time from receipt of an inbound packet to transmission of a corresponding outbound packet. As an example, as the rate of receipt of inbound packets increases, the system can be scaled to include more analyzer 120 modules, more compressor 140 modules, or more of the other modules (although, for simplicity, certain embodiments may include only one controller 1 10 and one encapsulation unit 150). The added modules may process the inbound packets in parallel to maintain an optimal time from receipt of an inbound packet to transmission of a corresponding outbound packet.

As described above, controller 110 may create blocks from inbound packets. The block is made by combining inbound packets together until the block reaches a desired block size or until combining an additional inbound packet would cause the block to exceed the desired block size. That is, the data collected may form blocks equal to or smaller than the desired block size, but not larger. As a result, incoming data that does not fit within the desired block size may have to span multiple blocks. The desired block size is divisible by the maximum transmission unit (MTU) or a number smaller than the MTU. The MTU relates to the transmission of outbound packets being generated from the inbound packets. Controller 1 10 places the block in a buffer of blocks waiting to be processed. Controller 1 10 controls the flow of data from the network to available analyzers 120. Once an analyzer 120 is available and the numbers of blocks required to form a set of blocks has been collected in the buffer, the set of blocks is sent to analyzer 120. The number of blocks in the set may be a multiple of the number of block that it takes to form an outbound packet. For example, for an analyzer to analyze 5 payloads, a total of 25 blocks would be sent.

Controller 1 10 may also be responsible for calculating the optimal block size, the values for other parameters that affect the amount of time it takes to analyze the optimal combination of blocks (e.g., parameters that affect the amount of time it takes to analyze the combination of blocks having an optimal compression), and the values for any other suitable parameters. So, if blocks are taking too long, controller 1 10 might reduce the number of payloads being compared or decrease the number of blocks need to form a payload. In some cases where build-up in the buffer causes excessive delays in the system, controller 1 10 may configure the parameters to simply aggregate and compress the blocks without reordering (optimizing) them.

Analyzer 120 takes the set of blocks collected in the buffer and attempts to find the optimal combination of blocks, such as the combination of blocks that combined into the highest possible compressed format or the best compressed format with given parameters. In some embodiments, all possible combinations of a set of blocks may be compared in order to determine the best combination. For example, if analyzer 120 analyzes combinations involving 2 blocks per combination, then comparing all possible combinations of a number (N) blocks in a set would require a number (K) of comparisons, where K=(N*(N-l))/2. For example, suppose N is equal to three (blocks A, B, and C). Comparing all possible combinations of the three blocks (with two blocks per combination) would require three comparisons (i.e., (3*(3-l))/2 = 3). In the example, the three combinations (K) would be AB, BC, and AC.

This approach may be impractical if the number of combinations is high. So, in some cases, a better approach may be to compress all the blocks at the beginning and then compare the blocks to each other by compressing them and checking the results size change. Because the comparison may still even be more time critical, more limits might need to be added to the search such as the number of times a block can be compared to another. In some embodiments, the number of times a block can have a high compression ratio with any 63193

number of blocks may be limited. So, if the number of blocks to a payload is 2 there is no need to compare A to C if both AB and DA have already been determined to have a good compression ratio. Any other limits could be used to help make efficient use of resources when demand is high. This might entail building a common dictionary after comparing a block to several other blocks and using the common dictionary to search the remaining blocks for other collisions. Also, this process can be run on multiple threads or across multiple PCs or processors by comparing results across multiple starting points for each thread.

Analyzer 120 may generate a compression score (e.g., a compression ratio) for the analyzed candidate combinations of blocks. The compression score may be included in a map or tree of all or some of the other compressions scores to facilitate identifying the candidate combinations with relatively good compression results. The map or tree could also be used to identify other candidate combinations to analyze. For example, if AB yields a high compression ratio and BC yields a high compression ratio, analyzer 120 may analyze whether A compressed with C would yield an even better ratio. Alternate methods might also include comparing multiple blocks at once and checking, such as comparing Blocks ABC. Other methods might be used for comparing block and creating a form of scoring. Also, depending on the configuration or implementation, trees from multiple analyzers 120 may need to be combined into a single tree. For example, analyzers 120a and 120b may share the work of analyzing the combinations of blocks A, B, C, and D. Analyzer 120a may generate a first tree with combinations AB and CD. Analyzer 120b may generate a second tree with combinations BC and AD. The first and second trees could then be combined into a single tree.

Aggregation unit 130 may walk through the map or tree to determine an optimal combination of blocks. In some embodiments, aggregation unit 130 walks through the map or tree using a tree search algorithm (such as prim's) to find the best possible overall compression ratio or the best score for each outbound packet. The search depth is the number of blocks it takes to form the MTU. To form the MTU, the number of blocks (N) = (MTU Size - Header Size)/Block Size. Once the optimal combination of blocks has been found, aggregation unit 130 combines the blocks into output payload data. The output payload data size is ideally less than or equal to the MTU size minus the header size (e.g., the size of the header being added to the output payload data to form the outbound packet). Compressor 140 compresses the output payload data using the same compression technique as the above steps. If the compression increases the size of the payload, it should be sent without compression. Encapsulation unit 150 adds the header to the output payload data to form the outbound packet. Encapsulation unit 150 then sends the outbound packet to the next hop, either via controller 1 10, compressor 140, encapsulation unit 150 itself, or other suitable module.

Once the outbound packet has been sent, controller 1 10 is notified of the total delay to process and send the outbound packet. Controller 110 can then adjust parameters to better optimize analyzer 120 or make analyzer 120 spend less time analyzing blocks (e.g., spend less time performing the preliminary compression and comparing compression scores of candidate combinations of blocks). This can be achieved by increasing the size of the blocks, decreasing the overall number of blocks compared, or changing any other parameters that analyzer 120 uses. Controller 1 10 can also adjust parameters to make compressor 140 more efficient. For example, during the times where the total delay exceeds a maximum delay threshold, controller 110 can instruct compressor 140 to use a faster compression technique, to pass through data uncompressed, or to share the workload with another compressor 140.

The decompression may be handled by a separate system (e.g., the destination node or another node along the path). The separate system may include a decompressor to decompress the jumbo frame and a router to re-assemble the packets in sequence.

FIGURES 2A-2B show a flow diagram of a method 200 for dynamic packet aggregation and compression according to one embodiment of the present disclosure. At step 202, controller 110 receives a plurality of inbound packets of network traffic data. At step 204, controller 1 10 aggregates the inbound packets into blocks. In some embodiments, controller 1 10 aggregates the inbound in real-time as they are received. Controller 1 10 aggregates inbound packets into a current one of the blocks if the current block can accommodate the inbound packets within a desired block size. If the current block cannot accommodate the inbound packets within the desired block size, controller 1 10 aggregates the inbound packets into the next block.

Controller 1 10 adds the blocks to a block buffer that collects a set of blocks to be analyzed. At step 206, controller 1 10 determines whether resources (e.g., analyzers 120) are available to analyze a set of blocks. If resources are not available, controller 1 10 may hold the blocks in the buffer until the resource become available or return to step 202 to reprocess the inbound packets of the unanalyzed blocks. If resources are available, controller 1 10 sends the packets to analyzer 120 to perform the initial analysis in step 208. Analyzer 120 performs the initial analysis on each of a plurality of candidate combinations of the blocks. As described with respect to step 202-204, each block comprises one or more inbound packets of network traffic data.

In some embodiments, analyzer 120 performs the initial analysis as described above with respect to FIGURE 1. Analyzer 120 may perform the initial analysis on a minimum number of the candidate combinations necessary to satisfy a desired analysis result. For example, suppose that the set of blocks being analyzed by analyzer 120 corresponds to blocks A, B, C, D, and E and the pre-determined number of blocks per candidate combination is configured as two. Analyzer 120 potentially has 10 candidate combinations of blocks to analyze (N = (5*(5-l)/2). Suppose that the desired analysis result is a compression ratio greater than or equal to 4. Analyzer 120 may analyze candidate combination AB, determine that the compression ratio is equal to 2, and may therefore determine that the desired analysis result is not satisfied. Analyzer 120 may next analyze candidate combination AC, determine that the compression ratio is equal to 5, and may therefore determine that the desired analysis result is satisfied. Thus, analyzer 120 need not proceed with analyzing the other candidate combinations involving block A or block C, which means fewer than all 10 possible combinations of a pre-determined number of blocks (e.g., 2 blocks per candidate combination) from the set are analysed. In some other embodiments, analyser 120 may generate more combination by ordering multiple set of blocks (e.g., blocks A, B, C, D, E and F attempting to be combined 3 or more blocks into a candidate combination).

In other embodiments, analyzer 120 may generate all 10 candidate combinations of blocks A, B, C, D, and E according to a round-robin technique and perform the initial analysis on all possible combinations of the pre-determined number of blocks (e.g., 2 blocks per candidate combination) from the set. A round-robin technique is further described with respect to FIGURE 3 below. Using a round-robin technique may allow for determining the best of all possible candidate combinations of the blocks, but the trade-off is that it may require more time and processing resources.

In some embodiments, the initial analysis is performed in parallel by a plurality of computing resources, such as 2, 3, 4, or n analyzers 120. The number of computing resources is dynamically scaled relative to the rate at which the inbound traffic is received. Thus, 63193

resources can be added when the system load is high and resources can be removed when the system load is low.

Analyzer 120 proceeds to step 210 to determine an initial analysis result for each of the candidate combinations being analyzed. As an example, the initial analysis may comprise a preliminary compression and the initial analysis result may be a compression score, such as a compression ratio. At step 212, analyzer 120 compares the initial analysis results associated with two or more of the candidate combinations and at step 214, analyzer 120 determines a desired analysis result based on the comparison. For example, if the candidate combination of blocks A and B yields a compression ratio of 10 and the candidate combination of blocks B and C yields a compression ratio of 4, comparing the AB compression ratio to the BC compression ratio results in a determination that the desired analysis result for the comparison is a compression ratio of 10.

At step 216, analyzer 120 selects a subset of blocks for which the initial analysis result of the candidate combination satisfies the desired analysis result. Continuing with the example, of the set of blocks A, B, C, and D, the initial analysis result of the candidate combination AB indicated that the subset of blocks A and B satisfy the desired analysis result (the compression ratio of 10 determined in step 214). At step 218, analyzer 120 forwards the selected subset of blocks to aggregation unit 130 for combining into outbound payload data. Aggregation unit 130 combines the selected subset of blocks (A and B) into the outbound payload data in step 220.

Aggregation unit 130 sends the outbound payload data to compressor 140. Compressor 140 compresses the outbound payload data at step 222. Compressor 140 uses the same compression technique that analyzer 120 analyzed during the preliminary compression of the initial analysis. In some embodiments, before compressing outbound payload data, compressor 140 may determine whether compressing the outbound data payload increases the size of the outbound payload data. If compressing the outbound data payload increases the size of the outbound payload data, compressor 140 leaves the outbound payload data uncompressed.

Compressor 140 may compress both inbound packet headers and the inbound packet payloads of the inbound packets that are comprised within the outbound payload data. For example, if the outbound payload data was created by combining block A and block B, and block A was created by aggregating inbound packets P(l) through P(i) and block B was 3193

created by aggregating inbound packets P(i+1) through P(j), compressor 140 compresses the header and payload portions of inbound packets P(l) through P(i) and P(i+1) through P(j) comprised within the outbound payload data.

Compressor 140 sends the compressed packets to encapsulation unit 150. At step 226, encapsulation unit 150 encapsulates the outbound payload data in an outbound packet. For example, encapsulation unit 150 adds a header to the outbound payload data to form the outbound packet. Encapsulation unit 150 then transmits the outbound packet to the next hop at step 228. In some embodiments, encapsulation unit 150 transmits the outbound packets via controller 1 10.

At step 230, controller 1 10 determines a time delay from receiving the inbound packets (e.g., P(l) through P(i) and P(i+1) through P(j)) to transmitting the corresponding outbound packet (e.g., the outbound packet generated from inbound packets P(l) through P(i) and P(i+1) through P(j))- Controller 1 10 may use any suitable timing mechanism for determining the delay. Any suitable time delay may be used, such as the average delay of the packets, a mean delay of the packets, a delay of the fastest packet, or a delay of the slowest packet. At step 232, controller 1 10 compares the calculated delay to a maximum delay threshold and/or a minimum delay threshold. If the calculated delay is less than the maximum delay and greater than the minimum delay, controller 110 returns to step 202 and applies the existing configuration to newly received inbound packets. If the calculated delay exceeds the maximum delay or is less than the minimum delay, controller 1 10 adjusts the configuration settings at step 234.

If the calculated delay exceeds the maximum delay, controller 1 10 may change the configuration settings to speed up processing. As an example, controller 1 10 may change the configuration from a slower compression technique to a faster compression technique according to the steps of evaluating a first compression technique during the initial analysis of a current set of blocks, determining that an amount of time from receipt of the inbound packets of the current set of blocks to transmission of a corresponding outbound packet exceeds a maximum delay, and in response, evaluating a second, faster compression technique during the initial analysis of a next set of blocks.

As another example, to speed up processing, controller 1 10 may increase the desired block size. The desired block size determines the number of inbound packets aggregated in sequence as they arrive at step 202. As another example, the number of blocks per set can be decreased such that the analyzer receives fewer blocks at step 208. With fewer blocks, there are fewer candidate combinations to analyze so the delays can be reduced. As another example, the number of combinations to analyze can be reduced. This may be done by turning off a round-robin configuration or by lessening the desired analysis result (e.g., if the compression ratio for a satisfactory combination is reduced from 10 to 5, analyzer 120 is more likely to find a satisfactory combination in fewer tries). As another example, controller 1 10 may increase the number of blocks per combination. That is, the number of blocks combined into outbound payload data may comprise a pre-determined number of blocks like two blocks (e.g., AB or BC) or three blocks (e.g., ABC or BCD), and the pre-determined number of blocks can be determined based on a time delay from receipt of a prior inbound packet to transmission of a corresponding outbound packet. As yet another example, controller 110 may increase the number of parallel computing resources by adding more analyzers 120, aggregation units 130, compressors 140, etc.

If the calculated delay is less than the minimum delay, controller 1 10 may change the configuration settings to slow down processing by taking the opposite actions as those described with respect to speeding up processing in the preceding paragraphs. Reasons for slowing down processing include freeing up resources or allowing the system to spend more time to find more optimal combination of blocks.

Controller 1 10 may also adjust the parameters based on the amount of computing resources available. For example, if more analyzers 120 are available, the controller could decrease the desired block size, configure a slower compression technique (which may be able to generate a more optimal compression than a faster compression technique), increase the number of blocks per set, increase the number of combinations to analyze, and/or decrease the number of blocks per combination.

After adjusting the configuration settings at step 234, controller 1 10 returns to step 202 to process new inbound packets according to the adjusted configuration.

FIGURE 3 illustrates generating candidate combinations of blocks for dynamic packet aggregation and compression according to one embodiment of the present disclosure. For a given set of blocks, the round-robin technique generates all possible candidate combinations involving a pre-determined number of blocks. For example, the set of blocks in FIGURE 3 corresponds to A, B, C, and D and the pre-determined number of blocks per combination is configured as 2. Applying the round-robin technique, all possible combinations are AB, AC, AD, BC, BD, and CD.

FIGURE 4 illustrates a block diagram of one or more modules described with respect to FIGURE 1, according to one embodiment of the present disclosure. The module includes processor 1020, memory 1030, and interface 1040. Processor 120 executes instructions to provide some or all of the functionality described herein as provided by a module and memory 130 stores the instructions executed by processor 120.

Processor 120 may include any suitable combination of hardware and software implemented in one or more integrated circuits or modules to execute instructions and manipulate data to perform some or all of the described functions of a module, such as controller 1 10, analyzer 120, aggregation unit 130, compressor 140, or encapsulation unit 150. Memory 130 may generally be operable to store computer executable code and data. Examples of memory 130 include computer memory (for example, Random Access Memory (RAM) or Read Only Memory (ROM)), mass storage media (for example, a hard disk), removable storage media (for example, a Compact Disk (CD) or a Digital Video Disk (DVD)), and/or or any other volatile or non-volatile, non-transitory computer-readable and/or computer-executable memory devices that store information. Other embodiments of a module can include additional components (beyond those shown in FIGURE 4) responsible for providing certain aspects of the module's functionality, including any of the functionality described above and/or any additional functionality (including any functionality necessary to support the solution described above).

Embodiments of the disclosure may have a variety of practical applications, including over data backhaul links and over wired or wireless point-to-point connections. In some embodiments, a single computer can perform all of the functionality described. The Central Processing Unit (CPU) would handle the division of data into blocks. The series of blocks would be passed to a set of Graphics Processing Units (GPU(s)) to perform the initial analysis and comparisons. The comparison results would be passed back to the CPU to aggregate the results and search for an optimal combination of blocks. Then each GPU (or group of GPU(s)) would finish the final compression and pass the output payload data back to the CPU for encapsulation and transmission onto the network.

In other embodiments, clusters of Accelerated Processing Units may be used to fill various roles. Although FIGURE 1 provides a basic configuration of the various roles (control, analysis, aggregation, compression, encapsulation/transmission), the configuration might vary from implementation to implementation. At least one of each of the parts is needed (although, the different parts may share hardware or software resources in some embodiments). Multiples of the parts may be configured dynamically depending on the amount of data need to be processed. The Controller, Aggregator, and Encapsulator are relatively simple processes and may not require significant amounts of processing power to achieve their tasks. So, in most cases, only one of each would be needed (possibly more for increased throughput or for redundancy). In some embodiments, a single machine could run all 3 parts above without any foreseeable issues. Both the initial and final routing could be implemented on an existing router.

The other remaining parts (the Analyzers and Compressors) require a relatively large amount of processing power and benefit from having multiple processors available to finish the comparisons and compression within an acceptable time for network traffic. Either a dedicated hardware solution could be used or multiple processors could be used across separate hardware (such as cloud-based processors communicatively coupled by a network). Also it may be viable to run theses function on existing hardware or blades, depending on if they possess enough free memory and processing power.

As used in the description, blocks and block sizes may provide a generic way of referring to packets or groups of packets that are smaller than a fraction of the MTU. In some embodiments, jumbograms are being used for the output packets and the MTU is 64 kb. The ideal solution would be to compare each incoming packet with each other packet in a given time frame, running a comparison between each to find traffic that might be compressible. But, when dealing with high speed connections, such as one million packets per second, if a small portion of that traffic is taken to compare all the packets with each other with no extra limits, such as 5 ms, that would mean that around 850 packets would need to be compared and compressed in that window. That would mean that 360825 comparisons would need to be made in a short window of time, and additional time would be needed for the final compression and transmission. In this case, it might be practical to either reduce the size of the window used to compare the packets to 2 ms. This would mean analyzing 300 packets and 44850 combinations (without limits).

Again, that can be further be reduced by creating blocks of packets. Supposing that the MTU is 64, the MTU would be able to accommodate a total of 10 blocks at 6 kB per P T/IB2014/063193

block. Suppose each packet is 1500 bytes long or less. That would mean that 4 packets combined into one block. Therefore, 300 packets become 75 blocks with 2775 combinations (without limits). Although using blocks is optional, it provides a practical way to decrease the total number of comparisons needed at the cost of some potential compression.

By adding limits and other methods of comparisons, the number of times each packet needs to be fully compressed with another can be reduced greatly.

As an example, limits can be placed on the number of times that a packet can be compared with another. For example, if a limit is set such that a packet can only be compared to half the other packets, then only 1388 comparisons are needed (in the ongoing scenario). Such an approach saves time, but increases the risk of missing potential blocks that could be compressed.

As another example, a limit can be set as to the number of times a packet can be compressed passed a certain ratio, such as 2: 1. This limit cannot be easily expressed because it varies largely on the amount of data being compared and the type. This number should be greater than the number of blocks used to make a single MTU. In the ongoing example, it takes 10 blocks to form the MTU. Thus, at least a value of 10 would need to be set.

In some embodiments, GPUs could be used to do the analysis and compression computations with each GPU workgroup running either part or a whole analysis. This method may allow for a high number of comparisons to be processed in a relatively short window of time. However, any suitable components may be used to do the analysis and compression as long as it takes less than the desired delay to process and send. According to some embodiments, consideration is given to the fact that analyzers and compressors should not receive more data than they can process for a given time. Because of the large amount of data needing to be processed in a short amount of time, when an analyzer is running it should not be interrupted with attempts to feed it new data, meaning that it may be in a substantially blocked state when it is comparing data. So, if a processing time of 5 ms is desired and the system cannot be back logged, that implies that at least 5 analyzers are needed to have 100% uptime. In a more general case, either a greater number of slower analyzers or a lesser number of faster analyzers could be required.

The next portion is the map/tree search. All of the comparison results should be accessible from one central location. In some embodiments, a prim's search is done on all of the data to find an optimal combination of blocks. The blocks can be grouped into the size of the block per MTU (10). Blocks can be removed/ignored from the search if a tree of at least 10 is formed. At this point a header could be added and it sent for final compression or it could be compressed and a header added afterwards.

Some embodiments of the disclosure may provide one or more technical advantages. Some embodiments may benefit from some, none, or all of these advantages. Other technical advantages may be readily ascertained by one of ordinary skill in the art. A technical advantage of some embodiments includes improved efficiency and better, more consistent performance. By taking the best matches and compressing them, the amount of network traffic can be minimized to allow for higher north bound speeds. Unlike implementations of ROHC with aggregation, certain embodiments compress not only the incoming packet headers, but also the incoming packet payload (i.e., the entire incoming packet is compressed). Some embodiments yield more consistent higher compression ratios. Sending more blocks to a given analyzer increases the chances of having compressible data.

A technical advantage of some embodiments is that the system is scalable and dynamic. Resources can be added to it or taken away and the system should be able to adapt. If the system does not have the required processing power to fully process all incoming traffic, it can change behavior to try to optimize both internal and external performance. All components can be provided by a single machine or spread out across multiple servers/machines each with multiple processors. For instance a single machine can do all the steps or it can be spread out with one or more machine dedicated to each role.

Another technical advantage of some embodiments allows use with any type of lossless compression, including both dictionary and non-dictionary based compression (such as predictive compression or Pattern based compression). In some embodiments, different types of compression can be analyzed on the same set of packets. Or, different types of compression can be used at different times depending on the system load.

Certain embodiments may be well-suited for low bandwidth point-to-point connections (any point-to-point bandwidth limited connections) by allowing for optimal compression of data in order to meet data caps or reduce data volume through high traffic areas in the network.

Modifications, additions, or omissions may be made to the systems and apparatuses disclosed herein without departing from the scope of the disclosure. The components of the systems and apparatuses may be integrated or separated. Moreover, the operations of the systems and apparatuses may be performed by more, fewer, or other components. Additionally, operations of the systems and apparatuses may be performed using any suitable logic comprising software, hardware, and/or other logic. As used in this document, "each" refers to each member of a set or each member of a subset of a set. Modifications, additions, or omissions also may be made to the methods disclosed herein without departing from the scope of the disclosure. The methods may include more, fewer, or other steps. Additionally, steps may be performed in any suitable order.

The above description of the embodiments does not constrain this disclosure. Other changes, substitutions, and alterations are possible without departing from the spirit and scope of this disclosure, as defined by the following claims.