Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DIFFERENCE DATASETS
Document Type and Number:
WIPO Patent Application WO/2017/023332
Kind Code:
A1
Abstract:
In some examples a system for generating full datasets from reduced datasets includes a communication interface, a data storage, and a processing system. The processing system may use the data storage to store a base dataset and a first difference dataset that includes data specifying differences between the base dataset and a first target dataset. The processing system may generate the first target dataset using the base dataset and the first difference dataset. The system may receive a second difference data set via the communication interface. The second difference dataset may specify differences between a second target dataset and the base dataset modified in accordance with at least a portion of the first difference dataset. The processing system may generate the second target dataset using the base dataset, the first difference dataset, and the second difference dataset.

Inventors:
HANSON THOMAS WILLIAM (US)
YORK JUSTIN E (US)
Application Number:
PCT/US2015/044013
Publication Date:
February 09, 2017
Filing Date:
August 06, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HEWLETT PACKARD ENTPR DEV LP (US)
International Classes:
G06F17/00; G06F15/16
Domestic Patent References:
WO2002085201A12002-10-31
Foreign References:
US20100169405A12010-07-01
US20080189350A12008-08-07
US20120158910A12012-06-21
US8082233B22011-12-20
Attorney, Agent or Firm:
COOK, Justin, M. (US)
Download PDF:
Claims:
CLAIMS 1. A method comprising:

storing a base dataset and a first difference dataset, wherein the base dataset comprises device configuration data, wherein the first difference dataset comprises data specifying differences between a first target dataset and the base dataset;

generating the first target dataset using the base dataset and the first difference dataset;

receiving a second difference dataset, wherein the second difference dataset comprises data specifying differences between a second target dataset and the base dataset modified by at least a portion of the first difference dataset; and

generating the second target dataset using the base dataset, the first difference dataset, and the second difference dataset. 2. The method of claim 1, further comprising:

storing the second difference dataset. 3. The method of claim 1, wherein generating the first target dataset comprises populating respective portions of the first target dataset with values from either the base dataset or the first difference dataset. 4. The method of claim 1, wherein the differences specified by first difference dataset comprise: (i) a modification to a value in the base dataset, or (ii) a value to append to the base dataset. 5. The method of claim 1, wherein the second difference dataset comprises an indication of a portion of the first difference dataset to ignore, and wherein generating the second target dataset comprises populating respective portions of the second target dataset with values from either the base dataset, the second difference dataset, or a non-indicated portion of the first difference dataset, wherein the non-indicated portion of the first difference dataset comprises a remaining portion of the first difference dataset not indicated by the indication. 6. The method of claim 1, wherein the first target dataset and the second target dataset are configuration datasets for a first configurable component and a second configurable component, respectively, the method further comprising:

configuring the first configurable component using the first target dataset; and

configuring the second configurable component using the second target dataset. 7. A method comprising:

a source computing system generating a first difference dataset based on differences between a first target dataset and a base dataset, wherein the base dataset comprises device configuration data, wherein the first difference dataset comprises data specifying one or more of: (i) a modification to a value in the base dataset, or (ii) a value to append to the base dataset;

sending the first difference dataset to a target computing system;

generating a second difference dataset based on differences between a second target dataset and the base data set modified by at least a portion of the first difference dataset; and

sending the second difference dataset to the target computing system. 8. The method of claim 7, further comprising:

determining that a size of the second difference dataset is less than a threshold; and sending the second difference dataset to the target computing system in response to determining that the size of the second difference dataset is less than the threshold. 9. The method of claim 7, further comprising:

generating a comparison dataset based on differences between the base dataset and the second target dataset;

determining that a size of the second difference dataset is less than a size of the comparison dataset; and

sending the second difference dataset to the target computing system in response to determining that the size of the second difference dataset is less than the size of the comparison dataset. 10. The method of claim 7, wherein generating the second difference dataset comprises:

identifying a particular portion of the first difference dataset in common with a corresponding portion of the second target dataset; and

generating the second difference dataset based on differences between the second target dataset and the base dataset modified in accordance with the particular portion of the first difference dataset, and wherein the second difference dataset comprises data indicative of the particular portion of the first difference dataset. 11. A system comprising:

a data storage;

a communication interface; and

a processing system communicatively coupled to the data storage and the communication interface, the processing system to:

store a base dataset and a first difference dataset in the data storage, wherein the first difference dataset comprises data specifying one or more of: (i) a modification to a value in the base dataset, or (ii) a value to append to the base dataset;

generate a first target dataset using the base dataset and the first difference dataset;

receive, via the communication interface, a second difference dataset, wherein the second difference dataset comprises data specifying: (a) a particular portion of the first difference dataset, and (b) differences between a second target dataset and the base dataset modified in accordance with the particular portion of the first difference dataset; and generate the second target dataset using the base dataset, the second difference dataset, and the first difference dataset. 12. The system of claim 11, wherein the processing system is further configured to:

store the second difference dataset in the data storage. 13. The system of claim 11, wherein the processing system is to generate the first target dataset via populating respective portions of the first target dataset with values from either the base dataset or the first difference dataset. 14. The system of claim 11, wherein the second difference dataset comprises an indication of a portion of the first difference dataset to ignore, wherein the particular portion comprises a remaining portion of the first difference dataset not indicated by the indication, and wherein the data specifying the particular portion comprises the indication of the portion to ignore. 15. The system of claim 11, further comprising a first configurable component and a second configurable component; and

wherein the processing system is to:

configure the first configurable component using the first target dataset; and configure the second configurable component using the second target dataset.

Description:
DIFFERENCE DATASETS

BACKGROUND

[0001] Some computing systems may be located in a data center or another facility while being monitored and managed by a remote computer. Such computing systems may include management interfaces to facilitate configuration of components by the remote computer. For instance, communication ports, adapters, and other components of the computing systems may be configured based on respective datasets provided to the computing systems by the remote computer. The management interface may then apply the received datasets to the components to complete the configuration process. BRIEF DESCRIPTION OF THE DRAWINGS

[0002] The following detailed description references the drawings, wherein:

[0003] FIG.1A is a block diagram of an example system with a source providing multiple difference datasets to a target for deriving corresponding full datasets;

[0004] FIG.1B is a block diagram of an example arrangement in which a source provides multiple difference datasets to a target via a data connection within a computing system;

[0005] FIG. 1C is a block diagram of an example system in which a source provides multiple difference datasets to a target via a data connection that includes a communication network;

[0006] FIG. 1D is a block diagram of an example system in which a source provides configuration data to a target using multiple difference datasets;

[0007] FIG. 2A is a block diagram including pictographs representing example difference datasets generated by a source system and sent to a target system; [0008] FIG. 2B is a block diagram including pictographs representing example difference datasets generated by a source system and sent to a target system;

[0009] FIG. 3A is a flowchart of an example process involving a source system generating difference datasets to communicate multiple datasets to a target system.

[0010] FIG. 3B is a flowchart of an example process involving a target system generating multiple datasets using difference datasets.

[0011] FIG. 4 is a flowchart of an example process involving a source system providing difference datasets to a target system for generation of full datasets;

[0012] FIG. 5 is a flowchart of an example process involving a source system providing difference datasets to a target system for generation of full datasets; and

[0013] FIG. 6 is a block diagram of an example system with a source providing multiple difference datasets to a target for deriving corresponding full datasets. DETAILED DESCRIPTION

[0014] The following description makes reference to the accompanying drawings, in which similar symbols identify similar components, unless context dictates otherwise. The descriptions herein, as well as the drawings, present examples of the subject matter of the present disclosure and are in no way limiting in regard to the subject matter disclosed herein. Throughout the description, the singular forms of“a”, “an”, and“the” mean“one or more”. Thus, various examples in which a component is described in singular form also apply to examples having multiple of those components. Moreover, some aspects of the examples presented herein may be modified, re-arranged, re-ordered substituted, combined, and/or separated in a variety of different configurations without departing from the subject matter of the present disclosure.

[0015] Nodes in a communication network are coupled to one another by a communication path, which may include a combination of communication interfaces, data networks, cables, routers, repeaters, data busses, and so on. Data transmission over a given communication path may be limited by a component in the path that offers the lowest bandwidth. The lowest bandwidth link is sometimes referred to as the communication bottleneck. In some cases, bottlenecks may be due to components that are shared between multiple nodes, such as a common data bus that is used to provide communication between a central processing unit and multiple devices controlled by the central processing system. In some cases, bottlenecks may be due to bandwidth-limited communication channels and/or hardware components.

[0016] Some aspects of the present disclosure help to alleviate the demands on communication bottlenecks by reducing the total amount of data that is transmitted by nodes sending multiple datasets that are similar, but not identical. For example, the present disclosure describes a process for sending first, second, and third datasets from a source system to a target system via a bandwidth-limited communication path. The source may send the first dataset in its entirety. Before sending the second dataset, the source may generate a difference dataset, which represents all of the differences between the second dataset and the first dataset. For example, the difference dataset may include data indicating a modification to a value in the first dataset and/or a value to append to the first dataset. The source can then send the difference dataset to the target, rather than the full second dataset, and the target can use the difference dataset in combination with the previously received first dataset to generate the full second dataset. For example, the target may apply the modifications specified by the difference dataset to the first dataset to thereby recover the full second dataset. In cases where the second dataset includes portions that are similar and/or identical to corresponding portions of the first dataset, such a process can help reduce the total bandwidth required to communicate both datasets from the source to the target.

[0017] Upon receipt, the target can locally store both the full first dataset and the difference dataset. As noted above, the target can use the difference dataset to generate the full second dataset by modifying the first dataset in accordance with the differences indicated in the difference dataset. For example, the target may generate the second dataset by modifying values in the first dataset and/or appending values to the first dataset.

[0018] In addition, the source can generate and send an additional difference dataset, which represents all of the differences between a third dataset and the first dataset as modified by the previously sent difference dataset. Thus, the additional difference dataset provides the target with information to enable the target to reconstruct the full third dataset. In particular, the target can refer to the locally stored first dataset and the difference dataset and then modify the first dataset as specified by the two difference datasets. Each subsequent communication of a dataset from the source to the target can then be carried out by communicating additional difference datasets that represent further differences in the conveyed dataset relative to some combination of previously transmitted datasets and/or difference datasets. Both the source and the target can store each of the previously transmitted difference datasets to use in generation of additional difference datasets relative to the stored difference datasets (by the source) and reconstruction of full datasets based on the stored difference datasets (by the target).

[0019] In some examples the datasets and difference datasets are not compressed. As such, it is not necessary for the datasets to be decompressed by the target, which may help conserve processing resources. Instead, in some examples, the target may reconstruct full datasets by modifying a full dataset via substituting of certain data values with new values and/or appending data values as indicated by a given received difference dataset. The generation of a full dataset using such an approach could be performed by a processor other than a central processing unit (CPU) of a computing system, such as a reduced instruction processor (e.g., ARM) of a board management controller (BMC) or a similar processor on a management interface. Due to the reduced processing demands, difference datasets may be delivered directly to such components of a computing system while bypassing a central processing system, which helps to conserve processing resources, and also reduces the amount of data communicated to and from the CPU via data busses and the like. For example, an interface module such as a BMC or management interface may be in communication with a remote computer that uses the interface module to both monitor the local computing system and to assert control over various aspects. Such an interface module may cause the local system to turn on, restart, reconfigure, etc. The remote computer may provide datasets to the interface module by sending a series of difference datasets directly to the interface module.

[0020] In some examples, an interface module of a computing system may receive configuration datasets for multiple components controlled by the interface module. The first configuration dataset may be transmitted in its entirety. Subsequent configuration datasets may be conveyed as difference datasets that indicate differences in the configuration dataset relative to the previously transmitted full dataset(s) and/or difference dataset(s). Upon reconstructing the full datasets at the interface module, the interface module may apply the configuration datasets to respective configurable components for which they were intended. For instance, multiple network interface cards (NICs) may be configured with nearly identical configuration data, except for their individual network identifiers (e.g., media access control address (MAC address)), which may be unique for each NIC. The full configuration datasets may therefore be similar to one another, but not identical. Transmitting only the differences between each subsequent configuration data set (e.g., MAC addresses) may help alleviate bandwidth constraints.

[0021] The techniques described herein provide for enhanced performance of dataset communications without modifications to hardware components. For example, communication bottlenecks may be relieved by providing additional bandwidth throughput in the data connection. Additional bandwidth may be provided by using larger or more numerous data cables in wired connections, utilizing a broader spectrum in wireless connections, and/or upgrading components such as switches, routers, repeaters, modems, etc. In another example, communication bottlenecks may be relieved by utilizing compression techniques. Some compression techniques represent a given dataset with a reduced-size version based on patterns within the dataset, such as repeating patterns, blocks of identical values, etc. However, data compression requires more significant processing power both to generate the compressed datasets and to reconstruct the uncompressed dataset from a compressed version. To be effective, data compression therefore requires additional processing resources both at the source and the target. Any of these options cannot be achieved without additional expensive upgrades, such as upgrades to the data connection hardware itself or to the processing resources at both ends. The present disclosure, by contrast, provides an approach that alleviates communication bottlenecks without necessitating the same extent of additional expenses.

[0022] FIG. 1A is a block diagram of an example system 100 with a source 102 providing multiple difference datasets to a target 112 for deriving corresponding full datasets. The source system 102 and the target system 112 may each be a computing system or computing device capable of processing datasets and storing such datasets in data storage. For example, both the source 102 and the target 112 may identify differences between multiple datasets, generate a difference dataset indicative of all such differences, and generate a full dataset by modifying a base dataset in accordance with a difference dataset.

[0023] The source system 102 can send datasets to the target system 112 via a communication path 110, which may include any combination of wireless networks, wired networks, busses, routers, repeaters, etc. The source system 102 includes a communication interface 104, a diffset evaluator 106, a diffset generator 107, and data storage 108. The target system 112 includes a communication interface 114, a diffset handler 116, and data storage 118. In some examples, the source system 102 and/or the target system 112 may be a computing system having processor(s), memory, and instructions such as software and/or firmware features stored in the memory that define processes performed by the computing system upon execution of such features by the processor(s). In some examples, the systems 102, 112 may include hardware features to perform processes described herein, such as logical circuit(s), application specific integrated circuit(s), etc.

[0024] The source 102 and target 112 communicate over the communication path 110 using respective communication interfaces 104 and 114. The communication interface 104 may send transmissions indicative of outgoing data over the communication path 110. The communication interface 114 may receive transmissions indicative of incoming data over the communication path 110. In some examples, transmissions may include packet data accompanied by associated overhead information. The communication interfaces 104, 114 and/or associated components may establish, maintain, and/or terminate communication links in accordance with a variety of protocols.

[0025] In the source system 102, the diffset evaluator 106 may evaluate a dataset to be sent and determine whether the dataset has similarities with available reference datasets known to the target 112. For example, the diffset evaluator 106 may compare a given dataset with a set of previously-sent datasets and identify a portion of the previously-sent dataset that is identical to a corresponding portion of the given dataset. The diffset evaluator 106 may further determine whether a difference dataset should be used to communicate the given dataset based on the extent to which the given dataset shares identical portion(s) with the previously-sent dataset(s). For example, if the identified identical portion(s) represent a small fraction of the given dataset, the diffset evaluator 106 may determine that the full dataset should be sent rather than a difference dataset.

[0026] The diffset generator 107 may generate a difference dataset indicative of the differences between a given full dataset and available reference dataset(s). The difference dataset output from the diffset generator 107 may then be sent to the target 112 and used to generate a copy of the original full dataset. Some processes that may be performed by the diffset evaluator 106 and/or the diffset generator 107 are further described in connection with FIGS.3-4, for example. [0027] The data storage 108 may include volatile and/or non-volatile memory modules. Data storage 108 may store full datasets to be conveyed to the target 112 and copies of difference datasets generated by the diffset generator 107 and previously sent to the target 112. The data storage 108 may also include machine- readable instructions that, when executed, cause the source 102 to carry out processes described herein in connection with the diffset evaluator 106 and the diffset generator 107.

[0028] In the target system 112, the diffset handler 116 may receive a difference dataset and generate a corresponding full dataset in accordance with the processes described herein. For example, the diffset handler 116 may identify locally stored copies of datasets indicated in a received difference dataset. The diffset handler 116 may generate a full dataset by modifying a portion of the identified dataset according to a modification specified in the difference dataset, such as a replacement value. In some examples, the diffset handler 116 may append additions to the identified dataset according to the difference dataset. Generation of the full dataset may involve initializing a data structure according to a schema indicated by the received difference dataset and/or a dataset referenced therein. Generation may also include populating values in the data structure using values from the referenced datasets and/or the difference dataset as specified by the difference dataset. The data storage 118 may include volatile and/or non-volatile memory modules. Data storage may store local copies of datasets received from the source 102 as well as machine- readable instructions that, when executed, cause the target 112 to perform processes described herein in connection with the diffset handler 116.

[0029] In some examples, a full dataset may be communicated to the target 112 by the source 102 sending a difference dataset representing differences between that full dataset and a base dataset and/or other datasets that are already stored at the target 112. The target 112 may also receive an indication of the particular base dataset and/or any other difference dataset(s) that a given difference dataset is intended to modify. In some cases, a difference dataset communication from the source 102 may include both a set of difference values (e.g., values to replace and/or values to append), and indicator(s) that identify the datasets to be modified according to the set of difference values. In some cases, a full dataset may be communicated by sending a difference dataset that itself refers to another difference dataset already stored at the target 112. For example, each difference dataset may specify: (i) a base dataset to modify, and (ii) how to modify that base dataset (e.g., a set of values to modify and/or append to base dataset). In addition, some difference datasets may further specify: (iii) another difference dataset stored at the target 112, and (iv) how to modify the base dataset in accordance with the specified other difference dataset (or portion(s) thereof). The various base datasets and/or difference datasets may be referred to using unique identifiers or the like.

[0030] In practice, some components of the source system 102 and/or target system 112 may be implemented as multiple components that operate in combination with one another, such as a computing system with multiple data storage elements and/or multiple processors. Moreover, some components may be implemented by multiple computing systems operating in coordination, such as virtual functions provided by a set of networked computing systems.

[0031] FIG.1B is a block diagram of an example arrangement in which a source 122 provides multiple difference datasets to a target 126 via a data connection 124 within a computing system 120. The source 122 and target 126 may perform similar processes as the source 102 and target 112, respectively, described in connection with FIG.1A. Both the source 122 and the target 126 may be components within the single computing system 120, and the data connection 124 is an internal communication path within the computing system 120. The data connection 124 may include wired connections such as a shared data bus or the like. For instance, the source 122 may include a central processing unit (CPU), and the target 126 may include a management interface, a peripheral, or another component of the computing system 120 that receives instructions from the CPU. [0032] The source 122 may convey multiple datasets to the target 126 via the data connection 124. A given dataset may be communicated by sending a difference dataset representing differences between the full dataset and a base dataset that is already stored at the target 126. Some difference datasets may be based on differences between a base dataset that is itself modified by another difference dataset. The target 126 may include a diffset hander 127, which may be similar to the diffset handler 116 described above in connection with FIG. 1A. Similar to the description of target 112 above, the target 126 may recover the full datasets by modifying the base datasets in accordance with difference dataset(s). Thus, to the extent that the datasets communicated from the source 122 to the target 126 are similar to one another, the size of the difference datasets transmitted over the data connection 124 may be less than the size of the full datasets. Using difference datasets can thereby help mitigate communication bottlenecks between the target 122 and the source 126.

[0033] FIG. 1C is a block diagram of an example system 130 in which a source 132 provides multiple difference datasets to a target 136 via a data connection that includes a communication network 134. The source 132 and target 136 may perform similar processes as the source 102 and target 112, respectively, described in connection with FIG. 1A. The target 136 may include a diffset hander 137, which may be similar to the diffset handler 116 described above in connection with FIG.1A. In system 130, the source 132 may be remote from the target 136, and the communication path between the source 132 and target 136 may include the communication network 134. For instance, the source 132 may be a remote management computer and the target 136 may be a server situated in a data center. The communication network 134 may be a local area data exchange network and/or a wide area packet exchange network that directs data packets from a sender to a recipient, such as the internet. Similar to the discussions in connection with FIGS.1A and 1B, the system 130 may enable the source 132 to communicate multiple datasets to the target 136 using difference datasets so as to use less bandwidth than would be required to send the full datasets. The system 130 therefore helps mitigate the performance impact of communication bottlenecks in the communication network 134.

[0034] FIG. 1D is a block diagram of an example system 140 in which a source 142 provides configuration data to a target 152 using multiple difference datasets. The source 142 and target 152 may perform similar processes as the source 102 and target 112, respectively, described in connection with FIG. 1A. The source system 142 includes a communication interface 144, a diffset evaluator 146, a diffset generator 147, and data storage 148. The target system 152 includes a management interface 154 having a communication interface 156, a diffset handler 158, data storage 160, and config manager 162. The config manager 162 may monitor and/or manage configurable components 164. For example, the config manager 162 may use respective configuration datasets to configure the configurable components 164. The source 142 and the management interface 154 may communicate via a data connection 150 using respective communication interfaces 144 and 156.

[0035] In operation, the source 142 may communicate multiple datasets to the target 152 using difference datasets to represent at least some of the communicated datasets. In some examples, the configurable components 164 may include multiple network interface cards (NICs). In an example in which the components 164 are all the same model, each of the configurable components 166, 168, 170 (e.g., NICs) may be configured using configuration datasets that are similar yet not identical to one another. For example, each NIC may have configuration datasets that assign each NIC a unique MAC address, but are otherwise identical. In some examples, one NIC may be a different model than another one, and the two may have configuration datasets that differ with respect to their size, structure (e.g., schema), and/or content. For difference datasets that refer to a base dataset with significant differences, the difference dataset may specify modifications to the base dataset that involves creating and populating new fields and/or deleting entire fields from the data structure such that the resulting modified dataset is organized in accordance with a schema of the desired dataset.

[0036] FIG. 2A is a block diagram including pictographs representing example difference datasets generated by a source system and sent to a target system. The difference datasets represented in FIG. 2A are described, for example purposes, in connection with the source 102 and target 112 of FIG. 1A. The pictographs in FIG. 2A represent full datasets 210 stored in the data storage 108 and evaluated by the diffset evaluator 106 of the source 102 and datasets 220 generated by the diffset generator 107 to be transmitted to the target 112. The transmitted datasets 220 may include difference datasets and may be sent to the target system 112 via data connection 110. At the target 112, the diffset handler 116 may use the received datasets 230 to generate full datasets 240 thereby completing communication of the full datasets. In addition, the source 102 and the target 112 may store local copies of the transmitted datasets 220 and received datasets 230, respectively. At the source 102, the local copies (220) may be used to generate a difference dataset based on differences between a given full dataset and the previously transmitted dataset(s) 220. Similarly, at the target 112, the local copies (230) may be used to reconstruct a full dataset from an incoming difference dataset that references the previously received datasets 230.

[0037] The example datasets 210, 220, 230, and 240 are illustrated by pictographs in which each dataset is represented as a block. Differing portions of the various datasets are represented by differently shaded regions of the blocks. The first full dataset 212 is represented by a uniformly unshaded block. The second full dataset 214 includes a portion that is identical to a corresponding portion of the first dataset 212. The identical portions are represented by unshaded regions of the two blocks. The second full dataset 214 also includes a portion 215 that is not identical to the first dataset 212, which is represented by a shaded region. The third full dataset 216 includes a portion that is identical to the first dataset 212 (represented by the unshaded region) and two portions 217a, 217b that are different from the first dataset (represented by the two shaded regions). In addition, one of the differing portions 217b in the third dataset 216 is identical to a corresponding portion of the second dataset 214. A common shading pattern is used to represent the portion 217b of the third dataset 216 and the identical portion 215 of the second dataset 214. The fourth dataset 218 includes a portion that is identical to a corresponding portion of the first dataset 212 (represented by the unshaded region); a portion 219c that is identical to the second dataset 214 (represented by the region shaded with the same pattern as portion 215); and portions 219a, 219b that are different from any corresponding portions of the first, second, or third datasets 212, 214, 216 (represented by the region shaded with a cross-hatch pattern); and a portion 219d that does not correspond to portions of the first, second, or third datasets 212, 214, 216. The portion 219d is represented by the region at the bottom of block 218, which causes block 218 to be greater in size than blocks 212, 214, 216.

[0038] The diffset evaluator 106 may analyze the full datasets 210 and identify portions of the datasets 210 with identical content at corresponding portions of the reference datasets already stored at the target system 112. In particular, the diffset evaluator 106 may refer to the local copies of the previously-transmitted datasets 220 stored at the source 102 to identify any portions of a given dataset that is identical to a previously transmitted dataset. The target system 112 may also retain local copies of received datasets 230 upon receipt of the transmitted datasets 220. Thus, at a given time, the transmitted datasets 220 stored at the source 102 and the received datasets 230 stored at the target 112 may be identical to one another. As such, the diffset evaluator 106 at the source 102 may refer to the transmitted datasets 220 to use as a basis for generation of a given difference dataset. Similarly, the diffset handler 116 at the target 112 may refer to the previously received datasets 230 to use for generating a full dataset from a given difference dataset.

[0039] Some of the datasets referred to at the source system (220) and at the target system (230) are referred to herein as “previously-sent” or “previously-received”. Previously-sent is used herein to describe a dataset that is available to the source 102 for use in generating a difference dataset that represents a full dataset. Previously-received is used herein to describe a dataset that is available to the target 112 for use in generating a full dataset in accordance with a difference dataset. However, the temporal order of dataset transmission from the source 102 to the target 112 may be altered in a number of ways that provide for a given dataset to be made available to the target 112 at the time the target 112 analyzes a difference dataset that references that given dataset. For instance, it is understood that the group of difference datasets 220 may be sent as a series of communications or as a single communication, and need not be communicated in any particular order. For a given difference dataset, the target 112 can generate a corresponding full dataset once the base dataset(s) and/or difference dataset(s) referenced by such difference dataset are available at the target 112, even if they arrive subsequent to that difference dataset.

[0040] In some examples, the diffset evaluator 102 may first determine whether to use a difference dataset to transmit a given dataset from the source 102 to the target 112. Such a determination may be based at least in part on the extent of identical portions between the given dataset and the previously-transmitted datasets 220. For example, as shown in FIG. 2A, the diffset evaluator 106 may determine that the first dataset 212 should be transmitted in its entirety (i.e., dataset 222). This determination may be due to the target system 112 not having any locally stored difference datasets to refer to. In some examples, the diffset evaluator 102 may determine that a given dataset can be communicated efficiently using a difference dataset that refers to previously-transmitted datasets. Upon such a determination, the diffset generator 107 may generate a corresponding difference dataset that references modification(s) and/or addition(s) to apply to a previously-transmitted dataset. Thus, a difference dataset generated by the diffset generator 107 may include data that identifies a previously-transmitted dataset and a set of values to replace and/or append to generate the desired full dataset. [0041] The diffset evaluator 106 may determine that the remaining full datasets 214, 216, 218 can be represented by difference datasets 224, 226, 228, respectively. When analyzing the second dataset 214, the diffset evaluator 106 may refer to the transmitted datasets 220 and determine that the second dataset 214 can be represented by a difference dataset 224 that references the first dataset 222. The difference dataset 224 is represented in FIG. 2A by a pictogram having a dashed outline corresponding to the size of the reference dataset it is intended to modify. The unfilled region of the pictogram represents portions of the second dataset 214 that are identical to the first dataset 222. The shaded region represents a portion 225 that is different from the first dataset 222. Thus, the transmitted difference dataset 224 may include data indicative of replacement values (represented by the shaded portion 225) and data indicating the first full dataset 232 stored for reference at the target system 112. In some examples, the difference dataset 224 that is transmitted via the communication path 110 may have a significantly reduced size compared to the full second dataset 214.

[0042] In the pictograph of FIG. 2A, the notation“2'1” on difference dataset 224 represents a reference to a particular dataset to modify. The notation indicates that the specified modifications (corresponding to the shaded portion 225) are intended to modify a corresponding portion of the first dataset and that the resulting modified dataset will be the second dataset. That is, the second dataset (2) results from modifying (') the first dataset (1) as specified by the difference dataset 224. At the target 112, the diffset handler 116 can use the locally stored copy of the first dataset 232 and the received difference dataset 234 to reconstruct the full dataset 244. The diffset handler 116 may generate dataset 244 by copying the portion of the first dataset 232 that is not modified by the difference dataset 234 and populating the remaining portion(s) of dataset 244 with the values specified by the received difference dataset 234.

[0043] When analyzing the third full dataset 216, the diffset evaluator 106 may refer to the previously transmitted datasets 220 and determine that the third dataset 216 can be represented by a difference dataset 226 that references the previously transmitted first dataset 222 and the difference dataset 224. The difference dataset 226 is represented in FIG. 2A by a pictogram having a dashed outline and shaded regions indicating modification(s) to the referenced dataset(s). The third difference dataset 226 specifies replacement values for the portions of the third full dataset 216 that differ from the second full dataset 214 (i.e., the first transmitted dataset 222 as modified by difference dataset 224 or equivalently, at the target 112, the first received dataset 232 as modified by difference dataset 234). In addition, the third difference dataset 226 may include indicators that specify whether to apply the entirety of the modifications specified by the second difference dataset 224 to the first difference dataset 222. In practice, such indicators may take the form of an indication that a portion of the difference dataset 224 should be ignored when modifying the first dataset 222. Other techniques may be used, such as associating a binary indication with each predefined portion of a given difference dataset that indicates, for each such portion, whether the modifications specified therein should be applied or ignored.

[0044] Generation of a full dataset using a difference dataset that indicates some portions should be applied, and other ignored, involves three (or more) datasets: a base dataset, an intermediate difference dataset, and a current difference dataset. The base dataset may be a reference dataset or an initially transmitted dataset. The intermediate difference dataset may be a previously transmitted difference dataset (e.g., corresponding to a previous full dataset). The current difference dataset may specify differences between a current full dataset and the base dataset modified according to a not ignored portion of the intermediate difference dataset. Thus, the current full dataset is referenced to a dataset generated when the base dataset is modified according to some, but not all, portions of the intermediate difference dataset. The portions of the intermediate difference dataset to be applied or ignored may be specified by the current difference dataset. As such, when combined with the base dataset and the intermediate difference dataset, the current difference dataset may include information for the target to generate the corresponding full current dataset.

[0045] For example, in FIG. 2A, the portion 227c of the difference dataset 226 may include an indicator that the corresponding portion of the difference dataset 224 (a subset of portion 225) should be ignored. Thus, the portion 227c may include an indicator to be interpreted by the target 112 that specifies that upon generation of the full third dataset 246 at the target 112, the portion corresponding to 227c should be copied from the first full dataset 232 rather than the difference dataset 234. However, the target 112 may still copy the remainder of portion 225 from difference dataset 234 (the portion corresponding to 217b), because such remainder is not modified by portion 227c. With this technique, a difference dataset may specify particular portions of previous difference datasets to be applied during generation of a subsequent dataset. In some examples, a given difference dataset representing a full dataset may specify all differences between that full dataset and a reference dataset that is itself modified according to portion(s) of additional difference dataset(s), and those additional difference dataset(s) and portion(s) thereof may be specified by data included in the given difference dataset.

[0046] In the example shown in FIG.2A, the difference dataset 226 references a portion of the second transmitted dataset 224. The difference dataset 226 is labeled with the notation“3'(2'1)”, which indicates that the modifications specified therein generate the third dataset (3) when applied to a base dataset formed by modifying the first dataset (1) using the second difference dataset (2) in the manner specified. Upon receipt of the dataset 236 at the target system 112, the diffset handler 116 may generate the third full dataset 246 by referring to the previously received datasets 232, 234. For example, the diffset handler 116 may generate the portions of the difference dataset 236 represented by the unfilled dashed outline by copying the corresponding portions from a combination of the previously received datasets 232, 234. For the two remaining portions of the difference dataset 236 (e.g., the shaded regions), the diffset handler 116 may populate corresponding portions of the third full dataset 246 with replacement values specified by the difference dataset 236 (e.g., for the replacement block) or with values from the corresponding portion of the first dataset 232 (e.g., for the ignore block).

[0047] In some examples, the portion 227c of the transmitted difference dataset 226 that references the first dataset 222 rather than the second dataset 224 may require a lesser amount of data than would be required to include a listing of all those values in the difference dataset 226. Thus, allowing the difference dataset 226 to specify particular portions of previously transmitted difference datasets to be applied, rather than the entirety of such datasets, may help reduce the bandwidth demands on communication path 110.

[0048] When analyzing the fourth full dataset 218, the diffset evaluator 106 may refer to the previously transmitted datasets 220 and determine that the fourth dataset 218 can be represented by a difference dataset 228 that references the previously transmitted first dataset 222 and the difference dataset 224. The difference dataset 228 is represented in FIG. 2A by a pictogram having a dashed outline and shaded regions indicating modification(s) and additions to be applied to the referenced dataset(s) to generate the fourth dataset 218. The difference dataset 228 is labeled with the notation“4'(2'1)”, which indicates that the modifications specified therein generate the fourth dataset (4) when applied to the first dataset (1) modified by the second dataset (2).

[0049] The difference dataset 228 may include portions 229a and 229b that specify replacement values for corresponding portions 219a and 219b of the fourth dataset. In addition, the difference dataset 228 may include a portion 229d that corresponds to portion 219d of the fourth dataset 218. The portion 229d may include a set of values to be appended to the referenced dataset. The remaining portions of difference dataset 228 are represented by the unfilled dashed outline, which indicates that those portions are identical to corresponding portions of the second dataset 214 (e.g., the first transmitted dataset 222 as modified by the second dataset 224). The difference dataset 228 may include indicator(s) specifying particular dataset(s) for the modifications specified therein to be applied to, such as a unique identifier for the first dataset 232 and/or the second dataset 234 stored at the target system 112.

[0050] Upon receipt of difference dataset 238 at the target system 112, the diffset handler 116 may generate the fourth full dataset 248 by referring to the previously received datasets 232, 234. For example, the diffset handler 116 may generate the portions of the difference dataset 238 represented by the unfilled dashed outline by copying the corresponding portions from a combination of the previously received datasets 232, 234. For the remaining portions of the difference dataset 236 (e.g., the shaded regions), the diffset handler 116 may populate corresponding portions of the fourth full dataset 248 with replacement values and/or appended values specified by the difference dataset 238 in accordance with information provided by the difference dataset 238.

[0051] FIG. 2B is a block diagram including pictographs representing example difference datasets generated by a source system 102 and sent to a target system 112. In the example of FIG.2A, the target 112 is not preloaded with a base dataset, such as a default reference dataset. FIG.2B illustrates an example in which both the source 102 and the target 112 are preloaded with a reference dataset 290. Similar to the diagram in FIG. 2A, FIG. 2B illustrates data stored at the source 102 and the target 112 in data storages 108 and 118, respectively. Initially, the source stores original full datasets 250 and a reference dataset 290, which are used to generate transmitted datasets 260 that are sent to the source 112 via the communication path 110. The source 112 uses the received datasets 270 and its own copy of the reference dataset 291 to generate the full datasets 280. As in FIG.2A, each dataset is represented by a pictograph of a rectangular block and differently shaded regions of the blocks are used to represent non-identical portions of the various datasets. Processing and communicating four different datasets 252, 254, 256, 258 are described below in turn.

[0052] The diffset evaluator 106 may compare the first full dataset 252 with the transmitted datasets 260 and the reference dataset 290 and determine that the first dataset 252 can be represented by difference dataset 262, which specifies differences between the first dataset 252 and the reference dataset 290. The diffset generator 107 may then generate the difference dataset 262, which can then be sent to the target 112 via the communication path 110 using the communication interfaces 104, 114. In addition, the generated difference dataset 262 may be stored locally at the source 102 for future reference. At the target system 112, the diffset handler 116 may analyze the received difference dataset 272 and generate the full dataset 282 by modifying the reference dataset 291 in accordance with the difference dataset 272 (e.g., by replacing values corresponding to the shaded region).

[0053] For the second full dataset 254, the diffset evaluator 106 may compare dataset 254 with the transmitted datasets 260 and the reference dataset 290 and determine that the extent of similarity between the dataset 254 and the available reference datasets (e.g., 262, 290) is less than a threshold. As a consequence, the diffset evaluator 106 may determine that the second full dataset 254 should be transmitted in its entirety. Thus, the transmitted dataset 264 sent to the target 112 may be a copy of the second full dataset 254. Upon receipt at the target 112, the diffset handler 116 may analyze the received dataset 274 and determine that it is a full dataset and not one that is used to modify other dataset(s) to generate a full dataset. Such a determination may be made, for example, based on header information or the like that is included with the received dataset 274. The second full dataset 284 may then be generated (e.g., by copying the received dataset 274) and the received dataset 274 may be stored for future reference.

[0054] For the third full dataset 256, the diffset evaluator 106 may compare dataset 256 with the transmitted datasets 260 and the reference dataset 290 and determine that the third dataset 256 can be represented by difference dataset 266. Difference dataset 266 specifies differences between the third dataset 256 and the first dataset– i.e., the reference dataset 290 as modified by difference dataset 262. As shown in FIG. 2B, difference dataset 266 is labeled“3'(1'R)”, which indicates that the third full dataset (3) results from modifying the dataset formed by the first transmitted dataset 262 (1) modifying the reference dataset 290 (R). In some examples, the difference dataset 266 may be selected to represent the third full dataset 256 after considering multiple candidate difference datasets. The diffset evaluator 106 may select the previously-sent dataset that is most similar to the presently analyzed dataset to use as a reference (e.g., the one that yields the smallest difference dataset). In some examples, the diffset evaluator 106 may consider multiple candidate difference datasets generated based on multiple permutations of the reference dataset 290 and the second dataset 264. In the labeling convention used herein, such candidate difference datasets may include, for example, 3'2, 3'R, 3'(1'R), 3'(1'2). The diffset evaluator 106 may then select the smallest one of the candidate difference datasets. Accordingly, the dataset 266 that is transmitted over the communication path 110 may be the smallest difference dataset that can be generate from the combination of previously-sent datasets 260 and the reference dataset 290.

[0055] When analyzing the third full dataset 256, the diffset evaluator 106 may determine that a difference dataset that references the second full dataset 264, 3'2, is greater in size than the difference dataset 266 that references the first dataset, 3'(1'R). The diffset generator 107 may then generate the difference dataset 266, which can then be sent to the target 112. The generated difference dataset 266 may be stored locally at the source 102 for future reference. At the target system 112, the diffset handler 116 may analyze the received difference dataset 276 and generate the third full dataset 286. For example, the diffset handler 116 may copy the portions of the first dataset that correspond to the unfilled regions of the received dataset 276. The remaining portion of the third full dataset 286 may be populated by replacement values specified by the difference dataset 276.

[0056] For the fourth full dataset 258, diffset evaluator 106 may compare dataset 258 with the transmitted datasets 260 and the reference dataset 290 and determine that the fourth dataset 258 can be represented by difference dataset 268. The difference dataset 268 may specify differences between the fourth dataset 258 and the second dataset 264 as modified by the first transmitted dataset 262, which is labeled 1'2. The reference dataset 1'2 may be a dataset that was not itself previously transmitted, but is formed from a combination of previously-transmitted datasets 262 and 264. In some examples, the diffset evaluator 106 may compare difference datasets generated based on multiple ones of the available previously-sent datasets 260 (and combinations thereof) and select 1'2 because it yields the smallest difference dataset 268 to represent the fourth full dataset 258. At the target system 112, the diffset handler 116 may analyze the received difference dataset 278 and generate the full fourth dataset 288. For example, the diffset handler 116 may populate portions of dataset 288 corresponding to the unshaded portions of dataset 278 with values from the second dataset 274 modified by dataset 272, and populate the remaining portion of dataset 288 with the values specified by the difference dataset 278.

[0057] In some cases, the communication sent to the target 112 may include both dataset(s) and information about the dataset(s), such as header information. A given communication may indicate: (1) a schema for the received dataset, (2) which dataset the presently conveyed dataset should modify, (3) uses for the received dataset, (4) whether to save a local copy of the difference dataset, and/or (5) other information useful for generating and/or making use of the full datasets. For instance, with reference to FIG.2A, when sending the first dataset 222, the communication sent to the target 112 may include an indication that the dataset payload is a full dataset– that it is not to be used in modifying any previously received datasets. The difference handler 116 may recognize such an indication and determine that the full first dataset 242 should be copied from the received dataset 232. The difference handler 116 may also determine that the received dataset 232 should be stored at the target 112 for use in generating subsequent datasets. Such determination may be based on information from the source 102 (e.g., header information) and/or based on criteria evaluated by the difference handler 116 (e.g., available memory, dataset(s) already stored, etc.). [0058] FIGS. 3A and 3B are flowcharts of example processes 300 and 310, respectively. The processes 300, 310 may be described below as being executed or performed by a system. For example, the processes may be performed by the source 102 and/or target 112 of system 100 described in connection with FIG.1A or the source 142 and/or target 152 of system 140 described in connection with FIG.1D. Other suitable systems and/or computing devices may be used as well. Processes 300, 310 may be implemented in the form of executable instructions stored on a machine-readable storage medium of the system and executed by a processor of the system. Processes 300, 310 may be implemented in the form of electronic circuitry (e.g., hardware). Some steps of these processes may be executed concurrently or in a different order than shown in FIGS.3A and 3B. Moreover, processes 300, 310 may include more or less steps than are shown in FIGS.3A and 3B. In some examples, steps may be ongoing and/or may repeat.

[0059] FIG. 3A is a flowchart of an example process 300 involving a source system generating difference datasets to communicate multiple datasets to a target system. At block 302, the source system may generate difference dataset A specifying differences between dataset A and a reference dataset. The reference dataset may be a dataset with copies stored at both the source and the target to allow both the source and the target to refer to the reference dataset. At block 304, difference dataset A generated in block 302 may be sent to the target. For example, the source 102 may send such a difference dataset to the target 112 over communication path 110 using the communication interfaces 104 and 114.

[0060] At block 306, the source system may generate difference dataset B that specifies differences between another dataset B and the reference dataset modified by at least a portion of difference dataset A. Thus, dataset B may have some portions that are identical to corresponding portions of the reference dataset and some portions that are identical to corresponding portions of difference dataset A. Moreover, a particular portion of the reference dataset that is modified by difference dataset A may be identical to a corresponding portion of dataset B. Difference dataset B may specify that while some portions of difference dataset A should be applied to the reference dataset (namely, those portions identical to dataset B), the portion of the reference dataset that is identical to dataset B should not be modified. In practice, difference dataset B may include an indicator that specifies that a portion of difference dataset A that modifies the portion of the reference dataset identical to dataset B should be ignored. For example, difference dataset B may include an indicator similar to indicator 227c of difference dataset 226 described above in connection with FIG.2A. Thus, difference dataset B may specify differences between dataset B and another dataset generated based on specified portion(s) of available reference dataset(s) and/or difference dataset(s) known to both the source and the target.

[0061] At block 308, difference dataset B generated in block 306 may be sent to the target. For example, source 102 may send such a difference dataset to target 112 via communication path 110.

[0062] In some examples, features of the source 102, such as the diffset evaluator 106 and/or diffset generator 107 may evaluate a variety of factors to determine whether to include a reference to another dataset in a given difference dataset. In some examples, difference dataset B may be generated so as to include all references to other datasets that yield a net reduction in the size of difference dataset B. For example, a reference may be included if the reference to the dataset portion requires less data/bandwidth than the referenced dataset portion itself. In some examples, the determination of whether to include a given reference may be further based on considerations of processing time and/or resources at the source and/or target. For instance, a reference to another dataset may be included in a difference dataset based on both the net bandwidth savings and processing resource requirements. In some examples, references that yield a bandwidth savings below some threshold may not be included. In some examples, references that yield a bandwidth savings below a threshold may be included only if the processing requirements to process that reference at the target are below another threshold. [0063] In some examples, features of the source 102 may analyze multiple candidate difference datasets that represent a given full dataset and select from among such candidates to transmit to the target 112. This may involve selecting the candidate that uses the least data (i.e., the one with the least bytes of data). In some cases, when analyzing a given full dataset to transmit, the source 102 may generate multiple candidate difference datasets based on differences between the given full dataset and multiple reference datasets known to both the source and the target. Features of the source 102, such as the diffset evaluator 106 may select from among such candidate difference datasets based on factors such as the size of the candidate difference datasets and/or the processing resources required to regenerate the full dataset at the target 112.

[0064] Further, even upon selecting a difference dataset to represent a given full dataset, some features of the source 102 may evaluate a variety of factors to determine whether or not to generate and transmit the difference dataset rather than the entire full dataset. As described in connection with analysis and transmission of the second dataset 254 in FIG.2B, in the event the cumulative differences between a given full dataset and available reference dataset(s) satisfy some criteria (e.g., exceed a threshold), the source 102 may determine to transmit the entire full dataset rather than a difference dataset. For example, upon analysis of a particular full dataset at the source 102, the diffset evaluator 106 may compare the sizes (e.g., amount of bytes) of the full dataset and a difference dataset that represents that full dataset based on difference(s) with reference dataset(s). The source 102 may evaluate the net reduction in data provided by the difference dataset (e.g., the difference between the sizes) and/or the relative reduction in data provided by the difference dataset (e.g., the ratio between the sizes). In some examples, the determination whether to generate and transmit the difference dataset or to transmit the full dataset may be based on comparing the net data reduction and/or relative data reduction with threshold(s). In some examples, such evaluations may be based on factors related to the data size of the difference dataset, such as the amount of data represented by referring to reference dataset(s) and/or the amount of data represented by replacement values included in the difference dataset.

[0065] FIG.3B is a flowchart of an example process 310 involving a target system generating multiple datasets using difference datasets. At block 312, the target system may maintain a reference dataset and difference dataset A. Difference dataset A may specify a difference between dataset A and the reference dataset. The reference dataset may be a dataset with copies stored at both the source and the target to allow both the source and the target to refer to the reference dataset. In some examples, the target 112 may receive the reference dataset and difference dataset A, for example via the communication interface 114. In some cases, the datasets may be provided to the target 112 by another system or process. For example, the datasets could be loaded, transferred, created, updated, read, stored, and/or maintained by a variety of systems such that the datasets become available for analysis by the target system 112.

[0066] At block 314, dataset A may be generated based on the reference dataset and difference dataset A. For example, the diffset handler 116 may generate dataset A by replicating the reference dataset, and populating the replicated data structure with values from the reference dataset or difference dataset A as specified by difference dataset A.

[0067] At block 316, the target system may receive difference dataset B. Difference dataset B may specify differences between dataset B and the reference dataset modified by at least a portion of difference dataset A. Thus, dataset B may have some portions that are identical to corresponding portions of the reference dataset and some portions that are identical to corresponding portions of difference dataset A. Moreover, a particular portion of the reference dataset that is modified by difference dataset A may be identical to a corresponding portion of dataset B. Difference dataset B may specify that some portions of difference dataset A should be applied to the reference dataset (namely, those portions identical to dataset B) to generate dataset B. Difference dataset B may further specify that the particular portion of the reference dataset should not be modified by difference dataset A, and that instead the particular portion of the reference dataset should be copied into dataset B.

[0068] In practice, difference dataset B may include data that indicates that the portion of difference dataset A that modifies the particular portion of the reference dataset should be ignored. For example, difference dataset B may include an indicator similar to indicator 227c of difference dataset 226 described above in connection with FIG.2A. Thus, difference dataset B may specify differences between dataset B and another dataset generated based on specified portion(s) of available reference dataset(s) and/or difference dataset(s) known to both the source and the target.

[0069] At block 318, dataset B may be generated using difference dataset B received in block 316 as well as difference dataset A and reference dataset. For example, the diffset handler 116 may generate dataset B by establishing a data structure corresponding to the reference dataset and populating portions of the data structure with values from the reference dataset, difference dataset A, or difference dataset B as specified by the difference dataset B.

[0070] FIG.4 is a flowchart of an example process 400 involving a source system using difference datasets to provide a target system with multiple datasets. In some examples, the process 400 may be performed by the source system 142 described in connection with FIG. 1D. Other suitable systems and/or computing devices may be used as well. Process 400 may be implemented in the form of executable instructions stored on at least one machine-readable storage medium of the system and executed by at least one processor of the system. Process 400 may be implemented in the form of electronic circuitry (e.g., hardware). Some steps of these processes may be executed concurrently or in a different order than shown in FIG.4. Moreover, process 400 may include more or less steps than are shown in FIG.4. In some examples, steps may be ongoing and/or may repeat. [0071] At block 402, a source system may load a configuration dataset for a component in a target system. For example, the source 142 may load a configuration dataset for a network interface card or another component managed via management interface 154.

[0072] At block 404, the source may determine whether reference dataset(s) are stored at the target system. Copies of some datasets may be stored locally at both the source 142 and the target 152 (e.g., in the data storages 148 and 160, respectively) to be used as reference datasets by the source 142 when generating difference datasets and by the target 152 when creating full datasets using difference datasets. For example, the source 142 may query its local data storage 148 to determine whether reference dataset(s) are stored at the target 152, such as previously-sent dataset(s). If it is determined, at block 404, that no reference datasets are stored at the target, process 400 may proceed with block 414 by sending the full configuration dataset to the target system. Thus, in the event that the target does not have any reference datasets to refer to, such as may occur when there are no previously-sent datasets, the full dataset may be sent to the target to thereby enable the target to generate the full dataset without reference to any other datasets.

[0073] On the other hand, if at block 404 it is determined that reference dataset(s) are stored at the target system, process 400 may proceed with block 406 to identify portions of the reference dataset(s) that are identical to corresponding portions of the configuration dataset. For example, the diffset evaluator 146 may refer to local copies of the reference dataset(s), which mirror the reference dataset(s) at the target, and identify portions therein that are common with the presently analyzed configuration dataset.

[0074] At block 408, the source system may determine whether the common portions (e.g., identical portions) are cumulatively greater than a threshold. For example, the diffset evaluator 146 may determine whether the cumulative amount of the identical data (e.g., amount of bytes, etc.) is greater than a threshold. The threshold may be an absolute threshold (e.g., number of bytes) or a proportionate threshold (e.g., 40% of the total size of the dataset). If it is determined, at block 408, that the common portions are not cumulatively greater than the threshold, process 400 may proceed with block 414 by sending the full configuration dataset to the target system. Thus, in the event that the amount of data in the common portions are below the threshold, the full dataset may be sent in its entirety.

[0075] On the other hand, if the amount of the data in the common portions exceeds the threshold, then process 400 may proceed with block 410. At block 410, the source system may generate a difference dataset based on the configuration dataset to be transmitted and the reference dataset(s) that include common portions with the configuration dataset. For example, the diffset generator 147 in the source 142 may generate a difference dataset that specifies all differences between the configuration dataset to be transmitted and some combination of reference dataset(s) stored at the target 152. As described above, the difference dataset may include replacement values and/or values to append to the reference dataset(s).

[0076] At block 412, the difference dataset may be sent to the target. For example, the communication interface 144 may transmit the difference dataset generated by the diffset generator 147 over communication path 150 to the target 152, where it is received via the communication interface 156. Following transmission of the difference dataset, at block 412, or transmission of the full configuration dataset, at block 414, process 400 may proceed with block 416. At block 416, the source system may store a local copy of the dataset sent to the transmission system. For example, the source 142 may store a local copy of the transmitted dataset in the data storage 148. Because the target 152 also stores a local copy of the received dataset, the locally stored copy may be a mirror of the datasets available at the target 152. In some examples, the dataset stored in the data storage 148 of the source 142 may be used for reference in generating subsequent difference datasets.

[0077] At block 418, the target system may use the received dataset to generate configuration data and use the configuration data to configure an associated component. For example, the management interface 154 of the target 152 may first generate a full dataset (e.g., if the received dataset is a difference dataset). Once the full configuration dataset is generated, the config manager 162 may apply the configuration dataset to one of the configurable components 164.

[0078] At block 420, the target system may store a local copy of the received dataset as a reference dataset for use in generating subsequent datasets. For example, the target 152 may store copies of at least some received datasets in data storage 160. The stored datasets at the target 152 may mirror the stored datasets at the source 142 so that each may use the same set of reference datasets when generating difference datasets or in reconstructing full datasets using difference datasets, respectively. In some examples, a given dataset may include an indication of whether the dataset should be stored at the source and/or target for future reference. Such an indication may then be used as a basis for determining whether to store a local copy of the dataset. In some examples, local copies of a given dataset may be stored based on an analysis by the source and/or target, such as available data storage capacity, size of the given dataset, etc.

[0079] FIG.5 is a flowchart of an example process 500 involving a source system using difference datasets to provide a target system with multiple datasets. In some examples, the process 500 may be performed by the source system 142 described in connection with FIG. 1D. Other suitable systems and/or computing devices may be used as well. Process 500 may be implemented in the form of executable instructions stored on at least one machine-readable storage medium of the system and executed by at least one processor of the system. Process 500 may be implemented in the form of electronic circuitry (e.g., hardware). Some steps of these processes may be executed concurrently or in a different order than shown in FIG.5. Moreover, process 500 may include more or less steps than are shown in FIG.5. In some examples, steps may be ongoing and/or may repeat.

[0080] At block 502, the source system may maintain a base dataset, a first full dataset, and difference dataset A. In some examples, the base dataset and difference dataset A may each be reference datasets with identical copies accessible to both the source system and a target system and the first full dataset may be a dataset for the source to communicate to the target. For example, the source 102 may maintain each of the base dataset, difference dataset A, and first full dataset in data storage 108. In some cases, difference dataset A may be a previously- transmitted dataset whereas the first dataset may be a dataset to be communicated presently.

[0081] At block 504, the source may generate difference dataset B. Difference dataset B may specify all differences between the full first dataset and the base dataset. For example, the diffset evaluator 106 and/or diffset generator 107 in source 102 may generate difference dataset B by identifying all differences between the first dataset and the base dataset. Such differences may include values that differ from corresponding ones in the base and/or values that should be appended or excised from the base to arrive at the first dataset. Indicators of the identified differences (e.g., replacement values) may then be included in difference dataset B along with a reference to the copy of the base dataset at the target. The information in difference dataset B may then be used by the target, in combination with its own copy of the base dataset, to generate a copy of the full first dataset.

[0082] At block 506, the source may generate difference dataset C. Difference dataset C may specify all differences between the full first dataset and the base dataset modified by at least a portion of difference dataset A. The process to generate difference dataset C may be similar to the process of identifying differences and including those differences in the difference dataset described above in connection with block 504. However, difference dataset C specifies differences relative to a modified version of the base dataset, namely a version which is modified by at least a portion of difference dataset A. For example, if there is any portion of difference dataset A that is identical to a corresponding portion of the full first dataset, the base dataset may be modified in accordance with that portion. In some examples, the base dataset may be modified in accordance with the entirety of difference dataset A, and difference dataset C may specify differences between the so-modified dataset and the full first dataset. Further, difference dataset C may include indicator(s) that allow the target to identify its own local copies of the base dataset and difference dataset A, and/or portions of those datasets to allow the target to generate the full first dataset using difference dataset C and its local copies of the indicated datasets.

[0083] At block 508, the sizes of the two difference datasets generated in blocks 504 and 506 may be compared with a threshold. For example, the diffset evaluator 106 may compare a measure of the amount of data (e.g., number of bytes, etc.) of difference dataset B and difference dataset C. The sizes compared may incorporate both indications of replacement and/or appended values as well as indicators of referenced datasets and instructions of how to reassemble the corresponding full dataset (i.e., indications of which portions to replace, which to append, which to copy from referenced datasets, etc.). The threshold may be an absolute data threshold, such as an amount of data bytes. In some examples, the threshold may be a relative threshold, such as a proportion of the amount of bytes in the full first dataset. For instance, the threshold may be 40% (e.g., 40% of the size of the full first dataset).

[0084] At block 510, the source may determine whether the two difference datasets are both above the threshold. If so, the process 500 may proceed with block 516. If not, the process 500 may proceed with block 512. Thus, block 512 may be performed if one or both of the two difference datasets B and C is below the threshold. At block 512, the source may determine which of the two difference datasets has a smaller size. If Difference dataset B is smaller, then process 500 may proceed with block 518. If difference dataset C is smaller, then process 500 ma proceed with block 520.

[0085] At block 516, which may be performed if the sizes of both candidate difference datasets are not below the threshold, as determined in block 510, the full first dataset may be transmitted to the target. For example, the source 102 may transmit the full first dataset to the target 112 via communication path 110 and forgo transmission of either of the two difference datasets. In some examples, the threshold may be a relative measure of the size of the candidate difference datasets B and C. For example, the threshold may be 40% of the size of the full first dataset that the two candidate difference datasets represent. For instance, if the full first dataset includes X bytes of data, the threshold may be 0.4 × X bytes.

[0086] At block 518, which may be performed if the size of difference dataset B is both less than the threshold, as determined in block 510, and smaller than difference dataset C, as determined in block 512, difference dataset B may be transmitted to the target. For example, the source 102 may transmit difference dataset B to the target 112 via communication path 110.

[0087] At block 520, which may be performed if the size of difference dataset C is both less than the threshold, as determined in block 512, and smaller than difference dataset B, as determined block 512, difference dataset C may be transmitted to the target. For example, the source 102 may transmit difference dataset C to the target 112 via communication path 110.

[0088] Following performance of one of blocks 516, 518, or 520, the source may maintain a local copy of the transmitted dataset, such as by storing the transmitted dataset in local data storage. If the transmitted dataset is also maintained at the target, then the transmitted dataset may then be used as a reference dataset during generation of subsequent difference datasets. In some examples, each transmitted dataset may be maintained by both the source and the target for use in generating subsequent difference datasets (at the source) and full datasets (at the target). As such, each transmitted dataset may help reduce the size of subsequent datasets, because each subsequent full dataset may be represented by an increasing number of reference datasets. By increasing the number of reference datasets, a given full dataset is increasingly likely to be represented efficiently by a difference dataset that refers to some combination of such reference datasets.

[0089] Note that a variety of different techniques may be used to perform blocks 510 and 512 such that multiple courses of action are selected from based at least in part on the size(s) of multiple candidate difference datasets that represent a given full dataset. In particular, as shown in FIG. 5, the sizes of difference datasets B and C are used as a basis to determine how to communicate the content of the full first dataset from the source to the target. The source may determine whether to: (a) transmit the full first dataset and forgo transmission of either difference dataset, as in block 516; (b) transmit difference dataset B, as in block 518; or (c) transmit difference dataset C, as in block 520.

[0090] Moreover, in some examples, the process 500 may be modified so as to determine which dataset to transmit without first generating each candidate dataset. For example, rather than generating the candidate difference datasets B and C in blocks 504 and 506, some examples may involve the source determining size(s) of multiple candidate difference datasets. Such size determinations may be based on the extent to which a given full dataset includes portions identical with corresponding portions of available reference dataset(s) (e.g., base dataset(s) and/or previously transmitted difference dataset(s)). For example, the diffset evaluator 106 may identify all portions that are not identical to corresponding portions of available reference datasets. The diffset evaluator 106 may then estimate the size of a corresponding difference dataset based on the size(s) of such identified non-identical portions. With such size information, the source may make comparisons with a threshold and among the candidates, and select, for transmission to the target, between multiple candidate difference datasets and the corresponding full dataset itself.

[0091] FIG.6 is a block diagram of an example source system 600 for generating and sending difference datasets and an example target system 630 for using difference datasets to generate full datasets. System 600 may be similar to systems 102 and 142 described in connection with FIGS.1-5, for example. System 630 may be similar to systems 112 and 152 described in connection with FIGS. 1-5, for example. In FIG. 6, system 600 includes a processor 610 and a non-transitory machine-readable storage medium 620. Similarly, system 630 includes a processor 640 and a non-transitory machine-readable storage medium 650. Although the following descriptions refer to a single processor and a single machine-readable storage medium, the descriptions may also apply to a system with multiple processors and/or multiple machine-readable storage mediums. In such examples, the instructions may be distributed (e.g., stored) across multiple machine-readable storage mediums and the instructions may be distributed (e.g., executed by) across multiple processors.

[0092] Processors 610 and 640 each may incorporate central processing units (CPUs), microprocessors, and/or other hardware devices suitable for retrieval and execution of instructions stored in non-transitory machine-readable storage medium 620 and 640, respectively. In the example shown in FIG.6, processor 610 may fetch, decode, and execute instructions 622, 624; similarly, processor 640 may fetch, decode, and execute instructions 652, 654. In some examples, processor 610 and processor 640 may include electronic circuits having electronic components for performing the processes specified by the instructions in machine-readable storage medium 620 and 650, respectively. With respect to the executable instruction representations (e.g., boxes) described and shown herein, it should be understood that part or all of the executable instructions and/or electronic circuits included within one box may, in some examples, be included in a different box shown in the figures or in a different box not shown.

[0093] Machine-readable storage mediums 620 and 650 may be any electronic, magnetic, optical, or other physical storage device that stores executable instructions. Thus, machine-readable storage mediums 620 and 650 may be, for example, Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disc, and the like. Machine-readable storage medium 620 may be disposed within system 600, and machine-readable storage medium 650 may be disposed within system 630, as shown in FIG.6. In this situation, the executable instructions may be“installed” on the system 600 and 630, respectively. In some examples, machine-readable storage medium 620 and/or 650 may be a portable, external or remote storage medium, for example, that allows system 600 and/or 630 to download the instructions from the portable/external/remote storage medium. In this situation, the executable instructions may be part of an“installation package”. As described herein, machine- readable storage medium 620 may be encoded with executable instructions for evaluating a given full dataset to determine a difference dataset corresponding to that dataset, generate the difference dataset, and transmit the generated difference dataset to the target 630. Machine-readable storage medium 650 may be encoded with executable instructions for receiving an incoming difference dataset from the source 600 and using the difference dataset to generate a corresponding full dataset.

[0094] Referring to system 600, dataset evaluation instructions 622, when executed by a processor (e.g., 610), may cause system 600 to evaluate a given dataset and determine a difference dataset that represents that dataset through reference to reference dataset(s) available to both the source and the target. Difference dataset generation instructions 624, when executed by a processor (e.g., 610), may cause system 600 to generate a difference dataset that indicates all differences between a given dataset and identified reference dataset(s) available to both the source and the target (e.g., by providing value(s) for replacement and/or addition to the identified reference dataset(s)). Dataset storage instructions 652, when executed by a processor (e.g., 640), may cause system 630 to store local copies of received difference datasets and/or full datasets for reference in generating subsequent datasets at the target 630. Dataset generation instructions 654, when executed by a processor (e.g., 640), may cause system 630 to generate a full dataset corresponding to a given difference dataset (e.g., by referencing dataset(s) referenced in the given difference dataset and populating the full dataset with values from a specified combination of the difference dataset and the referenced dataset(s)).