Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD TO PARALLELIZE THE RESTORATION OF CONTINUOUS DATA PROTECTION (CDP) REPLICATION USING DELTA COMPRESSION
Document Type and Number:
WIPO Patent Application WO/2023/006168
Kind Code:
A1
Abstract:
Provided is a method for data management in a data storage (202A-N). The method includes, at each of a series of snapshot time points, replicating a plurality of data blocks from the data storage in a backup storage. The method includes, at each of a series of intermediate time points, (i) determining one or more changed data blocks, and (ii) replicating the changed data blocks in the backup storage. The method includes, in response to a restore request for a selected intermediate time point, generating a dependency graph by processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point. The method includes performing a unique path analysis to determine one or more independent paths (502). The method includes restoring the plurality of data blocks replicated at the preceding snapshot time point from backup storage.

Inventors:
GOODMAN DANIEL (DE)
OFEK ITAMAR (DE)
Application Number:
PCT/EP2021/070784
Publication Date:
February 02, 2023
Filing Date:
July 26, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
GOODMAN DANIEL (DE)
International Classes:
G06F11/14; G06F11/10
Foreign References:
US20170060449A12017-03-02
US20100100698A12010-04-22
Attorney, Agent or Firm:
KREUZ, Georg (DE)
Download PDF:
Claims:
CLAIMS

1. A computer-implemented method for data management in a data storage (202A-N), the method comprising: at each of a series of snapshot time points, replicating a plurality of data blocks from the data storage (202A-N) in a backup storage; at each of a series of intermediate time points: determining one or more changed data blocks that have been updated in the data storage (202A-N); and replicating the changed data blocks in the backup storage, wherein if one or more of the changed data blocks are similar to a reference block that has previously been replicated, the similar data blocks are represented by an address of the reference block, a time point of the reference block and a delta value; and in response to a restore request for a selected time point, generating a dependency graph by: processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point, by: for each data block replicated at the processed time point: adding a node corresponding to the address of the data block, if it does not already exist in the graph; checking whether the node corresponding to the address of the data block has a sub-node corresponding to the processed time point; and in the case where a sub-node corresponding to the processed time point does not already exist for the node in the graph, and if the processed time point of the data block is more recent than a time point associated with any existing sub-node of the node: adding a sub-node corresponding to the processed time point to the node, and identifying any reference block associated with the data block; in the case where a sub-node corresponding to the processed time point already exists for the node in the graph: identifying any reference block associated with the data block; and for each identified reference block: adding a node corresponding to the address of the reference block, if it does not already exist in the graph; adding a sub-node corresponding to the time point of the reference block to the node, if it does not already exist in the graph; and mapping the association with the reference block as a path between the respective sub-nodes; performing a unique path analysis to determine one or more independent paths (504) covering all sub-nodes in the dependency graph; and restoring the plurality of data blocks replicated at the preceding snapshot time point from backup storage and following each of the independent paths to propagate a plurality of changes made up to the selected time point.

2. The method of claim 1, wherein the reference block addresses and delta values are compressed.

3. The method of claim 1 or claim 2, wherein a new intermediate time point is made after a predetermined time or a predetermined volume of changed data.

4. The method of claim 3, wherein the predetermined time is 5 seconds and the predetermined volume of changed data is 10 megabytes.

5. The method of any preceding claim, wherein each reference block address is stored with a checksum value for the reference block.

6. The method of claim 5, wherein restoring each reference block comprises generating a checksum value for the reference block and validating the generated checksum value against the stored checksum value.

7. The method of claim 6, wherein if the validation of the reference block fails, the method further comprises restarting the independent path (502) corresponding to the restored block.

8. The method of any preceding claim, wherein restoring the data blocks from the backup storage comprises sending the independent paths (502) to an external device for execution.

9. The method of any preceding claim, wherein restoring the data blocks comprises identifying one or more snapshot data blocks corresponding to the preceding snapshot time point, and caching the snapshot data blocks.

10. The method of any preceding claim, wherein restoring the plurality of data blocks comprises, while processing a data block, pre-fetching a following data block on the same independent path (502).

11. The method of any preceding claim, wherein restoring the plurality of data blocks comprises following a plurality of the independent paths (502) in parallel.

12. The method of claim 11, further comprising determining a maximum number of data blocks that can be restored simultaneously and setting a corresponding maximum number of independent paths (502) than can be followed in parallel.

13. A computer readable medium configured to store instructions which, when executed by a processor, cause the processor to execute the method of any preceding claim.

14. A data protection module (100, 204) for a data storage (202 A-N), the data protection module (100, 204) comprising one or more processors (102A-N) configured to execute the method of any one of claims 1 to 12.

15. A data storage system (200) comprising: one or more data storages (202A-N); and the data protection module (100, 204) of claim 14.

Description:
METHOD TO PARALLELIZE THE RESTORATION OF CONTINUOUS DATA PROTECTION (CDP) REPLICATION USING DELTA COMPRESSION

TECHNICAL FIELD

The disclosure relates generally to a restoration of continuous data protection (CDP) to a recovery point in parallel, and more particularly, the disclosure relates to a computer- implemented method for data management in a data storage. Moreover, the disclosure also relates to a data protection module for the data storage and a data storage system including the data protection module for data management in the data storage.

BACKGROUND

Continuous data protection (CDP) is a form of data replication that copies data from a source system to a target system. In continuous data protection (CDP) replication, sets of data extents that change in the same time period are grouped into datasets and sent to the target system. A dataset (DS) may contain a fully compressed data extent (as before) or differential data extent. The dataset is significant because it is a stepping stone from one crash-recovery point to another. Typically, each dataset may include 5 seconds worth of data. Continuous data protection (CDP) also means that a recovery/restore to a crash consistent point, which is a boundary between the datasets.

Typically, restoring to a traditional CDP restore point involves restoring target data volume to a snapshot and then applying CDP datasets in order until a recovery point dataset is applied. Each CDP dataset includes all the data extent and metadata describing the writes to the source system over a dataset period. Traffic reduction refers to a process where a method of compression is performed by referencing data that already exists at the target system, calculating and sending a representation of a difference of data.

Restoring data to a point in time from continuous data protection (CDP) replication includes traffic reduction. The traffic reduction is complex as, in order to restore a traffic reduction extent/data block, the reference extent/data block has to be obtained for applying the difference transform. If the reference data block does not exist, and it is important to determine how to obtain its value. The reference data block may be compressed using the traffic reduction, and thus it is important to resolve its reference. Long dependency chains and multiple dependency chains are typically used for restoring data to a time point using the traffic reduction.

An existing solution provides a method to generate a restoration plan that only restores a minimum number of CDP datasets and extents/data blocks and restores the CDP datasets to a target disk, dataset by dataset. This leads to many wasted inputs/outputs (IOs) to the target disk, and also extremely slow to execute. This method may not be performed in parallel for restoring the continuous data protection replication. This method has limitations as it is unable to recover from errors and may need to be restarted from a fresh disk. Further, this method does not control the resource requirements and usage by a restoration engine. The CDP datasets include traffic- reduction compression. The traffic reduction has a logic for generating the recovery plan as restoring data blocks requires access to reference data blocks, where a transform is applied to that data blocks, in order to generate the contents of the current extent/data block. The recovery plan ensures that the reference data block is available when needed. This method is fast and simple to calculate. However, it has some performance limitations as it cannot be executed in parallel and does not enable resource limits during the restoration process. Thus, restoring CDP datasets that include traffic reduction is currently significantly slower than restoring CDP datasets without traffic reduction. Further, resolving traffic reduction of the data block is computationally complex as the reference data block could itself be traffic reduction data block with its own dependencies. Additionally, there may exist multiple dependencies on the reference data block, where the depending data block may be separated over time in different data blocks and to determine (a) how long to cache each reference data block and (b) when it is safe to remove it. A further limitation of the existing solution is that if any anomaly is detected during a restoration process, the entire process has to start again.

The further limitations of the existing method are (i) every reference data block has to be written to the target disk and then read back multiple times, (ii) each dataset has be restored one by one and cannot be parallelized, even if the data blocks can be read in parallel from dataset, when the dataset contains traffic reduction blocks, first all the reference blocks must be read and only then can the new blocks be written, (iii) slow and (iv) if there is an error, the restore session has to be destroyed and restarted. Therefore, there arises a need to address the aforementioned technical problem/drawbacks in restoring an extent/data block in a data storage to a recovery point.

SUMMARY

It is an object of the disclosure to provide a computer-implemented method for data management in a data storage, a data protection module for the data storage, and a data storage system including the data protection module for data management in the data storage while avoiding one or more disadvantages of prior art approaches.

This object is achieved by the features of the independent claims. Further, implementation forms are apparent from the dependent claims, the description, and the figures.

The disclosure provides a computer-implemented method for data management in a data storage, a data protection module for the data storage, and a data storage system including the data protection module for data management in the data storage.

According to a first aspect, there is provided a computer-implemented method for data management in a data storage. The method includes, at each of a series of snapshot time points, replicating a plurality of data blocks from the data storage in a backup storage. The method includes, at each of a series of intermediate time points, (i) determining one or more changed data blocks that have been updated in the data storage, and (ii) replicating the changed data blocks in the backup storage. If one or more of the changed data blocks are similar to a reference block that has previously been replicated, the similar data blocks are represented by an address of the reference block, a time point of the reference block and a delta value. The method includes, in response to a restore request for a selected time point, generating a dependency graph by, processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point, by: (a) for each data block replicated at the processed time point: (i) adding a node corresponding to the address of the data block, if it does not already exist in the graph, (ii) checking whether the node corresponding to the address of the data block has a sub-node corresponding to the processed time point, and (iii) in the case where a sub-node corresponding to the processed time point does not already exist for the node in the graph, and if the processed time point of the data block is more recent than a time point associated with any existing sub-node of the node: adding a sub-node corresponding to the processed time point to the node, and identifying any reference block associated with the data block; or in the case where a sub-node corresponding to the processed time point already exists for the node in the graph: identifying any reference block associated with the data block; and (b) for each identified reference block: (i) adding a node corresponding to the address of the reference block, if it does not already exist in the graph, (ii) adding a sub-node corresponding to the time point of the reference block to the node, if it does not already exist in the graph, and (iii) mapping the association with the reference block as a path between the respective sub-nodes. The method includes performing a unique path analysis to determine one or more independent paths covering all sub-nodes in the dependency graph. The method includes restoring the plurality of data blocks replicated at the preceding snapshot time point from backup storage and following each of the independent paths to propagate a plurality of changes made up to the selected time point.

The method enables the generation of the dependency graph and efficiently restores the data blocks replicated at the preceding snapshot time point in the data storage from the backup storage. The method enables restoration of the data blocks and improves the performance of the traffic reduction. For regular continuous data protection (CDP) (i.e. without the traffic reduction), the method maximizes the parallelism in the restoration process by restoring the data blocks in parallel and also enables pre-fetching of the data blocks. The method minimizes unneeded read and write inputs/outputs (IOs) to a target disk, and it is fast to execute. The method controls the performance and resource usage by a data protection module while restoring the data blocks in parallel. The method enables fine grained recovery from data errors, thereby eliminating the need for restarting the entire restoration process. The dependency graph generated by the method is compact and has a tree structure that replaces metadata. The method enables the generation of discrete restoration trees which enables the performance through parallelized execution and error recovery.

The reference block addresses and delta values may be compressed. Optionally, a new intermediate time point is made after a predetermined time or a predetermined volume of changed data. Optionally, the predetermined time is 5 seconds and the predetermined volume of changed data is 10 megabytes.

Each reference block address may be stored with a checksum value for the reference block. Optionally, restoring each reference block includes generating a checksum value for the reference block and validating the generated checksum value against the stored checksum value. If the validation of the reference block may fail, the method further includes restarting the independent path corresponding to the restored block.

Optionally, restoring the data blocks from the backup storage includes sending the independent paths to an external device for execution. Optionally, restoring the data blocks includes identifying one or more snapshot data blocks corresponding to the preceding snapshot time point, and caching the snapshot data blocks. Restoring the plurality of data blocks may include, while processing a data block, pre-fetching a following data block on the same independent path. Optionally, restoring the plurality of data blocks includes following a plurality of the independent paths in parallel. Optionally, the method further includes determining a maximum number of data blocks that can be restored simultaneously and setting a corresponding maximum number of independent paths than can be followed in parallel.

According to a second aspect, there is provided a computer readable medium configured to store instructions which, when executed by a processor, cause the processor to execute the above method.

According to a third aspect, there is provided a data protection module for a data storage. The data protection module includes one or more processors configured to execute the above method.

The data protection module enables the generation of the dependency graph and efficiently restores the data blocks replicated at the preceding snapshot time point in the data storage from the backup storage. The data protection module enables restoration of the data blocks and improves the performance of the traffic reduction. For regular continuous data protection (CDP) (i.e. without the traffic reduction), the data protection module maximizes the parallelism in the restoration process by restoring the data blocks in parallel and also enables pre-fetching of the data blocks. The data protection module minimizes unneeded read and write inputs/outputs (IOs) to a target disk, and it is fast to execute. The data protection module controls the performance and resource usage while restoring the data blocks in parallel. The data protection module can enable fine grained recovery from data errors, thereby eliminating the need for restarting the entire restoration process. The data protection module can enable the generation of discrete restoration trees which can improve the performance through parallelized execution and error recovery. According to a fourth aspect, there is provided a data storage system. The data storage system includes one or more data storages and the data protection module as described above.

Therefore, in contradistinction to the existing solutions, the method enables the generation of the dependency graph and efficiently restores the data blocks replicated at the preceding snapshot time point in the data storage from the backup storage. The method enables restoration of the data blocks and improves the performance of the traffic reduction. For regular continuous data protection (CDP) (i.e. without the traffic reduction), the method maximizes the parallelism in the restoration process by restoring the data blocks in parallel and also enables pre-fetching of the data blocks. The method minimizes unneeded read and write inputs/outputs (IOs) to a target disk, and it is fast to execute. The method controls the performance and resource usage by a data protection module while restoring the data blocks in parallel. The method enables fine grained recovery from data errors, thereby eliminating the need for restarting the entire restoration process. The dependency graph generated by the method is compact and has a tree structure that replaces metadata. The method enables the generation of discrete restoration trees which enables the performance through parallelized execution and error recovery.

These and other aspects of the disclosure will be apparent from and the implementation(s) described below.

BRIEF DESCRIPTION OF DRAWINGS

Implementations of the disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:

FIG. 1 is a block diagram of a data protection module for a data storage in accordance with an implementation of the disclosure;

FIG. 2 is a block diagram of a data storage system in accordance with an implementation of the disclosure;

FIG. 3 is an exemplary map of dependencies for extents required in restoring to a point in time in accordance with an implementation of the disclosure; FIG. 4 illustrates an exemplary dependency graph that is generated for a data storage using a data protection module in accordance with an implementation of the disclosure;

FIG. 5 illustrates an exemplary process of determining one or more independent paths using a data protection module in accordance with an implementation of the disclosure;

FIG. 6 illustrates an exemplary process of extracting one or more independent paths covering all sub-nodes from a dependency graph into one or more execution trees in accordance with an implementation of the disclosure;

FIG. 7 illustrates an exemplary view of execution of one or more execution trees combined with resource management in accordance with an implementation of the disclosure;

FIGS. 8A-8C are flow diagrams that illustrate a method for data management in a data storage in accordance with an implementation of the disclosure; and

FIG. 9 is an illustration of an exemplary data protection module, a data storage system, or a computer system in which the various architectures and functionalities of the various previous implementations may be implemented.

DETAILED DESCRIPTION OF THE DRAWINGS

Implementations of the disclosure provide a computer-implemented method for data management in a data storage and the disclosure also relates to a data protection module for the data storage and a data storage system including the data protection module for data management in the data storage.

To make solutions of the disclosure more comprehensible for a person skilled in the art, the following implementations of the disclosure are described with reference to the accompanying drawings.

Terms such as "a first", "a second", "a third", and "a fourth" (if any) in the summary, claims, and foregoing accompanying drawings of the disclosure are used to distinguish between similar objects and are not necessarily used to describe a specific sequence or order. It should be understood that the terms so used are interchangeable under appropriate circumstances, so that the implementations of the disclosure described herein are, for example, capable of being implemented in sequences other than the sequences illustrated or described herein. Furthermore, the terms "include" and "have" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or units, is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.

Definitions:

Delta compression: Delta compression or data difference is a way of storing or transmitting data in the form of differences (i.e. deltas) between blocks of data rather than the complete blocks. The differences are recorded discretely, and may be called "deltas" or "diffs". By comparing data, a reference block of data can be found such that the differences between a block of data and the reference block are small. The delta compression greatly reduces data redundancy. Collections of unique deltas are substantially more space-efficient than their non- encoded equivalents.

Extent: An extent is a contiguous area of physical storage allocated by a file system. An extent may be referred to as a block of data, or may include a range of one or more data blocks.

Snapshot: The snapshot is a state of a storage system (e.g. a data storage system) captured at a given point in time. Preserving the storage system state not only allows data to be recovered in the event of failure but restored to known working points.

Data Storage: The data storage is essential because it backs up critical data to a central location. Users can then easily access this data. The data storage units are data storage devices/sy stems that allow storage and retrieval of data from a central location for authorized network users.

FIG. 1 is a block diagram of a data protection module 100 for a data storage in accordance with an implementation of the disclosure. The data protection module 100 includes one or more processors 102A-N. The one or more processors 102A-N are configured, at each of a series of snapshot time points, to replicate one or more data blocks from the data storage in a backup storage. The one or more processors 102A-N are configured, at each of a series of intermediate time points, to (i) determine one or more changed data blocks that have been updated in the data storage, and (ii) replicate the changed data blocks in the backup storage. If one or more of the changed data blocks are similar to a reference block that has previously been replicated, the similar data blocks are represented by an address of the reference block, a time point of the reference block, and a delta value.

The one or more processors 102A-N are configured, in response to a restore request for a selected time point, to generate a dependency graph by processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point, by: (a) for each data block replicated at the processed time point: (i) adding a node corresponding to the address of the data block, if it does not already exist in the graph, (ii) checking whether the node corresponding to the address of the data block has a sub-node corresponding to the processed time point, and (iii) in the case where a sub-node corresponding to the processed time point does not already exist for the node in the graph, and if the processed time point of the data block is more recent than a time point associated with any existing sub-node of the node: adding a sub-node corresponding to the processed time point to the node, and identifying any reference block associated with the data block; or in the case where a sub-node corresponding to the processed time point already exists for the node in the graph: identifying any reference block associated with the data block; and (b) for each identified reference block: (i) adding a node corresponding to the address of the reference block, if it does not already exist in the graph, (ii) adding a sub-node corresponding to the time point of the reference block to the node, if it does not already exist in the graph, and (iii) mapping the association with the reference block as a path between the respective sub-nodes. The one or more processors 102A-N are configured to determine one or more independent paths covering all sub-nodes in the dependency graph by performing a unique path analysis. The one or more processors 102A-N are configured to restore the one or more data blocks replicated at the preceding snapshot time point from backup storage and follow each of the independent paths to propagate one or more changes made up to the selected time point.

The data protection module 100 enables the generation of the dependency graph and efficiently restores the data blocks replicated at the preceding snapshot time point in the data storage from the backup storage. The data protection module 100 enables restoration of the data blocks and improves the performance of the traffic reduction. For regular continuous data protection (CDP) (i.e. without the traffic reduction), the data protection module 100 maximizes the parallelism in the restoration process by restoring the data blocks in parallel and also enables pre-fetching of the data blocks. The data protection module 100 minimizes unneeded read and write inputs/outputs (IOs) to a target disk, and it is fast to execute. The data protection module 100 controls the performance and resource usage while restoring the data blocks in parallel. The data protection module 100 enables fine grained recovery from data errors, thereby eliminating the need for restarting the entire restoration process. The dependency graph generated by the data protection module 100 is compact and has a tree structure that replaces metadata. The data protection module 100 enables the generation of discrete restoration trees which enables the performance through parallelized execution and error recovery.

The one or more data blocks may be referred to as extents of the data storage. Each data block or extend may be associated with an address, e.g. an extent may have the form <start_address, length> where the length may be fixed at, for example, 4kB or 1MB.

The continuous data protection (CDP) may include a snapshot of a state of the data storage at the beginning of the session and a set of datasets (e.g. the selected time point, the intermediate time point, etc.) each representing a few seconds of input/output (IO) to the data storage being replicated. Each dataset may include contents of the changed extents/data blocks during that period. Optionally, a new intermediate time point is made after a predetermined time or a predetermined volume of changed data. Optionally, the predetermined time is 5 seconds and the predetermined volume of changed data is 10 megabytes. Optionally, the extent/data block in the dataset may be stored as raw, compressed, or transformed by the traffic reduction. When the data block is sent using the traffic reduction method, the extent/data block is compared to a cache of the unique contents of the previous datasets. If the data block that is sent is similar to a cached extent/data block, the difference is calculated, and the reference to the source data blocks and the difference data is sent instead of the raw or compressed data block.

Each of the one or more data blocks may be a fully compressed data block or the delta value (i.e. a differential data block). The data protection module 100, at each of the series of snapshot time points, builds a snapshot for the data storage, i.e. a full disk image of the data storage. The snapshot may contain each data block in the data storage at the snapshot time point. Each data block in the snapshot may be a fully compressed data block.

The differential data blocks may include the address of the reference block, the time point of the reference block, the delta value, or checksum. The reference block addresses and delta values may be compressed. The reference block may be a data block that has been previously replicated from the data storage in the backup storage.

The data protection module 100 may store each reference block address with a checksum value for the reference block. Optionally, the data protection module 100 is configured to restore each reference block by generating a checksum value for the reference block and validating the generated checksum value against the stored checksum value. Optionally, the data protection module 100 adds the checksum value for the reference block to ensure that the differential transform is being applied to the correct reference block contents and to detect errors early and fail out with the correct error. If the validation of the reference block fails, the data protection module 100 may restart the independent path corresponding to the restored block. Optionally, the data protection module 100 restores the data blocks from the backup storage by sending the independent paths to an external device for execution.

If there is no match in the checksum value of the reference block, the recovery plan fails to restore. If there is a match in the checksum value for the reference block, the data protection module 100 may apply the difference to the reference block and update the block contents in memory.

Optionally, the data protection module 100 restores the data blocks by identifying one or more snapshot data blocks corresponding to the preceding snapshot time point, and caching the snapshot data blocks. The data protection module 100 may restore the one or more data blocks, while processing a data block, by pre-fetching a following data block on the same independent path. Optionally, the data protection module 100 restores the one or more data blocks by following one or more independent paths in parallel. Optionally, the data protection module 100 determines a maximum number of data blocks that can be restored simultaneously and sets a corresponding maximum number of independent paths than can be followed in parallel.

FIG. 2 is a block diagram of a data storage system 200 in accordance with an implementation of the disclosure. The data storage system 200 includes one or more data storages 202A-N and a data protection module 204. Optionally, the one or more data storages 202A-N are communicatively connected to the data protection module 204. The data protection module 204 is configured, at each of a series of snapshot time points, to replicate one or more data blocks from the data storage in a backup storage. The data protection module 204 is configured, at each of a series of intermediate time points, to (i) determine one or more changed data blocks that have been updated in the data storage, and (ii) replicate the changed data blocks in the backup storage. If one or more of the changed data blocks are similar to a reference block that has previously been replicated, the similar data blocks are represented by an address of the reference block, a time point of the reference block and a delta value. The data protection module 204 is configured, in response to a restore request for a selected time point, to generate a dependency graph by processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point, by: (a) for each data block replicated at the processed time point: (i) adding a node corresponding to the address of the data block, if it does not already exist in the graph, (ii) checking whether the node corresponding to the address of the data block has a sub-node corresponding to the processed time point, and (iii) in the case where a sub-node corresponding to the processed time point does not already exist for the node in the graph, and if the processed time point of the data block is more recent than a time point associated with any existing sub node of the node: adding a sub-node corresponding to the processed time point to the node, and identifying any reference block associated with the data block; or in the case where a sub-node corresponding to the processed time point already exists for the node in the graph: identifying any reference block associated with the data block; and (b) for each identified reference block: (i) adding a node corresponding to the address of the reference block, if it does not already exist in the graph, (ii) adding a sub-node corresponding to the time point of the reference block to the node, if it does not already exist in the graph, and (iii) mapping the association with the reference block as a path between the respective sub-nodes. The data protection module 204 is configured to determine one or more independent paths covering all sub-nodes in the dependency graph by performing a unique path analysis. The data protection module 204 is configured to restore the one or more data blocks replicated at the preceding snapshot time point from backup storage and following each of the independent paths to propagate one or more changes made up to the selected time point. Optionally, a new intermediate time point is made after a predetermined time or a predetermined volume of changed data. Optionally, the predetermined time is 5 seconds and the predetermined volume of changed data is 10 megabytes.

The backup storage may be a designated area of the data storage adjacent to a disk or area that is being backed-up. Alternatively, the backup storage may be located in a second data storage e.g. a cloud storage or backup server. In some implementations, the snapshots may be stored in a different location to the changed data blocks from each intermediate time point. For example, the snapshots may be stored in a designated area of the data storage and the changed data blocks may be stored in a second data storage.

The data storage system 200 is a term referred to describe the data storage, or a group of data storages 202A-N, that a network uses store copies of one or more data items across high-speed connections. The one or more data storages 202A-N are configured to back up critical data items/files and other data to a central location. The one or more data storages 202A-N enable users to access these data items/files. The data storages 202A-N are data storage devices/sy stems that are connected to a network that allows storage and retrieval of data from a central location for authorized network users.

FIG. 3 is an exemplary map of dependencies for extents required in restoring to a point in time in accordance with an implementation of the disclosure. FIG. 3 depicts a restoration of continuous data protection (CDP) for a data storage to a point in time (e.g. a selected time point). Optionally, a data protection module restores a snapshot time point to a CDP restore point, (i.e. an intermediate time point or a dataset (DS) 69). The top row (i.e. snap) may represent the data blocks from the snapshot time point that are needed. Optionally, each row (e.g. 50-69) represents a dataset (DS)/intermediate time point and its relevant raw and traffic reduction data blocks and their dependencies. Optionally, each row (50-69) represents the relevant content of each dataset or snapshot. Optionally, each column (A-J) represents a data block for each intermediate time point (i.e. DS50-DS69) on a backup storage or a target disk. The typical implementation depicts a raw data block 302 and a dependent data block 304. The raw data block 302 may represent the data block that has no dependencies (i.e. the data blocks that are not compressed using traffic reduction). The dependent data block 304 may represent the data block that is dependent on a reference block (i.e. the data blocks that are compressed using the traffic reduction). An arrow 306 may represent the dependent extents/reference blocks of the dependent data block 304.

In the typical implementation, the datasets 69 to 50 are processed. Optionally, the processing of the datasets 69 to 50 includes (i) for each data block (A-J) that is not already in the recovery plan, adding that data block to the recovery plan, (ii) if that data block is a traffic reduction data block, adding the reference block to a fetch list under the data block’s dataset, and (iii) as each dataset is processed, adding any data block for that dataset to the recovery plan and remove from the list.

In the typical implementation, the recovery plan may not resolve dependency chains which add complexity for generating the recovery plan. Optionally, the typical implementation depicts 2 types of chains. The 2 types of chains may include a long dependency chain and a fan out chain. The long dependency chain of the data blocks as follows: DSso:G - DSsi:E - DSs2:C - DS57:F - DS 6O :H - DS 6 s:E. Optionally, the fan out chain of the data blocks is as follows: DS64:C ^ DS69:A and DS64:C ^ DS66:F ^ DS67:I. The recovery plan that has cache values must determine when to remove unneeded reference data blocks. The typical implementation depicts an inefficient recovery plan that includes the following data blocks. 1. DS50: D, G 2. DS 51 : E 3. DS 52 : C 4. DS54: J 5. DS 57 : F 6. DS 60 : H 7. DS64: C 8. DS65: E 9. DS 66 : A, F 10. DS 67 : B, I 11. DS68: G 12. DS 69 : A, C he recovery plan is fetching the data blocks from the intermediate time points (DS50 - DS69) cludes the metadata to describe data block type, dependencies, etc. The recovery plan may store a minimum number of data blocks required for restoring. The typical implementation so shows the execution of the recovery plan. The recovery plan is executed as follows: 1. For DS50: PRE: fetch Bsnap and EXC: write Bsnap»D, G 2. For DS51: PRE: fetch G50 and EXC: write G50»E 3. For DS 52 : PRE: fetch E 51 and EXC: write E 51 »C 4. For DS54: PRE: fetch Isnap and EXC: write Isnap»J 5. For DS57: PRE: fetch C52 and EXC: write C52»F 6. For DS 60 : PRE: fetch F 57 and EXC: write F 57 »H 7. For DS 64 : PRE: -- and EXC: write C 8. For DS65: PRE: fetch H60 and EXC: write H60»E 9. For DS 66 : PRE: fetch C 64 and EXC: write A, C 64 »F 10. For DS67: PRE: fetch A66, F66 and EXC: write A66»B, F66»I 11. For DS68: PRE: -- and EXC: write G 12. For DS69: PRE: fetch C64 and EXC: write C64»A, C FIG. 4 illustrates an exemplary dependency graph that is generated for a data storage using a data protection module in accordance with an implementation of the disclosure. FIG.4 depicts he dependency graph of the map shown in FIG. 3. In this example, to restore a selected time oint (e.g. a selected time point/dataset 69), a dependency graph is generated by processing ach intermediate time point (i.e. the datasets 50-68) between the selected time point (dataset 9) and a preceding snapshot time point (i.e. from dataset 69 to 50), in reverse time order tarting with the selected time point. Optionally, the data protection module processes the atasets 69 to 50. Optionally, for each extent/data block (A-J) that is not already in the ependency graph (i.e. not already in the recovery plan), the data protection module (i) adds a ode (e.g. A-J) corresponding to the address of the data block to the dependency graph, (ii) hecks whether the node (A-J) corresponding to the address of the data block has a sub-node i.e. a child node) corresponding to the processed time point, and (iii) adds a sub-node orresponding to the processed time point to the node, and identifies any reference block ssociated with the node. Optionally, the node is the data block id/address and the node values the dataset id. Optionally, if the data block is a traffic reduction data block, the data protection module adds the reference block to the dependency graph. Optionally, if the node is the data lock id/address and the node value is the dataset id and the data protection module adds a irected connection from this reference block to the data block node. Optionally, the data protection module processes the dependency graph to determinendependent paths/trees and to determine the oldest node with no dependencies. When estoring, the data blocks may collect any reference extents/blocks from snapshot or generate a econdary snapshot as a reference block and process each tree independently, if any errors exist, nd then restart the independent path. For example, for each dataset 69 to 50, the data protection module processes each node (A-J) nd checks if the node corresponding to the address of the data block has sub-node for datasets 50-69). If a sub-node corresponding to the processed time point does not already exist for the ode in the dependency graph, the data protection module adds a sub-node corresponding to he processed time point to the node and identifies any reference block associated with the data lock. If a sub-node corresponding to the processed time point already exists for the node in he dependency graph, the data protection module adds a sub-node corresponding to the rocessed time point to the node and identifies any reference block associated with the data block. If the node is a traffic reduction data block, then the data protection module adds a node and sub-node for the reference block (if it does not exist in the dependency graph) and add a directed link from this sub-node to the sub-node associated with the traffic reduction data block.

Optionally, when processing a reference block, the reference dataset id is a real dataset id, not a snapshot (i.e. ‘snap’). Optionally, when a reference to a dataset id is < 50, it may be the snapshot (i.e. ‘snap’).

For dataset 69:A, the nodes are added as follows:

• A does not exist, add node A and sub-node A:69 (A new node)

• Reference to C:64, add C:64 a. C does not exist, create C and sub-node 64 (creating a reference C: 64 to a node that does not already exist)

• Add link from A:69 to C:64

For dataset 69 :C, the nodes are added as follows:

• C does exist

• If 69 > max (sub-nodes) i.e. (69 > 64) then add sub-node 69 (Adding a sub-node to an existing node, but the existing sub-node is from a reference and lower subset, so add it)

• There are no references

For dataset 68:A, the nodes are added as follows:

• A does exist

• If 68 > max (sub-nodes) i.e. (68 > 69) == FALSE then skip (A node and sub-node from previous dataset, so skip), i.e. address is already in the recovery plan.

For dataset 68:G, the nodes are added as follows:

• G does not exist, create node G and sub-node 68 For dataset 67:B, the nodes are added as follows:

• B does not exist, create B and sub-node 67

• Reference to A:66, create A:66 b. Node A exists, add sub-node 66 • Add link from B:68 to A:66

For dataset 67:1, the nodes are added as follows:

• I does not exist, create I and sub-node 67

• Reference to sub-node F :66, create F :66

• Add link from 1:67 to F: 66

For dataset 66:A, the nodes are added as follows:

• A exists and sub-node 66 already exists, no need to add

• A:66 exists in the recovery plan, need to check for references

• There is a reference to C:64, add C:64 (which already exists)

• Add link from A:66 to C:64

For dataset 66 :C, the nodes are added as follows:

• Node C exists

• if 60 > max(sub-nodes:69) == FALSE then skip For dataset 66:F, the nodes are added as follows:

• F does not exist, add F and sub-node 66

• Add reference C:64 (which already exists)

• Add link F:66 to C:64.

FIG. 5 illustrates an exemplary process of determining one or more independent paths 502 using a data protection module in accordance with an implementation of the disclosure. The data protection module determines one or more independent paths 502 covering all sub-nodes in a dependency graph by performing a unique path analysis. The unique path analysis may be a shortest/longest path analysis or any other graph algorithm known by a person skilled in the art and suitable to find optimal paths from a graph.

Optionally, the data protection module performs the unique path analysis to determine a shortest independent path and an independent longest path. The longest or shortest independent path may work as the dependency graph is unweighted. Optionally, the data protection module detects and calculates multiple independent paths 502 until all the nodes are covered. Optionally, when building the independent paths 502, the data protection module follows directed paths to sub-nodes that contain dataset ids. As each node may be added to the dependency graph, the data protection module determines the node (i.e. data block address) in order to add that full information about the data block.

Optionally, the data protection module processes the dependency graph to determine independent paths/trees 502 and to determine the oldest node (e.g. node 69:C and node 68:G) with no dependencies. The nodes 69 :C and 68 :G have no dependencies and no independent paths/trees. The independent paths/trees 502, as shown in FIG. 5, is as follows: (1) 66: A - 67:B, (2) Snap:B - 50:D, (3) SnapT - 54:J, (4) 64:C - 69:A , (5) 64:C - 66:F - 67:1 and (6) 50:G - 51:E - 52:C - 57:F - 60:H - 65 :E. In this example, the datasets are 50-69 and the data blocks are A-J.

FIG. 6 illustrates an exemplary process of extracting one or more independent paths covering all sub-nodes from a dependency graph into one or more execution trees in accordance with an implementation of the disclosure. In this example, the one or more execution trees is shown as: (1) Snap:B - 50:D -WA:D, (2) SnapT - 54: J - WA: J, (3) 50:G - 51 :E - 52:C - 57:F - 60:H - WA:H, (4) 65:E - WA:E, (5) 69:C - WA:C, (6) 68:G - WA:G, (7) 66:A - 67:B - WA:B, (8) 64:C - 69:A - WA: A, (9) 66:F - WAT and (10) 67:1 - WAT. In this example, the datasets are 50-69 and the data blocks are A-J. The WA represents the data block written to a target disk (e.g. WA:D represents the data block D written to the target disk). Optionally, when processing a reference block, the reference dataset id is a real dataset id, not a snapshot (i.e. ‘snap’). Optionally, when a reference to a dataset id is < 50, it may be the snapshot (i.e. ‘snap’).

Optionally, each execution tree may be checked to determine if there are any dependencies on data blocks in the snapshot, these may be read first and cached. When restoring, the data blocks may collect any reference extents/blocks from snapshot (i.e. snap) or generate a secondary snapshot as a reference block and process each execution tree independently, if any errors exist, and then restart the independent path. The secondary snapshot may be accessed at any time during the execution. Further, since continuous data protection (CDP) only reference recent CDP objects, and if the previous datasets still exists, then all the reference blocks may be resolved from CDP objects and there is no need to read the data blocks from the snapshot. All the execution trees may start asynchronously as there are no common dependencies. Each execution tree may be executed independently, in parallel to other execution trees. During pre-fetching, when the CDP datasets are stored on secondary or object storage with high latency, the next extents/data blocks in the independent path may be fetched while the current data blocks are being processed and the dependency graph, as shown in FIG. 4, may determine if any other data blocks need to be fetched from the same dataset and collect all data blocks in a single input/output (IO) operation.

Optionally, all data blocks are fetched in a single non-branch path (i.e. fetch in parallel). For example, 50:G, 51:E, 52:C, 57F, 60:H are fetched at once to minimize the latency for fetching data from datasets. Even, if a whole path is not fetched, the next nodes may be pre-fetched to ensure that there is no wait time.

As an execution tree is directed, there is no need to fetch the metadata from the dataset describing the reference block, only the differential value needs to be obtained, thereby reducing the data to be restored from the dataset. The location in the execution tree may be used to determine the type of the data block (i.e. first may be a raw data block and subsequent may be a traffic reduction data block).

FIG. 7 illustrates an exemplary view of execution of one or more execution trees in combination with resource management in accordance with an implementation of the disclosure. In this example, there are 7 execution trees, which as shown as follows: (1) Snap:B - 50:D -WA:D, (2) SnapT - 54: J - WA:J, (3) 50:G - 51:E - 52:C - 57:F - 60:H - WA:H, (4) 68:G - WA:G, (5) 64:C - 69: A - WA:A, (6) 66:F - WA:F and (7) 67:1 - WA:I. Optionally, the 7 execution trees are run in parallel. During the execution of the one or more execution trees, two data blocks are read and cached from snapshots B and I, which may be released once the execution tree is completed.

Optionally, the data protection module has control over the resources by determining a number of concurrent execution trees that are executed and determining a maximum number of data blocks being processed at any time point. For example, FIG. 7 shows that the number of data blocks is limited to 3 concurrent data block buffers. Further, the scheduling of a recovery plan may be optimized for maximum performance or it may be scheduled to control the limit the number of resources that are required. Traffic reduction may only reference the last n datasets. If the previous dataset exists, then the reference extents are collected directly from that dataset. FIGS. 8A-8C are flow diagrams that illustrate a method for data management in a data storage in accordance with an implementation of the disclosure. At a step 802, at each of a series of snapshot time points, one or more data blocks is replicated from the data storage in a backup storage. At a step 804, at each of a series of intermediate time points, one or more changed data blocks that have been updated in the data storage are determined, and the changed data blocks are replicated in the backup storage. If one or more of the changed data blocks are similar to a reference block that has previously been replicated, the similar data blocks are represented by an address of the reference block, a time point of the reference block, and a delta value. At a step 806, in response to a restore request for a selected time point, a dependency graph is generated by processing each intermediate time point between the selected time point and a preceding snapshot time point, in reverse time order starting with the selected time point, by: (a) for each data block replicated at the processed time point: (i) adding a node corresponding to the address of the data block, if it does not already exist in the graph, (ii) checking whether the node corresponding to the address of the data block has a sub-node corresponding to the processed time point, and (iii) in the case where a sub-node corresponding to the processed time point does not already exist for the node in the graph, and if the processed time point of the data block is more recent than a time point associated with any existing sub-node of the node: adding a sub-node corresponding to the processed time point to the node, and identifying any reference block associated with the data block; or in the case where a sub-node corresponding to the processed time point already exists for the node in the graph: identifying any reference block associated with the data block; and (b) for each identified reference block: (i) adding a node corresponding to the address of the reference block, if it does not already exist in the graph, (ii) adding a sub-node corresponding to the time point of the reference block to the node, if it does not already exist in the graph, and (iii) mapping the association with the reference block as a path between the respective sub-nodes. At a step 808, a unique path analysis is performed to determine one or more independent paths covering all sub-nodes in the dependency graph. At a step 810, the one or more data blocks replicated at the preceding snapshot time point is restored from backup storage and each of the independent paths is followed to propagate one or more changes made up to the selected time point.

The method enables the generation of the dependency graph and efficiently restores the data blocks replicated at the preceding snapshot time point in the data storage from the backup storage. The method enables restoration of the data blocks and improves the performance of the traffic reduction. For regular continuous data protection (CDP) (i.e. without the traffic reduction), the method maximizes the parallelism in the restoration process by restoring the data blocks in parallel and also enables pre-fetching of the data blocks. The method minimizes unneeded read and write inputs/outputs (IOs) to a target disk, and it is fast to execute. The method controls the performance and resource usage by a data protection module while restoring the data blocks in parallel. The method enables fine grained recovery from data errors, thereby eliminating the need for restarting the entire restoration process. The dependency graph generated by the method is compact and has a tree structure that replaces metadata. The method enables the generation of discrete restoration trees which enables the performance through parallelized execution and error recovery.

The reference block addresses and delta values may be compressed. Optionally, a new intermediate time point is made after a predetermined time or a predetermined volume of changed data. Optionally, the predetermined time is 5 seconds and the predetermined volume of changed data is 10 megabytes.

Each reference block address may be stored with a checksum value for the reference block. Optionally, restoring each reference block includes generating a checksum value for the reference block and validating the generated checksum value against the stored checksum value. If the validation of the reference block may fail, the method further includes restarting the independent path corresponding to the restored block.

Optionally, restoring the data blocks from the backup storage includes sending the independent paths to an external device for execution. Optionally, restoring the data blocks includes identifying one or more snapshot data blocks corresponding to the preceding snapshot time point, and caching the snapshot data blocks. Restoring the plurality of data blocks may include, while processing a data block, pre-fetching a following data block on the same independent path. Optionally, restoring the plurality of data blocks includes following a plurality of the independent paths in parallel. Optionally, the method further includes determining a maximum number of data blocks that can be restored simultaneously and setting a corresponding maximum number of independent paths than can be followed in parallel.

The method provides a solution to execute restore of dataset and traffic reduction in parallel, and also enables prefetching and error recovery. The method reduces IOs to a disk to only final content. The method maximizes throughput and minimizes RAM usage for buffers. The method enables that the execution can be distributed across multiple data protection module/restore engines, e.g, in cloud: multiple PODs/lambda functions.

In an implementation, a computer-readable medium configured to store instructions which, when executed by a processor, causes the processor to execute the above method.

FIG. 9 is an illustration of an exemplary data protection module, a data storage system, or a computer system in which the various architectures and functionalities of the various previous implementations may be implemented. As shown, the computer system 900 includes at least one processor 904 that is connected to a bus 902, wherein the computer system 900 may be implemented using any suitable protocol, such as PCI (Peripheral Component Interconnect), PCI-Express, AGP (Accelerated Graphics Port), Hyper Transport, or any other bus or point-to- point communication protocol (s). The computer system 900 also includes a memory 906.

Control logic (software) and data are stored in the memory 906 which may take a form of random-access memory (RAM). In the disclosure, a single semiconductor platform may refer to a sole unitary semiconductor-based integrated circuit or chip. It should be noted that the term single semiconductor platform may also refer to multi-chip modules with increased connectivity which simulate on-chip modules with increased connectivity which simulate on- chip operation, and make substantial improvements over utilizing a conventional central processing unit (CPU) and bus implementation. Of course, the various modules may also be situated separately or in various combinations of semiconductor platforms per the desires of the user.

The computer system 900 may also include a secondary storage 910. The secondary storage 910 includes, for example, a hard disk drive and a removable storage drive, representing a floppy disk drive, a magnetic tape drive, a compact disk drive, digital versatile disk (DVD) drive, recording device, universal serial bus (USB) flash memory. The removable storage drive at least one of reads from and writes to a removable storage unit in a well-known manner.

Computer programs, or computer control logic algorithms, may be stored in at least one of the memory 906 and the secondary storage 910. Such computer programs, when executed, enable the computer system 900 to perform various functions as described in the foregoing. The memory 906, the secondary storage 910, and any other storage are possible examples of computer-readable media. In an implementation, the architectures and functionalities depicted in the various previous figures may be implemented in the context of the processor 904, a graphics processor coupled to a communication interface 912, an integrated circuit (not shown) that is capable of at least a portion of the capabilities of both the processor 904 and a graphics processor, a chipset (namely, a group of integrated circuits designed to work and sold as a unit for performing related functions, and so forth).

Furthermore, the architectures and functionalities depicted in the various previous-described figures may be implemented in a context of a general computer system, a circuit board system, a game console system dedicated for entertainment purposes, an application-specific system. For example, the computer system 900 may take the form of a desktop computer, a laptop computer, a server, a workstation, a game console, an embedded system.

Furthermore, the computer system 900 may take the form of various other devices including, but not limited to a personal digital assistant (PDA) device, a mobile phone device, a smart phone, a television, and so forth. Additionally, although not shown, the computer system 900 may be coupled to a network (for example, a telecommunications network, a local area network (LAN), a wireless network, a wide area network (WAN) such as the Internet, a peer-to-peer network, a cable network, or the like) for communication purposes through an I/O interface 908.

It should be understood that the arrangement of components illustrated in the figures described are exemplary and that other arrangement may be possible. It should also be understood that the various system components (and means) defined by the claims, described below, and illustrated in the various block diagrams represent components in some systems configured according to the subject matter disclosed herein. For example, one or more of these system components (and means) may be realized, in whole or in part, by at least some of the components illustrated in the arrangements illustrated in the described figures.

In addition, while at least one of these components are implemented at least partially as an electronic hardware component, and therefore constitutes a machine, the other components may be implemented in software that when included in an execution environment constitutes a machine, hardware, or a combination of software and hardware. Although the disclosure and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope of the disclosure as defined by the appended claims.