BLOCK STORAGE METHOD AND SYSTEM FOR REDUCING DATA IN DATA DEDUPLICATION

Title:

BLOCK STORAGE METHOD AND SYSTEM FOR REDUCING DATA IN DATA DEDUPLICATION

Document Type and Number:

WIPO Patent Application WO/2022/069041

Kind Code:

Abstract:

A block storage method for storing an incoming I/O write request in a form of data blocks in a block storage, reduces an amount of write amplification in the block storage. The method includes dividing the incoming I/O write request into a first and a second incoming chunk using content based variable chunking method. The method further includes calculating a first hash value for the first incoming chunk and comparing the first hash value to the hash values in a map. If there is a match, the method includes identifying a matching prestored chunk and storing at least one pointer to a prestored first data block in the prestored chunk data, for efficient data deduplication.

Inventors:

SCHNEIDER ZVI (DE)
NATANZON ASSAF (DE)

Application Number:

PCT/EP2020/077440

Publication Date:

April 07, 2022

Filing Date:

October 01, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH CO LTD (CN)
SCHNEIDER ZVI (DE)

International Classes:

G06F3/06

Foreign References:

US9465808B1	2016-10-11
US20130046733A1	2013-02-21

Attorney, Agent or Firm:

KREUZ, Georg (DE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A block storage method (100) for storing an incoming I/O write request in a form of data blocks in a block storage (206), said block storage (206) comprising previously stored data for which hash values have been calculated based on a content based variable chunking method and a map between said hash values and pointers to logical addresses of corresponding prestored chunks of the previously stored data, the method (100) comprising the steps of dividing the incoming I/O write request into a first and a second incoming chunk using the content based variable chunking method, calculating a first hash value for the first incoming chunk, comparing the first hash value to the hash values in the map and if there is a match, o identifying a matching prestored chunk based on the first hash value, o store at least one pointer to a prestored first data block in the prestored chunk data having at least partially identical data to a first incoming data block in the first incoming chunk, instead of storing the first incoming data block in the block storage (206), and if there is not a match, o storing the first incoming chunk in the block storage (206), and storing the first hash value in the map together with a pointer to a first incoming chunk’s logical address in the block storage (206).

2. The method (100) according to claim 1, comprising the step of, if the prestored first data block has fully identical data to the first incoming data block, storing one pointer to the prestored first data block instead of storing the first incoming data block.

3. The method (100) according to claim 1 or 2, comprising the step of, if the prestored first data block has partially identical data to the first incoming data block, storing one pointer to the prestored first data block and one pointer to an adjacent block having identical data to a remainder of the first incoming data block instead of storing the first incoming data block.

4. The method (100) according to any one of the preceding claims, comprising the steps of, if the first incoming chunk comprises a first incomplete data block in addition to one

29 or more complete fixed sized incoming data blocks, a remainder of the first incomplete data block being stored in an adjacent incoming chunk to the first incoming chunk, identifying a second prestored data block in a first prestored data chunk, having identical data to a first portion of the first incomplete data block, identifying an adjacent incoming chunk to the first incoming chunk and obtaining a second hash value for the adjacent incoming chunk, if there is a match for the second hash value in the block storage (206), identifying a second prestored data chunk corresponding to the second hash value, identifying a third prestored data block in the second prestored data chunk having identical data to a remainder of the first incomplete data block, and storing a pointer to the second prestored data block and a pointer to the third prestored data block instead of storing the first incomplete data block if there is no match for the second hash value in the block storage (206), storing a pointer to the second prestored data block together with the remainder of the first incomplete data block instead of storing the first incomplete data block. The method (100) according to any one of the preceding claims, wherein the block storage (206) is divided into at least a first section (212A) in which I/O write requests are expected to have a size above a threshold and a second section (212B) in which I/O write requests are expected to have a size below the threshold, comprising the step examining the size of the incoming I/O write request and performing the steps of any one of the preceding claims only if the size is above the threshold, and if the size is below the threshold, performing fixed size deduplication for the first data block. The method (100) according to claim 5, wherein the expected size of I/O write requests is determined based on statistics on the sizes of I/O write requests stored in the block storage (206) in conjunction with their addresses in the block storage (206). The method (100) according to any one of the preceding claims, wherein the content based variable chunking method is a Rabin hash method. A block storage device (202) comprising a disk (204) for storing I/O write requests, the block storage device (202) comprising logical circuitry (208) for performing, for each incoming I/O write request, the method (100) according to any one of the preceding claims.

30 A block storage system (200A) comprising a block storage device (202) according to claim 8 and a control system (214) comprising the logical circuitry (214A). A control system (214) for controlling the writing of data to a block storage device (202), said control system (214) being arranged to perform, for each incoming I/O write request, the method (100) according to any one of the claims 1 - 7. A computer program product for use in a control system (214) for controlling the writing of data to a block storage system (200A), said computer program product comprising computer-readable instructions which, when executed in the control system (214) cause the control system (214) to perform the method (100) according to any one of the claims 1 - 7.

Description:

BLOCK STORAGE METHOD AND SYSTEM FOR REDUCING DATA IN DATA DEDUPLICATION

TECHNICAL FIELD

The present disclosure relates generally to the field of data protection and backup; and more specifically, to a block storage method, a block storage device, and a block storage system for reducing data in data deduplication.

BACKGROUND

Generally, data backup is used to protect and recover data in an event of data loss in a primary storage system (e.g. a host server). For safety reasons, a separate backup system or a storage system is extensively used to store a backup of the data present in the primary storage system. Typically, with time, storage space of the storage system becomes occupied as changes in data or any new data occupy a large storage space in the conventional storage systems. This is undesirable as it causes reduction in performance of the storage systems. Moreover, the cost of data storage, with all the associated costs including cost of storage hardware, continues to be a burden. Typically, data deduplication is a process widely used by the storage system to eliminate duplicate or redundant data stored on the storage system without compromising the fidelity of the original data. In storage systems, generally data is stored in block devices, where a given block device is a computer data storage device that supports reading and optionally, writing data in fixed-size blocks, sectors, or clusters. Conventional block storage systems (or devices) usually use only fixed size deduplication. Such conventional block storage systems usually have an initial inline phase which does some fixed size deduplication using entries which are in the cache and later, offline processing which sometimes apply stronger data reduction techniques like differential compression. The primary issue is that the offline data reduction technique requires significant amount of computational processing power (e.g. extensive use of central processing unit) and also cause read and write amplification, since data has to be processed again after being written, which is not desirable.

Currently, there are many techniques that may be employed for data reduction, for example, fixed size data deduplication. The fixed size deduplication technique divides the data stored into aligned blocks of a fixed size, such as aligned blocks of sizes 4 KB, 8KB or 16 KB. For each aligned block, a strong hash value is generated. If the aligned block that is to be written has the same hash value as an already stored block in the storage system, the aligned block is considered identical. However, the fixed size deduplication technique provides somewhat reliable data duplication for data blocks of small size. For data blocks of large size, the issue is that deduplication ratios are not efficient (i.e. manifests inefficient deduplication). There is another conventional technique for employed for data reduction known as similarity compression or differential compression. The differential compression technique uses an offline process in which the data is processed after being written in the storage system for data deduplication thus, the differential compression technique increases write amplification of the data and requires significant amount of CPU for data deduplication process. Further, while reading the data which is compressed with differential compression, decompression needs to be applied on larger data pieces which also requires more CPU cycles and increases time to complete an operation. In certain scenarios, conventional block storage system uses a two-phase approach for data reduction. In the two-phase approach, the conventional block storage system attempts data deduplication and similarity compression during the inline phase and again later, during the offline phase. The inline deduplication removes some redundancies from data before the data is being written to the conventional block storage system. However, compression of the data during the inline phase uses fixed size deduplication technique. Further, the offline processing requires significant amount of CPU and cause read and write amplification, since data must be processed again after being written.

Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with conventional methods and systems for data deduplication.

SUMMARY

The present disclosure seeks to provide a block storage method, a block storage device, and a block storage system for reducing data in data deduplication. The present disclosure seeks to provide a solution to the existing problem of inefficient data duplication techniques, that is how to further reduce write amplification and computations with higher compression ratios for data deduplication in a storage system. An aim of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in prior art, and provides improved block storage method, device, and system with improved performance and efficiency in data reduction, where the amount of write amplification is significantly reduced.

The object of the present disclosure is achieved by the solutions provided in the enclosed independent claims. Advantageous implementations of the present disclosure are further defined in the dependent claims.

In an aspect, the present disclosure provides a block storage method for storing an incoming I/O write request in the form of data blocks in a block storage. The block storage comprises previously stored data for which hash values have been calculated based on a content based variable chunking method and a map between said hash values and pointers to logical addresses of corresponding prestored chunks of the previously stored data. The method comprises the steps of dividing the incoming I/O write request into a first and a second incoming chunk using the content based variable chunking method. The method further comprises calculating a first hash value for the first incoming chunk and comparing the first hash value to the hash values in the map and if there is a match. The method further comprises identifying a matching prestored chunk based on the first hash value and store at least one pointer to a prestored first data block in the prestored chunk data having at least partially identical data to a first incoming data block in the first incoming chunk, instead of storing the first incoming data block in the block storage. If there is not a match, the method comprises storing the first incoming chunk in the block storage and storing the first hash value in the map together with a pointer to the first incoming chunk’s logical address in the block storage.

The method of the present disclosure enables efficient data deduplication of the incoming I/O write request and achieves improved performance and efficiency in terms of data reduction. The method reduces the amount of write amplification and reduces required computations by the block storage. In a case where a size of the incoming I/O write request is greater than a defined threshold, a variable length deduplication technique is employed on the incoming I/O write request (as in this case). The method uses the variable length deduplication technique for deduplication of the incoming I/O write request (e.g. large I/O write request that is greater than the defined threshold is separated from small I/O write requests less than the defined threshold). The variable length deduplication technique allows better performance, higher deduplication ratios and less write amplification for the I/O write requests of large sizes. The first hash value for the first incoming chunk is compared to the hash values in the map to check if the first incoming chunk is already present in the block storage. Thus, data deduplication is implemented by comparing the just the first hash value and the hash values in the map without any need to first store the first incoming chunk in the block storage.

In an implementation form, if the prestored first data block has fully identical data to the first incoming data block, the method further comprises storing one pointer to the prestored first data block instead of storing the first incoming data block.

In a scenario where the data in the first incoming data block is same as the data in the prestored first data block, only one pointer is required to be assigned to the prestored first block in the block storage. The pointer indicates a physical storage location of the prestored first block on the block storage and thus, prevents further duplication of the first incoming data block in the block storage (i.e. reduces redundancy).

In an implementation form, if the prestored first data block has partially identical data to the first incoming data block, the method comprises storing one pointer to the prestored first data block and one pointer to an adjacent block having identical data to a remainder of the first incoming data block instead of storing the first incoming data block.

In another scenario where the data in the first incoming data block is only partially identical (i.e. some portions of data are same while some different or similar but not same) to the prestored first data block, two pointers can be assigned instead of storing actual first incoming data block, which reduces the amount of data to be stored in the block storage.

In an implementation form, if the first incoming chunk comprises a first incomplete data block in addition to one or more complete fixed sized incoming data blocks, a remainder of the first incomplete data block being stored in an adjacent incoming chunk to the first incoming chunk, the method comprises identifying a second prestored data block in the first prestored data chunk, having identical data to a first portion of the first incomplete data block. The method further comprises identifying an adjacent incoming chunk to the first incoming chunk and obtaining a second hash value for the adjacent incoming chunk. If there is a match for the second hash value in the block storage, the method comprises identifying a third prestored data chunk corresponding to the second hash value, identifying a second prestored data block in the second prestored data chunk having identical data to a remainder of the first incomplete data block, and storing a pointer to the second prestored data block and a pointer to the third prestored data block instead of storing the first incomplete data block. If there is no match for the second hash value in the block storage, the method comprises storing a pointer to the second prestored data block together with the remainder of the first incomplete data block instead of storing the first incomplete data block.

Each incoming chunk (i.e. a variable chunk) has an arbitrary data size. The method enables to provide efficient data reduction even if the first incoming chunk has incomplete data blocks, such as the first incomplete data block. Thus, data deduplication is implemented by comparing hash values in the map without any need to store the entire first incoming chunk and the adj acent incoming chunk in the block storage.

In an implementation form, the block storage is divided into at least a first section in which I/O write requests are expected to have a size above a threshold and a second section in which I/O write requests are expected to have a size below the threshold. The method comprises the step examining the size of the incoming VO write request and performing the steps (above described steps) if the size is above the threshold. If the size is below the threshold, the method comprises performing fixed size deduplication for the first data block.

The method employs both fixed size deduplication and variable sized duplication in such a way that performance and efficiency in terms of data reduction is improved. The fixed size deduplication method is performed on the incoming VO write request when size of the VO write request is within the defined threshold. As the incoming I/O write request below the defined threshold, the amount of write amplification and computations is very small therefore, fixed size deduplication method is implemented. However, the method implements the variable size deduplication when the I/O write request is larger than the defined threshold. The variable size deduplication executed on the large incoming I/O write request reduces the amount of write amplification and computations required to be performed on the incoming I/O write request. The variable size deduplication further increases compression ratio and thus, improves overall performance of the block storage.

In an implementation form, the expected size of I/O write requests is determined based on statistics on the sizes of I/O write requests stored in the system in conjunction with their addresses in the block storage system.

The statistics on the sizes of VO write requests stored in the system in conjunction with their addresses in the block storage system enables to determine appropriate threshold to decide if the incoming I/O write request is a large incoming VO write request or a small incoming I/O write request. Thus, enables to decide if the fixed size deduplication or the variable size deduplication technique is needed to be implemented on the incoming I/O write request.

In an implementation form, the content based variable chunking method is a Rabin hash method.

In another aspect, the present disclosure provides a block storage device. The block storage device comprises a disk for storing I/O write requests. The device comprises a logical circuitry for performing, for each incoming I/O write request, the method.

The block storage device of this aspect achieves all the advantages and effects of the method of the present disclosure.

In another aspect, the present disclosure provides a block storage system. The block storage system comprises a block storage device and a control system comprising a logical circuitry.

The block storage system of this aspect achieves all the advantages and effects of the method of the present disclosure.

In another aspect, the present disclosure provides a control system for controlling the writing of data to a block storage device, said control system being arranged to perform, for each incoming I/O write request, the method.

The control system of this aspect achieves all the advantages and effects of the method of the present disclosure.

In an aspect, a computer program product for use in a control system for controlling the writing of data to a block storage device. The said computer program product comprising computer- readable instructions which, when executed in the control system cause the control system to perform the method.

The computer program product of this aspect achieves all the advantages and effects of the method of the present disclosure.

It has to be noted that all devices, elements, circuitry, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the present disclosure as defined by the appended claims.

Additional aspects, advantages, features, and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative implementations construed in conjunction with the appended claims that follow.

BRIEF DESCRIPTION OF THE DRAWINGS

The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers.

Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:

FIG. 1 is a flowchart of a block storage method for data reduction in data deduplication, in accordance with an embodiment of the present disclosure;

FIG. 2A is an illustration of a block storage system with a block storage device and a control system, in accordance with an embodiment of the present disclosure;

FIG. 2B is an illustration of an exemplary variable length deduplication, in accordance with an embodiment of the present disclosure; FIGs. 3A, 3B, 3C, 3D, and 3E are exemplary illustrations of various operations for data reduction in data deduplication, in accordance with an embodiment of the present disclosure; and

In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the nonunderlined number to the item. When a number is non-underlined and accompanied by an associated arrow, the non-underlined number is used to identify a general item at which the arrow is pointing.

DETAILED DESCRIPTION OF EMBODIMENTS

The following detailed description illustrates embodiments of the present disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practicing the present disclosure are also possible.

FIG. l is a flowchart of a block storage method 100 for data reduction in data deduplication, in accordance with an embodiment of the present disclosure. The method 100 is executed at a block storage device described, for example, in Fig. 2A. The method 100 includes steps 102 to 114

The method 100 is used for storing the incoming I/O write request in the form of data blocks in a block storage. The block storage refers to a hardware storage that supports reading and writing data in fixed-size blocks, sectors, or clusters. The block storage is described in detail, for example, in FIG. 2A. The incoming I/O write request (input/output write request) refers to a block write command. The block storage stores the incoming I/O write request in a form of the data blocks. Generally, one data block corresponds to a specific number of bytes occupied in physical storage space on a block storage (of a disk), such as 4 kilobytes (kB) data block, 8 kB data block, and the like. The data block may be also referred to as a logical block. In an example, a data block is potentially the smallest unit of data (i.e. a defined amount of data) used by a database. For example, the incoming I/O write request may be stored in the block storage in the form of the data blocks of a defined (or fixed) size, for example, 8 kB. The block storage device is configured to receive the incoming I/O write request and create chunks of data which is stored in a plurality of data blocks of the block storage (e.g. of the block storage device). In an example, the incoming I/O write request may be backup data received from a primary storage system (e.g. a host server) during backup. In the block storage, the writes are block writes with offset and length of the data. When the user requests the needed data, from the block storage, the block storage device potentially retrieves and reassembles data in the plurality of data blocks via offset and presents the requested data to a user. For example, retrieve 8 blocks of data starting at offset 8100.

The block storage comprises previously stored data for which hash values have been calculated based on a content based variable chunking method and a map between said hash values and pointers to logical addresses of corresponding prestored chunks of the previously stored data. The map may be a complex data structure that allows finding the location (e.g. of the prestored chunks) based on a hash (e.g. in practice, the fact that a large block is received with many variable chunks that are near each other is what allows to have an efficient data structure.). The previously stored data potentially refers to backup data that is backed up (i.e. stored) previously in the block storage (e.g. from the primary storage system). At the time of receipt, the data (i.e. the previously stored data) is split into portions of smaller size (known as chunks) using the content based variable chunking method. The content based variable chunking method splits the data into variable length chunks. For example, the data may be split into three chunks of 7 kilobytes, 19 kilobytes and 28 kilobytes using the content based variable chunking method. Typically, in practice, the chunking is byte aligned and not block aligned. Unlike fixed length chunking method, the content based variable chunking method is more resistant to byte shifting thus, increases the reliability of determining duplicate chunks between the previously stored data and the incoming I/O write request. Such data chunks are also stored in the form of data blocks of a defined size (e.g. a fixed size of 8 kB data block) in the block storage.

In an example, the hash value is generated by the block storage device for each chunk of the previously stored data using a hashing algorithm. The hash value refers to a fixed sized value for original data of arbitrary sizes generated using a hash function (i.e. a hashing algorithm). Examples of the hashing algorithm include but are not limited to SHA-2, MD5. Beneficially, the hash values enable identification of the respective chunk of the previously stored data in the block storage and thus, prevents further duplication of data of the data blocks in the block storage. Each of the prestored chunks of the previously stored data are assigned with the corresponding logical address (or virtual address) by the block storage device. The logical address is used as a reference to access the physical address of the chunk in the block storage of the block storage device. Further, pointers are assigned to the respective location address of each chunks of the previously stored data. Each pointer allows to physically locate and verify each chunk (spread across one or more data blocks) of the previously stored data on the block storage. Furthermore, in an example, the map refers to a data structure generated by the block storage device between the hash values and pointers assigned to the logical address of each chunk in the block storage. The map facilitates provisioning of an access to each chunk in the block storage and enables efficient deduplication for future incoming I/O write requests. It is to be understood that the map may not be a simple hash table, but in practice may be a complex data structure that allows finding the location (e.g. of the prestored chunks) based on a hash (e.g. in practice, the fact that a large block is received with many variable chunks that are near each other is what allows to have an efficient data structure.

In accordance with an embodiment, an expected size of I/O write requests is determined based on statistics on the sizes of I/O write requests stored in the system in conjunction with their addresses in the block storage. The block storage device is configured to acquire statistics on the sizes of I/O write requests in the system (e.g. the block storage) in conjunction with their addresses in the block storage. Specifically, the block storage device collects statistics on the I/O write requests sizes versus logical block addresses in the block storage. This indicates one or more storage areas (or sections) where I/O write requests are expected to be larger and storage areas where I/O write requests are expected to be comparatively smaller. Thereafter, the block storage is divided into at least a first section in which I/O write requests are expected to have a size above a threshold and a second section in which I/O write requests are expected to have a size below the threshold. In such a case, the method 100 comprises the step examining the size of the incoming I/O write request and performing the steps 102 to 114 of method 100 if the size is above the threshold, and if the size is below the threshold, performing fixed size deduplication for the first data block. The block storage is divided into sections (i.e. storage areas) of a fixed size. For example, the block storage may be divided into a first section and a second section. The first section and the second section may be, for example, of 100 MB each. Additionally, the threshold is decided (or set) to check if the incoming I/O write request is large or not. For example, in a case where the threshold of a defined size, for example, 256 KB is set, the incoming I/O write request of a size greater than 256 KB may be considered as large I/O write request. Similarly, the incoming I/O write requests less than the threshold may be considered as comparatively small I/O write requests. For the incoming I/O write request arriving at a section which is mapped as the second section (i.e. when the size of the incoming I/O write request is below the threshold), the incoming I/O write request is chunked to fixed size aligned chunks (e.g. 8KB) and fixed size deduplication is applied by the block storage device. As the size of incoming I/O write request arriving at the second section is below the threshold, the write amplification even in case of application of fixed size deduplication method is minimal or negligible. Moreover, for such sections (or storage areas), the block storage device applies a differential compression scheme for further compression and data reduction. The fixed size deduplication method is further described, in an example, in FIG. 2A. For the incoming I/O write request arriving at a section mapped as the first section (i.e. when the size of the incoming I/O write request is above the threshold), various data reduction operations are applied as described in the steps 102 to 114 of the method 100. The various data reduction operations correspond to variable size deduplication.

At step 102, the method 100 comprises dividing the incoming I/O write request into a first and a second incoming chunk using the content based variable chunking method. Each incoming chunk may be of arbitrary size. The incoming I/O write request is divided into a plurality of chunks of variable length (i.e. size), such as the first incoming chunk and the second incoming chunk, using the content based variable chunking algorithm. In other words, the size of the first incoming chunk and the second incoming chunk is different from each other. The chunk size of the first incoming chunk and the second incoming chunk is larger than a given data block that is of fixed size. For example, the size of each data block of fixed size may be 8 kilobytes, whereas the chunk size of the first incoming chunk and the second incoming chunk is arbitrary. The content based variable chunking method allows processing of the incoming I/O write request before being written into the block storage. Thus, the content based variable chunking method reduces the amount of write amplification and computations in the block storage. After execution of the content based variable chunking, each incoming chunk, such as the first incoming chunk and the second incoming chunk has an amount of data that is potentially equal to one or more incomplete data blocks (e.g. less than 8 KB) and one or more complete fixed sized incoming data blocks (i.e. fully aligned and equal of the size of the data block of fixed size, such as 8 KB). It is to be understood that there may be much more than two chunks in a single I/O write request, and that the first incoming chunk and the second incoming chunk are used for explanation purpose, for the sake of brevity.

In accordance with an embodiment, the content based variable chunking method is a Rabin hash method. Alternatively stated, the incoming I/O write request is divided (chunked) into variable size chunks with average chunk size significantly larger than the fixed size data block, for example, greater than 8KB data block using some known chunking algorithm, such as Rabin hash.

At step 104, the method 100 further comprises calculating a first hash value for the first incoming chunk. In practice, the hashes of all incoming chunks may be calculated, where the search for the first hash locations may also depend on the other hashes. The first hash value of the first incoming chunk is calculated by the block storage device using a hashing function. The first hash value refers to fixed sized value that represents original data of the first incoming chunk. Examples of the hashing function include, but are not limited to SHA-1, SHA-2, or MD5. Beneficially, the first hash value enables identification of the first incoming chunk that allows comparison of the first hash value of the first incoming chunk with the hash values of the previously stored data to prevent duplication of the first incoming chunk in the block storage.

At step 106, the method 100 further comprises comparing the first hash value to the hash values in the map (i.e. the data structure). The first hash value for the first incoming chunk is searched in the data structure (referred to as the map) and it is checked whether the first hash value can be found in the data structure at the block storage. In a case where match is found (i.e. the first hash value is found in the data structure, then the control moves to step 108. However, in a case where match is not found, then the control moves to step 112.

At step 108, the method 100 further comprises identifying the matching prestored chunk based on the first hash value. In a case where the match is found for the first hash value in the data structure, it implies that data corresponding to the first incoming chunk is already stored in the block storage. Hence, the prestored chunk is identified in the block storage to prevent storage of the first incoming chunk and thus, reduces duplication of data in the block storage. In another example, the prestored chunk is identified by the block storage device by comparing the first hash value to the previously calculated hash values of the previously stored data stored in the map. At step 110, the method 100 further comprises storing at least one pointer to a prestored first data block in the prestored chunk data having at least partially identical data to a first incoming data block in the first incoming chunk, instead of storing the first incoming data block in the block storage. In an example, the first incoming chunk may include data that is equal to one incomplete data block (e.g. of 4 KB) which is not fully aligned to the given block (e.g. of 8 KB) and two complete fixed sized incoming data blocks fully aligned to the consecutive blocks (of 8 KB each). In such a case, if the first hash value of the first incoming chunk exists in the current map, each of the two complete fixed sized incoming data blocks fully contained in the alignment is stored by the block storage device as a corresponding pointer block. Pointer block do not store actual data and only stores a pointer to the one or more consecutive blocks in the first section of the block storage where the data exists (e.g. pointers to prestored two data blocks in the prestored chunk data). An example is described in FIG. 3D. On the other hand, if the prestored first data block has partially identical data to the first incoming data block (i.e. the incomplete data block that is not fully aligned or contained in a given data block), the block storage device stores at least one pointer to the prestored first data block in the prestored chunk data instead of storing the first incoming data block (i.e. newly incoming data portion) in the block storage. The pointer allows to locate the prestored first data block on the block storage. Hence, the method 100 prevents duplication of data without prior storing the first incoming data block in the block storage that reduces the write amplification and significantly reduces the computation time. Moreover, in some cases, the block storage device potentially stores one more pointer to an adjacent block having identical data to a remainder of the first incoming data block (if the prestored first data block has partially identical data to the first incoming data block).

In accordance with an embodiment, if the prestored first data block has fully identical data to the first incoming data block, storing one pointer to the prestored first block instead of storing the first incoming data block. In a scenario, if the data in the first incoming data block completely matches the data in the prestored first data block, only one pointer is required to be assigned and stored by the block storage device to the prestored first block to locate address of the prestored first block in the block storage.

In accordance with an embodiment, if the prestored first data block has partially identical data to the first incoming data block, storing one pointer to the prestored first data block and one pointer to an adjacent block having identical data to a remainder of the first incoming data block instead of storing the first incoming data block. In another scenario, the data in the first incoming data block may partially match the data in the prestored first data block and partially match a data block adjacent to the prestored first data block. In other words, only a data portion (not all data) in the first incoming data block is same as data in the prestored first data block, whereas remaining data portion of the first incoming data block is potentially identical with a prestored second data block which is adjacent to the prestored first data block. Then, two pointers are assigned by the block storage device to locate the prestored first block on the block storage. The one pointer is assigned to the prestored first data block and the other pointer is assigned to the data block adjacent to the prestored first data block (that is the prestored second data block) to physically locate the prestored first block and the prestored second block on the block storage, without the need to store the first incoming data block. In this case, for example, if the prestored first data block has partially identical data to the first incoming data block (i.e. the incomplete data block that is not fully aligned or contained in a given data block), the block storage device stores at least one pointer to the prestored first data block in the prestored chunk data and another pointer to the adjacent block.

At step 112, the method 100 further comprises storing the first incoming chunk in the block storage and storing the first hash value in the map (i.e. the data structure) together with a pointer to the first incoming chunk’s logical address in the block storage. If there is no match in the data structure for the first hash value, it implies that the first incoming chunk is not present in the block storage. Therefore, the first incoming chunk is potentially stored in the block storage. Further, the first hash value is generated by the block storage device for the first incoming chunk for future identification of the first incoming chunk in the block storage and thus, prevents duplication of the first data chunk in the block storage. Furthermore, the first incoming chunk is assigned with the logical address (or virtual address) by the block storage device that is used as a reference to access the physical address of the chunk in the block storage. Additionally, the pointer is assigned by the block storage device to the logical address of the first incoming chunk. The pointer allows to locate and verify the first incoming chunk on the block storage. In an example, the data structure establishes a correspondence between the hash value, the first incoming chunk, and the pointer assigned to the logical address of the first incoming chunk in the block storage. The map provides an access to the first incoming chunk in the block storage using corresponding hash value and eliminates unnecessary data-to-data comparison for future deduplication operations and thus, reduces complexity.

In accordance with an embodiment, the method 100 further comprises the steps of, if the first incoming chunk comprises a first incomplete data block in addition to one or more complete fixed sized incoming data blocks, a remainder of the first incomplete data block being stored in an adjacent incoming chunk to the first incoming chunk, identifying a second prestored data block in the first prestored data chunk, having identical data to a first portion of the first incomplete data block. The first incomplete data block potentially have a size lesser than a given data block of a fixed size. The remaining portion of the first incomplete data block (for example, the remaining 4 kilobytes of the first incomplete data block) if found to be stored in the adjacent incoming chunk (e.g. second incoming chunk or chunk 2), then the second prestored data block in the first prestored data chunk is identified. This is done to determine if the first incomplete data block is already stored in the block storage and prevent duplication of data of the first incoming chunk.

The method 100 further comprises identifying an adjacent incoming chunk to the first incoming chunk and obtaining a second hash value for the adjacent incoming chunk. The adjacent incoming chunk to the first incoming chunk that has remaining data of the first incomplete data block is identified by the block storage device. Identification of the adjacent incoming chunk prevents duplication of data in the block storage. The second hash value of the adjacent incoming chunk is obtained in order to compare the second hash value of the adjacent incoming chunk with the hash value of the previously stored data to prevent duplication of the adjacent incoming chunk in the block storage.

The method 100 further comprises if there is a match for the second hash value in the block storage, identifying a second prestored data chunk corresponding to the second hash value, identifying a third prestored data block in the second prestored data chunk having identical data to a remainder of the first incomplete data block, and storing a pointer to the second prestored data block and a pointer to the third prestored data block instead of storing the first incomplete data block. The second hash value is also searched and checked if it is found in the data structure at the block storage to determine if remaining data of the first incomplete data block is already present in the block storage. In a case where the match is found for the second hash value, it implies that data corresponding to the adjacent incoming chunk is already stored in the block storage. The second prestored chunk comprises the remainder data of the first incomplete data block being stored in an adjacent incoming chunk.

In an example, since a data portion corresponding to the first incomplete data block and the remaining data of the first incomplete data block is already present in the previously stored data in the block storage, the first incomplete data block and the remaining data of the first incomplete data block is not stored in the block storage. Instead, one pointer is assigned by the block storage device to the second prestored data block in the first prestored data chunk that comprises data corresponding to the data of first incomplete data block. Further, another one pointer is assigned by the block storage to the third prestored data block of the second prestored data chunk that comprises data corresponding to the remaining data of the first incomplete data block that is stored in the adjacent incoming chunk. The pointers eliminate the need to store actual data, and significantly reduces the write amplification and associated computations.

If there is no match for the second hash value in the block storage, storing a pointer to the second prestored data block together with the remainder of the first incomplete data block instead of storing the first incomplete data block. No match for the second hash value in the block storage implies that the remaining data of the first incomplete data block being is not previously stored in the block storage. In such scenario, the one pointer is assigned by the block storage to the second prestored data block to locate the second prestored data block in the block storage. Further, the remaining data of the first incomplete data block is stored in the block storage. In an example, the remaining data of the first incomplete data block may be stored by the block storage device in one more fixed sized data blocks. In an example, alternatively, the remaining data of the first incomplete data block may be compressed as differential from the beginning (start) of the current chunk.

Thus, the method 100 enables performing the fixed size deduplication method on the incoming I/O write request when size of the I/O write request is within the pre-determined threshold and the variable size deduplication method when the VO write request is larger than the predetermined threshold. The incoming I/O write request within the pre-determined threshold does not involve adverse amount of write amplification and computations therefore, fixed size deduplication method can be implemented with affecting performance. However, the variable size deduplication method on the large incoming I/O write request reduces significant amount of write amplification and computations required to be performed on the incoming I/O write requests to prevent duplication. The variable size deduplication method further increases deduplication ratio and thus, reduces duplication of the data of the block storage.

FIG. 2A is a block diagram that illustrates various exemplary components of a block storage system, in accordance with another embodiment of the present disclosure. FIG. 2A is described in conjunction with FIG. 1A and IB. With reference to FIG. 2A, there is shown the block storage system 200A. The block storage system 200A comprises a block storage device 202 and a control system 214. The block storage device 202 comprises a disk 204 having a block storage 206. The block storage device 202 further comprises a logical circuitry 208 and a first network interface 210. The block storage 206 comprises a first section 212A and a second section 212B. The control system 214 comprises a logical circuitry 214A, a second network interface 214B, and a memory 214C.

The block storage device 202 refers to a secondary storage device used for back-up. The block storage device 202 includes suitable logic, circuitry, interfaces, and/or code that is configured to process the incoming I/O write requests. In an implementation, the block storage device 202 is a part of storage area networks (SANs) or a cloud-based storage environments. Other examples of the block storage device 202 include, but is not limited to a secondary storage server, a block-storage based computing device in a computer cluster (e.g. massively parallel computer clusters), an block-storage based electronic device, or a supercomputer. The block storage device 202 is communicatively coupled with the control system 214.

The block storage device 202 comprising a disk 204 for storing I/O write requests. The disk 204 herein refers to a hardware or physical storage of the block storage device 202. The disk 204 is configured to store the I/O write request and the instructions executable by the logical circuitry 208. In an example, the disk 204 includes the block storage 206 that defines the storage space for data blocks. The disk 204 may also include other known components, such as disk head, and the like, used to read and write data (not shown for the sake of brevity). Examples of implementation of the disk 204 may include, but are not limited to, a Hard Disk Drive (HDD), Solid-State Drive (SSD), a back-up storage disk, a block storage unit, or other computer storage media. The disk 204 may store an operating system and/or other program products (including one or more operation algorithms) to operate the block storage device 202. The block storage 206 is generally used for high-performing applications that require consistent input/output performance and low latency, for example, in storage-area network (SAN) environments, cloud-based storage, virtual machine file systems, and the like. Data is saved in the block storage 206 in fixed-sized chunks called blocks or data blocks.

The logical circuitry 208 is configured to execute instructions stored in the disk 204 to control storing of the incoming I/O write request in the form of the data blocks in the block storage device 202. Examples of the logical circuitry 208 includes, but are not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. Moreover, the logical circuitry 208 may refer to one or more individual processors, processing devices, a processing unit that is part of a machine, such as the block storage device 202.

The first network interface 210 is an arrangement of interconnected programmable and/or nonprogrammable components that are configured to facilitate data transmission between one or more electronic devices. The first network interface 210 may support communication protocols for Internet Small Computer Systems Interface (iSCSI), fibre channel, or Fibre Channel over Ethernet (FCoE) protocols. The first network interface 210 may also support communication protocols for one or more of peer-to-peer network, a hybrid peer-to-peer network, local area networks (LANs), metropolitan area networks (MANS), wide area networks (WANs), all or a portion of a public network such as the global computer network known as the Internet, a private network, or any other communication system or systems at one or more locations. Additionally, the first network interface 210 supports wired or wireless communication that can be carried out via any number of known protocols, including, but not limited to, Internet Protocol (IP), Wireless Access Protocol (WAP), Frame Relay, or Asynchronous Transfer Mode (ATM). Moreover, any other suitable protocols using voice, video, data, or combinations thereof, can also be employed and supported by the first network interface 210.

The first section 212A and the second section 212B are the sections logically divided within block storage 206. The first section 212A are meant for I/O write requests that are expected to have size above the threshold. For example, the first section 212A is used for the I/O write requests of size above 256 kilobytes, in an example. The second section 212B is configured for I/O write requests that are expected to have a size below the threshold. For example, the second section 212B is used for the I/O write requests of size below 256 kilobytes, in an example.

The control system 214 includes suitable logic, circuitry, and/or interfaces that is configured to control the writing of the data to the block storage device 202. In an implementation, the control system 214 is configured to execute control instructions stored in the memory 214C for controlling writing of the incoming I/O write request. The control system 214 is a part of the block storage system 200A. For example, the control system 214 may be a backup server that control data backup from the primary storage system (e.g. host server) to the secondary storage system (e.g. the block storage device 202). In another example, optionally, the control system 214 may be a primary storage system (e.g. host server) that controls writing in the secondary storage system, such as the block storage device 202. In yet another example, the control system 214 may be a control device that is communicatively coupled to the block storage device 202, for example, via a communication medium (e.g. iSCSI, fibre channel, or FCoE), to control the writing of the data to the block storage device 202.

The control system 214 comprises the logical circuitry 214A, the second network interface 214B, and the memory 214C. The logical circuitry 214A of the control system 214 is configured to execute instructions stored in the memory 214C to control writing of the incoming I/O write request in the disk 204 of the block storage device 202. Examples of the logical circuitry 214A includes, but are not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. Examples of the implementation of the second network interface 214B is same or similar as that of the first network interface 210 of the block storage device 202. Examples of implementation of the memory 214C of the control system 214 may include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), and/or CPU cache memory.

In operation, the logical circuitry 208 of the block storage device 202 is configured to divide the block storage into at least a first section 212A in which I/O write requests are expected to have a size above a threshold and a second section 212B in which I/O write requests are expected to have a size below the threshold. In such a case, the logical circuitry 208 is configured to examine the size of the incoming I/O write request and perform various data reduction operations using variable size deduplication (e.g. as described in the steps 102 to 114 of method 100 in FIG. 1) if the size is above the threshold, and if the size is below the threshold, perform fixed size deduplication for the first data block. For the incoming I/O write request arriving at a section which is mapped as the second section 212B (i.e. when the size of the incoming I/O write request is below the threshold), the logical circuitry 208 applies the fixed size deduplication. For the incoming I/O write request arriving at a section mapped as the first section 212A (i.e. when the size of the incoming I/O write request is above the threshold), the logical circuitry 208 applies variable size deduplication. The logical circuitry 208 is further configured to divide the incoming I/O write request into a first and a second incoming chunk using the content based variable chunking method (in practice, there may be more chunks of arbitrary size). The incoming I/O write request is divided into a plurality of chunks of variable length (i.e. size), such as the first incoming chunk and the second incoming chunk, using the content based variable chunking algorithm. The logical circuitry 208 is further configured to calculate a first hash value for the first incoming chunk. The first hash value of the first incoming chunk is calculated using a hashing function.

The logical circuitry 208 is further configured to compare the first hash value to the hash values in the map (i.e. the data structure). The first hash value for the first incoming chunk is searched in the data structure and it is checked whether the first hash value can be found in the data structure (possibly leveraging other hashes to make the find in the data structure while searching for the first hash value at the block storage.

The logical circuitry 208 is further configured to identify the matching prestored chunk based on the first hash value. In a case where the match is found for the first hash value, it implies that data corresponding to the first incoming chunk is already stored in the block storage.

The logical circuitry 208 is further configured to store at least one pointer to a prestored first data block in the prestored chunk data having at least partially identical data to a first incoming data block in the first incoming chunk, instead of storing the first incoming data block in the block storage. The pointer allows to locate the prestored first data block on the block storage. If the prestored first data block has fully identical data to the first incoming data block, the logical circuitry 208 is configured to store one pointer to the prestored first block instead of storing the first incoming data block. In a scenario, if the data in the first incoming data block completely matches the data in the prestored first data block, only one pointer is assigned by the logical circuitry 208. If the prestored first data block has partially identical data to the first incoming data block, the logical circuitry 208 is configured to store one pointer to the prestored first data block and one pointer to an adjacent block having identical data to a remainder of the first incoming data block instead of storing the first incoming data block. In another scenario, the data in the first incoming data block may partially match the data in the prestored first data block and partially match a data block adjacent to the prestored first data block. Then, two pointers are assigned by the logical circuitry 208 to physically locate the prestored first block on the block storage. The logical circuitry 208 is further configured to store the first incoming chunk in the block storage and store the first hash value in the map together with a pointer to the first incoming chunk’s logical address in the block storage. If there is no match for the first hash value in the data structure , it implies that the first incoming chunk is not present in the block storage. Therefore, the first incoming chunk is potentially stored in the block storage.

Moreover, if the first incoming chunk comprises an amount of data that is equal to a first incomplete data block in addition to the one or more complete fixed sized incoming data blocks, where a remainder of the first incomplete data block is stored in an adjacent incoming chunk to the first incoming chunk, the logical circuitry 208 is further configured to identify a second prestored data block in the first prestored data chunk. The first prestored data chunk has identical data to a first portion of the first incomplete data block. The logical circuitry 208 is further configured to identify an adjacent incoming chunk to the first incoming chunk and obtain a second hash value for the adjacent incoming chunk. If there is a match for the second hash value in the block storage, the logical circuitry 208 is further configured to: identify a third prestored data chunk corresponding to the second hash value, identify a second prestored data block in the second prestored data chunk having identical data to a remainder of the first incomplete data block, and store a pointer to the second prestored data block and a pointer to the third prestored data block instead of storing the first incomplete data block. If there is no match for the second hash value in the block storage, the logical circuitry 208 is further configured to store a pointer to the second prestored data block together with the remainder of the first incomplete data block instead of storing the first incomplete data block.

The control system 214 for controlling the writing of data to a block storage device 202, said control system 214 being arranged to perform, for each incoming I/O write request, the method 100 (described in FIG. 1). In operation, the logical circuitry 214A of the control system 214 is configured to control the writing of data to the block storage device 202. In one aspect, the logical circuitry 214A of the control system 214 executes various operations described, for example, in the FIGs. 1 and 2A (e.g. operations of the logical circuitry 208 of the block storage device 202). Alternatively stated, the various embodiments, operations, and variants disclosed above apply mutatis mutandis to the control system 214.

A computer program product for use in a control system 214 for controlling the writing of data to the block storage system 200A, said computer program product comprising computer- readable instructions which, when executed in the control system 214 cause the control system 214 to perform the method 100. The computer program product may include, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. In yet another aspect, the present disclosure provides a computer program adapted to perform the method 100 by a device (e.g. the block storage device 202 or the control system 214). In another aspect, the present disclosure provides a computer program product comprising a non-transitory computer-readable storage medium having computer-readable instructions being executable by a processor to execute the method 100. Examples of implementation of the non-transitory computer-readable storage medium include, but is not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, a Secure Digital (SD) card, Solid-State Drive (SSD), a computer readable storage medium, and/or CPU cache memory.

FIG. 2B is an exemplary illustration of an exemplary variable length deduplication, in accordance with an embodiment of the present disclosure. With reference to FIG. 2B, there is shown two exemplary I/O write requests, such as the first I/O write request 218 and the second I/O write request 220. Each of the first I/O write request 218 and the second I/O write request 220 undergoes content-based variable chunking (as shown in the representative block 216). For example, the first I/O write request 218 based on its content is divided into variable sized five data chunks (represented as A, C, D, B, and D in the the first I/O write request 218 and as IIA, he, ho, hu, and ho in the representative block 216). Similarly, the second I/O write request 220 based on its content is divided into variable sized three data chunks (represented as D, C and B in the the first I/O write request 218 and as ho, he, and 11B). It is to be noted that the chunks D, C and B in the second VO write request 220 is same as of the data chunks D, C and B in the first I/O write request 218.

In accordance with an embodiment, the first VO write request 218 and the second VO write request 220 both comprises duplicate the data chunks B, C and D. Further, the first incoming object comprises two duplicate copies of the data chunk D. When the first VO write request 218 and the second I/O write request 220 is to be backed up in a secondary storage (i.e. the block storage 206), deduplication is applied where the duplicate data chunks of the first VO write request 218 and the second VO write request 220 are not stored in the block storage 206. Hence, the data chunks A, B, C and D are only retained (backed up) in the block storage 206 and duplicate data chunks are removed. The above example describes a typical example of deduplication using variable length deduplication. However, it is to be understood that the variable deduplication (or the variable length deduplication) applied (or executed) by the block storage device 202 in the present disclosure is different as all data at the end is stored in fixed size block and the variable size deduplication (also simply referred to as dedup) only allows to understand where in other blocks the data is located.

FIGs. 3A, 3B, 3C, 3D, and 3E are exemplary illustrations of various operations for data reduction in data deduplication, in accordance with an embodiment of the present disclosure. FIG. 3A to 3E are described in conjunction with elements from FIGs. 1, 2A, and 2B. With reference to FIG. 3 A, there is shown a fixed size chunking arrangement 300A of an incoming I/O write request 302 using a fixed size deduplication technique. The fixed size deduplication technique is performed when the incoming VO write request 302 is below the defined threshold. For example, in a case where the threshold of a defined size, for example, 256 KB is set, the incoming I/O write request 302 is chunked to fixed size aligned chunks (e.g. of 8KB) and fixed size deduplication is applied by the block storage device 202. Additionally, the hash value is generated for each data block of the incoming I/O write request 302 and compared with the hash value of previously stored data in the block storage device 202. If a match is found between the hash value of previously stored data in the block storage device 202 and the hash value generated for the data block of the incoming VO write request 302, it implies that the data block of the incoming I/O write request is already present in the block storage device 202. Therefore, instead of storing the data again, the pointer is stored to the prestored data block. However, if no match is found between the hash value of previously stored data in the block storage device 202 and the hash value generated for each data block of the incoming I/O write request 302, it implies that the data of the incoming I/O write request 302 is not present in the block storage device 202. Therefore, the incoming I/O write request 302 is stored in the block storage device 202. Further, the pointer is assigned and stored as metadata to the data blocks of the incoming I/O write request 302 to map the data blocks on the block storage 206.

With reference to FIG. 3B, there is shown a variable size chunking arrangement 300B of an incoming I/O write request 304 that is divided using a content based variable chunking technique. The content based variable chunking technique is performed when the incoming I/O write request 304 is above the threshold. For example, in a case where the threshold of a defined size, for example, 256 KB is set, the incoming I/O write request 304 is chunked to variable size chunks and the content based variable chunking is applied on the incoming I/O write request 304. In the content based variable chunking, the incoming I/O write request 304 is divided into the data chunks of variable sizes. For example, the incoming I/O write request 304 is divided into a first incoming chunk 304A, a second incoming chunk 304B and a third incoming chunk 304C. The size of the first incoming chunk 304 A, the second incoming chunk 304B and the third incoming chunk 304C are different from each other. Further, the chunk size of the first incoming chunk 304A, the second incoming chunk 304B and the third incoming chunk 304C is larger than a given data block that is of fixed size. For example, if the size of the data block of fixed size is 8 kilobytes, then the determined chunk size of the first incoming chunk 304A, the second incoming chunk 304B and the third incoming chunk 304C is different and greater than 8 kilobytes. After execution of the content based variable chunking, each incoming chunk, such as the first incoming chunk 304A, the second incoming chunk 304B and the third incoming chunk 304C comprises an amount of data that equals to one or two incomplete data blocks (e.g. less than 8 KB) and one or more complete fixed sized incoming data blocks (i.e. fully aligned and equal of the size of the data block of fixed size, such as 8 KB).

With reference to FIG. 3C, there is shown an exemplary alignment of the first incoming chunk 304A with respect to a set of fixed size data blocks 306 (e.g. in the block storage). In this example, the first incoming chunk 304A comprises a first incomplete data block 308A, three complete fixed sized incoming data blocks 308B, 308C, and 308D, and a second incomplete data block 308E. The first incomplete data block 308A and the second incomplete data block 308E have a size lesser than a given data block of a fixed size. For example, the size of three complete fixed sized incoming data blocks 308B, 308C, and 308D may be 8 KB, whereas the size of the first incomplete data block 306B and the second incomplete data block 306C is lesser than 8 KB, such as 2142 bytes and 3968 bytes respectively (it is to be understood that for size in bytes is considered, that is the sizes are byte aligned not aligned in kilobytes). The block storage device 202 is configured to calculate a first hash value for the first incoming chunk 304A. Beneficially, the first hash value enables identification of the first incoming chunk 304A that allows to search the first hash value of the first incoming chunk 304A in the data structure and if found can prevent duplication of the first incoming chunk 304A in the block storage 206.

With reference to FIG. 3D, there is shown an exemplary illustration of a scenario when there is a match for the first hash value of the first incoming chunk 304A in the block storage 206. With reference to FIG. 3D, there is shown the first prestored data chunk 310 and the first incoming chunk 304A. In this scenario, the first prestored data chunk 310 is prestored in the form of four data blocks of a defined size and a partial block in the block storage 206 such as a prestored first data block 310A, a prestored second data block 310B, a prestored third data block 310C, and a prestored fourth data block 310D (followed by a partial chunk at the end, which is a partial block 310E, say of size 1525 bytes).

In a case where the prestored first data block 310A has partially identical data to the incoming data block 308B of the first incoming chunk 304 A and the prestored second data block 310B comprises identical data to a remainder of the incoming data block 308B, two pointers 312A and 312B are potentially assigned by the block storage device 202 to physically locate the prestored first data block 310A and the prestored second data block 310B on the block storage 206. The pointer 312A is assigned to the prestored first data block 310A and the other pointer 312B is assigned to the data block adjacent to the prestored first data block 310A (that is the prestored second data block 310B) without the need to store the incoming data block 308B.

Alternatively, in a case where the first hash value of the first incoming chunk 304 A exists in the current map (i.e. the data structure), each of the two complete fixed sized incoming data blocks (e.g. the incoming data blocks 308B, 308C, and 308D) fully contained in the alignment is stored by the block storage 206 as a corresponding pointer block. Pointer block do not store actual data and only stores a pointer to the one or more consecutive blocks in the block storage 206 where the data exists (e.g. pointers to prestored two data blocks in the prestored chunk data, for example, the first prestored data chunk 310).

With reference to FIG. 3E, there is shown an exemplary illustration of a scenario when the first incoming chunk 304A comprises the first incomplete data block 308A which is not fully aligned (contained) with respect to a data bock of a set of fixed size data blocks 306. In an example, a remainder of the first incomplete data block 308A may be stored in an adjacent incoming chunk 316 to the first incoming chunk 304A. With reference to FIG. 3E, there is shown the first prestored data chunk 310 and a second prestored data chunk 314. The second prestored data chunk 314 comprises prestored data blocks, such as a prestored first data block 314A, a prestored second data block 314B, a prestored third data block 314C, a prestored fourth data block 314D and a prestored fifth prestored data block 314E. There is further shown the adjacent incoming chunk 316 to the first incoming chunk 308. The adjacent incoming chunk 316 comprises incoming data blocks such as, a first incoming data block 316A, a second incoming data block 316B and a third incoming data block 316C.

In accordance with an embodiment, as a data portion corresponding to the first incomplete data block 308A and the remaining data of the first incomplete data block 308A (i.e. data corresponding to the third incoming data block 316C of the adjacent incoming chunk 316) is already present in the previously stored data in the block storage 206, the first incomplete data block 308A and the remaining data of the first incomplete data block is not stored in the block storage 206. Instead, a pointer 318A is assigned by the block storage device 202 to the prestored first data block 310A in the first prestored data chunk 310 that comprises data corresponding to the data of first incomplete data block 308A. Further, another pointer 318B is assigned by the block storage device 202 to the fifth prestored data block 314E of the second prestored data chunk 314 that comprises data corresponding to the remaining data of the first incomplete data block 308A that is stored in the adjacent incoming chunk 316. The pointers 318A and 318B eliminates the need to store actual data, and significantly reduces the write amplification and associated computations.

FIGs. 4A, 4B, and 4C collectively is a flowchart of a method for reducing data in data deduplication of incoming I/O write requests in a block storage device, in accordance with another embodiment of the present disclosure. FIGs. 4A, 4B, and 4C are described in conjunction with elements from FIGs. 1, 2A, 2B, and 3 A to 3E. The method 400 is executed by the block storage device 202 (of Fig. 2A). The method 400 includes steps 402 to 432.

At step 402, the method 400 comprises examining the size of the incoming I/O write request 302 and dividing the block storage 206 into at least a first section 212A in which I/O write requests are expected to have a size above a threshold and a second section 212B in which I/O write requests are expected to have a size below the threshold. In other words, the variable chunking is performed based on if an area (e.g. like the first section 212A) has large IOs more of the time (i.e. frequently large VO write request), or based of if specific IO is large.

At step 404, the method 400 further comprises determining if the incoming I/O write request 302 has size above or below threshold. In a case where the incoming I/O write request (e.g. the incoming I/O write request 302) does not has size above the threshold, then the control moves to step 406. However, in a case where the incoming I/O write request (e.g. the incoming I/O write request 304) has size above the threshold, then the control moves to step 410. At step 406, the method 400 further comprises performing fixed size deduplication for the first data block.

At step 408, the method 400 further comprises dividing the incoming I/O write request 302 into a first incoming chunk 304A and a second incoming chunk 304B using the content based variable chunking method. At step 410, the method 400 further comprises calculating a first hash value for the first incoming chunk 304A.

At step 412, the method 400 further comprises determining if the first incoming chunk 304A comprises the first incomplete data block 308A. In a case where the first incoming chunk 304A does not comprises the first incomplete data block 308A, then the control moves to step 414. However, in a case where the first incoming chunk 304A comprises the first incomplete data block 308 A, then the control moves to step 422. At step 414, the method 400 further comprises comparing the first hash value to the hash values in the map. In a case where no match is found between the first hash value to the hash values in the map, then the control moves to step 416. However, in a case where match is found between the first hash value to the hash values in the map, then the control moves to step 418.

At step 416, the method 400 further comprises storing the first incoming chunk 304 A in the block storage 206 and storing the first hash value in the map together with a pointer to the first incoming chunk’s logical address in the block storage 206. At step 418, the method 400 further comprises identifying the matching prestored chunk based on the first hash value.

At step 420, the method 400 further comprises storing at least one pointer 312A to a prestored first data block 310A in the prestored chunk data having at least partially identical data to a first incoming data block in the first incoming chunk 304A, instead of storing the first incoming data block in the block storage 206. At step 422, the method 400 further comprises identifying a second prestored data block in the first prestored data chunk 310, having identical data to a first portion of the first incomplete data block 308A.

At step 424, the method 400 further comprises identifying an adjacent incoming chunk 316 to the first incoming chunk 304 A and obtaining a second hash value for the adjacent incoming chunk 316. At step 426, the method 400 further comprises determining if there is a match for the second hash value in the block storage 206. In a case where no match is found for the second hash value in the block storage 206, then the control moves to step 428. However, in a case where match is found for the second hash value in the block storage 206, then the control moves to step 430.

At step 428, the method 400 further comprises storing a pointer 318A to the second prestored data block together with the remainder of the first incomplete data block 308A instead of storing the first incomplete data block 308A. At step 430, the method 400 further comprises identifying a second prestored data chunk 314 corresponding to the second hash value, identifying a second prestored data block in the second prestored data chunk 314 having identical data to a remainder of the first incomplete data block 308A, and storing a pointer 318A to the second prestored data block and a pointer 318B to the third prestored data block instead of storing the first incomplete data block 308A. Alternatively, in some cases each portion (e.g. a given portion of the first incomplete data block 308A) may comprise of two pointers.

Modifications to embodiments of the present disclosure described in the foregoing are possible without departing from the scope of the present disclosure as defined by the accompanying claims. Expressions such as "including", "comprising", "incorporating", "have", "is" used to describe and claim the present disclosure are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is also to be construed to relate to the plural. The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments. The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". It is appreciated that certain features of the present disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable combination or as suitable in any other described embodiment of the disclosure.

Previous Patent: DATA STORAGE SYSTEM WITH BUILT-IN REDUNDANCY AND METHODS OF RECOVERING AND STORING DATA

Next Patent: DATA STORAGE ARRANGEMENT AND METHOD FOR ANONYMIZATION AWARE DEDUPLICATION