REGENERATING LOCALLY REPAIRABLE CODES FOR DISTRIBUTED STORAGE SYSTEMS

Title:

REGENERATING LOCALLY REPAIRABLE CODES FOR DISTRIBUTED STORAGE SYSTEMS

Document Type and Number:

WIPO Patent Application WO/2018/029212

Kind Code:

Abstract:

Disclosed herein is a method for generating local and global parity nodes for source nodes, the method comprising: generating a first set of parity nodes for source nodes by applying a systematic encoding technique to the source nodes, wherein the first set of parity nodes comprises a plurality of parity nodes and each of the parity nodes in the first set of parity nodes is a global parity node; generating, in dependence on at least one of the parity nodes in the first set, a plurality of local parity nodes; generating a second set of parity nodes for the source nodes, wherein the second set of parity nodes comprises said plurality of local parity nodes and all of the parity nodes in the first set except for said at least one of the parity nodes in the first set that the plurality of local parity nodes were generated in dependence on; and using the second set of parity nodes as the parity nodes of the source data. Advantageously, embodiments combine the beneficial aspects of exact regenerating systematic codes and locally repairable codes.

Inventors:

SIMONSEN PER (NO)
GLIGOROSKI DANILO (NO)
JENSEN RUNE ERLEND (NO)
KRALEVSKA KATINA (NO)

Application Number:

PCT/EP2017/070110

Publication Date:

February 15, 2018

Filing Date:

August 08, 2017

Export Citation:

Click for automatic bibliography generation Help

Assignee:

MEMOSCALE AS (NO)

International Classes:

H03M13/37; G06F11/10; H03M13/03; H03M13/13

Foreign References:

GB201507934A	2015-05-08
GB201522869A	2015-12-24
EP2016082656W	2016-12-23

Other References:

KAMATH GOVINDA M ET AL: "Codes With Local Regeneration and Erasure Correction", IEEE TRANSACTIONS ON INFORMATION THEORY, IEEE PRESS, USA, vol. 60, no. 8, 1 August 2014 (2014-08-01), pages 4637 - 4660, XP011553834, ISSN: 0018-9448, [retrieved on 20140710], DOI: 10.1109/TIT.2014.2329872
CHENG HUANG ET AL: "Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems", PROC., SIXTH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS, NCA 2007, 1 July 2007 (2007-07-01), pages 79 - 86, XP031119281, ISBN: 978-0-7695-2922-6
KRALEVSKA KATINA ET AL: "General Sub-Packetized Access-Optimal Regenerating Codes", IEEE COMMUNICATIONS LETTERS, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 20, no. 7, 1 July 2016 (2016-07-01), pages 1281 - 1284, XP011616340, ISSN: 1089-7798, [retrieved on 20160708], DOI: 10.1109/LCOMM.2016.2561287
AGARWAL GAURAV KUMAR ET AL: "An alternate construction of an access-optimal regenerating code with optimal sub-packetization level", PROC., IEEE TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS, NCC, 27 February 2015 (2015-02-27), pages 1 - 6, XP032763750, DOI: 10.1109/NCC.2015.7084873
DANILO GLIGOROSKI ET AL: "Locally Repairable and Locally Regenerating Codes Obtained by Parity-Splitting of HashTag Codes", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 23 January 2017 (2017-01-23), XP080751036
RASHMI K V ET AL: "A piggybacking design framework for read-and download-efficient distributed storage codes", PROC., IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2013, 7 July 2013 (2013-07-07), pages 331 - 335, XP032496947, ISSN: 2157-8095, [retrieved on 20131003], DOI: 10.1109/ISIT.2013.6620242
A. DIMAKIS; P. GODFREY; Y. WU; M. WAINWRIGHT; K. RAMCHANDRAN: "Network coding for distributed storage systems", IEEE TRANSACTIONS ON INFORMATION THEORY, vol. 56, no. 9, September 2010 (2010-09-01), pages 4539 - 4551, XP011316796
P. GOPALAN; C. HUANG; H. SIMITCI; S. YEKHANIN: "On the locality of codeword symbols", IEEE TRANSACTIONS ON INFORMATION THEORY, vol. 58, no. 11, 2012, pages 6925 - 6934, XP011468632, DOI: doi:10.1109/TIT.2012.2208937
F. E. OGGIER; A. DATTA: "Self-repairing homomorphic codes for distributed storage systems", INFOCOM, 2011, pages 1215 - 1223, XP031953290, DOI: doi:10.1109/INFCOM.2011.5934901
SIMPLE REGENERATING CODES: NETWORK CODING FOR CLOUD STORAGE; D. PAPAILIOPOULOS; J. LUO; A. DIMAKIS; C. HUANG; J. LI, IEEE INFOCOM, 2012, pages 2801 - 2805
K.V. RASHMI ET AL.: "A ''Hitchhiker's'' Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centres", SIGCOMM, August 2014 (2014-08-01)
G.K. AGARWAL; B. SASIDHARAN; P. VIJAY KUMAR: "An alternate construction of an access-optimal regenerating code with optimal sub-packetisation level", NATIONAL CONFERENCE ON COMMUNICATIONS (NCC, February 2015 (2015-02-01), pages 1 - 6
K. KRALEVSKA; D. GLIGOROSKI; H. 0VERBY: "General sub-packetized access-optimal regenerating codes", IEEE COMMUNICATIONS LETTERS, vol. 20, no. 7, July 2016 (2016-07-01), pages 1281 - 1284, XP011616340, DOI: doi:10.1109/LCOMM.2016.2561287
K. KRALEVSKA; D. GLIGOROSKI; R. E. JENSEN; H. 0VERBY: "HashTag Erasure Codes: From theory to practice", ARXIV PREPRINT, vol. abs/1609, 2016
C. HUANG; H. SIMITCI; Y. XU; A. OGUS; B. CALDER; P. GOPALAN; J. LI; S. YEKHANIN: "Erasure coding in windows azure storage", USENIX ANNUAL TECHNICAL CONFERENCE, 2012, pages 15 - 26
M. SATHIAMOORTHY; M. ASTERIS; D. S. PAPAILIOPOULOS; A. G. DIMAKIS; R. VADALI; S. CHEN; D. BORTHAKUR, XORING ELEPHANTS: NOVEL ERASURE CODES FOR BIG DATA, vol. 6, no. 5, 2013, pages 325 - 336
G. M. KAMATH; N. PRAKASH; V. LALITHA; P. V. KUMAR: "Codes with local regeneration and erasure correction", IEEE TRANS. ON INF. THEORY, vol. 60, no. 8, August 2014 (2014-08-01), pages 4637 - 4660, XP011553834, DOI: doi:10.1109/TIT.2014.2329872

Attorney, Agent or Firm:

J A KEMP (GB)

Download PDF:

View/Download PDF PDF Help

Claims:

1. A method for generating local and global parity nodes for source nodes, the method comprising: generating a first set of parity nodes for source nodes by applying a systematic encoding technique to the source nodes, wherein the first set of parity nodes comprises a plurality of parity nodes and each of the parity nodes in the first set of parity nodes is a global parity node; generating, in dependence on at least one of the parity nodes in the first set, a plurality of local parity nodes; generating a second set of parity nodes for the source nodes, wherein the second set of parity nodes comprises said plurality of local parity nodes and all of the parity nodes in the first set except for said at least one of the parity nodes in the first set that the plurality of local parity nodes were generated in dependence on; and using the second set of parity nodes as the parity nodes of the source data.

2. The method according to claim 1 , wherein each of the global parity nodes is generated in dependence on all of the source nodes.

3. The method according to claim 1 or 2, wherein none of the local parity nodes is generated in dependence on all of the source nodes.

4. The method according to any preceding claim, wherein the at least one of the parity nodes in the first set that the plurality of local parity nodes are generated in dependence on comprise the plurality of local parity nodes.

5. The method according to any preceding claim, wherein the second set of parity nodes of the source nodes consists of said plurality of local parity nodes and all of the parity nodes in the first set of parity nodes except for said at least one of the global parity nodes that the plurality of local parity nodes were generated in dependence on.

6. The method according to any preceding claim, wherein the first set of parity nodes are generated by an MDS coding technique.

7. The method according to any preceding claim, wherein each of the nodes comprises a plurality of sub-stripes of data; and each of the nodes comprises the same number of sub-stripes.

8. The method according to claim 7, wherein the number of sub-stripes in each node is 2 or more; and/or the number of local parity nodes is 2 or more; and/or the number of global parity nodes in the second set is 1 or more.

9. The method according to any preceding claim, wherein for each of the global parity nodes of the source nodes in the second set of parity nodes, the global parity node is generated in dependence on a sub-stripe of a source node that one or more of the other of the global parity nodes in the second set of parity nodes is also generated in dependence on.

10. The method according to any preceding claim, wherein the applied systematic encoding technique is a coding technique for generating HashTag codes.

1 1. The method according to any preceding claim, wherein the local parity nodes are generated in dependence on only one of the parity nodes in the first set.

12. The method according to claim 7 or any claim dependent thereon, wherein for one of the at least one parity nodes in the first set of parity nodes, each sub-stripe is generated in dependence on a combination of only one sub-stripe of source data in each of the source nodes.

13. The method according to claim 12, wherein the local parity nodes are generated in dependence on the parity node in the first set of parity nodes for which each sub- stripe is generated in dependence on a combination of only one sub-stripe of source data in each of the source nodes.

14. The method according to claim 12 or 13, wherein the sub-stripes are combined by XOR operations.

15. The method according to any preceding claim, further comprising obtaining the code parameters n, k, r, α, δ and /, wherein: n is the total number of source nodes and the nodes in the second set of parity nodes;

k is the total number of source nodes;

r is the total number of nodes in the first set of parity nodes;

a is the number of sub-stripes in each node;

δ and / are parameters that define the number of local parity nodes and global parity nodes in the second set of parity nodes.

16. The method according to claim 15, wherein the number of local parity nodes in the second set of parity nodes is the product of / and (δ - 1 ).

17. The method according to claim 15 or 16, wherein the number of global parity nodes in the second set of parity nodes is (r - δ + 1 ).

18. The method according to any of claims 15 to 17, wherein each of the code parameters is received as an input, determined from one or more other code parameters, or is a default value.

19. The method according to any of claims 15 to 18, wherein the number sub-stripes in each node, a, is the same as the total number of global parity nodes in the second set of global parity nodes; each of the global parity nodes in the second set are named as {g_t, ... , g_a}; for each global parity node in the second set, g_{i :} the sub-stripes of the node are denoted as:

an array of all of the substripes of all of the global parity nodes in the second set is denoted as:

the method comprises using to compute a new array of all of the sub

stripes of all of the global parity nodes that is where:

determining a third set of parity nodes that comprises all of the local parity nodes comprised by the second set of parity nodes and the global parity nodes of the third set of parity nodes are determined in dependence on and

using the third set of parity nodes as the parity nodes of the source data instead of the second set of parity nodes.

20. The method according to claim 19, further comprising: generating by performing permutations on the rows and/or columns of

determining a fourth set of parity nodes that comprises all of the local parity nodes comprised by the second set of parity nodes and the global parity nodes of the fourth set of parity nodes are determined in dependence on ; and using the fourth set of parity nodes as the parity nodes of the source data instead of the second set of parity nodes.

21. The method according to claim 7 or any claim dependent thereon, wherein each of the sub-stripes of each of the local parity nodes is generated in dependence on a combination of the same number of sub-stripes from source nodes.

22. The method according to claim 7 or any claim dependent thereon apart from

claim 21 , wherein the number of sub-stripes of source data that at least one of the local parity nodes is generated in dependence on is different from the number of sub-stripes of source data that at least one of the local parity nodes is generated in dependence on.

23. The method according to any preceding claim apart from claim 11 , wherein the local parity nodes are generated in dependence more than one of the parity nodes in the first set; the number of local parity nodes generated in dependence on one of the parity nodes in the first set is different from the number of local parity nodes generated in dependence on one of the other parity nodes in the first set.

24. The method according to any preceding claim, wherein the method is computer implemented.

25. The method according to any preceding claim, wherein the code is generated over a Finite Field such as GF(8), GF(16), GF(32), GF(64), GF(128) or GF(256).

26. The method according to any preceding claim, further comprising performing a mapping operation on source data, for storing in one or more of the source nodes, and encoded source data, for storing as parity data in one or more of the parity nodes, such that one or more of the source nodes stores both source data and parity data.

27. A method of coding source data, the method comprising: obtaining source data; and encoding the source data to generate redundant data of the source data in accordance with the method of any preceding claim.

28. The method according to any preceding claim, further comprising transferring the source data and redundant data over a network.

29. A coding technique for generating coded data from source data, the coding technique being equivalent to generating the coded data in dependence on the method of any preceding claim.

30. A generator matrix for defining how to generate coded data from source data, the generator matrix defining a code that is equivalent to a systematic code with parity nodes generated in accordance with the method of any preceding claim.

31 . A generator matrix, G, for defining how to generate coded data from source data, wherein G has (α x n) rows and (α x T) columns with elements in a finite field in accordance with claim 15, where T is the total number of local parity nodes and global parity nodes, and G defines a code that is equivalent to a code that has been generated by the method of any preceding claim and where G has the form G = p , where / is an identity matrix with dimensions (α x k) x (α x k) and where P is a matrix with dimensions (α x T) x (α x k) .

32. A computing system configured to perform the method of any of claims 1 to 31.

33. A computer program that, when executed by a computing system, causes the computing system to perform the method of any of claims 1 to 31 .

34. A method for storing data in a data storage system, the method comprising:

obtaining source data; generating parity data for the source data in accordance with the method of any preceding claim; and storing the source data and the parity data in the data storage nodes of the data storage system.

35. A data storage system configured to store data in accordance with claim 34.

36. A method of using a plurality of nodes, that have been generated in accordance with the method of any of claims 1 to 29, to recover a source node, the method comprising performing a local recovery operation; wherein the local recovery operation comprises recovering the source node in dependence one of the local parity nodes and the source nodes that, in addition to the source node being recovered, said local parity node was generated in dependence on; and the local recovery operation does not require data from any other nodes to recover the lost source node.

37. A method of using a plurality of nodes, that have been generated in accordance with the method of any of claims 1 to 29, to recover a source node, the method comprising performing a global recovery operation; wherein the global recovery operation comprises recovering the source node in dependence on data in one or more of the global parity nodes and all of the source nodes apart from the source node being recovered.

38. A method of using a plurality of nodes, that have been generated in accordance with the method of any of claims 1 to 29, to recover a source node, the method comprising: determining either to recover the source node in dependence on a global recovery operation or a local recovery operation; and, in dependence on the determination, either recovering the source node in dependence on the method of claim 36; or recovering the source node in dependence on the method of claim 37.

39. A method of using a plurality of nodes, that have been encoded in dependence on the method of any of claims 1 to 29, to recover a local parity node, the method comprising: recovering the local parity node in dependence on the source nodes that the local parity node was generated in dependence on.

40. A method of using a plurality of nodes, that have been encoded in dependence on the method of any of claims 1 to 29, to recover a global parity node, the method comprising: recovering the global parity node in dependence on all of the other global parity nodes and all of the local parity nodes.

41. A method of using a plurality of nodes, that have been encoded in dependence on the method of any of claims 1 to 29, to recover a global parity node, the method comprising: recovering the global parity node in dependence on all of the other global parity nodes and all of the source nodes.

42. The method according to claim 41 , the method comprising recovering the global parity node in dependence on one sub-stripe in each of the source nodes and one sub-stripe in each of the other global parity nodes only.

43. A computing system configured to perform the method of any of claims 35 to 42.

44. A computer program that, when executed by a computing system, causes the computing system to perform the method of any of claims 35 to 42.

Description:

REGENERATING LOCALLY REPAIRABLE CODES FOR

DISTRIBUTED STORAGE SYSTEMS

Field

The field of the invention is systematic coding of data with erasure codes. Particularly preferred applications for coded data according to embodiments are data storage systems. The codes according to embodiments advantageously provide low average repair bandwidth for recovery of a single erased fragment of data compared to known techniques with Locally Repairable Codes. Another advantageous property of the codes according to embodiments is the low number of fragments of data that need to be read in order to reconstruct a single lost fragment of data compared to the known techniques provided by Exact Regenerating Codes.

Background

An important feature in distributed storage systems is the efficiency of the repair of a failed data fragment. The efficiency is measured with two metrics: 1. The repair bandwidth; and 2. The repair locality. The repair bandwidth is the amount of transferred data during a repair, i.e recovery, of a failed data fragment, while the repair locality is the number of data nodes contacted during the repair process. Two types of erasure codes that address each of these metrics have emerged: 1. Regenerating codes; and 2. Locally Repairable Codes (LRCs). Regenerating codes were defined in the paper 'Network coding for distributed storage systems' by A. Dimakis, P. Godfrey, Y. Wu, M. Wainwright, and K. Ramchandran, IEEE Transactions on Information Theory, vol. 56, no. 9, Sept 2010, pp. 4539-4551. Locally Repairable Codes were introduced independently in 3 papers: 1. On the locality of codeword symbols' by P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Transactions on Information Theory, vol. 58, no. 1 1 , 2012, pp. 6925-6934.; 2. 'Self- repairing homomorphic codes for distributed storage systems' by F. E. Oggier and A. Datta, INFOCOM, 201 1 , pp. 1215-1223; 3. 'Simple regenerating codes: Network coding for cloud storage' by D. Papailiopoulos, J. Luo, A. Dimakis, C. Huang, and J. Li, IEEE INFOCOM, 2012, pp. 2801-2805.

Regenerating codes aim to minimize the repair bandwidth, while LRCs seek to minimize the repair locality. It is known for regenerating codes to use sub-packetization. Each data node is divided into sub-packets, also referred to herein as substripes, of data. The recovery of a failed data node is performed by calculating the lost data node in dependence on sub-packets from all n-1 non-failed data nodes. A problem with such codes is that the reconstruction of lost a data node requires the transfer of data from all n- 1 non-failed data nodes and this results in I/O being high, i.e. a large number of data nodes need to be accessed.

An advantage of LRCs over regenerating codes is that less data nodes need to be accessed in a data recovery operation. Contacting less data nodes is particularly beneficial for storage applications that preferably have a low I/O. A problem with LRCs is that the amount of transferred data during data recovery is larger than for regenerating codes.

An [n, k] _q Maximum Distance Separable (MDS) code C has to transfer k symbols to recover one lost symbol, where k is the number of source nodes and n is the total number of nodes. When a systematic coding technique is used, the k source nodes comprise source data. There are (n - k) parity nodes, also referred to as redundant nodes, that store parity data, also referred to as redundant data, for use in reconstructing a failed node. The parity data is generated by combining source data using operations in a Finite Field. An exact regenerating systematic code in a finite field GF(q) is denoted by [n, k] _q. Regenerating codes are a class of codes for distributed storage that permit data recovery from any k out of n nodes, and also have the capability of repairing a failed node by connecting to d nodes and downloading an amount of data that is significantly less than the file size. Under an [n, k] _q regenerating code, a file of M symbols from a finite field GF(q) is divided into k fragments, each of size a = M/k symbols, which are further encoded into n fragments using an [n, k] _q MDS (Maximum-Distance Separable) code. The data from a failed node is recovered by transferring β symbols from each d non-failed nodes. Thus, the repair bandwidth γ is equal to άβ where a≤ άβ « M.

The data in each node can further be fragmented into a fragments. The reconstruction of unavailable data is performed by operations on fragments of data within available nodes. The sub-packetization level represents the minimum dimension over which all operations are performed. When the sub-packetization level is 1 , as it is for standard Reed-Solomon (RS) codes, then each node is recovered by transferring data of size M/k from k nodes, i.e., the total amount of the transferred data is M. Accordingly, the repair bandwidth is equal to the file size when RS codes are used. By using a sub-packetization level a > 1 , the repair bandwidth can be less than M.

In the paper Ά "Hitchhiker's" Guide to Fast and Efficient Data Reconstruction in Erasure- coded Data Centres' by K.V. Rashmi et al, SIGCOMM 2014, August 2014, a technique for improving on RS coding by using a sub-packetization level of 2 is presented.

In UK patent application number GB1507934.6, referred to herein as the [MSR1 ] codes, a more general exact regenerating systematic [n, k] _q code is presented where the sub- packetization is a = r, where r = n— k.

A general erasure coding technique is presented in a paper, by G.K. Agarwal, B. Sasidharan, and P. Vijay Kumar, titled 'An alternate construction of an access-optimal regenerating code with optimal sub-packetisation level', National Conference on Communications (NCC), pages 1-6, Feb 2015, referred to herein as the Agarwal paper. In the Agarwal paper, the data in each node is split into a fragments. The reconstruction of unavailable data is performed by operations on fragments of data within available nodes. The Agarwal paper discloses the use of a sub-packetisation level of a = r™ where m = klr. An essential condition to be able to construct the disclosed codes in the Agarwal paper is that m is an integer.

Referred to throughout the present document as [MSR2] is a patent application family comprising UK patent application number GB1522869.5 and International patent application number PCT/EP2016/082656, as well as the publications:

K. Kralevska, D. Gligoroski, and H. 0verby, "General sub-packetized access-optimal regenerating codes," IEEE Communications Letters, vol. 20, no. 7, pp. 1281-1284, July 2016; and

K. Kralevska, D. Gligoroski, R. E. Jensen, and H. 0verby, "HashTag Erasure Codes: From theory to practice," arXiv preprint, vol. abs/1609.02450, 2016.

In the codes defined by [MSR2], a more general exact regenerating systematic code [n, k, d] _q is presented where the sub-packetization level a can be any value in the range Codes generated in accordance with [MSR2] are referred to

Hash Tag codes.

Definition 1 :

A Locally Reparable Code (LRC) C, denoted as (k, l, g), is a code that divides k source nodes into / groups, with k/l source nodes in each group. It computes one local parity node for each group. It also computes g global parity nodes from all source nodes. For LRCs the total number of nodes n is n = k + I + g. With an LRC, any single lost source node can be recovered by transferring the data from k/l data nodes (i.e. source and parity nodes).

Two practical LRCs have been implemented in Windows Azure Storage, presented in the paper 'Erasure coding in windows azure storage' by C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin, USENIX Annual Technical Conference, 2012, pp. 15-26, referred to herein as LRC Azure, and HDFS-Xorbas by Facebook, presented in the paper 'XORing elephants: Novel erasure codes for big data' by M. Sathiamoorthy, M. Asteris, D. S. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, vol. 6, no. 5, 2013, pp. 325-336, referred herein as Xorbas LRC. Both implementations reduce the repair bandwidth and the I/O for reconstructing a single source node by introducing I local and r global parity nodes. Any single source node can be recovered from y data nodes within its local group. However, reconstruction of any global parity node (in Windows Azure) or double node failures is performed in the same way as with RS codes, i.e., data from k nodes has to be transferred.

There is a general need to improves known coding techniques, in particular for data storage applications.

Summary

According to a first aspect of the invention, there is provided a method for generating local and global parity nodes for source nodes, the method comprising: generating a first set of parity nodes for source nodes by applying a systematic encoding technique to the source nodes, wherein the first set of parity nodes comprises a plurality of parity nodes and each of the parity nodes in the first set of parity nodes is a global parity node; generating, in dependence on at least one of the parity nodes in the first set, a plurality of local parity nodes; generating a second set of parity nodes for the source nodes, wherein the second set of parity nodes comprises said plurality of local parity nodes and all of the parity nodes in the first set except for said at least one of the parity nodes in the first set that the plurality of local parity nodes were generated in dependence on; and using the second set of parity nodes as the parity nodes of the source data.

Preferably, each of the global parity nodes is generated in dependence on all of the source nodes.

Preferably, none of the local parity nodes is generated in dependence on all of the source nodes.

Preferably, the at least one of the parity nodes in the first set that the plurality of local parity nodes are generated in dependence on comprise the plurality of local parity nodes.

Preferably, the second set of parity nodes of the source nodes consists of said plurality of local parity nodes and all of the parity nodes in the first set of parity nodes except for said at least one of the global parity nodes that the plurality of local parity nodes were generated in dependence on.

Preferably, the first set of parity nodes are generated by an MDS coding technique.

Preferably, each of the nodes comprises a plurality of sub-stripes of data; and each of the nodes comprises the same number of sub-stripes.

Preferably, the number of sub-stripes in each node is 2 or more; and/or the number of local parity nodes is 2 or more; and/or the number of global parity nodes in the second set is 1 or more.

Preferably, for each of the global parity nodes of the source nodes in the second set of parity nodes, the global parity node is generated in dependence on a sub-stripe of a source node that one or more of the other of the global parity nodes in the second set of parity nodes is also generated in dependence on. Preferably, the applied systematic encoding technique is a coding technique for generating HashTag codes.

Preferably, the local parity nodes are generated in dependence on only one of the parity nodes in the first set.

Preferably, for one of the at least one parity nodes in the first set of parity nodes, each sub-stripe is generated in dependence on a combination of only one sub-stripe of source data in each of the source nodes.

Preferably, the local parity nodes are generated in dependence on the parity node in the first set of parity nodes for which each sub-stripe is generated in dependence on a combination of only one sub-stripe of source data in each of the source nodes.

Preferably, the sub-stripes are combined by XOR operations.

Preferably, the method further comprising obtaining the code parameters n, k, r, α, δ and /, wherein: n is the total number of source nodes and the nodes in the second set of parity nodes; k is the total number of source nodes; r is the total number of nodes in the first set of parity nodes; a is the number of sub-stripes in each node; δ and / are parameters that define the number of local parity nodes and global parity nodes in the second set of parity nodes.

Preferably, the number of local parity nodes in the second set of parity nodes is the product of / and (δ - 1 ).

Preferably, the number of global parity nodes in the second set of parity nodes is (r - δ + 1 )-

Preferably, each of the code parameters is received as an input, determined from one or more other code parameters, or is a default value.

Preferably, the number sub-stripes in each node, a, is the same as the total number of global parity nodes in the second set of global parity nodes; each of the global parity nodes in the second set are named as for each global parity node in the

second set, g _i the sub-stripes of the node are denoted as:

an array of all of the substripes of all of the global parity nodes in the second set is denoted as:

the method comprises using to compute a new array of all of the sub-stripes of

all of the global parity nodes that is where:

parity nodes as the parity nodes of the source data instead of the second set of parity nodes.

Preferably the method further comprises: generating by performing

permutations on the rows and/or columns of determining a fourth set of parity

nodes that comprises all of the local parity nodes comprised by the second set of parity nodes and the global parity nodes of the fourth set of parity nodes are determined in dependence on and using the fourth set of parity nodes as the parity nodes

of the source data instead of the second set of parity nodes. Preferably, each of the sub-stripes of each of the local parity nodes is generated in dependence on a combination of the same number of sub-stripes from source nodes.

Preferably, the number of sub-stripes of source data that at least one of the local parity nodes is generated in dependence on is different from the number of sub-stripes of source data that at least one of the local parity nodes is generated in dependence on.

Preferably, the local parity nodes are generated in dependence more than one of the parity nodes in the first set; the number of local parity nodes generated in dependence on one of the parity nodes in the first set is different from the number of local parity nodes generated in dependence on one of the other parity nodes in the first set.

Preferably, the method is computer implemented.

Preferably, the code is generated over a Finite Field such as GF(8), GF(16), GF(32), GF(64), GF(128) or GF(256).

Preferably, the method comprises performing a mapping operation on source data, for storing in one or more of the source nodes, and encoded source data, for storing as parity data in one or more of the parity nodes, such that one or more of the source nodes stores both source data and parity data.

According to a second aspect of the invention, there is provided a method of coding source data, the method comprising: obtaining source data; and encoding the source data to generate redundant data of the source data in accordance with the method of the first aspect.

Preferably, the method further comprises transferring the source data and redundant data over a network.

According to a third aspect of the invention, there is provided a coding technique for generating coded data from source data, the coding technique being equivalent to generating the coded data in dependence on the method of the first or second.

According to a fourth aspect of the invention, there is provided a generator matrix for defining how to generate coded data from source data, the generator matrix defining a code that is equivalent to a systematic code with parity nodes generated in accordance with the method of any of the first to third aspects.

According to a fifth aspect of the invention, there is provided a generator matrix, G, for defining how to generate coded data from source data, wherein G has (α x n) rows and (α x T) columns with elements in a finite field, where T is the total number of local parity nodes and global parity nodes, and G defines a code that is equivalent to a code that has been generated by the method of any of the first to fourth aspects and where G has the

form G = , where / is an identity matrix with dimensions (α x k) x (α x k) and where

P is a matrix with dimensions (α x T) x (α x k)

According to a sixth aspect of the invention, there is provided a computing system configured to perform the method of any of the first to fourth aspects.

According to a seventh aspect of the invention, there is provided a computer program that, when executed by a computing system, causes the computing system to perform the method of any of the first to fourth aspects.

According to an eighth aspect of the invention, there is provided a method for storing data in a data storage system, the method comprising: obtaining source data; generating parity data for the source data in accordance with the method of any of the first to seventh aspects; and storing the source data and the parity data in the data storage nodes of the data storage system.

According to a ninth aspect of the invention, there is provided a data storage system configured to store data in accordance with the eighth aspect.

According to a tenth aspect of the invention, there is provided a method of using a plurality of nodes, that have been generated in accordance with the method of any of the first to third aspects, to recover a source node, the method comprising performing a local recovery operation;

wherein the local recovery operation comprises recovering the source node in dependence one of the local parity nodes and the source nodes that, in addition to the source node being recovered, said local parity node was generated in dependence on; and the local recovery operation does not require data from any other nodes to recover the lost source node.

According to an eleventh aspect of the invention, there is provided a method of using a plurality of nodes, that have been generated in accordance with the method of any of the first to third aspects, to recover a source node, the method comprising performing a global recovery operation; wherein the global recovery operation comprises recovering the source node in dependence on data in one or more of the global parity nodes and all of the source nodes apart from the source node being recovered.

According to a twelfth aspect of the invention, there is provided a method of using a plurality of nodes, that have been generated in accordance with the method of any of any of the first to third aspects, to recover a source node, the method comprising: determining either to recover the source node in dependence on a global recovery operation or a local recovery operation; and, in dependence on the determination, either recovering the source node in dependence on the method of the tenth aspect; or recovering the source node in dependence on the method of the eleventh aspect.

According to a thirteenth aspect of the invention, there is provided a method of using a plurality of nodes, that have been encoded in dependence on the method of any of the first to third aspects, to recover a local parity node, the method comprising: recovering the local parity node in dependence on the source nodes that the local parity node was generated in dependence on.

According to a fourteenth aspect of the invention, there is provided a method of using a plurality of nodes, that have been encoded in dependence on the method of any of the first to third aspects, to recover a global parity node, the method comprising: recovering the global parity node in dependence on all of the other global parity nodes and all of the local parity nodes.

Preferably, the recovery process of the global parity node does not require any of the source nodes. According to a fifteenth aspect of the invention, there is provided a method of using a plurality of nodes, that have been encoded in dependence on the method of any of the first to third aspects, to recover a global parity node, the method comprising: recovering the global parity node in dependence on all of the other global parity nodes and all of the source nodes.

Preferably, the method comprises recovering the global parity node in dependence on one sub-stripe in each of the source nodes and one sub-stripe in each of the other global parity nodes only.

According to a sixteenth aspect of the invention, there is provided a computing system configured to perform the method of any of the tenth to fifteenth aspects.

According to a seventeenth aspect of the invention, there is provided a computer program that, when executed by a computing system, causes the computing system to perform the method of any of the tenth to fifteenth aspects.

Advantageously, embodiments combine the beneficial aspects of exact regenerating systematic codes and locally repairable codes.

List of Figures

Figure 1 shows the construction of parity nodes according to an embodiment;

Figure 2 shows the construction of parity nodes according to an embodiment;

Figures 3A to 3C show the construction of linear expressions for the parity parts of a code according to an embodiment;

Figure 4 is a flowchart of a process according to an embodiment; and Figure 5 shows the construction of parity nodes according to an embodiment. Description

Embodiments of the invention provide a new family of Regenerating - Locally Repairable Codes (R-LRC). Codes according to embodiments have the same locality as known LRCs, but they are constructed with sub-packetization levels as disclosed for the codes in [MSR1 ] and [MSR2]. Advantageously, the codes can have a lower repair bandwidth than known LRC codes.

The entire contents of [MSR1 ], which is UK patent application number GB1507934.6, is incorporated herein by reference.

The entire contents of [MSR2] is incorporated herein by reference. [MSR2 ] includes both UK patent application number GB1522869.5 and International patent application number PCT/EP2016/082656, as well as the publications:

K. Kralevska, D. Gligoroski, and H. 0verby, "General sub-packetized access-optimal regenerating codes," IEEE Communications Letters, vol. 20, no. 7, pp. 1281-1284, July 2016; and

K. Kralevska, D. Gligoroski, R. E. Jensen, and H. 0verby, "HashTag Erasure Codes: From theory to practice," arXiv preprint, vol. abs/1609.02450, 2016.

The codes generated according to the techniques in [MSR2] are known as HashTag codes.

Provided below are definitions from at least [MSR2]. Definition 2 (as defined in [MSR2]):

There are k source nodes {d _t, d ₂, d _k] where each source node d _j comprises of an indexed set of a substripes {a _{1 ;} , a _{2 j}, a _{a j}}. The source nodes can be represented as a two-dimensional array Data with a rows and k columns such that:

Definition 3 (as defined in [MSR2]):

There are r redundant nodes {p ₁, p ₂, ..., p _r} where each redundant node p _{j :} where 1≤ j≤ r, comprises of an indexed set of a substripes {p _1,j , p _2,j, ... , p _{α, j}}.

Definition 4 (as defined in [MSR2]):

There are r redundant two-dimensional index arrays P _{1 :} ..., P _r. The index array P ₁ has a rows and k columns, where each cell in P ₁ is a pair of indexes with the following values:

The index arrays P ₂, ..., P _r have a rows and k + m columns, where

in P _j, where 2≤ ≤ r, is a pair of indexes with the following values:

where the pairs with values are determined by the algorithm disclosed in

[MSR2].

Definition 5 (as defined in [MSR2]):

Each of the substripes p _i,j in the redundant nodes {p ₁, p ₂, ..., p _r}, where 1≤ i≤ a and 1≤

exists in the i-th row of the index array P, and the co-efficient is a nonzero element

in the finite field GF(q). The plurality of coefficients is such that the [MSR2] code is

an MDS code.

Embodiments include using any of the techniques disclosed in [MSR2] to determine how to generate coded data. The following is an algorithm according to an embodiment that generates a Regenerating - Locally Repairable Code (R-LRC) that advantageously has a low repair locality and a low recovery bandwidth.

Algorithm 1 Generating Regenerating - Locally Repairable Codes

Input: Parameters for Locally Reparable Code: (k, l, g) and a, where a is the number of substripes in each node.

Output: A Regenerating - Locally Repairable Code (k, I, g) with a substripes in each node. The following step are performed by Algorithm 1 :

1 . An MDS code is obtained that has k source nodes, r = g + I - 1 parity nodes and a substripes. In the present embodiment, the techniques disclosed in [MSR2] are used to obtain the code. However, embodiments also include using other techniques for obtaining a code so long as the generated code is consistent with the input parameters k, I, g and a. In some circumstances, the techniques according to [MSR1 ] may therefore be used.

2. Obtain the corresponding index arrays P _{1 :} ..., P _r using the techniques disclosed in [MSR2].

3. The global parity nodes are defined as the redundant nodes {p ₂, -- , p _r} ^as generated by the [MSR2] code.

4. The local parity nodes are defined in dependence on the index array:

According to present embodiment, P ₁ is split into / sub-arrays as follows:

5. The local parities are where The substripes of local parities

where are computed as , where

exists in the i-th row of the index array p _1,j and

coefficient is the corresponding nonzero element in the finite field GF(q)

determined by the [MSR2] code.

6. The output of Algorithm 1 is a code with parameters [k + g + I— 1, k] _q with a substripes that has the properties of a code according to [MSR2]. Advantageously, the code is also an (k, l, g) Locally Repairable Code.

The implementation of Algorithm 1 as presented above generates sub-arrays of P ₁ that all have the same dimensions. However, embodiments also include the sub-arrays that P ₁ is divided into having different dimensions. Preferably, the nodes that store data that has to be recovered fast are grouped in groups with a small locality, i.e. sub-arrays with small dimensions. P ₁ defines which sub-stripes of source data from source nodes are combined to form a sub-stripe of redundant data. The l sub-arrays of P ₁ also define which sub-stripes of source data from source nodes are combined to form a sub-stripe of redundant data. The combination of the sub-stripes of source data to generate a sub-stripe of redundant data can be performed by any combination technique over a Finite Field. However, the combination is preferably performed by XOR operations as these are computationally efficient. The arrays that define the parity nodes correspond to a generator matrix that defines the construction of the code. In accordance with the known use of a generator matrix, the product of the source data with the generator matrix generates the redundant data for the code. Constructing a generator matrix for a code from arrays that define parity nodes is a known technique.

Figure 1 shows the generation of local and global parity nodes according to an embodiment.

In Algorithm 1 as described above, the local parity nodes are generated in dependence on only one of the parity nodes initially generated by the coding technique, such as the techniques in [MSR2]. The initially generated parity node that the local parity nodes are generated in dependence on may be any one of the initially generated parity nodes.

Presented below are algorithms according to embodiments for recovering a node when the node has been generated according to the coding technique of embodiments. The algorithms are presented for the recovery of a lost source node when the local parity nodes have all been generated from only one of the initial parity nodes, as in Algorithm 1 . However, embodiments include applying corresponding techniques for recovering a lost source node when the local parity nodes have been generated from more than one of the initial parity nodes.

Algorithm 2 Recovery of one lost source node of an R-LRC

Algorithm 2 is a technique for recovering a lost source node according to an embodiment. The technique is preferable in applications for which it is advantageous for only a low number of source nodes to be read during a recovery operation.

Input: A code that has the parameters [k + g + I - 1, k] _q with a substripes and is also (k, l, g) Locally Repairable Code. The input code may have been generated by Algorithm 1 .

Also received as an input is the index / of a failed source node d _f that needs to be recovered. Output: The data from the failed source node d _f .

1 . Determine the corresponding local parity node p _{1 F}, where 1≤ F≤ I, to the lost source node d _f .

2. Repair d _f from p _{1 F} and the y— 1 source nodes used in the definition of p _{1 F}.

Algorithm 3 Recovery of one lost source node of an R-LRC with low bandwidth

Algorithm 3 is another technique for recovering a lost source node according to an embodiment. The technique is preferable when it is desirable for the repair bandwidth to be low.

Input: A code that has the parameters [k + g + I - 1, k] _q with a substripes and is also (k, l, g) Locally Repairable Code. The input code may have been generated by Algorithm 1 .

Also received as an input is the index / of a failed source node d _f that needs to be recovered.

Output: The data from the failed source node d _f .

1 . Use the repair algorithm for the code with parameters [k + g + I - 1, k] _q and a substripes to repair the lost source node d _f . The repair algorithm may use the node recovery techniques as disclosed in [MSR2] by using all of the local parity nodes as a single global parity node.

Advantageously, Algorithms 2 and 3 provide two options for recovering a lost source node. By using Algorithm 2, the number of nodes that are accessed in the recovery operation is minimised. If Algorithm 3 is used, the repair bandwidth is minimised. The decoding technique can therefore be flexibly determined so that it is the most appropriate depending on the application circumstances of either the number of nodes that are accessed or the repair bandwidth being the greater restraint on the application at the time of the source node recovery. Another example of the generation of an R-LRC according to an embodiment is presented below.

A code is desired with the LRC parameters (6, 2, 2) The techniques disclosed in [MSR2] are used to produce a code with parameters:

[6 + 2 + 2 - l,6] _q = [9,6] _q with α = 9 substripes in each node

The index arrays P1 , P2 and P3 for the parity nodes, that are generated by the [MSR2] coding technique, are presented as:

The source nodes are represented as d1 , d2, d6. The data in each source node is split into 9 substripes of data as follows:

Accordingly, the source node d1 is divided into the nine substripes 1 , 2, 9; the source node d2 into the nine substripes 10, 18; and the other source nodes are similarly divided into nine substripes.

The global parity nodes are defined as the redundant nodes {p ₂, P ₃} as generated by the coding technique as disclosed in [MSR2]. Accordingly, the global parity nodes are produced from the index arrays P2 and P3 and the corresponding coefficients in the finite field GF(q).

The local parities p _1,1 and p _1,2 are computed by splitting the index array P1 in two arrays P1 ,1 and P1 ,2:

Accordingly, each of the 9 substripes where 1≤ i≤ 9 are computed as exists in the t-th row

of the index array p _1,j and coefficient is the corresponding nonzero element in the

finite field GF(q) determined by the [MSR2] code.

Presented below are examples of recovering a node of source data.

Source node recovery option 1 :

If there is a need to recover one lost source node and for the number of source nodes that are contacted to be low, then Algorithm 2 is used to recover the lost source node.

For example, if the source node d 1 is lost then d 1 can be recovered by contacting the local parity node and the nodes d2 and d3. Each substripe in p _1,1 is a combination of

the corresponding substripes in d 1 , d2 and d3 and so there is sufficient information to recover the corresponding lost substripe in d 1 .

During the recovery of the source node, the total number of contacted nodes is 3 and the total number of contacted nodes is therefore lower than if all of the source nodes had been contacted. However, the total amount of transferred data, i.e. the bandwidth requirement, is 3 times the size of the lost node d1 .

Source node recovery option 2:

If there is a need to recover one lost source node with the amount of transferred data being low, i.e. with low bandwidth, then Algorithm 3 is used to recover the lost source node.

For example if the source node d1 is lost then it can be recovered using the data recovery techniques disclosed in [MSR2]. The recovery of d1 requires the following steps:

1 . From the local parity node read the 3 data fragments, i.e. substripes,

2. From the source nodes d2, d6 read the following 15 substripes:

Accordingly, the recovery of source node d 1 requires the transfer of 24 substripes of data. Each substripe is 1 /9 of the size of the lost data node and so the total amount of data required for repairing the source node d 1 is 24/9 = 2.67

This demonstrates an advantage of Node recovery option 2 over Node recovery option 1 , namely that the repair bandwidth is reduced from 3 to 2.67. However, the node recovery required obtaining data from all of the source nodes.

Embodiments also include the generation of codes that have the additional advantageous property of the efficiency of the recovery of one of the global parity nodes being increased, shown in Algorithm 4 below.

Additional steps are included into the previously presented coding techniques according to embodiments. The additional steps are presented below as additional steps in Algorithm 1 . However, embodiments include the additional steps being applied in the other coding techniques according to embodiments than in Algorithm 1 .

Algorithm 4 Generating Regenerating - Locally Repairable Codes with efficient repair of the global parity nodes

Input: Parameters for Locally Reparable Code: (k, l, g) and a. A property of the code is that a, which is the number of substripes in each node, is the same as g. Accordingly, either one of a or g may be received as an input parameter and the other input parameter determined from the relation α = g. Output: A Regenerating - Locally Repairable Code (k, I, g) with g substripes in each node.

1 . An MDS code is obtained that has k data nodes, r = g + I - 1 parity nodes and g substripes. In the present embodiment, the techniques disclosed in [MSR2] are used to obtain the code. However, embodiments also include using other techniques for obtaining a code so long as the generated code is consistent with the input parameters k, I, g and/or α.

2. Obtain the corresponding index arrays P ₁, ..., P _r.

3. The global parity nodes are defined as the redundant nodes {p ₂, - , p _r} ^as generated by the [MSR2] code.

The global parity nodes are used to construct the below array.

Each of the global parity nodes {p ₂, ... , p _r} are respectively re-named as {g _t,

The substripes of the node g _t are denoted as:

An array of all of the substripes of all of the global parity nodes is denoted as:

4. The local parity nodes are generated in dependence on the index array:

split into / sub-arrays as follows

5. The local parity nodes are where The substripes of the local parity

nodes are referred to as where , and are computed as

and the pair exists in the i-

th row of the index array and coefficient is the corresponding nonzero

element in the finite field GF(q) determined by the [MSR2] code.

6. From compute a new array of all of the substripes of all of the global

parity nodes that is and defined as:

7. The output of Algorithm 4 is a code with parameters [k + g + I— 1, k] _q with α = g substripes. The global parity nodes are defined as global nodes defined according to The code has the properties of a code according to [MSR2].

Advantageously, the code is also an (k, l, g) Locally Repairable Code.

Embodiments also include generating the matrix/array that defines the

substripes of all of the global parity nodes of a code according to an embodiment. is obtainable by performing permutations on the rows and/or columns of

The implementation of Algorithm 4 as presented above generates sub-arrays of ? _! that all have the same dimensions. However, embodiments also include the sub-arrays that P _t is divided into having different dimensions. Preferably, the nodes that store data that has to be recovered fast are grouped in groups with a small locality, i.e. sub-arrays with small dimensions.

Due to the generation of the global parity nodes in accordance with steps 6 and 7 of Algorithm 4, the code has the property of any one of the global parity nodes being advantageously recoverable in dependence on the first substripes in all k source nodes and the first substripes in all of the non-failed global parity nodes.

An example of the generation of a code in accordance with Algorithm 4 is provided in Figure 5.

Presented below is an algorithm according to an embodiment for recovering a parity node when the parity node has been generated according to Algorithm 4. Embodiments include applying corresponding techniques for recovering a lost node of redundant data when the local parity nodes have been generated from more than one of the initial parity nodes.

Algorithm 5 Recovery of one lost redundancy node with R-LRC

Algorithm 5 is a technique for recovering a lost parity node according to an embodiment.

Input: A Regenerating - Locally Repairable Code (k, l, g) with g substripes in each node. The input code may have been generated by Algorithm 4.

Also received as an input is the index / of a failed parity node p _f that needs to be recovered.

Output: The data from the failed node p _f .

W pf is a local parity node,

then

1 . Determine the corresponding local group of source nodes that p _f depends on.

2. Repair p _f from all source nodes in that group,

else 3. Repair p _f in dependence on the first substripes in all k source nodes and the first substripes of the g-1 non-failed global parity nodes.

Embodiments include a number of modifications and variations to the implementations described above. Embodiments also include more than one of the initially generated parity nodes being used to generate the local parity nodes. This is illustrated in Figure 2 that shows an embodiments in which a systematic (n, k) MDS code, with a sub-packetisation level α, is used to generate r parity nodes. Of the r parity nodes, δ-1 are used to generate local parity nodes by splitting each of the δ-1 parity nodes into a plurality of local parity nodes. Embodiments also include two or more of the δ-1 parity nodes being used to generate a different number of local parity nodes. The parity nodes that are not used to generate local parity nodes are global parity nodes. Accordingly, there are (r-δ+1 ) global parity nodes. All of the nodes comprise α substripes. This general construction of codes according to embodiments is provided below in Definition 6 to Definition 12 and Algorithm 6, Algorithm 7 and Algorithm 8.

Definition 6:

A Fq-linear vector code of block length n is a code having a symbol alphabet for some a >1 , i.e.

and satisfies the additional property that for given

also belongs to C where ac/ is scalar multiplication of the vector C _i

Each vector c, represents a node. Definition 7:

Let C be a [n, K, dmin, α, κ] vector code with a generator matrix G. The code C is said to have (/, δ) information locality if there exists a set of punctured codes with respective supports {S,} ieL such that

Definitions 6 and 7 above use known terminology in the art and are consistent with definitions provided in at least G. M. Kamath, N. Prakash, V. Lalitha, and P. V. Kumar, "Codes with local regeneration and erasure correction," IEEE Trans, on Inf. Theory, vol. 60, no. 8, pp. 4637-4660, Aug 2014, the entire contents of which are incorporated herein by reference.

Definition 8:

A (n, k, d) _q HashTag linear code is a vector systematic code defined over an alphabet for some a > 1 . It encodes a vector

to a codeword where the systematic

parts and the parity parts ^T

are computed by the linear expressions:

where and the index pair is defined in the y ^'-th row of the index array

Pi-r. The r index arrays are defined as follows:

where the values of the indexes (?,?) are determined by a scheduling algorithm that guarantees the code is MDS, i.e. the entire information x can be recovered from any k out of the n vectors c/.

Definition 9 (as defined in [MSR2]): There are k source nodes {d ₁, d ₂, d _k} where each source node d _j comprises of an indexed set of a substripes { a _1,j , a _2,j, a _{α, j}}. The source nodes can be represented as a two-dimensional array Data with a rows and k columns such that:

Definition 10 (as defined in [MSR2]):

There are r redundant nodes {p ₁, p ₂, ..., p _r} where each redundant node p _;, where 1 < < r, comprises of an indexed set of a substripes {p _1,j , p _2,j, ... , p _{α, j}}.

Definition 1 1 (as defined in [MSR2]):

There are r redundant two-dimensional index arrays P _{1 :} ..., P _r. The index array P ₁ has a rows and k columns, where each cell in P ₁ is a pair of indexes with the following values:

The index arrays P ₂, ..., P _r have a rows and k + m columns, where and each cell

in P _j, where 2≤ ≤ r, is a pair of indexes with the following values:

where the pairs with values are determined by the algorithm disclosed in

[MSR2].

Definition 12 (as defined in [MSR2]):

Each of the substripe in the redundant nodes {p ₁, p ₂, ... , p _r}, where 1≤ i≤ a and 1 < exists in the t-th row of the index array P, and the co-efficient is a nonzero element

in the finite field GF(q). The plurality of coefficients is such that the [MSR2] code is

a MDS code.

Embodiments include using any of the techniques disclosed in [MSR2] to determine how to generate coded data.

The following is an algorithm according to an embodiment that generates a Regenerating - Locally Repairable Code (R-LRC) that advantageously has a low repair locality and a low recovery bandwidth.

Algorithm 6 Generating Regenerating - Locally Repairable Codes

Input:

1 . A [n, k] HashTag MDS code with a sub-packetization level a;

2. Information locality (/, δ) for the R-LRC code.

Output:

A [n, K = ka, ί/min, a, k] vector code with information locality (/, δ). The following steps are performed by Algorithm 6:

1 ) Split k systematic nodes into / disjunctive subsets where every set has y

nodes. While embodiments include this splitting being arbitrary, the present example of an embodiment takes the canonical splitting where

2) Split each of the a linear equations for the first (δ - 1 ) parity expressions into l sub- summands where the variables in each equation correspond to the elements from the disjunctive subsets.

3) Associate the obtained (α x l x (δ-1 )) sub-summands to (l x (δ-1 )) new local parity nodes.

4) Define, and rename, the remaining (r - δ + 1 ) parity nodes that were not split in Steps 1 ) to 3) as new global parity nodes.

5) Obtain a new systematic generator matrix G'from the local and global parity nodes.

6) Return G' as a generator matrix of a (n, k) code with information locality (l, δ). The implementation of Algorithm 6 as presented above in Step 1 generates a canonical splitting with splits S _t ... S _l that all have the same dimensions. However, embodiments also include splits that may have different dimensions. Preferably, the nodes that store data that has to be recovered fast are grouped in groups with a small locality, i.e. splits with small dimensions.

The splitting in Step 1 of Algorithm 6 corresponds to a splitting of the index array P ₁. Accordingly, S ₁ ... S _l comprise / corresponding sub-arrays of P ₁. The combination of the sub-stripes of source data to generate a sub-stripe of redundant data can be performed by any combination technique over a finite field. However, the combination is preferably performed by XOR operations as these are computationally efficient. This corresponds to Step 2 in Algorithm 6 where the linear expressions are split and combined by operation + which corresponds to a XOR operation in finite fields with characteristic 2.

The arrays that define the parity nodes correspond to a generator matrix that defines the construction of the code (as determined in Steps 5 and 6 in Algorithm 6). In accordance with the known use of a generator matrix, the product of the source data with the generator matrix generates the redundant data for the code. Constructing a generator matrix for a code from arrays that define parity nodes is a known technique.

Figure 3A shows the linear expressions for the parity parts of a (9, 6) code with a subpacketization level, a, of 9. The code has been generated according to the techniques disclosed in [MSR2]. The coefficients are from the finite field F32 with the irreducible polynomial x ⁵ + x ³ + 1. The repair bandwidth for any of the source nodes is 2.67.

In an example according to an embodiment, the code shown in Figure 3A is used to generate a code with information locality (I = 2, 5 = 2).

Following step 1 ), the 6 source nodes {c ₁, c ₂, c ₃, c ₄, c ₅, c ₆} are split into the (l = 2) disjunctive subsets

Following step 2), the first global parity in Figure 3A, which is c ₇, is split into two local parities ^as shown in Figure 3B.

The remaining global parity nodes, which are c ₈ and is c ₉, remain as shown in Figure 3A and are renamed as The overall code can be

referred to as an (n, k) = (10,6) code, or, with terminology that gives the information locality, a (6,2,2) code.

In another example according to an embodiment, the code shown in Figure 3A is used to generate a code with information locality (I = 3, δ = 2). The resulting local parities are shown in Figure 3C. The overall code can be referred to as an (n, k) = (1 1 ,6) code, or, with terminology that gives the information locality, a (6,3,2) code.

The above examples demonstrate that the information locality of codes according to embodiments can advantageously be flexibly set as so as to be appropriate for the specific requirements of an application.

The codes according to [MSR2] are particularly preferable for generating the initial parity nodes from which the local parity nodes and global parity nodes are derived due the low repair bandwidth properties of the [MSR2] codes.

Presented below are algorithms according to embodiments for recovering a node when the node has been generated according to the coding technique of embodiments. The algorithms are presented for the recovery of a lost source node when the local parity nodes have all been generated from only one of the initial parity nodes, as in Algorithm 1 , i.e. for δ = 2. When δ = 2, Definition 1 for LRC can be used. However, embodiments include applying corresponding techniques for recovering a lost source node when the local parity nodes have been generated from more than one of the initial parity nodes, i.e. for δ > 2. When δ > 2, the more general Definition 7 for LRC with information locality (l, δ) should be used.

Algorithm 7 Recovery of one lost source node of an R-LRC

Algorithm 7 is a technique for recovering a lost source node according to an embodiment. The technique is preferable in applications for which it is advantageous for only a low number of source nodes to be read during a recovery operation. Input: A code that has the parameters [k + g + I - 1, k] _q with a substripes and is also (k, l, g) Locally Repairable Code. The input code may have been generated by Algorithm 6.

Also received as an input is the index /, where 1≤ /≤ k, of a failed source node c _f that needs to be recovered.

Output: The data from the failed source node c _f .

1 . Determine the corresponding local parity node c _F, where 1≤ F≤ I, to the lost source node c _f .

2. Repair c _f from c _F and the y— 1 source nodes with indexes determined by the split S _F used in the definition of c _F.

Algorithm 8 Recovery of one lost source node of an R-LRC with low bandwidth

Algorithm 8 is another technique for recovering a lost source node according to an embodiment. The technique is preferable when it is desirable for the repair bandwidth to be low.

Input: A code that has the parameters [k + g + I - 1, k] _q with a substripes and is also (k, l, g) Locally Repairable Code. The input code may have been generated by Algorithm 6.

Also received as an input is the index /, where 1≤ /≤ k, of a failed source node c _f that needs to be recovered.

Output: The data from the failed source node c _f .

2. Use the repair algorithm for the code with parameters [k + g + I - 1, k] _q and a substripes to repair the lost source node c _f . The repair algorithm may use the node recovery techniques as disclosed in [MSR2] by using all of the local parity nodes as a single global parity node. Advantageously, Algorithms 7 and 8 provide two options for recovering a lost source node. By using Algorithm 7, the number of nodes that are accessed in the recovery operation is minimised. If Algorithm 8 is used, the repair bandwidth is minimised. The decoding technique can therefore be flexibly determined so that it is the most appropriate depending on the application circumstances of either the number of nodes that are accessed or the repair bandwidth being the greater restraint on the application at the time of the source node recovery.

Another example of the generation of an R-LRC according to an embodiment is presented below.

A code is desired with the LRC parameters (6, 2, 2). The techniques disclosed in [MSR2] are used to produce a code with parameters:

[6 + 2 + 2 - l,6] _q = [9,6] _q with α = 9 substripes in each node.

The index arrays P ₁ , P ₂ and P ₃ for the parity nodes, that are generated by the [MSR2] coding technique, are presented as:

The source nodes are represented as C ₁ , C ₂, C6. The data in each source node is split into 9 substripes of data as follows:

Accordingly, the source node C ₁ is divided into the nine substripes X _{(1 ,1 )}, X _{(2,1 )}, X _{(9,1 )}; the source node C ₂ into the nine substripes X _{(1 ,2)}, X( ₂, ₂), X _{(9 ,2)}; and the other source nodes are similarly divided into nine substripes.

The global parity nodes g ₁, g ₂ are defined as the redundant nodes {p ₂, p ₃} , i.e. g ₁ = p ₂ and g ₂ = p ₃ , as generated by the coding technique as disclosed in [MSR2]. Accordingly, the global parity nodes are produced from the index arrays P ₂ and P3 and the corresponding coefficients in the finite field GF(q).

The local parities I1 and l ₂ are computed by splitting the index array P ₁ in two arrays P ₁ , ₁ and P ₁ , ₂:

Accordingly, each of the 9 substripes

exists in the t-th row of the

index array p _1,j and coefficient is the corresponding nonzero element in the finite

field GF(q) determined by the [MSR2] code.

Presented below are examples of recovering a node of source data according to embodiments.

Source node recovery option 1 :

If there is a need to recover one lost source node and for the number of source nodes that are contacted to be low, then Algorithm 7 is used to recover the lost source node.

For example, if the source node C ₁ is lost then C ₁ can be recovered by contacting the local parity node li and the nodes C ₂ and C ₃. Each substripe in l ₁ is a combination of the corresponding substripes in C ₁ , C ₂ and C3 and so there is sufficient information to recover the corresponding lost substripe in C ₁.

Source node recovery option 2:

If there is a need to recover one lost source node with the amount of transferred data being low, i.e. with low bandwidth, then Algorithm 8 is used to recover the lost source node. For example if the source node C ₁ is lost then it can be recovered using the data recovery techniques disclosed in [MSR2]. The recovery of C ₁ requires the following steps:

8. From the local parity node li read the 3 data fragments, i.e. substripes, and

9. From the source nodes C ₂, C6 read the following 15 substripes:

Accordingly, the recovery of source node C ₁ requires the transfer of 24 substripes of data. Each substripe is 1/9 of the size of the lost data node and so the total amount of data required for repairing the source node C ₁ is 24/9 = 2.67.

This demonstrates an advantage of node recovery option 2 over node recovery option 1 , namely that the repair bandwidth is reduced from 3 to 2.67. However, the node recovery required obtaining data from all of the source nodes.

Figure 4 is a flowchart of a process for generating codes for recovering a source node according to embodiments.

In step 401 , the process begins.

In step 403, a first set of parity nodes for source nodes is generated by applying a systematic encoding technique to the source nodes, wherein the first set of parity nodes comprises a plurality of parity nodes and each of the parity nodes in the first set of parity nodes is a global parity node. In step 405, a plurality of local parity nodes are generated in dependence on at least one of the parity nodes in the first set.

In step 407, a second set of parity nodes for the source nodes is generated, wherein the second set of parity nodes comprises said plurality of local parity nodes and all of the parity nodes in the first set except for said at least one of the parity nodes in the first set that the plurality of local parity nodes were generated in dependence on.

In step 409, the second set of parity nodes are used as the parity nodes of the source data.

In step 41 1 , the process ends.

Embodiments include using other codes than [MSR1] codes and [MSR2] codes to generate a first set of parity nodes from which the local and global parity nodes are generated. Any systematic coding technique, such as Reed-Solomon coding, can be used to generate the first set of parity nodes from which the local and global parity nodes are generated. The coding technique that generates the first set of parity nodes is preferably MDS. In particular, the coding technique used to generate the codes in embodiments is preferably a regenerative coding technique for recovering source nodes and/or parity nodes with less recovery traffic than Reed Solomon.

Embodiments provide an advantageous structure of code that is both locally repairable and globally repairable. Embodiments are not limited to the construction of codes according to the specific algorithms disclosed herein and embodiments extend to any systematic erasure coding technique that determines how to generate sub-stripes of parity nodes in a way that provides the same structure of codes according to embodiments. Embodiments therefore include any coding technique that has a generator matrix that is the same as the generator matrix of a code that has been generated according to any of the techniques according to the embodiments disclosed herein.

Embodiments also include using any method of decoding as appropriate for the way in which the code has been generated. The codes according to embodiments are particularly advantageous when implemented in the application of data storage where data is stored in nodes of a data storage system.

The actual generation of coded data in dependence on source data according to embodiments can be performed with known techniques and using known hardware. The processes required to use the techniques according to embodiments to generate a plurality of source nodes and redundant/parity nodes in a data storage system/network of a data centre storing the coded data would be a straightforward task for the skilled person. The skilled person would also be able to use known hardware to reconstruct one or more source nodes and/or parity nodes in order to implement embodiments.

The nodes according to embodiments include single data disks, or drives, or groups or data disks, or drives. A node includes any form of data storage element, a part of such an element or multiple such elements. In particular, a node can be any logical entity where data can be stored, and can be anything from a whole, a group of or parts of physical storage devices or locations including but not limited to memory based storage devices such as RAM and SSDs, hard drives, tape storage, optical storage devices, servers and data centers. The method according to embodiments may be performed within a single SSD disk. The method according to embodiments may be performed between chips inside a SSD, or between banks inside (flash) chips.

The storage of the data in a data storage system is not limited to the data storage system having nodes, i.e. data drives or sections of a data drive, that are only for use as a store of source data or redundant data. The systematic property may be maintained but a mapping introduced so that a data drive may store redundant data within a source data node and vice-versa. This interleaving of data changes the mapping of coded data to stored data and can be used to control the read operations from a data storage system, for example to ensure that the network traffic is balanced across the data storage system.

Although data storage is a particularly preferable application for the coding techniques disclosed herein embodiments include the generation of codes for any application, such as data transmission. For example, the nodes may correspond to data packets for transmission over a network. The systematic coding techniques according to embodiments include erasure resilient systematic coding techniques.

The codes according to embodiments can be constructed over any GF field of size GF(2 ⁿ) so long as n is large enough for a MDS erasure code to be generated. A typical implementation would have a GF field size of GF(32) or GF(256).

The flow charts and descriptions thereof herein should not be understood to prescribe a fixed order of performing the method steps described therein. Rather, the method steps may be performed in any order that is practicable. Although the present invention has been described in connection with specific exemplary embodiments, it should be understood that various changes, substitutions, and alterations apparent to those skilled in the art can be made to the disclosed embodiments without departing from the spirit and scope of the invention as set forth in the appended claims.

Methods and processes described herein can be embodied as code (e.g., software code) and/or data. Such code and data can be stored on one or more computer-readable media, which may include any device or medium that can store code and/or data for use by a computer system. When a computer system reads and executes the code and/or data stored on a computer-readable medium, the computer system performs the methods and processes embodied as data structures and code stored within the computer-readable storage medium. In certain embodiments, one or more of the steps of the methods and processes described herein can be performed by a processor (e.g., a processor of a computer system or data storage system). It should be appreciated by those skilled in the art that computer-readable media include removable and non-removable structures/devices that can be used for storage of information, such as computer-readable instructions, data structures, program modules, and other data used by a computing system/environment. A computer-readable medium includes, but is not limited to, volatile memory such as random access memories (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only-memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, FeRAM), and magnetic and optical storage devices (hard drives, magnetic tape, CDs, DVDs); network devices; or other media now known or later developed that is capable of storing computer- readable information/data. Computer-readable media should not be construed or interpreted to include any propagating signals. Embodiments also include the following numbered clauses:

1. A method for generating parity nodes of source nodes, the method comprising: applying a systematic encoding technique to source nodes in order to generate a plurality of global parity nodes of the source nodes; using one of the generated global parity nodes to generate a plurality of local parity nodes such that the local parity nodes combine to form said one of the generated global parity nodes; and determining the parity nodes of the source nodes to consist of said plurality of local parity nodes and all of the global parity nodes except for said one of the generated global parity nodes used to generate the plurality of local parity nodes.

2. The method according to clause 1 , wherein said applied systematic encoding technique is MDS coding technique, and preferably a Reed-Solomon coding technique.

3. The method according to clause 1 or 2, wherein each nodes comprises a plurality of sub-stripes of data.

4. The method according to any preceding clause, wherein the determined parity nodes of the source nodes are generated such that they satisfy the condition that their combination in Finite Field mathematics is zero.

5. The method according to clause 3, the method further comprising: determining, for each of said determined global parity nodes, to combine all but one of the sub-stripes in the node with one sub-stripe in another of said determined global parity nodes such that each of said determined global parity nodes is generated in dependence on data in all of the other of said determined global parity nodes.

6. A method of using a plurality of nodes, that have been encoded in dependence on the method of any preceding clause, to recover a source node, the method comprising: determining either to recover the source node in dependence on a global recovery operation or a local recovery operation; and, in dependence on the determination, either using the local parity nodes to recover the source node without using the global parity nodes to recover the source node; or using both the local parity nodes and the global parity nodes to recover the source node.

7. The method according to clause 6, wherein one or more of the source nodes apart from the source node being recovered are used for recovering said source node being recovered.

8. The method according to clause 6, wherein none of the other source nodes are used to recover said source node being recovered.

9. A method of using a plurality of nodes, that have been encoded in dependence on the method of any preceding clause, to recover one of the local parity nodes of said plurality of local parity nodes, the method comprising: using one or more of the other of said plurality of local parity nodes, apart from the local parity node being recovered, to recover the local parity node being recovered.

10. The method according to clause 9, wherein all of the other of said plurality of local parity nodes, apart from the local parity node being recovered, are used to recover the local parity node being recovered.

1 1. The method according to clause 9 or 10, wherein none of the global parity nodes are used to recover the local parity node.

12. The method according to any of clauses 9 to 11 , wherein none of the source nodes are used to recover the local parity node.

13. The method according to clause 9 or 10, wherein one or more of the global parity nodes and/or source nodes are used to recover the local parity node. 14. A method of using a plurality of nodes, that have been encoded in dependence on the method of any preceding clause, to recover a global parity node, the method comprising: using one or more of the other global parity nodes, apart from the global parity node being recovered, to recover the global parity node being recovered.

15. The method according to clause 14, wherein all of the other of said plurality of global parity nodes, apart from the global parity node being recovered, are used to recover the global parity node being recovered.

16. The method according to clause 14 or 15, wherein none of the local parity nodes are used to recover the global parity node.

17. The method according to any of clauses 14 to 16, wherein none of the source nodes are used to recover the global parity node.

18. The method according to clause 14 or 15, wherein one or more of the local parity nodes and/or source nodes are used to recover the global parity node.

19. A computing system configured to perform the method of any preceding clause.

Previous Patent: DIFFERENTIAL COMPRISING A DECELERATION DEVICE

Next Patent: HISTONES AND/OR PROADM AS MARKERS INDICATING ORGAN DYSFUNCTION