Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FILE INTEGRITY PRESERVATION
Document Type and Number:
WIPO Patent Application WO/2016/120328
Kind Code:
A1
Abstract:
In one embodiment of file integrity preservation in accordance with the present description, a file is subdivided into a plurality of subfiles, and a write update originally targeted for a portion of that file contained within one of the subfiles, is instead directed to a temporary copy subfile. As a consequence, the temporary copy subfile which is updated with the write data, may be scanned for viruses or other malware separately from the original file and its corresponding original subfile. If the temporary copy subfile passes the scanning test, the originally targeted file may be updated with the updated contents of the clean temporary copy subfile. Conversely, in the event that the write update introduced malicious software to the temporary copy subfile, the original file and its corresponding original subfile remain uncontaminated by the write update. Other aspects are also described.

Inventors:
MARTINEZ LISA (US)
CORONADO SARA MEGAN (US)
LARA CHRISTINA ANN (US)
CORONADO JUAN (US)
Application Number:
PCT/EP2016/051704
Publication Date:
August 04, 2016
Filing Date:
January 27, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
IBM (US)
IBM UK (GB)
International Classes:
G06F21/56; G06F21/57
Domestic Patent References:
WO2013014033A12013-01-31
Foreign References:
US6088803A2000-07-11
US8220053B12012-07-10
US20140059687A12014-02-27
US8856927B12014-10-07
Other References:
None
Attorney, Agent or Firm:
PYECROFT, Justine (Intellectual Property LawHursley Park, Winchester Hampshire SO21 2JN, GB)
Download PDF:
Claims:
CLAIMS

1. A method for updating a file, the method comprising operations by a processor, the operations comprising:

receiving update data for updating a first subfile of a file;

creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data;

scanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile.

2. The method of claim 1 wherein the operation of creating a first temporary copy subfile comprises:

obtaining a temporary location for the first temporary copy subfile from a pool of available temporary locations, the operations further comprising releasing the temporary location for the first temporary copy subfile back to the pool of available temporary locations after updating the first subfile with the scanned contents of the first temporary copy subfile.

3. The method of claim 1 wherein the contents of the first subfile are at a first location and the contents of the first temporary copy subfile are at a first temporary location, and wherein the updating the first subfile with the contents of the first temporary copy subfile comprises copying the scanned update data from the first temporary location of the updated temporary copy subfile to the first location of the first subfile.

4. The method of claim 1 wherein the contents of the first subfile are at a first original location and the first subfile has a first location pointer identifying the first original location of the first subfile, wherein the contents of the first temporary copy subfile are at a first original temporary location and the first temporary copy subfile has a first temporary location pointer identifying the first original temporary location of the first temporary copy subfile and wherein the updating the first subfile with the scanned contents of the first temporary copy subfile comprises updating the first location pointer for the first subfile to identify the first original temporary location of the first temporary copy subfile as the location of the first subfile instead of identifying the first original location as the location of the first subfile, and wherein the operations further comprise updating the first temporary location pointer for the first temporary copy subfile to identify the temporary location of the first temporary copy subfile as the first original location instead of identifying the first original temporary location of the first temporary copy subfile.

5. The method of claim 1, wherein the operations further comprise:

if the updated first temporary copy subfile fails the scanning, repairing the updated first temporary copy subfile;

rescanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the first temporary copy subfile.

6. The method of claim 5 wherein a first host is the source of the update data for the first subfile, and the operations further comprise:

if the updated first temporary copy subfile fails the rescanning, quarantining the updated first temporary copy subfile and blocking write data access by the first host to the first subfile.

7. The method of claim 6 wherein the operations further comprise:

if the updated first temporary copy subfile is quarantined, requesting and receiving a resending of the update data for updating the first subfile of the file;

creating a second temporary copy subfile corresponding to the first subfile of the file; updating the second temporary copy subfile with the resent update data instead of updating the first subfile with the resent update data;

scanning the updated second temporary copy subfile; and

if the updated second temporary copy subfile passes the scan, updating the first subfile with the scanned contents of the second temporary copy subfile and removing the blocking of write data access to the first subfile by the first host.

8. The method of claim 7, wherein the operations further comprise:

if the updated second temporary copy subfile fails the scanning, repairing the updated second temporary copy subfile;

rescanning the updated second temporary copy subfile; and

if the updated second temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the second temporary copy subfile.

9. The method of claim 8 wherein the operations further comprise:

if the updated second temporary copy subfile fails the rescanning, quarantining the updated second temporary copy subfile.

10. The method of claim 6 wherein the operations further comprise:

if the updated first temporary copy subfile is quarantined, requesting and receiving a resending of the update data from a second host for updating the first subfile of the file;

creating a second temporary copy subfile corresponding to the first subfile of the file; updating the second temporary copy subfile with the resent update data instead of updating the first subfile with the resent update data;

scanning the updated second temporary copy subfile; and

if the updated second temporary copy subfile passes the scan, updating the first subfile with the scanned contents of the second temporary copy subfile and removing the blocking of write data access to the first subfile by the first host.

11. The method of claim 10, wherein the operations further comprise:

if the updated second temporary copy subfile fails the scanning, repairing the updated second temporary copy subfile;

rescanning the updated second temporary copy subfile; and

if the updated second temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the second temporary copy subfile.

12. The method of claim 11 wherein the operations further comprise:

if the updated second temporary copy subfile fails the rescanning, quarantining the updated second temporary copy subfile.

13. A system, comprising : at least one storage system including at least one storage unit adapted to store a file having a subfile of the file, and at least one storage controller adapted to access and control storage units of the at least one storage system; and at least one computer readable storage medium having computer readable program instructions embodied therewith, the program instructions executable by the storage system to cause the storage system to perform operations, the operations comprising:

receiving update data for updating a first subfile of a file;

creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data;

scanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile.

14. The system of claim 13 wherein at least one storage unit has a pool of available temporary locations, and wherein the operating of creating a first temporary copy subfile comprises obtaining a temporary location for the first temporary copy subfile from the pool of available temporary locations, the operations further comprising releasing the temporary location for the first temporary copy subfile back to the pool of available temporary locations after updating the first subfile with the scanned contents of the first temporary copy subfile.

15. The system of claim 13 wherein the contents of the first subfile are at a first location and the contents of the first temporary copy subfile are at a first temporary location, and wherein the updating the first subfile with the contents of the first temporary copy subfile comprises copying the scanned update data from the first temporary location of the updated temporary copy subfile to the first location of the first subfile.

16. The system of claim 13 wherein the contents of the first subfile are at a first original location and the first subfile has a first location pointer identifying the first original location of the first subfile, wherein the contents of the first temporary copy subfile are at a first original temporary location and the first temporary copy subfile has a first temporary location pointer identifying the first original temporary location of the first temporary copy subfile and wherein the updating the first subfile with the scanned contents of the first temporary copy subfile comprises updating the first location pointer for the first subfile to identify the first original temporary location of the first temporary copy subfile as the location of the first subfile instead of identifying the first original location as the location of the first subfile, and

wherein the operations further comprise updating the first temporary location pointer for the first temporary copy subfile to identify the temporary location of the first temporary copy subfile as the first original location instead of identifying the first original temporary location of the first temporary copy subfile.

17. The system of claim 13, wherein the operations further comprise:

if the updated first temporary copy subfile fails the scanning, repairing the updated first temporary copy subfile;

rescanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the first temporary copy subfile.

18. The system of claim 17 further comprising at least one of a first host and a second host in which the first host is the source of the update data for the first subfile, and wherein the operations further comprise:

if the updated first temporary copy subfile fails the rescanning, quarantining the updated first temporary copy subfile and blocking write data access by the first host to the first subfile; if the updated first temporary copy subfile is quarantined, requesting and receiving a resending of the update data from at least one of the first host and second host for updating the first subfile of the file;

creating a second temporary copy subfile corresponding to the first subfile of the file; updating the second temporary copy subfile with the resent update data instead of updating the first subfile with the resent update data;

scanning the updated second temporary copy subfile;

if the updated second temporary copy subfile passes the scan, updating the first subfile with the scanned contents of the second temporary copy subfile and removing the blocking of write data access to the first subfile by the first host;

if the updated second temporary copy subfile fails the scanning, repairing the updated second temporary copy subfile;

rescanning the updated second temporary copy subfile;

if the updated second temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the second temporary copy subfile; and

if the updated second temporary copy subfile fails the rescanning, quarantining the updated second temporary copy subfile.

19. A computer program product for use with at least one storage system including at least one storage unit adapted to store a file having a subfile of the file, and at least one storage controller adapted to access and control storage units of the at least one storage system, the computer program product comprising at least one computer readable storage medium having computer readable program instructions embodied therewith, the program instructions executable by the storage system to cause the storage system to perform operations, the operations comprising:

receiving update data for updating a first subfile of a file;

creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data;

scanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile.

20. The computer program product of claim 19 wherein the at least one storage unit has a pool of available temporary locations, and wherein the operation of creating the first temporary copy subfile comprises obtaining a temporary location for the first temporary copy subfile from the pool of available temporary locations, the operations further comprising releasing the temporary location for the first temporary copy subfile back to the pool of available temporary locations after updating the first subfile with the scanned contents of the first temporary copy subfile.

21. The computer program product of claim 19 herein the contents of the first subfile are at a first location and the contents of the first temporary copy subfile are at a first temporary location, and wherein the updating the first subfile with the contents of the first temporary copy subfile comprises copying the scanned update data from the first temporary location of the updated temporary copy subfile to the first location of the first subfile.

22. The computer program product of claim 19 wherein the contents of the first subfile are at a first original location and the first subfile has a first location pointer identifying the first original location of the first subfile, wherein the contents of the first temporary copy subfile are at a first original temporary location and the first temporary copy subfile has a first temporary location pointer identifying the first original temporary location of the first temporary copy subfile and wherein the updating the first subfile with the scanned contents of the first temporary copy subfile comprises updating the first location pointer for the first subfile to identify the first original temporary location of the first temporary copy subfile as the location of the first subfile instead of identifying the first original location as the location of the first subfile, and wherein the operations further comprise updating the first temporary location pointer for the first temporary copy subfile to identify the temporary location of the first temporary copy subfile as the first original location instead of identifying the first original temporary location of the first temporary copy subfile.

23. The computer program product of claim 19 wherein the operations further comprise: if the updated first temporary copy subfile fails the scanning, repairing the updated first temporary copy subfile;

rescanning the updated first temporary copy subfile; and

if the updated first temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the first temporary copy subfile.

24. The computer program product of claim 23 further comprising at least one of a first host and a second host in which the first host is the source of the update data for the first subfile, and wherein the operations further comprise:

if the updated first temporary copy subfile fails the rescanning, quarantining the updated first temporary copy subfile and blocking write data access by the first host to the first subfile; if the updated first temporary copy subfile is quarantined, requesting and receiving a resending of the update data from at least one of the first host and second host for updating the first subfile of the file; creating a second temporary copy subfile corresponding to the first subfile of the file;

updating the second temporary copy subfile with the resent update data instead of updating the first subfile with the resent update data; scanning the updated second temporary copy subfile;

if the updated second temporary copy subfile passes the scan, updating the first subfile with the scanned contents of the second temporary copy subfile and removing the blocking of write data access to the first subfile by the first host;

if the updated second temporary copy subfile fails the scanning, repairing the updated second temporary copy subfile;

rescanning the updated second temporary copy subfile;

if the updated second temporary copy subfile passes the rescan, updating the first subfile with the rescanned contents of the second temporary copy subfile; and

if the updated second temporary copy subfile fails the rescanning, quarantining the updated second temporary copy subfile.

25. A computer program comprising program code means adapted to perform the method of any of claims 1 to 12 when said program is run on a computer.

Description:
FILE INTEGRITY PRESERVATION

BACKGROUND OF THE INVENTION FIELD OF THE INVENTION

[0001] The subject matter disclosed herein relates to data updates to files and to anti- virus file scanning.

DESCRIPTION OF THE RELATED ART

[0002] Files are often scanned for computer viruses and other malicious software frequently referred to as "malware." Such malicious software includes a variety of forms of hostile or intrusive software. Examples of malicious software include computer viruses, worms, trojan horses, and ransomware. Still other examples include spyware, adware, scareware, and other malicious programs. Malicious software can take the form of executable program code, scripts, active content, and other software. Malicious software is often disguised as, or embedded in, non-malicious files to facilitate the spread and to increase the difficulty in detecting the malicious software.

[0003] In some systems, upon a write operation to update a file, the write data is committed to the file to update the file, and an anti- virus scan is initiated on the updated file. Also in some systems, in order to facilitate the scanning process, a file to be scanned is subdivided into subfiles which are scanned separately by one or more scan servers. If the last write command introduces malicious software, the anti- virus scan can frequently detect it, and an attempt may be made to repair the infected file. If the repair of the infected file fails, the entire file is typically quarantined to prevent subsequent read operations to the infected file which can spread the malicious software. Hence, users are typically denied access to a quarantined file. However, a read operation directed to an infected file which has not been quarantined, may permit spread of the malicious software. According to a first aspect, there is provided a method for updating a file, the method comprising operations by a processor, the operations comprising: receiving update data for updating a first subfile of a file; creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data; scanning the updated first temporary copy subfile; and if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile.

SUMMARY

[0004] According to a second aspect, there is provided a system, comprising: at least one storage system including at least one storage unit adapted to store a file having a subfile of the file, and at least one storage controller adapted to access and control storage units of the at least one storage system; and at least one computer readable storage medium having computer readable program instructions embodied therewith, the program instructions executable by the storage system to cause the storage system to perform operations, the operations comprising: receiving update data for updating a first subfile of a file; creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data; scanning the updated first temporary copy subfile; and if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile.

[0005] According to a third aspect, there is provided a computer program product for use with at least one storage system including at least one storage unit adapted to store a file having a subfile of the file, and at least one storage controller adapted to access and control storage units of the at least one storage system, the computer program product comprising at least one computer readable storage medium having computer readable program instructions embodied therewith, the program instructions executable by the storage system to cause the storage system to perform operations, the operations comprising: receiving update data for updating a first subfile of a file; creating a first temporary copy subfile corresponding to the first subfile of the file; updating the first temporary copy subfile with the update data instead of updating the first subfile with the update data; scanning the updated first temporary copy subfile; and if the updated first temporary copy subfile passes the scan, updating the first subfile with the scanned update of the first temporary copy subfile. [0006] According to a preferred embodiment, there is provided is a method for preserving file integrity in connection with a write operation to update a file, in which a temporary copy subfile corresponding to the originally targeted portion of the file, is created. Instead of committing the write update data to the originally targeted portion of the file, the write data is directed instead to update the temporary copy subfile. The updated temporary copy subfile may be scanned for malicious software, and if the updated temporary copy subfile passes the scan, the originally targeted portion of the file may be updated with the scanned update data contained by the temporary copy subfile which was determined to be free of malicious software.

[0007] In one embodiment, as a consequence of updating the temporary copy subfile instead of the original file, the temporary copy subfile after it has been updated with the write data, may optionally be scanned for viruses or other malware separately from the original file or its original subfile. Accordingly, in one embodiment, read access to the original file including the corresponding original subfile, may optionally be permitted while the temporary copy subfile is updated and scanned.

[0008] Conversely, in the event that the write update introduced malicious software to the temporary copy subfile, the original file and its corresponding original subfile remain uncontaminated by the write update. Accordingly, in one embodiment, access to the original file and its corresponding original subfile may optionally continue since they remain uncontaminated and their integrity has been preserved.

[0009] Still further, should the contaminated temporary copy subfile, according to one embodiment, be quarantined, the original file and its corresponding original subfile may optionally remain free of quarantine since their integrity has been preserved. Accordingly, access to the original file and its corresponding original subfile may optionally continue since they remain uncontaminated and unquarantined.

[0010] In one embodiment, a location for the temporary copy subfile may optionally be obtained from a pool of available temporary subfile locations. In another embodiment, one or more attempts may optionally be made to repair the temporary copy subfile to eliminate the malicious software before the temporary copy subfile is quarantined. In another embodiment, the update data may optionally be resent one or more times to update one or more additional temporary copy subfiles instead of the original file.

[0011] Other embodiments are directed to systems, apparatus and computer program products. Still other aspects are described.

DETAILED DESCRIPTION OF THE DRAWINGS

[0012] Preferred embodiments of the present invention will now be described, by way of example only, and with reference to the following drawings:

FIG. 1 is a schematic block diagram illustrating one embodiment of a data processing system employing file integrity preservation in accordance with the present description;

FIG. 2 is a schematic block diagram illustrating one embodiment of a file subdivided for file integrity preservation in accordance with the present description;

FIG. 3 is a schematic block diagram illustrating one embodiment of updating a temporary copy subfile for file integrity preservation in accordance with the present

description;

FIG. 4 is a schematic block diagram illustrating one embodiment of creating a temporary copy subfile for file integrity preservation in accordance with the present

description;

FIG. 5 is a schematic block diagram illustrating one embodiment of updating the originally targeted subfile for file integrity preservation in accordance with the present description;

FIG. 6 is a schematic block diagram illustrating another aspect of updating an originally targeted subfile for file integrity preservation in accordance with the present description;

FIG. 7 is a schematic block diagram illustrating one embodiment of quarantining of a temporary copy subfile for file integrity preservation in accordance with the present

description;

FIG. 8 is a schematic block diagram illustrating one embodiment of creating a second temporary copy subfile for file integrity preservation in accordance with the present

description;

FIG. 9 is a schematic block diagram illustrating one embodiment of an antivirus control file which may be used in connection with file integrity preservation in accordance with the present description;

FIG. 10 is a schematic block diagram illustrating one embodiment of a computer which may be used for file integrity preservation in accordance with the present description;

FIG. 11 is a schematic block diagram illustrating one embodiment of a file integrity preservation apparatus in accordance with the present description; and

FIG. 12 depicts one embodiment of operations for file integrity preservation in accordance with the present description.

FIG. 13 depicts another embodiment of operations for file integrity preservation in accordance with the present description.

FIG. 14 depicts still another embodiment of operations for file integrity preservation in accordance with the present description.

DETAILED DESCRIPTION

[0013] In one embodiment of file integrity preservation in accordance with the present description, a file is subdivided into a plurality of subfiles, and a write update originally targeted for a portion of that file contained within one of the subfiles, is instead directed to a temporary copy subfile. In this example, the temporary copy subfile contains a copy of the originally targeted subfile of the file and thus corresponds to the originally targeted subfile of the original file. As a consequence of updating the temporary copy subfile instead of the original file, the temporary copy subfile after it has been updated with the write data, may optionally be scanned for viruses or other malware separately from the original file or its original subfile. Accordingly, in one embodiment, read access to the original file including the corresponding original subfile, may optionally be permitted while the temporary copy subfile is updated and scanned.

[0014] In another embodiment of the present invention, if the temporary copy subfile passes the scanning test, the originally targeted file may be updated with the updated contents of the clean temporary copy subfile. Conversely, in the event that the write update introduced malicious software to the temporary copy subfile, the original file and its corresponding original subfile remain uncontaminated by the write update. Accordingly, access to the original file and its corresponding original subfile may optionally continue since they remain uncontaminated and their integrity has been preserved.

[0015] Thus, in those instances in which a read command is executed before a write update targeted for the same file has been completed and scanned for malicious software, spread of the malicious software may be avoided since the read operation may optionally be directed to the original file or subfile while the write update which may be carrying malicious software is directed to the temporary copy subfile. Accordingly, should the write update data be infected with malicious software, the read operation does not come into contact with the infected update data.

[0016] Still further, should the contaminated temporary copy subfile be quarantined, the original file and its corresponding original subfile may remain free of quarantine since their integrity has been preserved. Accordingly, access to the original file and its corresponding original subfile may optionally continue since they remain uncontaminated and unquarantined.

[0017] As used herein, the terms "scan", "anti-virus (AV) scan" and anti-virus (AV) program" refer to scans and programs for detecting any malicious software including but not limited to computer viruses. The term "repair" refers to processing an infected file detected to be infected with malicious software, to eliminate or render harmless the malicious software. The term "quarantining" refers to restricting or completely blocking access to an infected file which has been quarantined to eliminate or inhibit the spread of the malicious software from the infected file.

[0018] FIG. 1 is a schematic block diagram illustrating one embodiment of a data processing system 100 which provides for file integrity preservation in accordance with one embodiment of the present invention. The system 100 includes a plurality of servers 110 as represented by the servers 110a- 1 lOd that may scan files or provide a host function, or both. In addition, the system 100 includes a network 120 and a storage system. The network 120 may be the Internet, a router, a wide area network, a local area network, or the like. The storage system includes a first bus 125, a second bus 150, and one or more storage servers 130 as represented by the servers 130a, 130b, which provide a data storage function in connection with one or more storage subsystems 140 as represented by the storage subsystems 140a, 140b, 140c. In one embodiment, one or more servers 110 as represented by the servers 1 lOe, 1 lOf are included in the storage subsystem.

[0019] One or more servers as represented by the servers 110a, 110b, for example, may provide a host function to store data to and retrieve data from the storage system 180. In some storage systems, an anti- virus (AV) program runs external to the servers performing the storage function. Thus, the anti- virus software can be run on one or more dedicated servers such as the servers 110c and 1 lOd, for example which are external to the storage system 180, or servers 1 lOe, 1 lOf, for example, which are internal to the storage system 180, to validate that the data contained within a storage unit of the storage system 180 is virus free. To speed the scanning of files and to provide for continued use of files, particularly large files while they are being scanned, it is known to subdivide a file into subfiles and to distribute the scanning of the subfiles to different servers so that the various subfiles of a particular file may be scanned by different servers operating in parallel or at different times. In addition, subfiles of a file may be accessed while other subfiles of the file are being scanned.

[0020] Previously, a storage system typically provided real time scan "on write" operations. For example, in connection with a write operation, the write data provided by a host server 110a, 110b was previously committed directly to the targeted file, and an AV Scan was initiated on the updated targeted file in which typically the entire file was scanned after the write operation. If the last write command introduced malicious software, and the AV Scan detected it, a repair of the infected file was attempted. If the repair of the infected file failed, the infected file was typically quarantined, blocking access to the quarantined file. In some prior systems, an entire file which may be a terabyte in size or larger, may be quarantined

notwithstanding that only a relatively small portion of the file is actually infected.

[0021] As previously mentioned, in accordance with one embodiment of the present invention, file integrity may be preserved by subdividing a file into a plurality of subfiles, and directing a write update intended for a portion of that file to a temporary copy subfile instead. The temporary copy subfile is a copy of the original portion of the file which was the target of the write operation. As a result, malicious software if contained within the write update data would contaminate the temporary copy subfile rather than the original targeted file or its subfiles. In this manner, quarantining of either the original file or its original subfile may be avoided.

[0022] In addition, it is recognized herein that previously an anti- virus scan may have been insufficient to protect a file in the event that a "read" operation and a "write" operation occurred at the same time. For example, in many prior systems, an AV Scan had typically been initiated only on an "open for read" operation or a "close after a write" operation.

Accordingly, an AV Scan was frequently not initiated on every read operation. As a result, if a process opened a file for a read operation while another process was writing to the same file and the write operation introduced malicious software, the read process in a prior storage system might have read that introduced virus before the AV scan and any subsequent repair or quarantine were completed.

[0023] As previously mentioned, in accordance with one embodiment of the present invention, file integrity may be preserved by subdividing a file into a plurality of subfiles, and directing a write update originally targeting a portion of that file to a temporary copy subfile instead to the original file itself or its subfile. Accordingly if a process opens a file for a read operation while another process is writing infected data which had been originally targeted for the same file, as a result of file integrity preservation in accordance one embodiment of the present invention, the read process would not encounter that malicious software in the original file or its subfiles since any malicious software would be introduced to the temporary copy subfile rather than to the original file which is being read.

[0024] Each storage subsystem 140 of FIG. 1 may include one or more controllers 160 that control one or more storage devices 170. The storage devices 170 may be hard disk drives, optical storage devices, micromechanical storage devices, semiconductor storage devices, and the like. Storage servers 130 may manage and control the storage system 180. The storage servers 130 may communicate with the network 120 and the storage subsystems 140 through the first bus 125 and second bus 150 respectively.

[0025] The storage devices 170 may store files, directory information, metadata, and the like, referred to hereafter as files. The servers 1 lOe, 1 lOf may scan the files for the purpose of detecting and mitigating any malware that may be stored in a file. The servers 110 may be external to the storage system 180 and/or internal to the storage system 180 as described above. [0026] Files in the storage system 180 can grow to various sizes; very small to very large file sizes can exist. Scanning such large files with a single server 110 in some systems may require an inordinate amount of time. In addition, a large file that is being scanned may be inaccessible during the long scan time. Having a file inaccessible for such a long period of time is burdensome for important files. As previously mentioned, to speed the scanning of files and to provide for continued use of files, particularly large files, it has been known to subdivide a file into subfiles and to distribute the scanning of the subfiles to different servers so that the various subfiles of a particular file may be scanned by different servers operating in parallel or at different times.

[0027] FIG. 2 is a schematic block diagram illustrating one embodiment of a file 200, the integrity of which may be preserved in accordance with one embodiment of the present invention. The file 200 may be stored in the storage system 180 of FIG. 1, for example. As previously mentioned, a file such as the file 200 may be quite large. For example, in one embodiment, the file 200 may have a size in excess of 1 Terabyte (TB). Here, the file 200 is divided into a plurality of subfiles 205 as represented by the subfiles 205a, 205b, 205c, 205d ... . In one embodiment, each subfile 205 is no larger than a specified size. The specified size may vary in range, such as 1 Megabyte (MB) to 1 Gigabyte (GB), for example. In another example, the file 200 may be divided so that each subfile 205 is no larger than a specified size of 10 GB. It is appreciated that the size of a subfile may vary, depending upon the particular application. This subdivision process may be initiated multiple times until the entire file scan is completed by the prior subdivided file scanning procedure.

[0028] In accordance with one embodiment of the present invention, division of files into subfiles may be utilized for preservation of file integrity by redirecting a write update targeting, that is intended for, a portion of a particular file, to a temporary copy subfile containing a copy of the targeted portion of the original file. FIG. 3 shows an example of such file integrity preservation in connection with a write update operation for write update data sent by a host server such as the host server 110a. The write update data from the host server 110a targets data contained within a subfile 205b of the file 200. Instead of immediately committing the write update data to the targeted original subfile 205b of the file 200, a first temporary copy subfile 205b 1 corresponding to the first original subfile 205b is created and the contents of the original subfile 205b are copied over to the temporary copy subfile 205b 1.

[0029] In one embodiment, a temporary copy subfile such as the temporary copy subfile 205b 1 (FIG. 3) may be created by obtaining a temporary memory location for the particular temporary copy subfile from a pool 210 (FIG. 4) of available temporary locations 210a, 210b, 210c ... . In this example, the pool of available temporary locations 210a, 210b, 210c are provided by disk drive storage locations. However, it is appreciated that in other embodiments, the pool of available temporary locations 210a, 210b, 210c may be provided by volatile or nonvolatile memory or by storage locations provided by other types of storage devices, depending upon the particular application.

[0030] In one embodiment, file integrity preservation in accordance with the present description may be invoked with a command line interface (CLI) command having a suitable name such as "Preserve File Integrity on Write" for example. Upon invoking this command, to enable the file integrity preservation process for a particular file such as the file 200, the file integrity preservation process creates storage space as represented by pool 210 (FIG. 4) of available temporary locations 210a, 210b, 210c ... using a storage controller 160 and the storage devices 170 to contain temporary copy subfiles for the file 200. In one embodiment, the size of the pool 210 may be dependent upon the size of the file 200 for which the file integrity preservation command was invoked, and the frequency of write updates to the file 200. It is appreciated that the size of the pool 210 may vary, depending upon the particular application.

[0031] In this example, the temporary copy subfile 205bl is created using an available temporary copy subfile location 210b of the pool 210 of available temporary locations.

Accordingly a data structure for the temporary copy subfile 205b 1 has a file location pointer (as represented by an arrow 212a) pointing to the temporary copy subfile location 210b of the pool 210 of available temporary locations, as the location of the temporary copy subfile 205b 1. The contents of the targeted original subfile 205b are copied over to the location of the temporary copy subfile 205b 1 so that the temporary copy subfile 205b 1 corresponds to the targeted original subfile 205b. [0032] When a host sends a "write command" to update a file, and the CLI command

"Preserve File Integrity on Write" is enabled on the file, the write data associated with the "write command" is committed to the temporary copy subfile in the storage location instead of the original file. Thus, in this example, once the temporary copy subfile 205b 1 corresponding to the targeted original subfile 205b is available, the write update data received for the write operation and intended for the original subfile 205b, is committed to update the temporary copy subfile 205b 1 as indicated by the Write Data Update process arrow of FIG 3, instead of being committed to update the original subfile 205b. As a consequence, the temporary copy subfile 205b 1 which is updated with the write data, may be scanned for malicious software as indicated by the Anti- Virus Scan process arrow of FIG. 3, separately from the original file 200 and its corresponding original subfile 205b. Accordingly, in one embodiment, access to the original file 200 including the corresponding original subfile 205b, may be permitted while the temporary copy subfile 205b 1 is updated and scanned.

[0033] If the temporary copy subfile 205b 1 passes the scanning test, the original file 200 may be updated with the scanned and updated contents of the clean temporary copy subfile 205b 1 as indicated by the Scanned Write Data Update If not Infected process arrow of FIG. 5. In one embodiment, the original file 200 may be updated by copying the scanned, updated contents from the temporary location 210b (FIG. 4) of the temporary copy subfile 205b 1 to the location of the targeted original subfile 205b. Upon successful updating of the original file 200 with the scanned, updated contents of the temporary copy subfile 205b 1, the temporary memory or storage space utilized by the temporary copy subfile 205b may be released for use by other processes. Thus, the temporary copy subfile location 210b may be released and returned to the pool 210 of temporary copy subfile locations.

[0034] Another example of a technique for updating the original file 200 with the scanned, updated contents of the temporary copy subfile 205b 1 is referred to herein as a switch subfile pointer process and is described in connection with FIG. 6 below. It is appreciated that the original file 200 may be updated with clean update data from the temporary copy subfile 205b 1 using other techniques, depending upon the particular application.

[0035] As previously mentioned in connection with FIG. 4, the temporary copy subfile 205b 1 has a file location pointer (as represented by an arrow 212a) pointing to the temporary copy subfile location 210b of the pool 210 of available temporary locations, as the location of the temporary copy subfile 205b 1. Similarly, the targeted original subfile 205b has a file location pointer (as represented by an arrow 212b) pointing to the original subfile location within the file 200, as the location of the targeted original subfile 205b. Instead of copying the data of the updated and scanned temporary subfile 205b 1 from the temporary copy subfile location 210b of the pool to the targeted original subfile 205b at its original location within the file 200, the file pointers of the temporary copy subfile 205b 1 and the targeted original subfile 205b may be updated by switching them as depicted in FIG. 6.

[0036] Once switched, the targeted original subfile 205b has a file location pointer (as represented by an arrow 212c) pointing to the temporary copy subfile location 210b of the pool 210 of available temporary locations, as the location of the targeted original subfile 205b and the temporary copy subfile 205b 1 has a file location pointer (as represented by an arrow 212d) pointing to the original subfile location within the file 200 as the location of the temporary copy subfile 205b 1. It is the temporary copy subfile 205b 1 which contains the updated data. The switch is only made once the temporary copy subfile 205b 1 has been scanned and confirmed as free from malicious software as described above. In this manner, the targeted original subfile 205b of the file 200 may be updated with the scanned write update data without actually copying it from the temporary copy subfile 205b 1 to the original subfile 205b.

[0037] In this manner, updating the original targeted subfile 205b with the updated and scanned contents of the temporary copy subfile 205b 1 includes updating a file pointer for the original targeted subfile 205b to identify the temporary copy subfile location 210b of the temporary copy subfile 205b 1 as the location of original targeted subfile 205b instead of identifying the original location within the file 200 as the location of the original targeted subfile 205b. Furthermore, updating the pool pointer for the temporary copy subfile 205b 1 to identify the location of the temporary copy subfile 205b las the original location targeted subfile 205b within the file 200 instead of identifying the original temporary copy subfile location 210b of the temporary copy subfile 205b 1. In some embodiments, updating file pointers in accordance with the process described herein may be achieved more quickly and efficiently as compared to copying the scanned updated data from the temporary copy subfile to the original subfile.

[0038] Conversely, in the event that the AV scan reveals that the write update introduced malicious software to the temporary copy subfile 205b 1, the original file 200 and its corresponding targeted original subfile 205b remain uncontaminated by the write update.

Accordingly, access to the original file 200 and its corresponding original subfile 205b may continue since they remain uncontaminated and their integrity has been preserved.

[0039] An attempt may be made to repair the infected temporary copy subfile 205b 1 as indicated by the Attempt Repair if Infected process arrow of FIG. 3. The temporary copy subfile 205b 1 which had been updated with the write data, may be rescanned for malicious software following the repair attempt as represented by the Anti- Virus Scan process arrow of FIG. 3, again separately from the original file 200 and its corresponding original subfile 205b.

[0040] If the repaired temporary copy subfile 205b 1 passes the rescanning test, the original file 200 may be updated with the rescanned and updated contents of the clean temporary copy subfile 205b 1 as indicated by the Scanned Write Data Update If not Infected process arrow of FIG. 5 using update techniques such as those described above. Conversely, in the event that the AV rescan reveals that the repair of the temporary copy subfile 205b 1 failed such that the temporary copy subfile 205b 1 remains contaminated from the write update, the temporary copy subfile 205b lmay be quarantined as represented in FIG. 7. As a result, the storage space 210b (FIG. 4) occupied by the quarantined temporary subfile 205b 1 is marked unavailable for use. It is appreciated that the number of repair attempts and failed rescans before the temporary copy subfile is quarantined may vary, depending upon the particular application.

[0041] Still further, should the contaminated temporary copy subfile 205b 1 be quarantined, the original file 200 and its corresponding original subfile 205b may remain free of quarantine as shown in FIG. 7 since their integrity has been preserved because the subfile 205b 1 in the temporary location is in a quarantined state and has not been committed to a location within the file 200. Accordingly, access to the original file 200 and its corresponding original subfile 205b 1 continues since they remain uncontaminated and unquarantined. [0042] However, upon quarantining the temporary copy subfile 205b 1, the file 200 will not contain the latest updates represented by the quarantined write update data. In one

embodiment, the original host server 110a which provided the original write update data may be requested to resend the write update data. In another embodiment of the present invention, in the event that the first temporary copy subfile 205b 1 is quarantined, a second temporary copy subfile as represented by the temporary copy subfile 205b2 (FIG. 8) may be created. In one embodiment, a second temporary copy subfile such as the temporary copy subfile 205b2 (FIG. 8) may be created by obtaining a temporary storage location in a manner similar to that described above in connection with temporary copy subfile 205b 1. In this example, the temporary copy subfile 205b2 is created using an available temporary copy subfile location 210d of the pool 210 of available temporary locations. Accordingly a data structure for the temporary copy subfile 205b2 has a file location pointer (as represented by an arrow 212e) pointing to the temporary copy subfile location 210d of the pool 210 of available temporary locations, as the location of the temporary copy subfile 205b2. The contents of the targeted original subfile 205b are copied over to the location of the temporary copy subfile 205b2 so that the temporary copy subfile 205b2 corresponds to the targeted original subfile 205b in the same manner as the first temporary copy subfile 205b 1.

[0043] Once the temporary copy subfile 205b2 corresponding to the targeted original subfile 205b is available, the write update data resent by the original host server 110a for the write operation intended for the original subfile 205b, is committed to update the temporary copy subfile 205b2 in the same manner as described above in connection with temporary copy subfile 205b 1. Accordingly, the temporary copy subfile 205b2 which is updated with the write data, may be scanned for malicious software in the same manner as described above in connection with temporary copy subfile 205b 1.

[0044] If the temporary copy subfile 205b2 passes the scanning test, the original file 200 may be updated with the scanned and updated contents of the clean temporary copy subfile 205b2 in the manner described above in connection with temporary copy subfile 205b 1. Upon successful updating of the original file 200 with the scanned, updated contents of the temporary copy subfile 205b2, the temporary memory or storage space utilized by the temporary copy subfile 205b2 may be released and returned to the pool 210 of temporary copy subfile locations. [0045] Conversely, in the event that the AV scan reveals that the resent write update again introduced malicious software, this time to the temporary copy subfile 205b2, an attempt may be made to repair and rescan the infected temporary copy subfile 205b 1 one or more times as described above in connection with temporary copy subfile 205b 1. If the write update data that was resent is once again quarantined, then the file 200 may be marked with a suitable indication such as "not up to date," for example, to indicate that the particular file area (in this example, subfile 205b) was not updated. In this example, the user may also be informed that the particular file area (in this example, subfile 205b) was not updated, and that the temporary copy subfiles 205b 1 and 205b2 have been quarantined. In addition, the original host server (host server 110a in this example) may be requested to not resend the particular write update data which was found to contain malicious software and which could not be repaired as discussed above. Further in one embodiment, any subsequent write updates from the same host (host server 110a in this example) to the same subfile (in this example, subfile 205b) may be rejected.

[0046] In another embodiment of the present invention, once the user has been informed that subfile 205b has not been updated and has been informed of the quarantining of the temporary copy subfiles 205b 1, 205b2, the user may select to delete the quarantined subfile data of the quarantined subfiles 205b 1, 205b2. If so, the infected storage locations 210b, 21 Od are cleared, and another AV Scan is performed on those areas. If a storage location which previously contained a quarantined temporary copy subfile is found to be free of malicious software, the storage location may be returned to the pool 210 of temporary storage locations. In addition, in one embodiment, the subsequent write updates from the same host (host server 110a in this example) to the previously subfile 205b for which temporary copy subfiles 205b 1, 205b2 were quarantined, may be accepted.

[0047] It is seen from the above that a host that sends an "open read" command for the file receives the file data that is virus free. As previously stated, the host will be notified that the file contains an area (subfile in this example) that did not get updated due to a virus detection. As in current art, since the file was not updated, an "open read" command from a host will not cause an AV scan if the AV scan engines have not been updated with new AV software. [0048] As previously mentioned, upon quarantining the first temporary copy subfile 205b 1, the file 200 will not contain the latest updates represented by the quarantined write update data. In another embodiment, the original host server 110a which provided the original write update data may be requested to not send write update data to the subfile 205b. Instead, a second host such as the host server 110b, for example, may be requested to provide the write update data targeted to update the subfile 205b. In this example, the write update data provided by the second host server 110b may be the same as that provided by the first host server 110a, but may be free of malicious software.

[0049] Accordingly, in this example, in the event that the first temporary copy subfile 205b 1 is quarantined, a second temporary copy subfile as represented by the temporary copy subfile 205b2 (FIG. 8) may be created to receive the write update data from the second host server 110b in a manner similar to that described above in connection with the first host server 110a. If the temporary copy subfile 205b2 containing the write update data from the second host server 110b passes the scanning test, the original file 200 may be updated with the scanned and updated contents of the clean temporary copy subfile 205b2 in the manner described above in connection with temporary copy subfile 205b 1. Upon successful updating of the original file 200 with the scanned, updated contents of the temporary copy subfile 205b2, the temporary memory or storage space utilized by the temporary copy subfile 205b2 may be released and returned to the pool 210 of temporary copy subfile locations. In addition, the first host server 110a may be permitted to resume sending subsequent write updates targeted for the subfile 205b.

[0050] Conversely if the write update data that was sent by the second host server 110b is also quarantined, then the file 200 may be marked with a suitable indication such as "not up to date," for example, to indicate the particular file area (in this example, subfile 205b) which was not updated. In this example, the user may also be informed that the particular file area (in this example, subfile 205b) was not updated, and that the temporary copy subfiles 205b 1 and 205b2 have been quarantined. In addition, the second host server (host server 110b in this example) may be requested to not resend the particular write update data which was found to contain malicious software and which could not be repaired as discussed above. Further in one embodiment, any subsequent write updates from the same host (host server 110b in this example) to the same subfile (in this example, subfile 205b) may be rejected.

[0051] In another embodiment of the present invention, once the user has been informed that subfile 205b has not been updated and has been informed of the quarantining of the temporary copy subfiles 205b 1, 205b2, the user may in this example as well select to delete the quarantined subfile data of the quarantined subfiles 205b 1, 205b2. If so, the infected storage locations 210b, 210d are cleared, and another AV Scan is performed on those areas. If a storage location which previously contained a quarantined temporary copy subfile is found to be free of malicious software, the storage location may be returned to the pool 210 of temporary storage locations. In addition, in one embodiment, the subsequent write updates from the same hosts (host servers 110a, 110b in this example) targeted for the subfile 205b for which temporary copy subfiles 205b 1, 205b2 were quarantined, would be accepted.

[0052] In one embodiment, the size of each subfile 205 may be selected to be proportional to the capacity of a server 110 scanning the subfiles such as the original subfile 205b and its corresponding temporary copy subfiles 205b 1, 205b2, for example. The size may be fixed or may be dynamically assigned. It is appreciated that other sizes and other techniques for choosing the sizes of the subfiles may be utilized depending upon the particular application.

[0053] FIG. 9 is a schematic block diagram illustrating one embodiment of an anti-virus control file 302. The anti- virus control file 302 includes an entry 220 for each subfile original or temporary to be scanned. In one embodiment, each entry 220 includes a status 230, a server identifier 235, and a subfile address 240.

[0054] The status 230 may be selected from the group consisting of in-queue, quarantined, and cleared statuses. The in-queue status may indicate that an original or temporary copy subfile 205 is scheduled to be scanned by a server 110, but has not been found to be clear of malicious software. In one embodiment, subfiles 205 within the in-queue status may be accessed.

Alternatively, subfiles 205 with the in-queue status may not be accessed. As used herein, accessed refers to a subfile 205 being read from and/or written to by an application, an operating system, or the like. [0055] The quarantined status may indicate that malicious software has been found in the subfile 205. In one embodiment, subfiles 205 with a quarantined status may not be accessed. Subfiles 205 with the quarantined status may be scheduled for mitigation, deletion or other processing. The mitigation may include repair to delete malicious software from the subfile 205, overwriting the subfile 205 with a backup copy, and rebuilding the subfile 205 using error codes and/or redundant data, and the like.

[0056] The cleared status may indicate that the subfile 205 has been scanned and that no malicious software has been found. In one embodiment, subfiles 205 with a cleared status may be accessed. For example, if the first subfile 205a of a large database file 200 has been scanned and has a cleared status, the first subfile 205a may be accessed.

[0057] The server identifier 235 may identify the server 110 assigned to scan the subfile 205. In one embodiment, the server identifier 235 is a network address. Alternatively, the server identify 235 may be a logical name.

[0058] The subfile address 240 may include a start address and an end address for the subfile 250. In one embodiment, the subfile address 240 includes start addresses and end addresses for a plurality of segments that make up the subfile.

[0059] FIG. 10 is a schematic block diagram illustrating one embodiment of a computer 300. The computer 300 may be the server 110. Alternatively, the computer 300 may be a storage server 130, a controller 160, or the like. The computer 300 may include a processor 305, a memory 310, and communication hardware 315. The memory 310 may be a semiconductor storage device, a hard disk drive, or the combinations thereof. The memory 310 may store computer readable program code. The processor 305 may execute the computer readable program code. The computer 300 may communicate with the external devices through the communication hardware 315.

[0060] FIG. 11 is a schematic block diagram illustrating one embodiment of a file integrity preservation apparatus 350. The apparatus 350 may be embodied in the computer 300. The apparatus 350 includes an anti-virus control file 320, a division module 325, an access module 330 and a subfile update module 360.

[0061] In one embodiment, the anti-virus control file 320, the division module 325, the access module 330 and the subfile update module 360 may be embodied in a computer-readable storage medium storing computer readable program code. The computer readable storage medium may be the memory 310. The processor 305 may execute the computer readable program code to perform the functions of the anti- virus control file 320, the division module 325, the access module 330 and the subfile update module 360

[0062] The division module 325 may divide the file 200 into a plurality of subfiles 205 and create the temporary copy subfiles. The access module 330 may maintain a status of each subfile 205. In addition, the access module 330 may scan each subfile 205 with a separate server 110 as described herein. If the subfile passes the scan, the subfile update module 360 may update the subfile with the scanned update data.

[0063] FIG. 12 shows one embodiment of operations for file integrity preservation in accordance with the present description. Upon the initiation of a write data update (block 400) in which a host provides write data targeted to update a portion of a file, the write data update is instead used to update (block 404) a temporary copy subfile corresponding to a subfile of the file containing the targeted portion of the write data operation. Upon updating the temporary copy subfile of the file with the write update, the updated temporary copy subfile is scanned (block 408) for malicious software. If the scanned, updated temporary copy subfile passes (block 412) the scan, the file or its original subfile may be updated (block 416) with the scanned, updated contents of the temporary copy subfile. In addition, any blocks applied to prior sources of infected write data for the subfile of the file may be removed (block 420) to permit resumption of access to the subfile for the previously blocked sources.

[0064] Conversely, if the updated, scanned temporary copy subfile fails (block 412) the scan such that the temporary copy subfile was found to be infected with malicious software, an attempt (block 434) may optionally be made to repair the scanned, updated temporary copy subfile found to be infected with malicious software. Upon completion of the repair attempt, the temporary copy subfile may be rescanned (block 436) to determine if the repair attempt was successful. If the temporary copy subfile fails the scan again, that is, the repair attempt was unsuccessful (block 436) , the temporary copy subfile may be quarantined (block 440). In one embodiment, a determination may be made (block 448) as to whether to request a resending of the write update data. The request to resend the write update data may be made to the original source of the write update data or to a different source. If a resending of the write update data is requested and received, the resent write update data may be used to update (block 404) another temporary copy subfile and the operations of blocks 404-448 may be repeated.

[0065] If it is determined (block 448) that the resending of the write update data is not to be requested, the source of the infected write update data may be temporarily blocked (block 450) from further access to the subfile which was targeted by the write data update. In one embodiment, further operations may be performed as explained in greater detail in connection with FIG. 13 below.

[0066] If the repair attempt (block 434) allows the updated temporary copy subfile to pass (block 436) the scan, indicating that the repair was successful, the file or its original subfile may be updated (block 416) with the scanned, updated contents of the temporary copy subfile. In addition, any blocks applied to prior sources of infected write data for the subfile of the file may be removed (block 420) to permit resumption of access to the subfile for the previously blocked sources.

[0067] As previously mentioned, if a determination (block 448) is made to request no further resends of the write update data, a further operations may optionally be performed. FIG. 13 depicts one example of operations which may be initiated (block 500) subsequent to quarantining (block 440, FIG. 12) a temporary copy subfile. In one embodiment, further operations may include deleting (block 504) the contents of the quarantined temporary copy subfiles and scanning (block 508) the locations of the deleted temporary copy subfiles to ensure that they are free of malicious software. If so, the locations of the temporary copy subfiles may be returned (block 516) to a pool of temporary copy subfiles for use by other processes.

Alternatively, if the scanning (block 508) of the locations of the deleted temporary copy subfile indicates that malicious software remains, the quarantining of the locations of the temporary copy subfile may continue (block 520). In some embodiments, one or more additional attempts may be made to clean the temporary copy subfile locations found to harbor malicious software.

[0068] As previously mentioned in connection with FIG. 12, in the event that a temporary copy subfile is quarantined (block 440, FIG. 12), a determination may be made (block 448) as to whether to request a resending of the write update data. FIG. 14 is directed to an embodiment in which the request to resend the write update data is made to a source other than the original source of the write update data.

[0069] In this embodiment, the resent write update data from the second source may be used to update (block 404, FIG. 12) another temporary copy subfile instead of the original targeted subfile 205b (FIG. 8) of the original file 200 and the operations of blocks 404-448 (FIG. 12) may be repeated with respect to the second temporary copy subfile. Accordingly, upon receipt (block 600) of the resent write update data from a second host, the resent update data intended to update the subfile 205b, is instead used to update (block 604) a second temporary copy subfile 205b2 corresponding to the subfile 205b of the file 200 containing the targeted portion of the write data operation. Upon updating the second temporary copy subfile of the file with the write update, the updated second temporary copy subfile is scanned (block 608) for malicious software. If the scanned, updated second temporary copy subfile passes (block 612) the scan, the file 200 or its original subfile 205b may be updated (block 616) with the scanned, updated contents of the second temporary copy subfile. In addition, any blocks applied to prior sources such as the original source of the infected write data for the subfile of the file may be removed (block 620) to permit resumption of access to the subfile for the previously blocked sources.

[0070] Conversely, if the updated, scanned second temporary copy subfile 205b2 fails (block 612) the scan such that the second temporary copy subfile was found to be infected with malicious software, an attempt (block 634) may be made to repair the scanned, updated second temporary copy subfile found to be infected with malicious software. Upon completion of the repair attempt, the second temporary copy subfile may be rescanned (block 636) to determine if the repair attempt was successful. If the second temporary copy subfile fails the scan again, that is, the repair attempt was unsuccessful (block 636), the second temporary copy subfile may be quarantined (block 640). In one embodiment, a determination may be made (block 448) as to whether to request a resending of the write update data from the second source or another source. If a resending of the write update data is requested and received, the resent write update data may be used to update (block 604) another (such as a third) temporary copy subfile and the operations of blocks 604-648 may be repeated.

[0071] If it is determined (block 648) that the resending of the write update data is not to be requested again, the second source of the infected write update data may be temporarily blocked (block 650) from further access to the subfile which was targeted by the write data update. In one embodiment, further operations may be performed for the quarantined second temporary copy subfile as explained in greater detail in connection with FIG. 13 above.

[0072] If the repair attempt (block 634) allows the updated second temporary copy subfile to pass (block 636) the scan, indicating that the repair was successful, the file 200 or its original subfile 205b may be updated (block 616) with the scanned, updated contents of the second temporary copy subfile. In addition, any blocks applied to prior sources of infected write data for the subfile of the file may be removed (block 620) to permit resumption of access to the subfile for the previously blocked sources.

[0073] The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.

[0074] The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non- exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.

[0075] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.

[0076] Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state- setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.

[0077] Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

[0078] These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.

[0079] The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks. [0080] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

[0081] The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described

embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.