Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND DEVICES FOR SECURE DELETION OF DATA IN A LOG STRUCTURED FILE SYSTEM
Document Type and Number:
WIPO Patent Application WO/2012/167392
Kind Code:
A2
Abstract:
For ensuring secure deletion of data in a log structured file system (12), shortened is the deletion latency between deletion of the data by a user and secure deletion of the data from memory by an application (11) writing (S4) and deleting (S3) junk files to expedite reallocation of memory. For that purpose, the application (11) determines (S1, S2) periodically a number of free chunks of memory, deletes (S3) a junk file for cases where the number of free chunks is below a defined lower memory threshold, and writes (S4) a junk file for cases where the number of free chunks (131) is above a defined upper memory threshold.

Inventors:
CAPKUN SRDJAN (CH)
BASIN DAVID (CH)
REARDON JOEL (CH)
MARFORIO CLAUDIO (CH)
Application Number:
PCT/CH2012/000123
Publication Date:
December 13, 2012
Filing Date:
June 05, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ETH ZUERICH (CH)
CAPKUN SRDJAN (CH)
BASIN DAVID (CH)
REARDON JOEL (CH)
MARFORIO CLAUDIO (CH)
International Classes:
G06F17/30
Foreign References:
US20090172265A12009-07-02
Other References:
None
Download PDF:
Claims:
A method for secure deletion of data in a log structured file system (12) wherein deletion latency between deletion of the data by a user and secure deletion of the data from memory (13) is shortened by an application (11) writing and deleting junk files to expedite reallocation of memory.

The method of claim 2, wherein the application (11) determines periodically a number of free chunks of memory (131), deletes a junk file for cases where the number of free chunks (131) is below a defined lower memory threshold, and writes a junk file for cases where the number of free chunks (131) is above a defined upper memory threshold.

The method of claim 2, wherein the junk files are set to a defined number of one or more memory erase blocks ( 130) including junk data, a memory erase block (130) comprising a plurality of memory chunks (131), and the defined lower memory threshold and the defined upper memory threshold are set to obtain a minimum amount of available memory in a range of 10 to 250 memory erase blocks (130).

The method of one of claims 1 to 3, wherein the application deletes from memory (13) older junk files prior to newer junk files.

The method of one of claims 1 to 4, wherein the log structured file system (12) is implemented on a communication terminal, and the application (1 T) is assigned permission to run on the communication terminal while the communication terminal is in a locked state. The method of one of claims 1 to' 5, wherein upon user deletion of sensitive data the application (11) fills the available memory with junk data.

A computerized device (1) comprising a data memory (13) and a log structured file system (12), wherein the device (1) further comprises an application module (11) configured to shorten deletion latency between deletion of the data by a user and secure deletion of the data from the memory ( 13) by writing and deleting junk files to expedite reallocation of memory.

The device (1) of claim 7, wherein the application module (11) is further configured to determine periodically a number of free chunks of memory ( 131 ), to delete a junk file for cases where the number of free chunks (131) is below a defined lower memory threshold, and to write a junk file for cases where the number of free chunks ( 131 ) is above a defined upper memory threshold.

The device (1 ) of claim 8, wherein the junk files each include a defined number of one or more memory erase blocks (130) including junk data, a memory erase block (130) comprising a plurality of memory chunks (131), and the defined lower memory threshold and the defined upper memory threshold are set to obtain a minimum amount of available memory in a range of 10 to 250 memory erase blocks (130).

10. The device (1) of one of claims 7 to 9, wherein the application module (11) is further configured to delete from the memory older junk files prior to newer junk files.

11. The device ( 1 ) of one of claims 7 to 10, wherein the device ( 1 ) is a communication terminal, and the application module ( 11 ) is set up with permissions to run on the communication terminal while the communication terminal is in a locked state.

12. A computer program product comprising a computer readable medium having stored thereon computer program code which directs a computerized device (1), having a data memory (13) and a log structured file system (12), to execute a method according to one of claims T to 7.

1 . A method of deleting data in a log structured file system (12) associated with flash memory (13) of an electronic device (1), wherein, upon deletion of a chunk of memory (131) by the file system (12), a kernel function of the communication terminal overwrites the entire deleted chunk (131) with zeros.

14. An electronic device (1) comprising flash memory (1 ) and an associated log structured file system (12), wherein the device (1) further comprises a kernel function configured, upon deletion of a chunk of memory (131 ) by the file system ( 1 ), to overwrite the entire deleted chunk (131) with zeros.

15. A computer program product comprising a computer readable medium having stored thereon computer program code which directs an electronic device (1), having a flash memory (13) and an associated log structured file system (12), to execute a method according to claim 13.

6. A method for secure deletion on a file system, comprising the following steps:

- deleting the data from the device through normal means;

- writing new data until the capacity of the device is full, thereby ensuring that all available space has since been repurposed for new data; and - deleting the new file to reclaim the space it consumes. 7. A method for the encryption and secure deletion of data, using a storage medium capable of partitioning, where data storage comprises the following steps:

- partitioning the device into an encryption key area and a storage area;

- associating each file in the storage area with an encryption key in the key area; - encrypting any data written to a file with its corresponding key; and data deletion compromises the following steps:

- deleting the file from the storage area;

- deleting the key for the file from the key area;

- applying claim 1 6 to the capacity of storage space.

A method for achieving improvements to the expected latency on log -structured file systems without requiring special privileges, comprising the following steps: - crating one or more unneeded file(s) whenever the file system has free space above a threshold;

- deleting an unneeded file whenever the file system has free space below a threshold; and

- periodically checking the file system's free space and managing the unneeded files.

A method for secure deletion involving a remapping layer, such that any file system mounted on top of the remapping layer can implement an equivalent system as claim 1 6, where instead of filling the entire storage capacity of the device, the remapping layer withholds blocks from the file system, consisting of the following steps:

- the remapping layer is informed that a secure deletion operation is occurring;

- the user writes unneeded data to a single file;

- during this process, whenever a block is erased, the remapping layer uses block metadata to prevent its reallocation;

- the unneeded file is then deleted and the remapping layer is told that;

- the secure deletion operation has completed; and the block remapping layer uses metadata to again allow allocation of erased blocks.

20. A realization of claim 1 9 as applied to the UBI block remapping layer and any flash file system, including YAFF5 and YAFFS2.

21 . A realization of claim 1 9 using the bad block flag as metadata to prevent the file system from allocating a block.

5 22. A hybrid system of claims 20 and 21 .

23. A method for secure deletion involving a remapping layer, such that any file system mounted on top of it can implement an equivalent system as Claim 1 8, where instead of filling unneeded files, the remapping layer withholds blocks from the file system; this consist of the following steps:

10 - the desired free space (cf. the thresholds on free space) is told to the remapping layer;

- when a block is erased, the remapping layer may indicate via metadata that the file system should not use that block (cf. writing an unneeded file); and

- the remapping layer uses metadata to also indicate when the file system can i s again use a block (cf. deleting an unneeded file).

24. A realization of claim 23 using the UBI remapping layer and any flash file system, including UBIFS, YAFFS and YAFFS2.

25. A realization of claim 23 using the bad block flag as metadata to prevent the file system from allocating a block.

26. A hybrid system of claims 24 and 25

A method for secure deletion making use of a remapping layer that provides atomic update, with a change to any file system that operates on the remapping layer, where . secure deletion occurs in a similar way as claim 1 7, consisting of the following steps:

- one or more erase block(s) at fixed logical positions is/are dedicated solely to storage of keys;

- each ile is associated with a key;

- any data written to a file is encrypted with its corresponding key before writing it to storage;

- when data is deleted, its corresponding key is also deleted; and

- periodically the block(s) containing keys are atomically updated using the remapping layer's atomic update function to securely remove all old keys, thus securely deleting (in a computational sense) the corresponding data. 28. A realization of claim 27 with UBI remapping layer and any file system that can function with UBI, including UBIFS, YAFFS and YAFFS2.

Description:
METHOD AND DEVICES FOR SECURE DELETION OF DATA IN A LOG STRUCTURED FILE SYSTEM

Field of the Invention

The present invention relates to a method and devices for secure deletion of data in a log structured file system. Particularly, the present invention relates to a method and devices for secure deletion of data stored by a log structured file system in flash memory.

Background of the Invention

Deleting a file from a storage medium has two main purposes: it reclaims storage and ensures that any sensitive information contained in the file is no longer accessible. When done for the latter purpose, it is critical that the file is securely deleted, meaning that its content does not persist on the storage media after its deletion.

Largely for performance reasons, secure deletion of data is almost always ignored in file system design [Z, 3, 6, 1 0, 1 2, 1 5]. Typically, deletion is implemented as a rapid operation where a file is unlinked, meaning its metadata states that it is no longer present while the contents of the file remain in storage until overwritten by new data [4]. While many users expect that deleting messages will delete them, clearing the browser's history will clear it, and changing their location will overwrite their previous location, in reality this information remains on their devices without any guarantees of actual deletion from memory. Surveys of repurposed hard drives found that many contained private financial or medical data that could be recovered with trivial forensic cost and effort [4]. However, secure deletion is not only important when media is repurposed; it also enables users to protect the confidentiality of their data, if their devices are compromised, stolen, or confiscated under a subpoena. In the case of a subpoena, the user may also be forced to disclose all passwords, keys, or other credentials that enable access to the data stored on the device; in such a scenario, the users can only use secure deletion to protect their data before the device is seized.

To protect their privacy, users should be able to securely delete the data stored on their devices. This requirement is particularly important for modern smartphones, as they are increasingly storing personal data such as the owner's private conversations, browsing history, and location history. Mobile phones further store business data, which for company policy or legal reasons should be deleted after some time elapses or should not be available at some geographic locations, e.g. , should be deleted before a phone owner leaves a certain jurisdiction. Secure deletion mechanisms have been proposed for some widely-used block-structured and journalled file systems, like ext2 aYid ext3 [ 1 , 7, 8], These mechanisms typically modify the kernel and enforce that when the file is marked for deletion it is overwritten with arbitrary data. Another mechanism is full-drive encryption, where the decryption key is generated from user's password upon system boot. This mechanism is effective but has limitations: users can be coerced to disclose their credentials, or might be legally obliged to delete data after some time or at some locations.

Log-structured file systems differ from a traditional block-based file system, e.g. FAT or ext2, in that the entire file system is stored as a chronological record of changes from the initial empty state. As files are written, new fixed-size chunks are appended to the log indicating the resulting change; a chunk can store either the file headers or the files' data, and is always added to the end of the log. The file system maintains in RAM the information regarding which chunks are valid, which have been superseded since writing, and where the headers for each file can be found. Log-structured file systems complicate secure deletion because the traditional approach of overwriting a file with new content simply appends a second version of the file, while the first still remains in the log's history. Similarly, encrypting a file will also just append a new encrypted version of that file, while the plaintext remains. The mechanism through which data is removed from a log-structured file system is called garbage collection. It operates at the erase block level, which has a larger granularity than a chunk. The file system examines both the wasted space in an erase block and the storage medium's total remaining free space when deciding whether to garbage collect the erase block. YAFFS, "Yet Another Flash File System", is a log-structured file system developed specifically for flash memory storage. The flash-based log-structured file system is notably used for the internal memory of Android mobile phones. YAFFS allocates memory by selecting an unused erase block and. allocating sequentially the numbered chunks in that block. When the block contains no more empty chunks, a new block is selected for allocation by searching for an empty block. YAFFS searches sequentially, wrapping cyclically when necessary, by the erase block number as defined by the physical layout of memory on the storage medium.

Garbage collection in YAFFS is either initiated by a timer thread that performs system maintenance, or takes place after write operations. Usually, only a few chunks are copied at a time, whereby the work to copy a block is amortized over many write operations. If the file system contains too few free blocks then a more aggressive garbage collection is performed, in this case, blocks with less deleted space will be collected, and the procedure will continue until the entire block can be reclaimed. Figure 1 shows the lifetime for stored data. After being allocated at time t 0 , a block is reallocated at time t 2 , when all deleted data it contained becomes securely deleted. The difference between t 2 and the time when the user deletes the data, t, , is called the deletion latency (t 2 -t, ). This is always less than the block reallocation period - defined as the time between two subsequent allocations of the same block (t 2 -t 0 ).

Flash Memory Flash memory is a non-volatile storage medium consisting of an array of electrical components that store information. The contents of flash memory cannot be altered in place, but rather an erase procedure must be first performed on a larger granularity than reading or writing. Erasure on flash memory is costly: the increased power requirements of erasure eventually wear out the medium, and erasure blocks can only handle a finite number of erasure operations - roughly 1 04 to 105 [ 1 3] - before becoming unusable.

Flash file systems are typically log-structured for two reasons. First, the large erase granularity of flash memory maps exactly to the garbage collector's erase blocks in a log- structured file system. Second, log-structured file systems do not require in-place updates; new data is always appended to the end of the log.

When considering how deletion is performed in the YAFFS file system, it becomes clear that that log-structured file systems provide no temporal guarantees on data deletion; deleted data persists for around 44 hours with average phone use and indefinitely, if the phone is not used after the deletion. Furthermore, mechanisms such as file overwriting or encryption of individual files, proposed for data deletion in block-structured file systems [ 1 1 , 5] do not ensure data deletion in log-structured file systems such as YAFFS. Namely, overwriting a file in log-structured file systems simply writes a new version of a file, but does not remove the original copy. Similarly, when a file is encrypted, the ciphertext will be written to a new location, but the plaintext will remain on the flash drive until storage space is needed and garbage collection is invoked. Only changes to the file system to ensure that no plaintext data is ever written to the drive would protect the user.

Summary of the Invention

It is an object of this invention to provide a method and devices for secure deletion of data in a log structured file system, which method and devices do not have the disadvantages of the prior art. In particular, it is an object of the present invention to provide a method and devices for secure deletion of data stored by a log structured file system in flash memory.

According to the present invention, these objects are achieved particularly through the features of the independent claims. In addition, further advantageous embodiments follow from the dependent claims and the description.

According to the present invention, the above-mentioned objects are particularly achieved in that, for secure deletion of data in a log structured file system, the deletion latency between deletion of the data by a user and secure deletion of the data from memory, e.g. flash memory, is shortened by an application (module) writing and deleting junk files to expedite reallocation of memory.

In a preferred embodiment, the application determines periodically a number of free chunks of memory, deletes a junk file for cases where the number of free chunks is below a defined lower memory threshold, and writes a junk file for cases where the number of free chunks is above a defined upper memory threshold. For example, the junk files are set to a defined number of one or more memory erase blocks including junk data, a memory erase block comprising a plurality of memory chunks, and the defined lower memory threshold and the defined upper memory threshold are set to obtain a minimum amount of available memory in a range of 1 0 to 250 memory erase blocks.

In another preferred embodiment, the application deletes from memory older junk files prior to newer junk files. In an embodiment, the log structured file system is implemented on a computerized device, particularly a communication terminal, and the application is assigned permission to run on the communication terminal while the communication terminal is in a locked state.

In another embodiment, upon user deletion of sensitive data, the application fills the available memory with junk data.

In addition to the method and computerized device for secure deletion of data in a log structured file system, the present invention also relates to a computer program product comprising a (tangible, non-transitory) computer readable medium having stored thereon computer program code which directs a computerized device, e.g. a communication terminal, having a data memory, e.g. flash memory, and a log structured file system, to execute the method for secure deletion of data in the log structured file system.

In another aspect, the invention relates to a method and a device for deleting data in a log structured file system associated with flash memory of a communication terminal, whereby, upon deletion of a chunk of memory by the file system, a kernel function of the communication terminal overwrites the entire deleted chunk with zeros. I n a third aspect, the invention relates to a method for secure deletion on a file system, comprising the steps: deleting the data from the device through normal means; writing new data until the capacity of the device is full, thereby ensuring that all available space has since been repurposed for new data; and deleting the new file to reclaim the space it consumes.

In a fourth aspect, the invention relates to a method for the encryption and secu re deletion of data, using a storage medium capable of partitioning , where data storage comprises the steps: partitioning the device into an encryption key area and a storage area; associating each file in the storage area with an encryption key in the key area; encrypting any data written to a file with its corresponding key; and where data deletion compromises the steps: deleting the file from the storage area; deleting the key for the file from the key area; applying the third aspect to the capacity of storage space.

In a fifth aspect, the invention relates to a method for achieving improvements to the expected latency on log-structured file systems without requiring special privileges, comprising the following steps: crating one or more u nneeded file( s) whenever the file system has free space above a threshold; deleting an unneeded file whenever the file system has free space below a threshold; and periodically checking the file system's free space and managing the unneeded files. In another aspect, the invention relates to a method for secure deletion involving a remapping layer, such that any file system mounted on top of the rema pping layer can implement an equivalent system as the thi rd aspect, where instead of filling the entire storage capacity of the device, the remapping layer withholds blocks from the file system, consisting of the following steps: the remapping layer is informed that a secu re deletion operation is occurring; the user writes unneeded data to a single file; during this process, whenever a block is erased, the remapping layer uses block metadata to prevent its reallocation; the unneeded file is then deleted and the remapping layer is told that; the secure deletion operation has completed; and the block remapping layer uses metadata to again allow allocation of erased blocks. A first realization is applied to the UBI block remapping layer and any flash file system, including YAFFS and YAFFS2. A second realization uses the bad block flag as metadata to prevent the file system from allocating a block. A hybrid system combines these first and second realizations.

In a further aspect, the invention relates to a method for secure deletion involving a remapping layer, such that any file system mounted on top of it can implement an equivalent system as the fifth aspect, where instead of filling unneeded files, the remapping layer withholds blocks from the file system; this consist of the following steps: the desired free space (cf. the thresholds on free space) is told to the remapping layer; when a block is erased, the remapping layer may indicate via metadata that the file system should not use that block (cf. writing an unneeded file); and the remapping layer uses metadata to also indicate when the file system can again use a block (cf. deleting an unneeded file). In a first realization used is the UBI remapping layer and any flash file system, including UBIFS, YAFFS and YAFFS2. A second realization uses the bad block flag as metadata to prevent the file system from allocating a block. A hybrid system combines these first and second realizations.

In another aspect, the invention relates to a method for secure deletion making use of a remapping layer that provides atomic update, with a change to any file system that operates on the remapping layer, where secure deletion occurs in a similar way as the fourth aspect, consisting of the following steps: one or more erase block(s) at fixed logical positions is/are dedicated solely to storage of keys; each file is associated with a key; any data written to a file is encrypted with its corresponding key before writing it to storage; when data is deleted, its corresponding key is also deleted; and periodically the block(s) containing keys are atomically updated using the remapping layer's atomic update function to securely remove all old keys, thus securely deleting (in a computationa-l sense) the corresponding data. A realization uses UBI remapping layer and any file system that can function with UBI, including UBIFS, YAFFS and YAFFS2.

Brief Description of the Drawings

The present invention will be explained in more detail, by way of example, with reference to the drawings in which: Figure 1 : shows a time diagram illustrating the lifetime for stored data.

Figure 2a: shows a graph which illustrates a snapshot of block allocation for a browser application.

Figure 2b: shows a graph which illustrates a snapshot of block allocation for a maps application. Figure 3: shows a graph which illustrates YAFFS' block allocations over time on an

Android phone.

Figure 4: shows a flow diagram illustrating an exemplary sequence of steps performed by an application for secure deletion of data in a log structured file system.

Figure 6: shows a scatter plot of average block allocations per hour and median secret erasure time in different application simulations with a variety of parameters. Figure 7 : shows a graph which illustrates block allocation times on a simulated system not running the application, where, at the end of the simulation, the drive was completely filled to guarantee all deleted data was removed.

Figure 8: illustrates the state of an erase block at different times.

Figure 9: shows a block diagram illustrating schematically a computerized device having a data memory, a log structured files system and an application for secure deletion of data in the log structured file system.

Detailed Description of the Preferred Embodiments

In this section, investigated is the data persistence on a commercially available mobile radio phone, particularly on an Android phone, which uses the YAFFS file system for its internal memory. Measured is the time that it takes for one erase block to be reclaimed after being marked for deletion.

Although erasure does not happen deterministically, it is possible to measure average and worst-case data deletion latency for specific devices, application configurations and usage patterns. To measure the average time it takes for blocks to be reallocated, which implies the deletion of any data previously stored on them, the file system is instrumented at the kernel level to log block allocation information. The results show clearly the need for solutions that guarantee secure, i.e. complete and permanent, and timely deletion of data.

A modified version of the YAFFS Linux kernel module was built that logs data about block allocations and chunk writes. It was logged whenever a new block is allocated, which signals that the block is now empty and that whatever data was previously on the block has been erased (or moved). Also logged was every write operation: both file headers and file data. This made it possible to determine how often writes occur, in which blocks and chunks they occur, and when files are deleted.

During block allocations, logged was the system time in microseconds, the unique physical block number (in the present case, ranging from 1 to 1 570), the sequence number of the block, and the number of free chunks and erased blocks according to YAFFS's statistics. Also logged was the device name to demultiplex the data, as the Android phone has multiple YAFF5 partitions.

Logging every chunk write gives a fine-grained view of the writing behaviour on a mobile device. Logged was the system time in microseconds, the physical location of the chunk, the operating system's owner of the file, the block on which it is written, the type of data being written (i.e. a file, a folder, a header, etc.), the file id, and where in the file the data is being written.

With the collected information, it is possible to determine how much data is written to the file system, and the timing and frequency of block erasures. It is also possible to log the time when a particular file is written to the file system, which is cross-referenced in the logs to determine the block number on which it resides. Given this information, along with the time when each block is next erased and the time the data was deleted, it is possible to compute the extra time the data spends on the device after being deleted by the user. By logging the ownership of chunks, it is also possible to determine the distinct writing patterns of different running applications. This will later be used to construct profiles that model different scenarios in the simulated environment. To understand the severity of the existing problem with current implementations, it is examined in detail, the time required for data deletion on an Android phone. First the focus is on a subset of applications that could be used daily on a smart phone to determine how long data, even after deletion, lives on the system when the user uses only such applications. Then the phone is used continuously throughout the daily routine to find out, on average, how long data remains "alive" on the system before being erased. This information will later be used to model the writing behaviour of the Android OS and its applications to simulate a system with a specifically tailored simulator.

The system tested is a Nexus One running the latest Android OS ( 2.2.1 ) under what can be considered normal daily use. browsing the web, saving images, taking photographs with the integrated camera, listening to music, writing and receiving SMS messages, and making phone calls.

To understand the writing patterns of some commonly used applications, a user used the phone's browser, maps and gallery applications plus a popular game found on the Android Market. The user used the phone without any knowledge of the test, thereby eliminating any bias which could be introduced by knowing which system properties were examined.

For the browser test, the user surfed the web for approximately 8 minutes, performing various web surfing activities, such as logging into a university website, getting weather forecasts, and searching for images. For the maps case, the user interacted with the application for approximately 6 minutes, searching for a particular destination, looking at its "street view" and calculating a route to it. The game and gallery examples were ran for approximately 4-5 minutes each, which can be considered normal usage. Browser Maps Game Gallery Overall

Running time (s) 504 395 300 240 1439

Allocated blocks 185 87 1 2 247

Never deleted 168 75 1 2 199

Mean reallocation period (s) 54 75 WA N/A 271

Table 1 : Average reallocation period for different commonly used applications.

The test ran for 23 minutes and allocated 304 blocks.

Statistics for each test are summarized in Table 1 and example traces are plotted in Figure 2. The absence of lines after allocation of some blocks indicates that their content is still present on the system after their deletion time. This short usage scenario, which was executed for 23 minutes, gives an idea of how commonly used applications behave with respect to writes to disk.

To give a better idea of how long data remains on the device after being deleted, the instrumented phone was used daily. The experiment lasted 670 hours, roughly 27.9 days. In total, throughout the experiment, collected were 20345 block allocations made by 73 different writers. A writer could be any application but also the Android OS itself or one of its services (i.e. GPS, DHCP, compass, etc.) . Subsequent analysis of the experiment's logs yielded the result that blocks are reclaimed, on average, every 44.7 hours (the median being 44.5 hours). The worst case for block reallocation time for the experiment is 327.7 hours. This is not surprising, given the YAFFS implementation, but underscores the critical need for secure deletion solutions.

In Figure 9, reference numeral 1 refers to a computerized device, e.g. a communication terminal such as a mobile radio phone, e.g. an Android phone, or another mobile computerized device such as a laptop, palmtop or PDA computer, for example. As illustrated schematically in Figure 9, computerized device 1 comprises data memory 1 3, preferably flash memory, a log . structured file system 1 2, and an application module 1 1 running on a processor of the computerized device 1 . As illustrated in Figure 9, the data memory 1 3 comprises a plurality of memory erase blocks 130, each memory erase block 1 30 comprising a defined number of one or more memory chunks 1 31 .

The application module 1 1 works at the user level and is designed for the scenario where 5 security conscious users of the computing device 1 , e.g. mobile phone users, want to install a secure deletion application from an application marketplace (app store), but are unwilling to install a new operating system on their computing device 1 . This means that only user-level permissions on the storage medium can be provided to the application.

A user-level application has limited access to the flash device. The application cannot I D force the file-system to perform block erasures, prioritize garbage collection of particular areas in memory, or even know where on the computing device 1 the user's data is stored. The interface to the file system for such applications consists of the creation, modification, and deletion of the user's own local files.

The application module 1 1 keeps the storage medium close to capacity. In a so called 15 "ballooning" process, the application module 1 1 creates and removes insensitive junk files (including meaningless junk data only), resulting in more frequent garbage collection. This reduces the average block reallocation period, thereby speeding up the secure deletion of any sensitive information contained in the blocks.

The block reallocation period in a log-structured file system is the expected time that will 0 elapse between allocations of a block in the file system (cf. Figure 1 ). This is based mainly on two factors: the write frequency on the medium, and the expected number of other blocks that will be allocated before the particular block is reallocated. The type of contents on the block also has an effect: blocks containing long-term operating system files will tend not to be reallocated, simply because their contents are never deleted; however, blocks containing data that is not deleted is necessarily not a concern for secure deletion. The cyclic behaviour of block allocations in YAFFS is evident in Figure 3, which shows the sequence of block allocations from the collected data from an Android mobile phone. The X-axis corresponds to the time in hours for the experiment, and the Y-axis shows the numbered erase blocks. A small square in the graph indicates when each erase block was allocated. For clarity, only the allocations for every fifteenth block are shown. While some noise exists, it can be seen that block allocation numbers generally increase over time and wrap cyclically, and so the block reallocation period is dependent on both the number of blocks and the system-wide time between block allocations.

According to the invention, the capacity of the file system is reduced with junk content, which reduces the block reallocation period due to fewer blocks being available for allocation, and thus reduces the deletion latency. As a result, YAFFS is forced to employ more frequent garbage collection, as the file system will perpetually believe that it is in a state of reduced capacity. The junk files will be deleted when the drive requires more space, and be regenerated whenever there is too much free space.

The operation of the application module 1 1 is illustrated in Figure 4. The application module 1 1 runs periodically on the computing device 1 , examining the file system 1 2 (using stat) to determine the number of free chunks. As illustrated, in step S1 , the application module 1 1 requests from the log structured file system 1 2 information about the amount of free space, which is returned by the file system 1 2, in step 52, as a number ( n ) of free memory chunks 1 3 1 . Based on the current free capacity, the application module 1 1 either creates or deletes junk files using parameterized thresholds. As can be seen in Figure 4, if the number of free chunks 1 31 is below the defined lower memory threshold (n < min_threshold), in step S3, the application module 1 1 deletes one or more junk files from memory 1 3. On 5 the other hand, if the number of free chunks 1 31 is above the defined upper memory threshold (n > max_threshold ), in step S4, the application module 1 1 writes a junk file into memory 1 3.

In addition to the lower and upper memory thresholds, the junk files' exact size is also parameterizable and defined in multiples of erase blocks. Deleting one junk file will free at T O least one erase block for new data.

The application module 1 1 always deletes the oldest junk file before more recent ones to ensure that the desired load-balancing of wear on the flash memory is preserved. Long- lived junk files can also be erased, with new ones written, to ensure that their corresponding erase blocks will be used for more active data storage. The new ones are i s preferably written, before deleting the old ones to ensure that they reside on different erase blocks.

As illustrated schematically in Figure. 4, the application module 1 1 runs and operates within the limits imposed by user-space (i.e. outside the kernel space). To run the application module 1 1 on the computerized device 1 , it is given the permission to run 20 while the device 1 is in a locked state; the application module 1 1 also specifies that it will run as a service, meaning execution occurs even when the application is not in the foreground. Besides running the application module 1 1 on the computerized device 1 , e.g. the Android phone, also simulated was its behaviour on a simulated flash drive mounted as a YAFFS file system to collect more statistics on its effectiveness. This drive was implemented in RAM using the kernel module nandsim. Nandsim creates a virtual flash device that can be mounted as any flash based file system. A discrete event simulator was programmed that writes, overwrites, and deletes files on the phone's storage, which is simply a mounted directory on the simulation computer.

All the distributions that were used for file creation, modification, and deletion were taken from real-life Android phone usage, as described above. After a week of logging all write activities on the phone, the following two distributions were computed for each application: the time between successive creations of two new files, and the type of file to create. A file type is defined by its lifetime, a distribution over the period between opening a file for write, a distribution over the number of chunks to write to a file each time it is opened, and a distribution over the chunks of a file that indicates where writes will occur.

Additionally, implemented was a secret writer that operated alongside the simulated writers. It infrequently wrote a one-chunk secret message, waited until a new block was allocated, and deleted the secret message. Logged was the time before and after the file was opened to write the secret message, and the time it was deleted. By cross- referencing with the collected block allocation information, it was determined to which blocks the secret was written, and the time when those blocks were reclaimed thus erasing the secret.

In the YAFFS implementation, measured was the rate of block allocations. This provided information about the rate that data is written to the flash device. Data can be written from two sources: the actual data written by the simulator, and data copied by YAFFS's garbage collection mechanism. Since fixed write distributions are used, the expected rate of writes from the simulator is identical between experiments. Therefore, the observed disparity in block allocation rates reflects exactly the additional writes that are required by 5 the space filling application module 1 1 to achieve secure deletion.

Figure 5 (cf. Figure 3 ) shows YAFFS block allocations when using the application module 1 1 . It can be seen that as the range of possible block allocations shrinks considerably, the sequential allocation strategy becomes much more erratic, and the block reallocation period decreases. Erase blocks that are not being allocated (i.e. , the white space) T O correspond to erase blocks that have now been assigned junk files.

To quantify the benefit of the application module 1 1 - that is, how promptly the secure deletion of sensitive data occurs - measured is the expected time sensitive data remains on the storage medium. Calculated is this measurement using the secret writer that periodically writes one block secrets onto the medium and then deletes them, i s Subsequently it is computed how long the written secrets remained on the device.

Parameters for the application module 1 1 include the size of the junk files, and the free space thresholds when junk files are created and destroyed. These variables affect the total expected free space on the partition during execution, which will be in the range defined by the thresholds. This is typically, though not always, between the lower 20 threshold and the size of one junk file. The amount of free space on the drive is what affects both deletion time and the block allocation rate. To get an idea of how these parameters are affected, the simulation was run for different parameters and computed was the median erasure time and block allocation rate. Figure 6 shows the result of this experiment, which is a scatter plot with the median deletion time on the Y-axis and block allocations per hour on the X-axis; each point on the plot shows the results from one of the simulations. It can be observed in Figure 6 that these two quantities are inversely proportional. As the block allocations rate increases - due to less free space and thus more frequent garbage collection - the time decreases during which secrets remain on the device. -

Some representative configuration parameters were selected and investigated further. Table 1 shows measurements of the deletion time in hours for different amounts of free space (measured in erase blocks), as affected by using different parameters for the application module 1 1 . Each simulation was run eight times; the results were averaged and the 95% confidence intervals were computed. l- ' rec space Percentile Measurements

t ' er;ise Mocks ) Isi 50lh yoih V? Hi l Oih

No Ballooning 40.1S ± 3.93 48.76 ± 1.32 55.88 ± 2.24 56.99 ± 2.37 58.93 ± 2.32

250 7.92 ± 1.62 15.06 ± 1.51 22.82 ± 1.68 23.77 ± 1.61 25.03 ± 1.29

•50 0.08 ± 0.03 4.51 ± 0.57 8.37 ± 0.69 9.28 ± 1.03 10.54 ± 1.41

25 0.06 ± 0.05 2.3G ± 0.32 5.24 ± 0.81 6.25 ± 1.20 9.59 ± 1.79

10 0.02 ± 0.01 1.26 ± 0.18 3.14 ± 0.27 3.81 ± 0.43 17.42 ± 11.29

Table 2; Deletion time in hours for different configuration parameters.

It is observed that by filling all but 1 5% of the drive's capacity, much better secret erasure times are obtained than if the application module 1 1 is not used, and in the extreme case of ten free blocks, half the secrets are deleted in an hour and a quarter. From the measurements it was observed that assuring that all data is deleted in a timely fashion remains a challenge.

Since the simulated data being written is identical in expectation, it is observed that limiting the drive's spare capacity must result in more frequent and less optimal garbage collections. This is measured as the rate of block allocations, and in particular, the ratio between the expected rate and the observed rate represents the scale of the additional cost of application module 1 1 . Table 3 shows the results for block allocations (abbreviated as allocs) per hour using the same selected parameters for application module 1 1 as with Table 2.

5 It can be seen in Table 3 that limiting the available space has a significant impact on the block allocation rate. The first step, at 250 blocks free, has only a 61 % increase in block allocations and reasonably faster deletion. However, achieving deletion in less than an hour requires much more frequent block allocation. In the next section, consideration is given on how increased block allocations affect the wear on the device in terms of flash io memory and battery consumption.

Free space Block allocs Ratio

(erase blocks) per hour

No Ballooning 32.57 ± 1.13 Γ

250 52.64 ± 4.43 1.61

50 137.47 ± 26.92 422

25 196.00 ± 19.03 6.02

10 325.37 ± 36.84 9.99

Table 3: Block allocations per hour for different configuration parameters. The ratio column is the proportion with regards to not using the application module 1 1 .

The potential drawback of the application module 1 1 is the additional wear that i s increased erasures put on the device 1 , both in terms of damage to the flash memory and power consumption. If this approach significantly reduces the phone's lifetime or battery life, then it would be a concern for adoption. Therefore, experiments were performed to investigate this.

The additional wear is directly proportional to the increase in the block allocation rate, 20 and inversely proportional to the lifespan. The block allocation rate was converted into an expected lifetime in years using the conservative estimate of 1 04 erasures per block and 1 571 erase blocks available on the Android's data partition. A typical flash erase block can handle between 104 and 105 erasures [ 1 3]).

Table 4 shows the plot of the expected minimum lifespan of an Android phone running continuously at varying block allocation rates. It can be seen that device wear is not a concern even with the conservative estimate of block life time. A computerized device 1 running without the application module 1 1 can expect a lifespan of six decades - well beyond the replacement period of mobile phones. Even running the application module 1 1 with the most aggressive parameters will still result a lifetime of five years. It is observed that there exists a trade off between wear on the device and secure deletion, and so the user is enabled select his/her desired device lifespan and tune the security parameters accordingly.

Free space Expected minimum

(erase blocks) life time (years)

No Ballooning 55.1

250 34.1

50 1 1.7

f. O V. 11

10 5.5

Table 4: Expected minimum device lifetime at various block allocation rates.

To test if the application module 1 1 provides a feasible solution, the power consumption of write operations was checked. Measured was the battery level through the Android API, which gives its current charge as a percentage of its capacity. The experiment consisted of continuously writing data to the flash memory of the phone in a background service while monitoring the battery level in the foreground and measuring how much data must be written to drain ten percent of the total battery capacity. The experiment was run four times and the result was averaged. The resulting mean is within the range of 1 1 .0Ί +/-0.22 GB with a confidence of 95%, corresponding to 90483 full erase blocks worth of data. Since this well exceeds the total of 1 570 erase blocks on the Android's data partition, it can be assured that the experiment must have erased the blocks as well as written to them, Even using the most aggressive strategy by the application module 1 1 , where 325.37 blocks are allocated an hour, it will still take 1 1 .5 days for the application module 1 1 to consume ten percent of the battery. Furthermore, by looking at the built-in "Battery use" section in the phone's 'About' menu it could be seen that, after the experiment, the testing application was responsible for only 3% of battery usage, while the Android system accounted for 1 0% and the display for 87%. It was thus concluded that power consumption is not a concern for the use of the application module 1 1 . .

So far, it was shown that the use of the application module 1 1 reduces significantly the time that deleted sensitive data remains accessible on a log-structured file system, but it is still not possible to guarantee that a particular piece of sensitive data has been deleted at a certain time. Indeed, as was outlined for the maximum time measurement from Table 2, a secret can remain on the medium significantly longer than even the 95th percentile measurement. It would be useful to have a single user-space operation that guarantees the secure deletion of all previously erased data.

Ensuring secure deletion of all sensitive data on a log structured file system from user- space requires first deleting the sensitive data and then completely filling the drive with new data. Such an operation is always possible, provided the user is not subjected to disk-quota limitations, but it requires .garbage collecting nearly every block in the file system. This is because deleted chunks can occur in any erase block that sees active use, resulting in small data gaps throughout the file system. In Figure 7, simulated was a YAFFS drive without running the application module 1 1 , and at the end of the simulation the drive was filled completely. This is illustrated by the near immediate reallocation of every block on the medium after approximately 1 40 hours into the simulation.

By using the application module 1 1 , it is possible to reduce the number of blocks being actively used for writing data. This means that the set of blocks needing to be erased to achieve immediate deletion will be reduced. Any reduction would be an improvement over garbage collecting every erase block, and so performed was the following experiment to quantify how many blocks would need to be garbage collected when using the application module 1 1 . After executing the simulation, recorded was the exact number of block allocations. Also deleted was the current secret file, if it existed. To ensure that some secrets remained stored on the medium, the drive was unmounted and, using dd and grep, it was confirmed that the secrets still remained on the device. Then the drive was remounted and its remaining capacity was filled with the contents of /dev/zero until no more data could be written. Then the difference was taken between the current number of block allocations and the previously recorded value to determine how much additional work was performed. It was also verified that the secrets were in fact deleted by again unmounting the drive and unsuccessfully searching for the sensitive content.

Table 5 shows the results of this experiment. The average number of additional block allocations that was required to guarantee secure deletion, along with their 95% confidence intervals, is presented. The tests on the Android phone show that this operation occurs in less than a minute, and requires no significant power consumption. Free space Block

(erase blocks) allocations

No Ballooning 1557.20 ± 0.42

250 569.55 ± 1.63

50 406.00 ± 79.20

25 356.10 ± 65.92

10 187.12 ± 41.68

Table 5: Block allocations required to purge all secrets for different configurations.

The results show how many blocks were being actively used to store data at a particular time. The operation of filling the drive generally requires the compaction of every active erase block used by the file system. The application module 1 1 reduces the number of erase blocks being actively used, and thus reduces the overhead to perform this operation. The lack of variance in the large measurements suggests that the number of active blocks well exceeded the total number required by the file system to store data, thus producing a spread of data over all available blocks. The smaller measurements with greater variance suggest that the data was more frequently garbage collected and reorganized - since YAFFS performs no data lifetime analysis to collocate data, the variance is likely a result of the disparity between deletion times.

A further solution for secure deletion of data in a log structured file system is implemented at the kernel layer. In the kernel-based approach, the YAFFS file system was modified to achieve secure deletion. This models the scenario where a user of the computerized device is willing to install a custom kernel for their device and has already gained super-user access to the hardware. Thereby, provided is a simple, easily auditable, and small change to the file system to achieve secure deletion.

The principle behind NAND flash programming is that an erasure sets all bits to the value of binary 1 , and programming simply selects some bit positions to instead have the value of binary 0. It is not possible to program a zero into a one, as this operation requires erasing the corresponding erase block. Reprogramming an area of memory to contain only zeros is possible at a smaller granularity than erase blocks, and so researchers have proposed zero overwriting the flash memory [1 4]. The kernel-based approach requires both super-user permissions to install a new YAFFS version, and a non-standard use of flash. This kernel-based approach is attractive because it requires only a small change to YAFFS to enable guaranteed immediate secure deletion without causing any additional wear on the device. However, it requires programming a flash chunk multiple times, which may not be recommended in all specifications for NAND memory [3] due to the unreliable state of a chunk after multiple programming. Since there is no intention to reread the page that was overwritten with zeros, the data integrity concerns about reprogramming a flash page are somewhat mitigated. Figure 8 shows an example of how zero overwriting can remove sensitive information. In fact the original version of YAFFS ( YAFFS 1 ) used zero overwriting to mark an "is deleted flag" in chunks. During garbage collection, YAFFS 1 reread the reprogrammed chunk to determine if it needed to be copied or it could be deleted. This reprogramming was removed in YAFFS2 to allow operation on flash memory that forbids multiple overwrites, and a more complicated garbage collection involving memory lookups was used. The present kernel-based approach is similar, but the entire page is overwritten to zero to prevent the information from being available.

This kernel-based approach may still leak information through advanced forensic techniques, perhaps allowing an observer to determine how recently a gate was set to zero, thus indicating which bits were simultaneously reprogrammed. However, regardless of possible forensic leakage, zero-overwriting guarantees that the data is information theoretically deleted against software forensics.

The benefit of this kernel-based approach is that it requires no block erasures to delete the information. All data is promptly removed when it is no longer needed, and the flash hardware is not required to erase any additional blocks to achieve secure deletion. As such, there is no additional wear on the device itself - except for the minuscule power consumption required to program the chunk to contain only zeros.

In the kernel-based approach, the change to YAFFS is less than a dozen lines of C code, and is contained mostly in the YAFFS chunk del() function. This function handles the deletion, of a chunk from internal memory structures, and is invoked whenever a chunk is no longer needed by the file system, such as deleting a file, truncating it, or overwriting a part of it. The method was enhanced to overwrite the entire deleted chunk with zeros, using the same technique used to set the deleted flag in the tags - which is also implemented in the same function and used when the device is mounted as a YAFFS 1 drive. The other change is in the flash write function, where a kernel oops was removed - the kernel equivalent of a segmentation fault - that prohibited empty chunk tags. Alternatively, a second version of this function can be created which is called only to perform zero-overwriting, and the kernel oops is kept in place.

The kernel-based approach was implemented and its correctness tested by writing information, deleting it, and then searching the raw memory for the information using grep and hexdump. Raw data was collected from the NAND simulator by unmounting /dev/mtdblockO (the device that corresponds to the simulator) and using the program dd to copy the full contents. Raw data was collected from the Android phone by logging in as root, unmounting the flash drive, and copying the raw data using cat from /dev/mtd/mtd5 (the device that corresponds to the phone's data partition) to the phone's external memory. The resulting file was then copied to a PC and examined using grep and hexdump.

The deletion tests consisted of creating a file with some sensitive information and then erasing it in different ways. The following items were tested: a deleted file, a file completely overwritten, a file partially overwritten, a file completely truncated, and a file partially truncated. The tests using partial truncation and overwriting always erased the entire sensitive part of the file. Tests were done using block-aligned and block-unaligned overwriting and truncation. The tests were first run using the standard version of YAFFS, ensuring that the data was still recoverable. In each test, the sensitive data is completely erased from the file system, but remains accessible by unmounting it and reading the raw data. Using the modified version of YAFFS, it was found that the information was irrecoverable from both the file system and the underlying flash medium immediately after running the deletion tests. A trade off with this kernel-based approach is that deleting or truncating large files is not immediate, as the zero-overwriting happens in the foreground. It is also wasteful when a chunk is overwritten shortly before the entire erase block is erased. It would be possible to write a larger YAFFS patch that maintains a list of blocks that need to be zero overwritten, with a sanitization daemon running in the background; this approach is used to provide secure deletion to the ext2 file system [ 1 ]. Care would be needed to ensure that sanitization is not performed, if the erased block has been erased (and new data added) since it was queued for sanitization.

It should be noted that, in the description, the computer program code has been associated with specific functional modules and the sequence of the steps has been presented in a specific order, one skilled in the art will understand, however, that the computer program code may be structured differently and that the order of at least some of the steps could be altered, without deviating from the scope of the invention.

Secure Deletion from Flash Translation Layers

We have presented techniques to securely delete data from flash file systems. However, many consumer flash storage devices— such as 5D cards, solid state drives, and USB sticks— allow common block-structured file systems (e.g., FAT) to be used on the device without any concern for erase blocks, bad blocks, and garbage collection. This is accomplished through the use of a flash translation layer (FTL): a layer of indirection between the raw flash memory and the file system. An FTL handles all the nuances of flash memory's erase blocks internally, allowing any block-based file system to be mounted upon it.

Our approach of purging as a user-space mechanism to ensure the immediate secure deletion of data works for FTL devices as well as MTD devices. Once all the space on the device is consumed with live data, any previously deleted data will be securely removed. However, depending on the size of the device, filling the drive to capacity can be a time- consuming process. MicroSD cards, for example, are available with sizes of 32 GB and can be quite slow during writes. While it may be useful to perform a secure deletion operation whenever such devices are unmounted from a host computer, the cost in execution time may be too great.

If the FTL's erase blocks can be partitioned such that some of the erase blocks are assigned to one partition and the remaining erase blocks are assigned to another, then the following solution can achieve more effective secure deletion. The device is first partitioned into a large storage area and a small key area. When a file that is created, an encryption key is generated and stored on the smaller partition. The data for the file is encrypted using this key and the resulting encrypted data is stored on the large partition. No data is ever written to the large partition unless it is encrypted.

When deleting a file with this method, both the file's data and its secret key are deleted. To securely delete the file, the user can apply the additional step of performing purging on the key partition. Since the key partition is much smaller than the entire storage medium, the operation will be much quicker to perform than purging the entire drive. The resulting wear on the erase blocks assigned to that partition means that this area will likely develop bad blocks earlier. However, by creating a few small partitions, the user can migrate among them after they become unwritable, and set up the device initially with an expected lifetime that is sufficiently large.

This system can be implemented as a small change to the FAT file system driver. After partitioning the drive, logical volume managers can be used to amalgamate the two physical partitions into one logical partition, where any writes to the largest addresses on the logical partition will occur in the small key space. The logical partition is then formatted with our modified FAT file system, which is aware of the size and location of the key space. When data is written to or read from the file system, it will perform the necessary encryption and decryption operations using the correct key from the key partition, and store the data only in the data partition. When the drive is unmounted, or after the expiration of some period of time during which files were deleted, the file system will internally and automatically perform the purging operation by filling the blocks in the logical partition that correspond to the complete set of blocks in the physical key storage area.

UBI YAFFS and other flash file systems interact directly with the flash memory controller, through an interface called the Memory Technology Device (MTD) layer. This interface exposes functions to read and write areas in an erase block, erase an entire erase block, check if an erase block is bad, and mark an erase block as bad. It is a low-level interface and does not provide wear-levelling or bad block detection on its memory.

Recently, a new flash interface that allows in-place updates and removes the concerns of both wear-levelling and bad block detection has been developed. The new flash interface, called Unsorted Block Images ( U BI ), is a simple remapping layer above the MTD layer. Logical blocks at the U BI layer are mapped to physical blocks at the MTD layer.

U BI exposes the following interface: read and write to a logical erase blocks ( LEB ) , erase an entire LEB, and atomically update the contents of an entire LEB (i.e. , in-place edits at the erase block level) . It also allows dynamic creation of UBI partitions using unallocated LEBs. It is neither possible for an LEB to become bad, nor is wear-levelling a concern for LEBs. UBI permits more organization in the design of flash file systems, as there are no longer wear-levelling and bad block concerns for statically-placed super-blocks with specific file system metadata and the assignment of different regions of the storage medium to different purposes.

Underlying this interface is a simple mapping from LEBs to physical erase blocks ( PEBs), where PEBs correspond to erase blocks at the MTD layer. Wear monitoring is handled by accounting the erasures at the PEB level, and a transparent remapping of LEBs occurs when necessary. Remapping also occurs when a bad block is detected. Despite remapping, a LEB ' s number will remain constant regardless of changes to its corresponding PEB. UBI is primarily used by the UBI File System (UBIF5). UBIF5 significantly differs from previous flash-based file systems, making use of UBI's wear-levelling and atomic updates to create a more structured file system, with superblocks, storage areas, and a journal. It also has a checkpoint and replay feature where necessary data structures in memory are periodically written to a fixed location. The replay feature plays back changes made since the checkpoint, improving mounting time after an unsafe unmounting,

UBI-aware Flash File Systems

Regular flash file systems can be mounted on top of the UBI interface, which will act like an MTD interface with the exception that blocks will not become bad and wear levelling is effortlessly improved. With small modifications to the flash file system, the atomic update feature can also be used by the file system.

In this section we present methods to achieve secure deletion in flash file systems: the first being the trivial solution of atomicaily updating blocks, the second being a small change to UBI to use the bad block mechanism to achieve more efficient purging and ballooning.

UBI Atomic Update

UBI's admittance of atomic updates allows for a simple means to achieve secure deletion; when data is deleted, the file system atomicaily updates each erase block in which the data was stored so that it no longer contain the deleted data. UBI's implementation of the atomic update mechanism ensures that once the new block is written, the old one is erased in a timely fashion.

This approach suffers from the onerous frequency of updates. It causes an erase block to be erased each time any data is removed or overwritten, although this can be improved by batching deletions to be periodically purged. This approach also does not update the deleted space to contain new useful data. A more complicated solution to atomically update blocks with new data evolves towards an unnecessary reimplementation of the existing garbage collection mechanism.

Despite these concerns, this approach can be useful to allow guaranteed immediate secure deletion for any data from a file marked with a sensitive attribute. A file containing encryption keys, for instance, would be a good candidate for such a flag, and the user is assured that whenever any data is removed from the file, it will be purged from the storage medium.

UBI Purging and Ballooning

Purging achieved secure deletion by filling the drive to capacity, requiring writing to many locations on the disk that may have otherwise been empty. Ballooning achieved secure deletion by artificially reducing the size of the flash partition, thus reducing the period between a block's allocations. Both these solutions make use of the file system's garbage collection mechanism, which generally favours allocating empty blocks before compacting the available space on partially-empty blocks.

UBI exposes the ability to dynamically create partitions from unused logical blocks in its block pool. It is theoretically possible for UBI to dynamically grow or shrink the size of a partition--were it to know which blocks are not currently being used by the file system above it, and were the file system to know that its size is volatile.

We modified the UBI layer to withhold blocks from the flash file system mounted upon it. This achieved the same effect as ballooning's unnecessary files, and it also prevents the writing and deleting of these files. Similarly, when the file system is being purged, our withholding of blocks reduces the number of allocation that are made and thus reduces both the time it takes for the operation to complete and the wear on the device to achieve it.

This approach requires minimal changes to any MTD-based file system using this technique: the blocks are simply withheld and the file system is effectively given a smaller partition to write data. It only needs to be modified to test when a block is no longer being withheld.

Our solution uses the bad block feature of flash file systems as the method to withhold blocks. Recall that UBI automatically remaps bad blocks, so a naive flash file system (i.e., one using the UBI layer as an MTD layer) will make enquiries about bad blocks and always receive a negative response. We modify UBI to have the response to such enquiries reflect UBI's desire to withhold blocks from the file system. The flash file system needs a trivial change to allow the file system to permit the notion of a block once again becoming good.

For purging, UBI can be warned of this operation using its procfs interface, so it can withhold every block from being allocated to reduce the amount of data that needs to be written to the device during purging— a file system containing many empty blocks will not need to write to them, only the blocks containing data will be garbage collected. After the purging operation completes, the system returns to its previous state where all blocks are once again assignable.

For ballooning, UBI is told the optimal size of the empty space on the file system---this is equivalent to the aggressiveness in our ballooning strategy. When a block is erased, a flash file system will sequentially enquire if the erasure was successful or resulted in a destroyed erase block. If there is too much space available, then UBI will mark the LEB as withheld and indicate to the file system that it is unwritable; otherwise it will report a success. When UBI is asked if a block is bad, it returns false if the file system is in need of more space; otherwise it returns the current withholding status of the LEB.

UBIFS

The UBI file system ( UBIFS) is a sophisticated file system designed for flash memory accessed through the UBI device abstraction. UBIFS is an improvement over existing flash-based file systems, but was still not designed to provide secure deletion. In this section we examine methods to add secure deletion to UBIFS.

Secure Deletion at the File Level

One approach is to use a designated key area, which will be updated during the periodic checkpoint operation performed by UBIFS. Each file will be encrypted with a different key, and the file's inode gives the location of the key in the checkpoint area. Keys are only preserved in the checkpoint when their corresponding files have not been deleted; therefore a key for a deleted file that remains in the old checkpoint is not preserved into the new one. Since old checkpoints are immediately considered wasted space, the corresponding erase blocks that contain their data can be quickly reclaimed. This securely deletes the key and thus deletes the data in a computationally-secure manner.

Additional cryptographically-secure random data would be stored in the checkpoint, which will be used as keys for files that are created between checkpoint operations. This protects the availability of the data if the device crashes between checkpoints and the contents of RAM are lost— the keys are written to the device before they are used, and during the replay operation the file system can reunite the file with its key. The wear-levelling concerns of this technique are handled automatically by UBI, and this approach leverages the existing checkpoint and replay feature of the file system to easily add secure deletion. Since the checkpoint consists of the data normally stored in RAM, the keys for the files will always remain quickly accessible.

This solution is limited because truncations and overwrites of a file will not result in the secure deletion of the removed data. This is because the key for the file will remain accessible until the entire file is erased, regardless of how much data is deleted from the file. Long-lived files, such as media databases, will not be able to use this technique to securely delete data that has been removed from them. The next section describes a method to achieve secure deletion by encrypting each block separately.

Secure Deletion at the Block Level

The major concern with using per-file encryption keys is that long-lived files will not see their keys deleted despite data being deleted from the file. In this section we present a solution to associate each block with its own key, thus ensuring secure deletion of all deleted data.

Our proposal is to encrypt every block with a different key. Using 1 28-bit keys for 2048- byte blocks, this results in less than one percent of the drive's capacity being used for storing keys. When a block is no longer needed, then its corresponding key is marked as unneeded. Periodically, the key storage area is atomically updated to purge old keys, thus achieving secure deletion of the data they encrypted. Wear-levelling is not a concern as the LEBs that are assigned to store keys are not rigidly assigned to particular PEBs.

Encryption is handled entirely in the file system, so no unencrypted data is ever written to disk. Encryption occurs immediately after UBIFS applies optional compression to the contents of a data node. Similarly, decryption occurs immediately before UBIFS applies the decompression operation.

A small part of the file system is dedicated solely to the storage of keys. This data area is itself not encrypted, and so only by securely deleting the data are we assured that it is no longer accessible. The key space is written in advance: a full erase block of random data is written, and the keys are then later assigned out from the random data as they are needed.

Such a design requires a bidirectional map: data blocks must be able to find their corresponding encryption keys, and it should be simple to determine if a particular encryption key is being used or not.

Mapping a data block to a key is relatively straightforward. We need a simple function that takes the data available and returns a position in the key storage area. Each block has index data, which for data blocks indicates the file's inode and the offset in the file where the data is located. UBIFS is designed to support optionally larger index keys---an extra 64 bits---which is ample to use to reference the offset in the key space where the block's key is located. There is also 32-bits of zeros left as padding that can also be used for this purpose.

The reverse mapping is a function that, given a stored key, returns whether the key is unassigned, assigned, or should be deleted. It does not need to map to the block that uses it— orphaned keys can be found when doing a consistency check on the index data of each block. Such a map can be stored in memory with two bits for each block; as flash blocks are often 2048 bytes large, this storage is nominal. There are two challenges with implementing the reverse mapping. First, the key data is already written onto the blocks before it is actually used. Each block in the key space can store many keys. We cannot update the key's value after it is assigned or otherwise changes state; the mapping of keys to their states must be stored elsewhere. Second, we want to be able to reconstruct the mapping during the drive's mounting without necessitating a complete scan of every data block on the file system. UBIFS's has an advantage over other flash file systems in that its checkpoint and replay mechanisms allow the file system to be quickly mounted even if it was unsafely unmounted. Our solution cannot obviate this advantage.

We implement this mapping as a 2-bit map over the space of key positions. Each key is given two bits to record its state, and this mapping is kept in memory and updated as appropriate: when fresh random data is written, then the key is marked as unassigned; when a new data block is written, an unassigned key is assigned to it and marked as assigned; when a block is marked as deleted, then the corresponding key is set to deleted to indicate that it should be purged. If an erase block containing deleted data blocks is erased, then any keys that are marked as deleted can be returned to unassigned as the data that they encrypted has been securely removed and so the entropy can once again be used for new keys.

The reverse mapping is written, along with the other in-memory data structures, during the periodic checkpoint operation. This means that the checkpoint contains the key states for all data blocks at the time of checkpointing. This is loaded into memory when the file system is mounted. In the event of power loss, the replay mechanism can be used along with the checkpoint to restore the map to its correct state. Each new block that is written will indicate the key position it uses, and blocks that were erased since checkpointing are also determined by replaying the sequence of updates. References

[I ] 5. Bauer and N. B. Priyantha. Secure Data Deletion for Linux File Systems. Usenix Security Symposium, 2001 .

[2] R ' emy Card, Theodore Ts'o, and Stephen Tweedie. Design and Implementation of the Second Extended Filesystem. Proceedings of the First Dutch International Symposium, 1 993.

[3] Charles Manning. How YAFFS Works.

[4] 5imson L. Garfinkel and Abhi Shelat. Remembrance of Data Passed: A Study of Disk

Sanitization Practices. IEEE Security and Privacy, 2003.

[5] GN U Privacy Guard. gpg( 1 ) - Linux man page.

[5] Adrian Hunter. A Brief Introduction to the Design of U BIFS. 2008.

[7] Nikolai Joukov, Harry Papaxenopoulos, and Erez Zadok. Secure Deletion Myths,

Issues, and Solutions. ACM workshop on Storage security and survivability, 2006.

[8] Nikolai Joukov and Erez Zadokstony. Adding Secure Deletion to Your Favorite File

System. IEEE Security In Storage Workshop, 2005.

[9] Jaeheung Lee, Sangho Yi, Junyoung Heo, and Hyungbae Park. An Efficient Secure

Deletion Scheme for Flash File Systems. Journal of Information Science and

Engineering, 201 0.

[ 1 0] Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. The new ext4 filesystem: current status and future plans.

[ I I ] Colin Plumb. shred( 1 ) - linux man page.

[ 1 2] Mendel Rosenblum and John K. Ousterhout. The Design and Implementation of a

Log-Structured File System. ACM Transactions on Computer Systems, 1 992.

[ 1 3] Radu Stoica, Manos Athanassoulis, Ryan Johnson, and Anastasia Ailamaki.

Evaluating and Repairing Write Performance on Flash Devices. In Workshop on

Data Management on New Hardware, 2009.

[ 1 4] Kyoungmoon Sun, Jongmoo Choi, Donghee Lee, and S.H. Noh. Models and Design of an Adaptive Hybrid Scheme for Secure Deletion of Data in Consumer Electronics.

IEEE Transactions on Consumer Electronics, 2008.

[ 1 5] David Woodhouse. JFFS : The Journalling Flash File System. 2001 .