Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
NON-VOLATILE MEMORY HAVING 3D ARRAY ARCHITECTURE WITH STAIRCASE WORD LINES AND VERTICAL BIT LINES AND METHODS THEREOF
Document Type and Number:
WIPO Patent Application WO/2013/188563
Kind Code:
A1
Abstract:
In a 3D nonvolatile memory with memory elements arranged in a three-dimensional pattern defined by rectangular coordinates having x, y and z-directions and with a plurality of parallel planes from a bottom plane to a top plane stacked in the z-direction over a semiconductor substrate; a plurality of local bit lines elongated in the z-direction through the plurality of layers and arranged in a two-dimensional rectangular array of bit line pillars having rows in the x-direction and columns in the y-direction; the 3D nonvolatile memory further having a plurality of staircase word lines spaced apart in the y-direction and between and separated from the plurality of bit line pillars at a plurality of crossings, individual staircase word lines each having a series of alternating steps and risers elongated respectively in the x-direction and z-direction traversing across the plurality of planes in the z-direction with a segment in each plane.

Inventors:
CERNEA, Raul-Adrian (889 Agnew Road, Santa Clara, California, 95054, US)
SAMACHISA, George (6858 Castlerock Drive, San Jose, California, 95120, US)
Application Number:
US2013/045466
Publication Date:
December 19, 2013
Filing Date:
June 12, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SANDISK 3D LLC (951 SanDisk Drive, Milpitas, California, 95035, US)
International Classes:
H01L27/24; G11C13/00; H01L45/00
Domestic Patent References:
WO2010117914A12010-10-14
Foreign References:
US20120147644A12012-06-14
US20110115049A12011-05-19
US20030027419A12003-02-06
US20060184720A12006-08-17
US5570315A1996-10-29
US5774397A1998-06-30
US6046935A2000-04-04
US6373746B12002-04-16
US6456528B12002-09-24
US6522580B22003-02-18
US6643188B22003-11-04
US6771536B22004-08-03
US6781877B22004-08-24
US7342279B22008-03-11
US7023739B22006-04-04
US7177191B22007-02-13
US20090001344A12009-01-01
US5172338A1992-12-15
US6222762B12001-04-24
US6538922B12003-03-25
US6678192B22004-01-13
US7237074B22007-06-26
US7324393B22008-01-29
US20120147650A12012-06-14
US20120147650A12012-06-14
Other References:
KOZICKI ET AL.: "Multi-bit Memory Using Programmable Metallization Cell Technology", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES AND MEMORY, GRENOBLE, FRANCE, 12 June 2005 (2005-06-12), pages 48 - 53
SCHROGMEIER ET AL.: "Symposium on VLSI Circuits", 2007, article "Time Discrete Voltage Sensing and Iterative Programming Control for a 4F2 Multilevel CBRAM"
Attorney, Agent or Firm:
YAU, Philip et al. (Davis Wright Tremaine LLP, 505 Montgomery Street Suite 80, San Francisco California, 94111, US)
Download PDF:
Claims:
IT IS CLAIMED:

1. A non-volatile memory, comprising: memory elements arranged in a three-dimensional pattern defined by rectangular coordinates having x, y and z-directions and with a plurality of parallel planes from a bottom plane to a top plane stacked in the z-direction over a semiconductor substrate; a plurality of local bit lines elongated in the z-direction through the plurality of parallel planes and arranged in a two-dimensional rectangular array of bit line pillars having rows in the x-direction and columns in the y-direction; a plurality of staircase word lines spaced apart in the y-direction and between and separated from the plurality of bit line pillars at a plurality of crossings, individual staircase word lines each having a series of alternating steps and risers elongated respectively in the x-direction and z-direction traversing across the plurality of planes in the z-direction with a segment in each plane; and a plurality of non- volatile re-programmable memory elements individually- connected through circuits between the bit line pillars and the staircase word lines adjacent the crossings thereof.

2. The non-volatile memory as in claim 1, wherein: adjacent staircase word lines of said plurality of staircase word lines are staggered and spaced apart in the x-direction by a predetermined offset.

3. The non- volatile memory as in claim 1, wherein: said predetermined offset is a spacing between two bit line pillars of the row of bit line pillars.

4. The non-volatile memory as in claim 1, wherein: said predetermined offset is half of a spacing between two bit line pillars of the row of bit line pillars.

5. The non-volatile memory as in claim 1, wherein: each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has at least one riser raising the segment prior to crossing a last one of the bit line pillars in the segment.

6. The non-volatile memory as in claim 1 , wherein: each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has a riser after crossing each of the bit line pillars in the segment.

7. The non-volatile memory as in claim 1 , further comprising an associated word line driver above the top plane connected to each staircase word line.

8. The non-volatile memory as in claim 7, wherein: the associated word line driver has a size along the x-direction fitting within a segment.

9. The non-volatile memory as in claim 1 , further comprising an associated word line driver below the bottom plane connected to each staircase word line.

10. The non-volatile memory as in claim 9, wherein: the associated word line driver has a size along the x-direction fitting within a segment.

11. A method of operating a non-volatile memory, comprising: providing memory elements arranged in a three-dimensional pattern defined by rectangular coordinates having x, y and z-directions and with a plurality of parallel planes stacked in the z-direction over a semiconductor substrate; providing a plurality of local bit lines elongated in the z-direction through the plurality of parallel planes and arranged in a two-dimensional rectangular array of bit line pillars having rows in the x-direction and columns in the y-direction; providing a plurality of staircase word lines spaced apart in the y-direction and between and separated from the plurality of bit line pillars at a plurality of crossings, individual staircase word lines each having a series of alternating steps and risers elongated respectively in the x-direction and z-direction traversing across the plurality of planes in the z-direction with a segment in each plane; providing a plurality of non-volatile re-programmable memory elements individually connected through circuits between the bit line pillars and the staircase word lines adjacent the crossings thereof; and wherein voltages are applied through the plurality of local bit lines and the plurality of staircase word lines to effect memory operations on a selected group of non-volatile re-programmable memory elements.

12. The method as in claim 1 , wherein: adjacent staircase word lines of said plurality of staircase word lines are staggered and spaced apart in the x-direction by a predetermined offset.

13. The method as in claim 1, wherein: said predetermined offset is a spacing between two bit line pillars of the row of bit line pillars.

14. The method as in claim 1 , wherein: said predetermined offset is half of a spacing between two bit line pillars of the row of bit line pillars.

15. The method as in claim 1, wherein: each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has at least one riser raising the segment prior to crossing a last one of the bit line pillars in the segment.

1 . The method as in claim 1 , wherein: each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has a riser after crossing each of the bit line pillars in the segment.

17. The method as in claim 1, further comprising an associated word line driver above the top plane connected to each staircase word line.

18. The method as in claim 17, wherein: the associated word line driver has a size along the x-direction fitting within a segment.

19. The method as in claim 1 , further comprising an associated word line driver below the bottom plane connected to each staircase word line.

20. The method as in claim 19, wherein: the associated word line driver has a size along the x-direction fitting within a segment.

Description:
ON- VOLATILE MEMORY HAVING 3D ARRAY ARCHITECTURE WITH STAIRCASE WORD LINES AND VERTICAL BIT LINES AND METHODS

THEREOF

BACKGROUND

[0001] The subject matter of this application is the structure, use and making of reprogrammable non-volatile memory cell arrays, and, more specifically, to three- dimensional arrays of memory storage elements formed on and above semiconductor substrates.

[0002] Uses of re-programmable non-volatile mass data storage systems utilizing flash memory are widespread for storing data of computer files, camera pictures, and data generated by and/or used by other types of hosts. A popular form of flash memory is a card that is removably connected to the host through a connector. There are many different flash memory cards that are commercially available, examples being those sold under trademarks CompactFlash (CF), the MultiMediaCard (MMC), Secure Digital (SD), miniSD, microSD, Memory Stick, Memory Stick Micro, xD- Picture Card, SmartMedia and TransFlash. These cards have unique mechanical plugs and/or electrical interfaces according to their specifications, and plug into mating receptacles provided as part of or connected with the host.

[0003] Another form of flash memory systems in widespread use is the flash drive, which is a hand held memory system in a small elongated package that has a

Universal Serial Bus (USB) plug for connecting with a host by plugging it into the host's USB receptacle. SanDisk Corporation, assignee hereof, sells flash drives under its Cruzer, Ultra and Extreme Contour trademarks. In yet another form of flash memory systems, a large amount of memory is permanently installed within host systems, such as within a notebook computer in place of the usual disk drive mass data storage system. Each of these three forms of mass data storage systems generally includes the same type of flash memory arrays. They each also usually contain its own memory controller and drivers but there are also some memory only systems that are instead controlled at least in part by software executed by the host to which the memory is connected. The flash memory is typically formed on one or more integrated circuit chips and the controller on another circuit chip. But in some memory systems that include the controller, especially those embedded within a host, the memory, controller and drivers are often formed on a single integrated circuit chip.

[0004] There are two primary techniques by which data are communicated between the host and flash memory systems. In one of them, addresses of data files generated or received by the system are mapped into distinct ranges of a continuous logical address space established for the system. The extent of the address space is typically sufficient to cover the full range of addresses that the system is capable of handling. As one example, magnetic disk storage drives communicate with computers or other host systems through such a logical address space. The host system keeps track of the logical addresses assigned to its files by a file allocation table (FAT) and the memory system maintains a map of those logical addresses into physical memory addresses where the data are stored. Most memory cards and flash drives that are commercially available utilize this type of interface since it emulates that of magnetic disk drives with which hosts have commonly interfaced.

[0005] In the second of the two techniques, data files generated by an electronic system are uniquely identified and their data logically addressed by offsets within the file. Theses file identifiers are then directly mapped within the memory system into physical memory locations. Both types of host/memory system interfaces are described and contrasted elsewhere, such as in patent application publication no. US 2006/0184720 Al .

[0006] Flash memory systems typically utilize integrated circuits with arrays of memory cells that individually store an electrical charge that controls the threshold level of the memory cells according to the data being stored in them. Electrically conductive floating gates are most commonly provided as part of the memory cells to store the charge but dielectric charge trapping material is alternatively used. A NAND architecture is generally preferred for the memory cell arrays used for large capacity mass storage systems. Other architectures, such as NOR, are typically used instead for small capacity memories. Examples of NAND flash arrays and their operation as part of flash memory systems may be had by reference to United States patents nos. 5,570,315, 5,774,397, 6,046,935, 6,373,746, 6,456,528, 6,522,580, 6,643,188, 6,771,536, 6,781,877 and 7,342,279. [0007] The amount of integrated circuit area necessary for each bit of data stored in the memory cell array has been reduced significantly over the years, and the goal remains to reduce this further. The cost and size of the flash memory systems are therefore being reduced as a result. The use of the NAND array architecture contributes to this but other approaches have also been employed to reducing the size of memory cell arrays. One of these other approaches is to form, on a semiconductor substrate, multiple two-dimensional memory cell arrays, one on top of another in different planes, instead of the more typical single array. Examples of integrated circuits having multiple stacked NAND flash memory cell array planes are given in United States patents nos. 7,023,739 and 7,177,191.

[0008] Another type of re-programmable non-volatile memory cell uses variable resistance memory elements that may be set to either conductive or non-conductive states (or, alternately, low or high resistance states, respectively), and some additionally to partially conductive states and remain in that state until subsequently re-set to the initial condition. The variable resistance elements are individually connected between two orthogonally extending conductors (typically bit and word lines) where they cross each other in a two-dimensional array. The state of such an element is typically changed by proper voltages being placed on the intersecting conductors. Since these voltages are necessarily also applied to a large number of other unselected resistive elements because they are connected along the same conductors as the states of selected elements being programmed or read, diodes are commonly connected in series with the variable resistive elements in order to reduce leakage currents that can flow through them. The desire to perform data reading and programming operations with a large number of memory cells in parallel results in reading or programming voltages being applied to a very large number of other memory cells. An example of an array of variable resistive memory elements and associated diodes is given in patent application publication no. US 2009/0001344 Al .

SUMMARY OF THE INVENTION

3D ARRAY ARCHITECTURE WITH STAIRCASE WORD LINES

[0009] In a 3D nonvolatile memory with memory elements arranged in a three- dimensional pattern defined by rectangular coordinates having x, y and z-directions and with a plurality of parallel planes from a bottom plane to a top plane stacked in the z-direction over a semiconductor substrate; a plurality of local bit lines elongated in the z-direction through the plurality of layers and arranged in a two-dimensional rectangular array of bit line pillars having rows in the x-direction and columns in the y-direction; the 3D nonvolatile memory further having a plurality of staircase word lines spaced apart in the y-direction and between and separated from the plurality of bit line pillars at a plurality of crossings, individual staircase word lines each having a series of alternating steps and risers elongated respectively in the x-direction and z- direction traversing across the plurality of planes in the z-direction with a segment in each plane.

[0010] In another embodiment, adjacent staircase word lines are offset in the x- direction by the pitch of the local bit line.

[0011] In yet another embodiment, adjacent staircase word lines are offset in the x- direction by half the pitch of the local bit line.

[0012] In yet another embodiment, an associated word line driver is provided above the top plane to connect to each staircase word line.

[0013] In yet another embodiment, an associated word line driver is provided below the bottom plane to connect to each staircase word line.

[0014] In yet another embodiment, the associated word line driver has a size along the x-direction fitting within a segment.

[0015] In yet another embodiment, each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has at least one riser raising the segment prior to crossing a last one of the bit line pillars in the segment.

[0016] In yet another embodiment, each segment of the staircase word line crosses a segment of bit line pillars; and the segment of staircase word line has a riser after crossing each of the bit line pillars in the segment.

[0017] The advantage of the staircase word line architecture is that word lines from different memory layers can be accessed easily as each eventually can be accessed and selected from either the top or bottom of the 3D memory array. In this way, interconnect and decoding are much simplified and there is no need to have multiple word lines connected in parallel to share limited resources. This will avoid the individual word line drivers having to drive a large load in the form of a large number of word lines in parallel and the ICC can be reduced. Within a given ICC budget, the performance is improved. Drivers implemented by devices with weaker drive power than CMOS devices may be used. Also, by avoiding multiple word lines connected in parallel, disturb is reduced.

[0018] Various aspects, advantages, features and details of the innovative three- dimensional variable resistive element memory system are included in a description of exemplary examples thereof that follows, which description should be taken in conjunction with the accompanying drawings.

[0019] All patents, patent applications, articles, other publications, documents and things referenced herein are hereby incorporated herein by this reference in their entirety for all purposes. To the extent of any inconsistency or conflict in the definition or use of terms between any of the incorporated publications, documents or things and the present application, those of the present application shall prevail.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] FIG. 1 illustrates schematically an architecture of a three-dimensional memory in the form of an equivalent circuit of a portion of such a memory.

[0021] FIG. 2 is a block diagram of an illustrative memory system that can use the three-dimensional memory of FIG. 1.

[0022] FIG. 3 provides plan views of the two planes and substrate of the three- dimensional array of FIG. 1, with some structure added.

[0023] FIG. 4 is an expanded view of a portion of one of the planes of FIG. 3, annotated to show effects of programming data therein.

[0024] FIG. 5 is an expanded view of a portion of one of the planes of FIG. 3, annotated to show effects of reading data therefrom.

[0025] FIG. 6 illustrates an example memory storage element.

[0026] FIG. 7 illustrates the read bias voltages and current leakage across multiple planes of the 3D memory shown in FIG. 1 and FIG. 3.

[0027] FIG. 8 illustrates schematically a single-sided word line architecture. [0028] FIG. 9 illustrates one plane and substrate of the 3D array with the single-sided word line architecture.

[0029] FIG. 10 illustrates the elimination of leakage currents in the single-sided word- line architecture 3-D array of FIG. 8 and FIG. 9.

[0030] FIG. 11A illustrates the local bit line LBLn is coupled to the sense amplifier via a segment of global bit line GBLi having a length y 1.

[0031] FIG. 11B illustrates the local bit line LBL 13 is coupled to the sense amplifier via a segment of global bit line GBLi having a length y2.

[0032] FIG. 12 illustrates the resistance along a circuit path of a selected cell M between a word line driver and a sense amplifier.

[0033] FIG. 13 illustrates a bit line control circuit that keeps the bit line voltage fixed to a reference voltage.

[0034] FIG. 14 is an isometric view of a portion of the 3D array with a structure having staircase word lines.

[0035] FIG. 15 illustrates a cross-section view of the 3D array along the y-direction according to an embodiment in which the word line step to the next memory layer is made in between the bit lines.

[0036] FIG. 16 illustrates a cross-section view of the 3D array along the y-direction according to an embodiment in which the various staggered word line steps are stacked as close as possible.

[0037] FIG. 17 illustrates from top to bottom a series of process steps to fabricate a 3D array with staircase word lines.

[0038] FIG. 18 illustrates a word line driver formed as a vertical structure on top of the 3D array of memory layers.

[0039] FIG. 19A is a schematic illustration of a cross-section view of the efficient 3D array projected on the x-z plane.

[0040] FIG. 19B illustrates the device structure of the efficient 3D array shown schematically in FIG. 19 A. [0041] FIG. 20 is a schematic illustration of a cross-section view of the efficient 3D array projected on the x-z plane according to another embodiment.

[0042] FIG. 21 is an isometric view of a portion of the efficient 3D array shown in FIG. 19.

[0043] FIG. 22A illustrates one of a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where a masking layer is laid over the word line layer to enable trenches to be etched in the word line layer.

[0044] FIG. 22B illustrates one of a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where trenches are etched in the word line layer.

[0045] FIG. 22C illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where an oxide layer is deposited on top of the word line layer, followed by a masking layer.

[0046] FIG. 22D illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where trenches are etched in the oxide layer.

[0047] FIG. 22E illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where a second word line layer is formed on top of the oxide layer and making connection with the lower word line layer through the trenches in the oxide layer.

[0048] FIG. 22F illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where trenches are etched in the second word line layer.

[0049] FIG. 22G illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where the process repeats itself as in that shown in FIG. 22C for the next layer of oxide and masking layer to build up the staircase structure of the word line.

[0050] FIG. 22H illustrates a series of process steps to fabricate the efficient 3D array shown in FIG. 19, including where the process repeats itself as in that shown in FIG. 22D where trenches are etched in the oxide layer in order to build up progressively the staircase structure of the word line. [0051] FIG. 23 illustrates the biasing condition for setting or resetting a R/W element.

DESCRIPTION OF EXEMPLARY EMBODIMENTS

[0052] Referring initially to FIG. 1, an architecture of a three-dimensional memory 10 is schematically and generally illustrated in the form of an equivalent circuit of a portion of such a memory. This is a specific example of the three-dimensional array summarized above. A standard three-dimensional rectangular coordinate system 11 is used for reference, the directions of each of vectors x, y and z being orthogonal with the other two.

[0053] A circuit for selectively connecting internal memory elements with external data circuits is preferably formed in a semiconductor substrate 13. In this specific example, a two-dimensional array of select or switching devices Q xy are utilized, where x gives a relative position of the device in the x-direction and y its relative position in the y-direction. The individual devices Q xy may be a select gate or select transistor, as examples. Global bit lines (GBL X ) are elongated in the y-direction and have relative positions in the x-direction that are indicated by the subscript. The global bit lines (GBL X ) are individually connectable with the source or drain of the select devices Q having the same position in the x-direction, although during reading and also typically programming only one select device connected with a specific global bit line is turned on at time. The other of the source or drain of the individual select devices Q is connected with one of the local bit lines (LBL xy ). The local bit lines are elongated vertically, in the z-direction, and form a regular two-dimensional array in the x (row) and y (column) directions.

[0054] In order to connect one set (in this example, designated as one row) of local bit lines with corresponding global bit lines, control gate lines SG y are elongated in the x- direction and connect with control terminals (gates) of a single row of select devices

Q xy having a common position in the y-direction. The select devices Q xy therefore connect one row of local bit lines (LBL xy ) across the x-direction (having the same position in the y-direction) at a time to corresponding ones of the global bit-lines

(GBL X ), depending upon which of the control gate lines SG y receives a voltage that turns on the select devices to which it is connected. The remaining control gate lines receive voltages that keep their connected select devices off. It may be noted that since only one select device (Q xy ) is used with each of the local bit lines (LBL xy ), the pitch of the array across the semiconductor substrate in both x and y-directions may be made very small, and thus the density of the memory storage elements large.

[0055] Memory storage elements M^ y are formed in a plurality of planes positioned at different distances in the z-direction above the substrate 13. Two planes 1 and 2 are illustrated in FIG. 1 but there will typically be more, such as 4, 6 or even more. In each plane at distance z, word lines WL^ are elongated in the x-direction and spaced apart in the y-direction between the local bit-lines (LBL xy ). The word lines WL^ of each plane individually cross adjacent two of the local bit- lines LBL xy on either side of the word lines. The individual memory storage elements M zxy are connected between one local bit line LBL xy and one word line WL zy adjacent these individual crossings. An individual memory element M zxy is therefore addressable by placing proper voltages on the local bit line LBL xy and word line WL^ between which the memory element is connected. The voltages are chosen to provide the electrical stimulus necessary to cause the state of the memory element to change from an existing state to the desired new state. The levels, duration and other characteristics of these voltages depend upon the material that is used for the memory elements.

[0056] Each "plane" of the three-dimensional memory cell structure is typically formed of at least two layers, one in which the conductive word lines WL zy are positioned and another of a dielectric material that electrically isolates the planes from each other. Additional layers may also be present in each plane, depending for example on the structure of the memory elements M^ y . The planes are stacked on top of each other on a semiconductor substrate with the local bit lines LBL xy being connected with storage elements M zxy of each plane through which the local bit lines extend.

[0057] FIG. 2 is a block diagram of an illustrative memory system that can use the three-dimensional memory 10 of FIG. 1. Sense amplifier and I/O circuits 21 are connected to provide (during programming) and receive (during reading) analog electrical quantities in parallel over the global bit-lines GBL X of FIG. 1 that are representative of data stored in addressed storage elements M zxy . The circuits 21 typically contain sense amplifiers for converting these electrical quantities into digital data values during reading, which digital values are then conveyed over lines 23 to a memory system controller 25. Conversely, data to be programmed into the array 10 are sent by the controller 25 to the sense amplifier and I/O circuits 21, which then programs that data into addressed memory element by placing proper voltages on the global bit lines GBL X . For binary operation, one voltage level is typically placed on a global bit line to represent a binary "1" and another voltage level to represent a binary "0". The memory elements are addressed for reading or programming by voltages placed on the word lines WL zy and select gate control lines SG y by respective word line select circuits 27 and local bit line circuits 29. In the specific three-dimensional array of FIG. 1, the memory elements lying between a selected word line and any of the local bit lines LBL xy connected at one instance through the select devices Q xy to the global bit lines GBL X may be addressed for programming or reading by appropriate voltages being applied through the select circuits 27 and 29.

[0058] The memory system controller 25 typically receives data from and sends data to a host system 31. The controller 25 usually contains an amount of random-access- memory (RAM) 34 for temporarily storing such data and operating information. Commands, status signals and addresses of data being read or programmed are also exchanged between the controller 25 and host 31. The memory system operates with a wide variety of host systems. They include personal computers (PCs), laptop and other portable computers, cellular telephones, personal digital assistants (PDAs), digital still cameras, digital movie cameras and portable audio players. The host typically includes a built-in receptacle 33 for one or more types of memory cards or flash drives that accepts a mating memory system plug 35 of the memory system but some hosts require the use of adapters into which a memory card is plugged, and others require the use of cables therebetween. Alternatively, the memory system may be built into the host system as an integral part thereof. [0059] The memory system controller 25 conveys to decoder/driver circuits 37 commands received from the host. Similarly, status signals generated by the memory system are communicated to the controller 25 from the circuits 37. The circuits 37 can be simple logic circuits in the case where the controller controls nearly all of the memory operations, or can include a state machine to control at least some of the repetitive memory operations necessary to carry out given commands. Control signals resulting from decoding commands are applied from the circuits 37 to the word line select circuits 27, local bit line select circuits 29 and sense amplifier and I/O circuits 21. Also connected to the circuits 27 and 29 are address lines 39 from the controller that carry physical addresses of memory elements to be accessed within the array 10 in order to carry out a command from the host. The physical addresses correspond to logical addresses received from the host system 31, the conversion being made by the controller 25 and/or the decoder/driver 37. As a result, the circuits 29 partially address the designated storage elements within the array 10 by placing proper voltages on the control elements of the select devices Q xy to connect selected local bit lines (LBL xy ) with the global bit lines (GBL X ). The addressing is completed by the circuits 27 applying proper voltages to the word lines WL^ of the array.

[0060] Although the memory system of FIG. 2 utilizes the three-dimensional memory element array 10 of FIG. 1, the system is not limited to use of only that array architecture. A given memory system may alternatively combine this type of memory with other another type including flash memory, such as flash having a NAND memory cell array architecture, a magnetic disk drive or some other type of memory. The other type of memory may have its own controller or may in some cases share the controller 25 with the three-dimensional memory cell array 10, particularly if there is some compatibility between the two types of memory at an operational level.

[0061] Although each of the memory elements M zxy in the array of FIG. 1 may be individually addressed for changing its state according to incoming data or for reading its existing storage state, it is certainly preferable to program and read the array in units of multiple memory elements in parallel. In the three-dimensional array of FIG. 1, one row of memory elements on one plane may be programmed and read in parallel. The number of memory elements operated in parallel depends on the number of memory elements connected to the selected word line. In some arrays, the word lines may be segmented (not shown in FIG. 1) so that only a portion of the total number of memory elements connected along their length may be addressed for parallel operation, namely the memory elements connected to a selected one of the segments.

[0062] Previously programmed memory elements whose data have become obsolete may be addressed and re-programmed from the states in which they were previously programmed. The states of the memory elements being re-programmed in parallel will therefore most often have different starting states among them. This is acceptable for many memory element materials but it is usually preferred to re-set a group of memory elements to a common state before they are re-programmed. For this purpose, the memory elements may be grouped into blocks, where the memory elements of each block are simultaneously reset to a common state, preferably one of the programmed states, in preparation for subsequently programming them. If the memory element material being used is characterized by changing from a first to a second state in significantly less time than it takes to be changed from the second state back to the first state, then the reset operation is preferably chosen to cause the transition taking the longer time to be made. The programming is then done faster than resetting. The longer reset time is usually not a problem since resetting blocks of memory elements containing nothing but obsolete data is typically accomplished in a high percentage of the cases in the background, therefore not adversely impacting the programming performance of the memory system.

[0063] With the use of block re-setting of memory elements, a three-dimensional array of variable resistive memory elements may be operated in a manner similar to current flash memory cell arrays. Resetting a block of memory elements to a common state corresponds to erasing a block of flash memory cells to an erased state. The individual blocks of memory elements herein may be further divided into a plurality of pages of storage elements, wherein the memory elements of a page are programmed and read together. This is like the use of pages in flash memories. The memory elements of an individual page are programmed and read together. Of course, when programming, those memory elements that are to store data that are represented by the reset state are not changed from the reset state. Those of the memory elements of a page that need to be changed to another state in order to represent the data being stored in them have their states changed by the programming operation.

[0064] An example of use of such blocks and pages is illustrated in FIG. 3, which provides plan schematic views of planes 1 and 2 of the array of FIG. 1. The different word lines WL zy that extend across each of the planes and the local bit lines LBL xy that extend through the planes are shown in two-dimensions. Individual blocks are made up of memory elements connected to both sides of one word line, or one segment of a word line if the word lines are segmented, in a single one of the planes. There are therefore a very large number of such blocks in each plane of the array. In the block illustrated in FIG. 3, each of the memory elements Mn 4 , Mi 24 , Mi 34 , Mi 15, M 1 5 and M 135 connected to both sides of one word line WL 1 form the block. Of course, there will be many more memory elements connected along the length of a word line but only a few of them are illustrated, for simplicity. The memory elements of each block are connected between the single word line and different ones of the local bit lines, namely, for the block illustrated in FIG. 3, between the word line WLi 2 and respective local bit lines LBLi 2 , LBL 22 , LBL 32 , LBLi 3 , LBL 23 and LBL 33 .

[0065] A page is also illustrated in FIG. 3. In the specific embodiment being described, there are two pages per block. One page is formed by the memory elements along one side of the word line of the block and the other page by the memory elements along the opposite side of the word line. The example page marked in FIG. 3 is formed by memory elements Mn 4 , Mi 24 and Mi 34 . Of course, a page will typically have a very large number of memory elements in order to be able to program and read a large amount of data at one time. Only a few of the storage elements of the page of FIG. 3 are included, for simplicity in explanation.

[0066] Example resetting, programming and reading operations of the memory array of FIGs. 1 and 3, when operated as array 10 in the memory system of FIG. 2, will now be described. For these examples, each of the memory elements M^ y is taken to include a non-volatile memory material that can be switched between two stable states of different resistance levels by impressing voltages (or currents) of different polarity across the memory element, or voltages of the same polarity but different magnitudes and/or duration. For example, one class of material may be placed into a high resistance state by passing current in one direction through the element, and into a low resistance state by passing current in the other direction through the element. Or, in the case of switching using the same voltage polarity, one element may need a higher voltage and a shorter time to switch to a high resistance state and a lower voltage and a longer time to switch to a lower resistance state. These are the two memory states of the individual memory elements that indicate storage of one bit of data, which is either a "0" or a "1", depending upon the memory element state.

[0067] To reset (erase) a block of memory elements, the memory elements in that block are placed into their high resistance state. This state will be designated as the logical data state "1", following the convention used in current flash memory arrays but it could alternatively be designated to be a "0". As shown by the example in FIG. 3, a block includes all the memory elements that are electrically connected to one word line WL or segment thereof. A block is the smallest unit of memory elements in the array that are reset together. It can include thousands of memory elements. If a row of memory elements on one side of a word line includes 1000 of them, for example, a block will have 2000 memory elements from the two rows on either side of the word line.

[0068] The following steps may be taken to reset all the memory elements of a block, using the block illustrated in FIG. 3 as an example:

1. Set all of the global bit lines (GBLi, GBL 2 and GBL 3 in the array of FIGs. 1 and 3) to zero volts, by the sense amplifier and I/O circuits 21 of FIG. 2.

2. Set at least the two select gate lines on either side of the one word line of the block to H' volts, so that the local bit lines on each side of the word line in the y-direction are connected to their respective global bit lines through their select devices and therefore brought to zero volts. The voltage H' is made high enough to turn on the select devices Q xy , like something in a range of 1-3 volts, typically 2 volts. The block shown in FIG. 3 includes the word line WLi 2 , so the select gate lines SG 2 and SG 3 (FIG. 1) on either side of that word line are set to H' volts, by the circuits 29 of FIG. 2, in order to turn on the select devices Qi 2 , Q 22 , Q 32 , Qi 3 , Q 23 and Q 33 . This causes each of the local bit lines LBL12, LBL22, LBL32, LBL13, LBL23 and LBL33 in two adjacent rows extending in the x-direction to be connected to respective ones of the global bit lines GBL1, GBL2 and GBL3. Two of the local bit lines adjacent to each other in the y-direction are connected to a single global bit line. Those local bit lines are then set to the zero volts of the global bit lines. The remaining local bit lines preferably remain unconnected and with their voltages floating.

3. Set the word line of the block being reset to H volts. This reset voltage value is dependent on the switching material in the memory element and can be between a fraction of a volt to a few volts. All other word lines of the array, including the other word lines of selected plane 1 and all the word lines on the other unselected planes, are set to zero volts. In the array of FIGs. 1 and 3, word line WL12 is placed at H volts, while all other word lines in the array are placed at zero volts, all by the circuits 27 of FIG. 2.

[0069] The result is that H volts are placed across each of the memory elements of the block. In the example block of FIG. 3, this includes the memory elements Mn 4 , Mi2 4 , Mi3 4 , Mi 15, M125 and M135. For the type of memory material being used as an example, the resulting currents through these memory elements places any of them not already in a high resistance state, into that re-set state.

[0070] It may be noted that no stray currents will flow because only one word line has a non-zero voltage. The voltage on the one word line of the block can cause current to flow to ground only through the memory elements of the block. There is also nothing that can drive any of the unselected and electrically floating local bit lines to H volts, so no voltage difference will exist across any other memory elements of the array outside of the block. Therefore no voltages are applied across unselected memory elements in other blocks that can cause them to be inadvertently disturbed or reset.

[0071] It may also be noted that multiple blocks may be concurrently reset by setting any combination of word lines and the adjacent select gates to H or H' respectively. In this case, the only penalty for doing so is an increase in the amount of current that is required to simultaneously reset an increased number of memory elements. This affects the size of the power supply that is required.

[0072] The memory elements of a page are preferably programmed concurrently, in order to increase the parallelism of the memory system operation. An expanded version of the page indicated in FIG. 3 is provided in FIG. 4, with annotations added to illustrate a programming operation. The individual memory elements of the page are initially in their reset state because all the memory elements of its block have previously been reset. The reset state is taken herein to represent a logical data "1". For any of these memory elements to store a logical data "0" in accordance with incoming data being programmed into the page, those memory elements are switched into their low resistance state, their set state, while the remaining memory elements of the page remain in the reset state.

[0073] For programming a page, only one row of select devices is turned on, resulting in only one row of local bit lines being connected to the global bit lines. This connection alternatively allows the memory elements of both pages of the block to be programmed in two sequential programming cycles, which then makes the number of memory elements in the reset and programming units equal.

[0074] Referring to FIGs. 3 and 4, an example programming operation within the indicated one page of memory elements Mn 4 , Mi 24 and Mi 34 is described, as follows:

1. The voltages placed on the global bit lines are in accordance with the pattern of data received by the memory system for programming. In the example of FIG. 4, GBLi carries logical data bit "1", GBL 2 the logical bit "0" and GBL 3 the logical bit "1." The bit lines are set respectively to corresponding voltages M, H and M, as shown, where the M level voltage is high but not sufficient to program a memory element and the H level is high enough to force a memory element into the programmed state. The M level voltage may be about one -half of the H level voltage, between zero volts and H. For example, a M level can be 0.7 volt, and a H level can be 1.5 volt. The H level used for programming is not necessary the same as the H level used for resetting or reading. In this case, according to the received data, memory elements Mn 4 and Mi 34 are to remain in their reset state, while memory element Mi 24 is being programmed. Therefore, the programming voltages are applied only to memory element Mi 24 of this page by the following steps.

2. Set the word line of the page being programmed to 0 volts, in this case selected word line WLi 2 . This is the only word line to which the memory elements of the page are connected. Each of the other word lines on all planes is set to the M level. These word line voltages are applied by the circuits 27 of FIG. 2.

3. Set one of the select gate lines below and on either side of the selected word line to the H' voltage level, in order to select a page for programming. For the page indicated in FIGs. 3 and 4, the H' voltage is placed on select gate line SG 2 in order to turn on select devices Qi 2 , Q 22 and Q 32 (FIG. 1). All other select gate lines, namely lines SGi and SG 3 in this example, are set to 0 volts in order to keep their select devices off. The select gate line voltages are applied by the circuits 29 of FIG. 2. This connects one row of local bit lines to the global bit lines and leaves all other local bit lines floating. In this example, the row of local bit lines LBLi 2 , LBL 22 and LBL 32 are connected to the respective global bit lines GBLi, GBL 2 and GBL 3 through the select devices that are turned on, while all other local bit lines (LBLs) of the array are left floating.

[0075] The result of this operation, for the example memory element material mentioned above, is that a programming current I PR OG is sent through the memory element Mi 24 , thereby causing that memory element to change from a reset to a set (programmed) state. The same will occur with other memory elements (not shown) that are connected between the selected word line WLi 2 and a local bit line (LBL) that has the programming voltage level H applied.

[0076] An example of the relative timing of applying the above-listed programming voltages is to initially set all the global bit lines (GBLs), the selected select gate line (SG), the selected word line and two adjacent word lines on either side of the selected word line on the one page all to the voltage level M. After this, selected ones of the GBLs are raised to the voltage level H according to the data being programmed while simultaneously dropping the voltage of the selected word line to 0 volts for the duration of the programming cycle. The word lines in plane 1 other than the selected word line WL12 and all word lines in the unselected other planes can be weakly driven to M, some lower voltage or allowed to float in order to reduce power that must be delivered by word line drivers that are part of the circuits 27 of FIG. 2.

[0077] By floating all the local bit lines other than the selected row (in this example, all but LBL 12 , LBL 2 and LBL 3 ), voltages can be loosely coupled to outer word lines of the selected plane 1 and word lines of other planes that are allowed to float through memory elements in their low resistance state (programmed) that are connected between the floating local bit lines and adjacent word lines. These outer word lines of the selected plane and word lines in unselected planes, although allowed to float, may eventually be driven up to voltage level M through a combination of programmed memory elements.

[0078] There are typically parasitic currents present during the programming operation that can increase the currents that must be supplied through the selected word line and global bit lines. During programming there are two sources of parasitic currents, one to the adjacent page in a different block and another to the adjacent page in the same block. An example of the first is the parasitic current Ipi shown on FIG. 4 from the local bit line LBL 22 that has been raised to the voltage level H during programming. The memory element Mi 23 is connected between that voltage and the voltage level M on its word line WLn. This voltage difference can cause the parasitic current -Ipi to flow. Since there is no such voltage difference between the local bit lines LBL 1 or LBL 3 and the word line WL l l s no such parasitic current flows through either of the memory elements Mn 3 or Mi 33 , a result of these memory elements remaining in the reset state according to the data being programmed.

[0079] Other parasitic currents can similarly flow from the same local bit line LBL 22 to an adjacent word line in other planes. The presence of these currents may limit the number of planes that can be included in the memory system since the total current may increase with the number of planes. The limitation for programming is in the current capacity of the memory power supply, so the maximum number of planes is a tradeoff between the size of the power supply and the number of planes. A number of 4-8 planes may generally be used in most cases.

[0080] The other source of parasitic currents during programming is to an adjacent page in the same block. The local bit lines that are left floating (all but those connected to the row of memory elements being programmed) will tend to be driven to the voltage level M of unselected word lines through any programmed memory element on any plane. This in turn can cause parasitic currents to flow in the selected plane from these local bit lines at the M voltage level to the selected word line that is at zero volts. An example of this is given by the currents Ip 2 , Ip 3 and Ip 4 shown in FIG. 4. In general, these currents will be much less than the other parasitic current Ipi discussed above, since these currents flow only through those memory elements in their conductive state that are adjacent to the selected word line in the selected plane.

[0081] The above-described programming techniques ensure that the selected page is programmed (local bit lines at H, selected word line at 0) and that adjacent unselected word lines are at M. As mentioned earlier, other unselected word lines can be weakly driven to M or initially driven to M and then left floating. Alternately, word lines in any plane distant from the selected word line (for example, more than 5 word lines away) can also be left uncharged (at ground) or floating because the parasitic currents flowing to them are so low as to be negligible compared to the identified parasitic currents since they must flow through a series combination of five or more ON devices (devices in their low resistance state). This can reduce the power dissipation caused by charging a large number of word lines.

[0082] While the above description assumes that each memory element of the page being programmed will reach its desired ON value with one application of a programming pulse, a program-verify technique commonly used in NOR or NAND flash memory technology may alternately be used. In this process, a complete programming operation for a given page includes of a series of individual programming operations in which a smaller change in ON resistance occurs within each program operation. Interspersed between each program operation is a verify (read) operation that determines whether an individual memory element has reached its desired programmed level of resistance or conductance consistent with the data being programmed in the memory element. The sequence of program/verify is terminated for each memory element as it is verified to reach the desired value of resistance or conductance. After all of memory elements being programmed are verified to have reached their desired programmed value, programming of the page of memory elements is then completed. An example of this technique is described in United States patent no. 5,172,338.

[0083] With reference primarily to FIG. 5, the parallel reading of the states of a page of memory elements, such as the memory elements Mn 4 , Mi 24 and Mi 34 , is described. The steps of an example reading process are as follows:

1. Set all the global bit lines GBLs and all the word lines WL to a voltage V R . The voltage V R is simply a convenient reference voltage and can be any number of values but will typically be between 0 and 1 volt. In general, for operating modes where repeated reads occur, it is convenient to set all word lines in the array to V R in order to reduce parasitic read currents, even though this requires charging all the word lines. However, as an alternative, it is only necessary to raise the selected word line (WL 1 in FIG. 5), the word line in each of the other planes that is in the same position as the selected word line and the immediately adjacent word lines in all planes to V R .

2. Turn on one row of select devices by placing a voltage on the control line adjacent to the selected word line in order to define the page to be read. In the example of FIGs. 1 and 5, a voltage is applied to the control line SG 2 in order to turn on the select devices Qi 2 , Q 22 and Q 32 . This connects one row of local bit lines LBLi 2 , LBL 22 and LBL 32 to their respective global bit lines GBLi, GBL 2 and GBL 3 . These local bit lines are then connected to individual sense amplifiers (SA) that are present in the sense amplifier and I/O circuits 21 of FIG. 2, and assume the potential VR of the global bit lines to which they are connected. All other local bit lines LBLs are allowed to float.

3. Set the selected word line (WLi 2 ) to a voltage of VR ± Vsense. The sign of Vsense is chosen based on the sense amplifier and has a magnitude of about 0.5 volt. The voltages on all other word lines remain the same.

4. Sense current flowing into (V R + Vsense) or out of (V R - Vsense) each sense amplifier for time T. These are the currents I RI , I R2 and I R3 shown to be flowing through the addressed memory elements of the example of FIG. 5, which are proportional to the programmed states of the respective memory elements Mn 4 , Mi 24 and Mi 34 . The states of the memory elements Mn 4 , Mi 24 and Mi 34 are then given by binary outputs of the sense amplifiers within the sense amplifier and I/O circuits 21 that are connected to the respective global bit lines GBLi, GBL 2 and GBL 3 . These sense amplifier outputs are then sent over the lines 23 (FIG. 2) to the controller 25, which then provides the read data to the host 31.

5. Turn off the select devices (Qi 2 , Q 22 and Q 32 ) by removing the voltage from the select gate line (SG 2 ), in order to disconnect the local bit lines from the global bit lines, and return the selected word line (WLi 2 ) to the voltage V R .

[0084] Parasitic currents during such a read operation have two undesirable effects. As with programming, parasitic currents place increased demands on the memory system power supply. In addition, it is possible for parasitic currents to exist that are erroneously included in the currents though the addressed memory elements that are being read. This can therefore lead to erroneous read results if such parasitic currents are large enough.

[0085] As in the programming case, all of the local bit lines except the selected row (LBLi 2 , LBL 22 and LBL 32 in the example of FIG. 5) are floating. But the potential of the floating local bit lines may be driven to V R by any memory element that is in its programmed (low resistance) state and connected between a floating local bit line and a word line at V R , in any plane. A parasitic current comparable to Ipi in the programming case (FIG. 4) is not present during data read because both the selected local bit lines and the adjacent non-selected word lines are both at V R . Parasitic currents may flow, however, through low resistance memory elements connected between floating local bit lines and the selected word line. These are comparable to the currents Ip 2 , Ip 3 , and Ip 4 during programming (FIG. 4), indicated as Ip 5 , Ip 6 and Ip 7 in FIG. 5. Each of these currents can be equal in magnitude to the maximum read current through an addressed memory element. However, these parasitic currents are flowing from the word lines at the voltage V R to the selected word line at a voltage V R ± Vsense without flowing through the sense amplifiers. These parasitic currents will not flow through the selected local bit lines (LBL12, LBL22 and LBL32 in FIG. 5) to which the sense amplifiers are connected. Although they contribute to power dissipation, these parasitic currents do not therefore introduce a sensing error.

[0086] Although the neighboring word lines should be at V R to minimize parasitic currents, as in the programming case it may be desirable to weakly drive these word lines or even allow them to float. In one variation, the selected word line and the neighboring word lines can be pre-charged to V R and then allowed to float. When the sense amplifier is energized, it may charge them to V R SO that the potential on these lines is accurately set by the reference voltage from the sense amplifier (as opposed to the reference voltage from the word line driver). This can occur before the selected word line is changed to V R ± Vsense but the sense amplifier current is not measured until this charging transient is completed.

[0087] Reference cells may also be included within the memory array 10 to facilitate any or all of the common data operations (erase, program, or read). A reference cell is a cell that is structurally as nearly identical to a data cell as possible in which the resistance is set to a particular value. They are useful to cancel or track resistance drift of data cells associated with temperature, process non-uniformities, repeated programming, time or other cell properties that may vary during operation of the memory. Typically they are set to have a resistance above the highest acceptable low resistance value of a memory element in one data state (such as the ON resistance) and below the lowest acceptable high resistance value of a memory element in another data state (such as the OFF resistance). Reference cells may be "global" to a plane or the entire array, or may be contained within each block or page.

[0088] In one embodiment, multiple reference cells may be contained within each page. The number of such cells may be only a few (less than 10), or may be up to a several percent of the total number of cells within each page. In this case, the reference cells are typically reset and written in a separate operation independent of the data within the page. For example, they may be set one time in the factory, or they may be set once or multiple times during operation of the memory array. During a reset operation described above, all of the global bit lines are set low, but this can be modified to only set the global bit lines associated with the memory elements being reset to a low value while the global bit lines associated with the reference cells are set to an intermediate value, thus inhibiting them from being reset. Alternately, to reset reference cells within a given block, the global bit lines associated with the reference cells are set to a low value while the global bit lines associated with the data cells are set to an intermediate value. During programming, this process is reversed and the global bit lines associated with the reference cells are raised to a high value to set the reference cells to a desired ON resistance while the memory elements remain in the reset state. Typically the programming voltages or times will be changed to program reference cells to a higher ON resistance than when programming memory elements.

[0089] If, for example, the number of reference cells in each page is chosen to be 1% of the number of data storage memory elements, then they may be physically arranged along each word line such that each reference cell is separated from its neighbor by 100 data cells, and the sense amplifier associated with reading the reference cell can share its reference information with the intervening sense amplifiers reading data. Reference cells can be used during programming to ensure the data is programmed with sufficient margin. Further information regarding the use of reference cells within a page can be found in United States patents nos. 6,222,762, 6,538,922, 6,678,192 and 7,237,074.

[0090] In a particular embodiment, reference cells may be used to approximately cancel parasitic currents in the array. In this case the value of the resistance of the reference cell(s) is set to that of the reset state rather than a value between the reset state and a data state as described earlier. The current in each reference cell can be measured by its associated sense amplifier and this current subtracted from neighboring data cells. In this case, the reference cell is approximating the parasitic currents flowing in a region of the memory array that tracks and is similar to the parasitic currents flowing in that region of the array during a data operation. This correction can be applied in a two-step operation (measure the parasitic current in the reference cells and subsequently subtract its value from that obtained during a data operation) or simultaneously with the data operation. One way in which simultaneous operation is possible is to use the reference cell to adjust the timing or reference levels of the adjacent data sense amplifiers. An example of this is shown in United States patent no. 7,324,393. [0091] In conventional two-dimensional arrays of variable resistance memory elements, a diode is usually included in series with the memory element between the crossing bit and word lines. The primary purpose of the diodes is to reduce the number and magnitudes of parasitic currents during resetting (erasing), programming and reading the memory elements. A significant advantage of the three-dimensional array herein is that resulting parasitic currents are fewer and therefore have a reduced negative effect on operation of the array than in other types of arrays.

[0092] Diodes may also be connected in series with the individual memory elements of the three-dimensional array, as currently done in other arrays of variable resistive memory elements, in order to reduce further the number of parasitic currents but there are disadvantages in doing so. Primarily, the manufacturing process becomes more complicated. Added masks and added manufacturing steps are then necessary. Also, since formation of the silicon p-n diodes often requires at least one high temperature step, the word lines and local bit lines cannot then be made of metal having a low melting point, such as aluminum that is commonly used in integrated circuit manufacturing, because it may melt during the subsequent high temperature step. Use of a metal, or composite material including a metal, is preferred because of its higher conductivity than the conductively doped polysilicon material that is typically used for bit and word lines because of being exposed to such high temperatures. An example of an array of resistive switching memory elements having a diode formed as part of the individual memory elements is given in patent application publication no. US 2009/0001344 Al .

[0093] Because of the reduced number of parasitic currents in the three-dimensional array herein, the total magnitude of parasitic currents can be managed without the use of such diodes. In addition to the simpler manufacturing processes, the absence of the diodes allows bi-polar operation; that is, an operation in which the voltage polarity to switch the memory element from its first state to its second memory state is opposite of the voltage polarity to switch the memory element from its second to its first memory state. The advantage of the bi-polar operation over a unipolar operation (same polarity voltage is used to switch the memory element from its first to second memory state as from its second to first memory state) is the reduction of power to switch the memory element and an improvement in the reliability of the memory element. These advantages of the bi-polar operation are seen in memory elements in which formation and destruction of a conductive filament is the physical mechanism for switching, as in the memory elements made from metal oxides and solid electrolyte materials.

[0094] The level of parasitic currents increases with the number of planes and with the number of memory elements connected along the individual word lines within each plane. But since the number of word lines on each plane does not significantly affect the amount of parasitic current, the planes may individually include a large number of word lines. The parasitic currents resulting from a large number of memory elements connected along the length of individual word lines can further be managed by segmenting the word lines into sections of fewer numbers of memory elements. Erasing, programming and reading operations are then performed on the memory elements connected along one segment of each word line instead of the total number of memory elements connected along the entire length of the word line.

[0095] The re -programmable non-volatile memory array being described herein has many advantages. The quantity of digital data that may be stored per unit of semiconductor substrate area is high. It may be manufactured with a lower cost per stored bit of data. Only a few masks are necessary for the entire stack of planes, rather than requiring a separate set of masks for each plane. The number of local bit line connections with the substrate is significantly reduced over other multi-plane structures that do not use the vertical local bit lines. The architecture eliminates the need for each memory cell to have a diode in series with the resistive memory element, thereby further simplifying the manufacturing process and enabling the use of metal conductive lines. Also, the voltages necessary to operate the array are much lower than those used in current commercial flash memories.

[0096] Since at least one-half of each current path is vertical, the voltage drops present in large cross-point arrays are significantly reduced. The reduced length of the current path due to the shorter vertical component means that there are approximately one-half the number memory cells on each current path and thus the leakage currents are reduced as is the number of unselected cells disturbed during a data programming or read operation. For example, if there are N cells associated with a word line and N cells associated with a bit line of equal length in a conventional array, there are 2N cells associated or "touched" with every data operation. In the vertical local bit line architecture described herein, there are n cells associated with the bit line (n is the number of planes and is typically a small number such as 4 to 8), or N+n cells are associated with a data operation. For a large N this means that the number of cells affected by a data operation is approximately one-half as many as in a conventional three-dimensional array.

Materials Useful for the Memory Storage Elements

[0097] The material used for the non-volatile memory storage elements M zxy in the array of FIG. 1 can be a chalcogenide, a metal oxide, or any one of a number of materials that exhibit a stable, reversible shift in resistance in response to an external voltage applied to or current passed through the material.

[0098] Metal oxides are characterized by being insulating when initially deposited. One suitable metal oxide is a titanium oxide (TiO x ). A previously reported memory element using this material is illustrated in FIG. 6. In this case, near- stoichiometric Ti0 2 bulk material is altered in an annealing process to create an oxygen deficient layer (or a layer with oxygen vacancies) in proximity of the bottom electrode. The top platinum electrode, with its high work function, creates a high potential Pt/Ti0 2 barrier for electrons. As a result, at moderate voltages (below one volt), a very low current will flow through the structure. The bottom Pt Ti0 2-x barrier is lowered by the presence of the oxygen vacancies (0 + 2 ) and behaves as a low resistance contact (ohmic contact). (The oxygen vacancies in Ti0 2 are known to act as n-type dopant, transforming the insulating oxide in an electrically conductive doped semiconductor.) The resulting composite structure is in a non-conductive (high resistance) state.

[0099] But when a large negative voltage (such as 1.5 volt) is applied across the structure, the oxygen vacancies drift toward the top electrode and, as a result, the potential barrier Pt/Ti0 2 is reduced and a relatively high current can flow through the structure. The device is then in its low resistance (conductive) state. Experiments reported by others have shown that conduction is occurring in filament-like regions of the Ti0 2 , perhaps along grain boundaries. [00100] The conductive path is broken by applying a large positive voltage across the structure of FIG. 6. Under this positive bias, the oxygen vacancies move away from the proximity of the top Pt/Ti0 2 barrier, and "break" the filament. The device returns to its high resistance state. Both of the conductive and non-conductive states are non-volatile. Sensing the conduction of the memory storage element by applying a voltage around 0.5 volts can easily determine the state of the memory element.

[00101] While this specific conduction mechanism may not apply to all metal oxides, as a group, they have a similar behavior: transition from a low conductive state to a high conductive occurs state when appropriate voltages are applied, and the two states are non-volatile. Examples of other materials include HfOx, ZrOx, WOx, NiOx, CoOx, CoalOx, MnOx, ZnMn 2 0 4 , ZnOx, TaOx, NbOx, HfSiOx, HfAlOx. Suitable top electrodes include metals with a high work function (typically > 4.5 eV) capable to getter oxygen in contact with the metal oxide to create oxygen vacancies at the contact. Some examples are TaCN, TiCN, Ru, RuO, Pt, Ti rich TiOx, TiAIN, TaAIN, TiSiN, TaSiN, Ir0 2 . Suitable materials for the bottom electrode are any conducting oxygen rich material such as Ti(0)N, Ta(0)N, TiN and TaN. The thicknesses of the electrodes are typically 1 nm or greater. Thicknesses of the metal oxide are generally in the range of 5 nm to 50 nm.

[00102] Another class of materials suitable for the memory storage elements is solid electrolytes but since they are electrically conductive when deposited, individual memory elements need to be formed and isolated from one another. Solid electrolytes are somewhat similar to the metal oxides, and the conduction mechanism is assumed to be the formation of a metallic filament between the top and bottom electrode. In this structure the filament is formed by dissolving ions from one electrode (the oxidizable electrode) into the body of the cell (the solid electrolyte). In one example, the solid electrolyte contains silver ions or copper ions, and the oxidizable electrode is preferably a metal intercalated in a transition metal sulfide or selenide material such as A x (MB2)i -x , where A is Ag or Cu, B is S or Se, and M is a transition metal such as Ta, V, or Ti, and x ranges from about 0.1 to about 0.7. Such a composition minimizes oxidizing unwanted material into the solid electrolyte. One example of such a composition is Ag x (TaS2)i -x . Alternate composition materials include a-Agl. The other electrode (the indifferent or neutral electrode) should be a good electrical conductor while remaining insoluble in the solid electrolyte material. Examples include metals and compounds such as W, Ni, Mo, Pt, metal silicides, and the like.

[00103] Examples of solid electrolytes materials are: TaO, GeSe or GeS. Other systems suitable for use as solid electrolyte cells are: Cu/TaO/W, Ag/GeSe/W, Cu/GeSe/W, Cu/GeS/W, and Ag/GeS/W, where the first material is the oxidizable electrode, the middle material is the solid electrolyte, and the third material is the indifferent (neutral) electrode. Typical thicknesses of the solid electrolyte are between 30 nm and 100 nm.

[00104] In recent years, carbon has been extensively studied as a non- volatile memory material. As a non-volatile memory element, carbon is usually used in two forms, conductive (or grapheme like-carbon) and insulating (or amorphous carbon). The difference in the two types of carbon material is the content of the carbon chemical bonds, so called sp 2 and sp 3 hybridizations. In the sp 3 configuration, the carbon valence electrons are kept in strong covalent bonds and as a result the sp 3 hybridization is non-conductive. Carbon films in which the sp 3 configuration dominates, are commonly referred to as tetrahedral-amorphous carbon, or diamond like. In the sp 2 configuration, not all the carbon valence electrons are kept in covalent bonds. The weak tight electrons (phi bonds) contribute to the electrical conduction making the mostly sp 2 configuration a conductive carbon material. The operation of the carbon resistive switching nonvolatile memories is based on the fact that it is possible to transform the sp 3 configuration to the sp 2 configuration by applying appropriate current (or voltage) pulses to the carbon structure. For example, when a very short (1 - 5 ns) high amplitude voltage pulse is applied across the material, the conductance is greatly reduced as the material sp 2 changes into an sp 3 form ("reset" state). It has been theorized that the high local temperatures generated by this pulse causes disorder in the material and if the pulse is very short, the carbon "quenches" in an amorphous state (sp 3 hybridization). On the other hand, when in the reset state, applying a lower voltage for a longer time (-300 nsec) causes part of the material to change into the sp 2 form ("set" state). The carbon resistance switching non-volatile memory elements have a capacitor like configuration where the top and bottom electrodes are made of high temperature melting point metals like W, Pd, Pt and TaN. [00105] There has been significant attention recently to the application of carbon nanotubes (CNTs) as a non-volatile memory material. A (single walled) carbon nanotube is a hollow cylinder of carbon, typically a rolled and self-closing sheet one carbon atom thick, with a typical diameter of about 1-2 nm and a length hundreds of times greater. Such nanotubes can demonstrate very high conductivity, and various proposals have been made regarding compatibility with integrated circuit fabrication. It has been proposed to encapsulate "short" CNT's within an inert binder matrix to form a fabric of CNT's. These can be deposited on a silicon wafer using a spin-on or spray coating, and as applied the CNT's have a random orientation with respect to each other. When an electric field is applied across this fabric, the CNT's tend to flex or align themselves such that the conductivity of the fabric is changed. The switching mechanism from low-to-high resistance and the opposite is not well understood. As in the other carbon based resistive switching non-volatile memories, the CNT based memories have capacitor-like configurations with top and bottom electrodes made of high melting point metals such as those mentioned above.

[00106] Yet another class of materials suitable for the memory storage elements is phase-change materials. A preferred group of phase-change materials includes chalcogenide glasses, often of a composition Ge x Sb y Te z , where preferably x=2, y=2 and z=5. GeSb has also been found to be useful. Other materials include AglnSbTe, GeTe, GaSb, BaSbTe, InSbTe and various other combinations of these basic elements. Thicknesses are generally in the range of 1 nm to 500 nm. The generally accepted explanation for the switching mechanism is that when a high energy pulse is applied for a very short time to cause a region of the material to melt, the material "quenches" in an amorphous state, which is a low conductive state. When a lower energy pulse is applied for a longer time such that the temperature remains above the crystallization temperature but below the melting temperature, the material crystallizes to form poly-crystal phases of high conductivity. These devices are often fabricated using sub-lithographic pillars, integrated with heater electrodes. Often the localized region undergoing the phase change may be designed to correspond to a transition over a step edge, or a region where the material crosses over a slot etched in a low thermal conductivity material. The contacting electrodes may be any high melting metal such as TiN, W, WN and TaN in thicknesses from 1 nm to 500 nm.

[00107] It will be noted that the memory materials in most of the foregoing examples utilize electrodes on either side thereof whose compositions are specifically selected. In embodiments of the three-dimensional memory array herein where the word lines (WL) and/or local bit lines (LBL) also form these electrodes by direct contact with the memory material, those lines are preferably made of the conductive materials described above. In embodiments using additional conductive segments for at least one of the two memory element electrodes, those segments are therefore made of the materials described above for the memory element electrodes.

[00108] Steering elements are commonly incorporated into controllable resistance types of memory storage elements. Steering elements can be a transistor or a diode. Although an advantage of the three-dimensional architecture described herein is that such steering elements are not necessary, there may be specific configurations where it is desirable to include steering elements. The diode can be a p-n junction (not necessarily of silicon), a metal/insulator/insulator/metal (MUM), or a Schottky type metal/semiconductor contact but can alternately be a solid electrolyte element. A characteristic of this type of diode is that for correct operation in a memory array, it is necessary to be switched "on" and "off during each address operation. Until the memory element is addressed, the diode is in the high resistance state ("off state) and "shields" the resistive memory element from disturb voltages. To access a resistive memory element, three different operations are needed: a) convert the diode from high resistance to low resistance, b) program, read, or reset (erase) the memory element by application of appropriate voltages across or currents through the diode, and c) reset (erase) the diode. In some embodiments one or more of these operations can be combined into the same step. Resetting the diode may be accomplished by applying a reverse voltage to the memory element including a diode, which causes the diode filament to collapse and the diode to return to the high resistance state.

[00109] For simplicity the above description has consider the simplest case of storing one data value within each cell: each cell is either reset or set and holds one bit of data. However, the techniques of the present application are not limited to this simple case. By using various values of ON resistance and designing the sense amplifiers to be able to discriminate between several of such values, each memory element can hold multiple-bits of data in a multiple-level cell (MLC). The principles of such operation are described in United States patent no. 5,172,338 referenced earlier. Examples of MLC technology applied to three dimensional arrays of memory elements include an article entitled "Multi-bit Memory Using Programmable Metallization Cell Technology" by Kozicki et al., Proceedings of the International Conference on Electronic Devices and Memory, Grenoble, France, June 12-17, 2005, pp. 48-53 and "Time Discrete Voltage Sensing and Iterative Programming Control for a 4F2 Multilevel CBRAM" by Schrogmeier et al. (2007 Symposium on VLSI Circuits).

[00110] Conventionally, diodes are commonly connected in series with the variable resistive elements of a memory array in order to reduce leakage currents that can flow through them. The highly compact 3D reprogrammable memory described in the present invention has an architecture that does not require a diode in series with each memory element while able to keep the leakage currents reduced. (Of course, using a diode will further control the leakage currents at the expense of more processing and possible more occupied space.) This is possible with short local vertical bit lines which are selectively coupled to a set of global bit lines. In this manner, the structures of the 3D memory are necessarily segmented and couplings between the individual paths in the mesh are reduced.

[00111] Even if the 3D reprogrammable memory has an architecture that allows reduced current leakage, it is desirable to further reduce them. As described earlier and in connection with FIG. 5, parasitic currents may exist during a read operation and these currents have two undesirable effects. First, they result in higher power consumption. Secondly, and more seriously, they may occur in the sensing path of the memory element being sensed, cause erroneous reading of the sensed current.

[00112] FIG. 7 illustrates the read bias voltages and current leakage across multiple planes of the 3D memory shown in FIG. 1 and FIG. 3. FIG. 7 is a cross- sectional view across 4 planes along the x-direction of a portion of the perspective 3D view of the memory shown in FIG. 1. It should be clear that while FIG. 1 shows the substrate and 2 planes, FIG. 7 shows the substrate and 4 planes to better illustrate the effect of current leakage from one plane to another.

[00113] In accordance with the general principle described in connection with

FIG. 5, when the resistive state of a memory element 200 in FIG. 7 is to be determined, a bias voltage is applied across the memory element and its element current I ELEMENT sensed. The memory element 200 resides on Plane 4 and is accessible by selecting the word line 210 (Sel-WLi) and the local bit line 220 (Sel- LBLj). For example, to apply the bias voltage, the selected word line 210 (Sel-WLi) is set to Ov and the corresponding selected local bit line 220 (Sel-LBLj) is set to a reference such as 0.5V via a turned on select gate 222 by a sense amplifier 240. With all other unselected word line in all planes also set to the reference 0.5V and all unselected local bit lines also set to the reference 0.5V, then the current sensed by the sense amplifier 240 will just be the I ELEMENT of the memory element 200.

[00114] The architecture shown in FIG. 1 and FIG. 7 has the unselected local bit lines (LBLj+1, LBLj+2, ...) and the selected local bit line (Sel-LBLj) all sharing the same global bit line 250 (GBLi) to the sense amplifier 240. During sensing of the memory element 200, the unselected local bit lines can only be isolated from the sense amplifier 240 by having their respective select gate such as gate 232 turned off. In this way, the unselected local bit lines are left floating and will couple to the reference 0.5V by virtue of adjacent nodes which are at 0.5V. However, the adjacent nodes are not exactly at the reference 0.5V. This is due to a finite resistance in each word line (perpendicular to the plane in FIG. 7) which results in a progressive voltage drop away from one end of the word line at which 0.5V is applied. This ultimately results in the floating, adjacent unselected local bit lines coupling to a voltage slightly different from the reference 0.5V. In this instance, there will be leakage currents between the selected and unselected local bit lines as illustrated by broken flow lines in FIG. 7. Then sensed current is then I ELEMENT + leakage currents instead of just I ELEMENT - This problem becomes worse will increasing word line's length and resistivity.

[00115] Another 3D memory architecture includes memory elements arranged in a three-dimensional pattern defined by rectangular coordinates having x, y and z- directions and with a plurality of parallel planes stacked in the z-direction. The memory elements in each plane are accessed by a plurality of word lines and local bit lines in tandem with a plurality of global bit lines. The plurality of local bit lines are in the z-direction through the plurality of planes and arranged in a two dimensional rectangular array of rows in the x-direction and columns in the y-directions. The plurality of word lines in each plane are elongated in the x-direction and spaced apart in the y-direction between and separated from the plurality of local bit lines in the individual planes. A non- volatile, reprogramming memory element is located near a crossing between a word line and local bit line and accessible by the word line and bit line and wherein a group of memory elements are accessible in parallel by a common word line and a row of local bit lines. The 3D memory has a single-sided word line architecture with each word line exclusively connected to one row of memory elements. This is accomplished by providing one word line for each row of memory elements instead of sharing one word line between two rows of memory elements and linking the memory element across the array across the word lines. While the row of memory elements is also being accessed by a corresponding row of local bit lines, there is no extension of coupling for the row of local bit lines beyond the word line.

[00116] A double-sided word line architecture has been described earlier in that each word line is connected to two adjacent rows of memory elements associated with two corresponding rows of local bit lines, one adjacent row along one side of the word line and another adjacent row along the other side. For example, as shown in FIG. 1 and FIG. 3, the word line WLi 2 is connected on one side to a first row (or page) of memory elements (M 114 , M 124 , M 134 , ...) associated respectively with local bit lines (LBLi 2 , LBL 22 , LBL 32 , ...) and also connected on another side to a second row (or page) of memory elements (M 115 , M 125 , M 135 , ...) associated respectively with local bit lines (LBLi 3 , LBL 23 , LBL 33 , ...)

[00117] FIG. 8 illustrates schematically a single-sided word line architecture.

Each word line is connected to an adjacent row of memory elements associate with one row of local bit lines on only one side.

[00118] The 3D memory array with the double-sided word line architecture illustrated in FIG. 1 can be modified to the single-sided word line architecture where each word line except ones at an edge of the array will be replaced by a pair of word lines. In this way, each word line is connecting exclusively to one row of memory elements. Thus, the word line WLi 2 shown in FIG. 1 is now replaced in FIG. 8 by the pair of word lines WL 13 and WL 14 . It will be seen that WL13 is connected to one row of memory elements (M 114 , M 124 , M 134 , ...) and WL14 is connected to one row of memory elements (M 115 , M 125 , M 135 , ...) As described before, a row of memory elements constitutes a page which is read or written to in parallel.

[00119] FIG. 9 illustrates one plane and substrate of the 3D array with the single-sided word line architecture. Going from the double-sided word line architecture of FIG. 3, similarly, WLi 2 in FIG. 3 would be replaced by the pair WL 13 , WL 14 in FIG. 9, etc. In FIG. 3, a typical double-sided word line (e.g., WL 12 ) is connected to two rows of memory elements (on both side of the word line). In FIG. 9, each single-sided word line (e.g., WL 13 ) is connected to only one row of memory elements.

[00120] FIG. 9 also illustrates a minimum block of memory elements that is erasable as a unit to be defined by two row of memory elements (Mi 13 , M 123 , M 133 , ...) and (M114, M 124 , M 134 , ...) sharing the same row of local bit lines (e.g., LBLi 2 , LBL 22 , LBL 32 , . ..)

[00121] FIG. 10 illustrates the elimination of leakage currents in the single- sided word-line architecture 3-D array of FIG. 8 and 9. The analysis of leakage current is similar to that described with respect to FIG. 7. However, with the single- sided word-line architecture, the selected local bit line 220 (Sel-LBL ) is not coupled to an adjacent bit line 230 across the separated word lines 210 and 212. Thus there is no leakage current between adjacent local bit lines and the sense current in the sense amplifier 240 via the global bit line 250 and the local bit line 220 will be just that from the current of the memory element I ELMENT -

[00122] The single-sided word-line architecture doubles the number of word lines in the memory array compared to the architecture shown in FIG. 1. However, this disadvantage is offset by providing a memory array with less leakage currents among the memory elements.

[00123] The single-sided word-line architecture is disclosed in PCT International Publication No. WO 2010/117914 Al, and United States Patent Application Publication No. 20120147650, the entire disclosure of these are incorporated herein by reference.

SENSING ERROR DUE TO LOCAL BIT LINE VOLTAGE VARIATIONS

[00124] As described in the embodiments of FIG. 1 and FIG. 8, a selected

R/W element, M, is accessed by a pair of selected word line WL and local bit line LBL. The local bit line LBL is one among a 2D array of bit line pillars. Each bit line pillar LBL is switchably connected by a bit line pillar switch to a node on a corresponding global bit line GBL. In a read operation, the current through the R/W element is sensed by a sense amplifier via the global bit line GBL coupled to the selected local bit line LBL. The examples given in FIG. 7 and FIG. 10 has the R/W element, M, connected between a selected local bit line and a selected word line. The selected local bit line is set to 0.5V and the selected word line is set to 0V. The voltages on the word lines are driven by a set of word line drivers. All other word lines and local bit lines are preferable set to the same voltage as the selected local bit line to eliminate current leakage.

[00125] The voltage on a local bit line is sourced from a bit line driver associated with a sense amplifier, typically located on one end of a global bit line. The voltage established on a local bit line could be highly variable dependent on the position of a connection node the local bit line makes along the global bit line as well as the resistive state of the cell (R/W element) it is accessing.

[00126] The voltages of the individual local bit line are dependent on the positions of the respective local bit lines or connection nodes on the global bit line relative to the bit line driver. A local bit line LBL is relative short, as it only transverses the layers across the z-direction, so the voltage drop along it is insignificant. However, the global bit line is long in comparison, and due to the finite resistance of the global bit line, an IR voltage drop along it can cause the bit line driver to supply reduced voltage to the local bit line. Furthermore, the reduced voltage is dependent on the position of the connection node the local bit line makes with the global bit line. [00127] FIGs. 11A and 11B respectively illustrate the different path lengths of two local bit lines to their sense amplifiers. A voltage VDD is supplied to the global bit line GBLi via the sense amplifier 240. In FIG. 11A, the local bit line LBLn 260- 1 1 is coupled to the sense amplifier 240 via a segment 270-yl of global bit line GBLi having a length yl . Thus the IR drop in the path due to the segment 270-yl is IRGBL(yi). In FIG. 11B, the local bit line LBL13 260-13 is coupled to the sense amplifier 240 via a segment 270-y2 of global bit line GBLi having a length y2. Thus the IR drop in the path due to the segment 270-y2 is IRGBL(Y2).

[00128] The problem is further exacerbated if the bit line driver is sensitive to the serial resistance of the circuit path during sensing, as is the case with a source- follower configuration. The bit line voltage in this case depends upon the current flowing through the transistor of the source follower. Thus, the various bit lines could be driven to different voltages depending on the serial resistance in the respective circuit paths.

[00129] FIG. 12 illustrates the resistance along a circuit path of a selected cell

M between a word line driver and a sense amplifier. The sense amplifier also acts as a bit line driver. The resistance includes the resistance of a segment of the selected word line (RWL( X )), the resistance of the R/W element (R M ) which is state dependent, the resistance of the segment of the global bit line (RGBL(Y)) and the resistance of the sense amplifier (RSA).

[00130] The cell's actual current value and cell's current reading by the sense amplifier are both affected by cell position, sense amplifier resistance, data pattern of neighboring cells and word line resistivity. In an ideal situation, if a cell is close to the sense amplifier, RGBL(Y=O) =0. If the sense amplifier is emulated by VDD, RSA =0. If the word line is ideally conductive, RWL( X ) =0.

[00131] In general, these resistances all contribute to reducing the cell current.

With the cell farther away from the bit line driver and a real sense amplifier, and more conductive neighboring cells, alternative paths become more and more significant. Thus, the sense amplifier will read a cell current reduced from its actual one.

[00132] The non-constant voltages among the local bit lines will exacerbate current leakage in the network of the 3D array. For example, the adjacent unselected word lines are biased to the same voltage as the selected local bit line to avoid leakage and it will be uncertain if the local bit line voltage is variable.

[00133] Worst still during read, the non-uniform local bit line voltage will lead to loss of margin between the different resistive states of the R/W elements and cause memory states to be the overlapping and indistinguishable.

BIT LINE VOLTAGE CONTROL

[00134] According to one aspect of the invention, each local bit line is switchably connected to a node on a global bit line having first and second ends, and the voltage on the local bit line is maintained at a predetermined reference level in spite of being driven by a bit line driver from a first end of the global bit line that constitutes variable circuit path length and circuit serial resistance. This is accomplished by a feedback voltage regulator comprising a voltage clamp at the first end of the global bit line controlled by a bit line voltage comparator at the second end of the global bit line. The bit line voltage is sensed accurately from the second end of the global bit line since there is no current flow to incur an IR drop. The comparator compares the sensed bit line voltage with the predetermined reference level and outputs a control voltage. The voltage clamp is controlled by the control voltage as part of the feedback circuit. In this way the voltage at the local bit line is regulated at the reference voltage.

[00135] FIG. 13 illustrates a bit line control circuit that keeps the bit line voltage fixed relative to a reference voltage. A sense amplifier 240 is connected to a first end 271 of the global bit line GBL 270. The local bit line LBL 260 is coupled to the sense amplifier via a first segment 270-1 of the GBL 270. The sense amplifier serves as a bit line driver to drive the local bit line LBL 260 to a given voltage as well as sensing a current in the local bit line. The remaining portion of the GBL forms the second segment 270-2 of the GBL 270. A voltage clamp (BL Clamp) 280 operates with a supply voltage from the sense amplifier to clamp the voltage at the LBL 260. The actual voltage V LBL at the LBL 260 can be detected from a second end 273 of the GBL 270 via the second segment 270-2. Since no current flows in the second segment 270-2, there is no IR drop in the second segment. This actual voltage is compared to a predetermined reference voltage 286 by a comparator 284 such as an op amp. The output of the comparator 282 feeds a control voltage Vc to control the BL clamp 280. For example, the BL clamp 280 can be implemented by a transistor, with the output of the comparator Vc supplied to the gate 282 of the transistor. In order to maintain a predetermined local bit line voltage of VLBL, the predetermined reference voltage is set to VLBL in order to have the comparator 284 outputs a feedback control voltage Vc = VLBL + VT + AV where VT is the threshold of the transistor and AV is the feedback adjustment. In this way, the voltage of the local bit line 260 can be set to a predetermined value irrespective of the variable resistance RGBL(Y) of the first segment 270-1 of the global bit line GBL 270 to the voltage supply (via the sense amplifier).

[00136] One implementation of the bit line voltage control circuit is to have the sense amplifiers 240 located at the first end 271 of the global bit line 270 in the 3D array and the comparator 282 located at the second end 273 of the global bit line. A conducting line 283 connects the output of the comparator 284 to the voltage clamp 280 across and under the 3D array. The bit line voltage control circuit can be implemented as another layer below the 3D array.

[00137] When the voltages of the local bit lines in the 3D array are well controlled during read and programming, the problems of leakage and loss of margin mentioned above are reduced.

3D ARRAY ARCHITECTURE WITH STAIRCASE WORD LINES

[00138] According to one aspect of the invention, a nonvolatile memory is provided with a 3D array of read/write (R/W) memory elements accessible by an x-y- z framework of an array of local bit lines or bit line pillars in the z-direction and word lines in multiple memory planes or layers in the x-y plane perpendicular to the z- direction. An x-array of global bit lines in the y-direction is switchably coupled to individual ones of the local bit line pillars along the y-direction.

[00139] Furthermore, the switchably coupling of a local bit line pillar to a corresponding global bit line is accomplished by a select transistor. The select transistor is a pillar select device that is formed as a vertical structure, switching between a local bit line pillar and a global bit line. The pillar select devices, are not formed within a CMOS layer, but are formed in a separate layer (pillar select layer) above the CMOS layer, along the z-direction between the array of global bit lines and the array of local bit lines.

[00140] Furthermore, each word line has multiple segments in a staircase structure traversing the multiple memory layers in which each segment of the staircase word line lies in a memory plane or layer. Thus each word line has a segment in each memory layer and ultimately rises to the top of the 3D array as an exposed word line segment to be connected to a word line driver.

[00141] In a 3D nonvolatile memory with memory elements arranged in a three-dimensional pattern defined by rectangular coordinates having x, y and z- directions and with a plurality of parallel planes from a bottom plane to a top plane stacked in the z-direction over a semiconductor substrate; a plurality of local bit lines elongated in the z-direction through the plurality of layers and arranged in a two- dimensional rectangular array of bit line pillars having rows in the x-direction and columns in the y-direction; the 3D nonvolatile memory further having a plurality of staircase word lines spaced apart in the y-direction and between and separated from the plurality of bit line pillars at a plurality of crossings, individual staircase word lines each having a series of alternating steps and risers elongated respectively in the x-direction and z-direction traversing across the plurality of planes in the z-direction with a segment in each plane.

[00142] FIG. 14 is an isometric view of a portion of the 3D array 300 with a structure having staircase word lines 310. The gross structure is a 3D array of memory cells M at crossings between a 2D array of local bit lines LBLs in the z- direction and segments of word lines WLs in each memory layer in the x-y plane.

[00143] In this embodiment, the local bit lines LBL 320 are in the form of bit line pillars LBLs. A row of LBL in the x-direction is switched to corresponding global bit lines GBLs 330 by a set of pillar switches 340 at their base. As will be described later, the set of pillar switches is preferably implemented with a NAND selection using two select gates SGs 341-1, 341-2. [00144] For ease of illustration, FIG. 14 shows each word line segment 312 in a memory layer able to select 4 local bit lines. In practice, each word line segment 312 can select other number of local bit lines in each memory layer. For example, if there are 8 memory layers, then, each word line has 8 segments. If each segment can select 16 bit lines in each memory layer, then each word line can select 16x8=128 local bit lines in parallel across all 8 memory layers. The top segment is also connected to a word line driver 350 formed by a source 352 and drain 354 controlled by a WL select gate 351.

Adjacent Staircase Word lines offset in the x-direction by the Pitch of the Local Bit Line

[00145] FIG. 15 illustrates a cross-section view of the 3D array along the y- direction according to an embodiment in which the word line 310 step to the next memory layer is made in between the bit lines. In this example, each word line segment 312 selects 8 local bit lines in each step or memory layer. There are 4 memory layers. Each step rises to the next layer in between a pair of adjacent local bit lines 320. Adjacent staircase word lines on adjacent layers are staggered so that their risers 314 have an offset 315 along the x-direction given by the pitch of the local bit lines. The R/W element cross-section, defined by the bit line and the word line, is constant.

Adjacent Staircase Word lines offset in the x-direction by the Half the Pitch of the Local Bit Line

[00146] FIG. 16 illustrates a cross-section view of the 3D array along the y- direction according to an embodiment in which the various staggered word line steps are stacked as close as possible. In this example, each word line segment 312 selects 8 local bit lines in each step or memory layer. There are 4 memory layers. Each step rises to the next layer in between a pair of adjacent local bit lines 320. Two adjacent staircase word lines on adjacent layers are staggered so that their risers 314 have an offset 315 given by half the pitch of the local bit lines. The R/W element cross section differs. However, this configuration yields better array efficiency and has shorter word lines. [00147] The advantage of the staircase word line architecture is that word lines from different memory layers can be accessed easily as each eventually can be accessed and selected from either the top or bottom of the 3D memory array. In this way, interconnect and decoding are much simplified and there is no need to have multiple word lines connected in parallel to share limited resources. This will avoid the individual word line drivers having to drive a large load in the form of a large number of word lines in parallel and the ICC can be reduced. Within a given ICC budget, the performance is improved. Drivers implemented by devices with weaker drive power than CMOS devices may be used. Also, by avoiding multiple word lines connected in parallel, disturb is reduced.

[00148] According to a first implementation of forming a slab of multi-plane memory with staircase word lines, a word line layer and an oxide layer are alternately formed on top of each other. After a word layer is formed, trenches are cut in the word layer with a first mask to create word line segments having first and second ends. After an oxide layer is formed, trenches are cut in the oxide layer with a second mask to expose the second end of each word line segment for connection to a first end of each word line segment in the next plane to create the staircase structure. With each memory plane constituting from a word line layer and an oxide layer, this method requires two masking to form each memory plane.

[00149] According to a first embodiment, the staircase word line is formed such that each segment in a plane crosses more than one vertical bit line. Thus, the formation of the alternate word lines and bit lines is accomplished by offsetting a same mask each time by a width of the trench.

[00150] FIG. 17 illustrates from top to bottom a series of process steps to fabricate a 3D array with staircase word lines. In this example, each memory layer is formed by two masking applications.

(1) A word line layer is deposited on a base surface. For example a 3nm layer of word line material is deposited by Atomic Layer Deposition ("ALD'). A masking layer with a first mask is laid over the word line layer to enable trenches to be etched in the word line layer. (2) Trenches are etched in the word line layer through the openings of the first mask to the base surface. An anisotropic word line etch is performed using Reactive Ion Etch ("RIE").

(3) An oxide layer is deposited on top of the word line layer. For example a lOnm to 20nm oxide is deposited by ALD. This is followed by a second mask process. The second mask is identical to the first mask except for an offset by the width of a trench to enable trenches to be etched in the oxide layer. The current trenches are aligned adjacent to the previous trenches.

(4) Trenches are etched in the oxide layer through the openings of the second mask. An anisotropic oxide etch is performed using RIE.

(5) A second word line layer is formed on top of the oxide layer and making connection with the lower word line layer through the trenches in the oxide layer. This is followed by the first mask being laid over the second word line layer but offset from the last mask by the width of a trench

(6) Trenches are etched in the second word line layer through the openings of the first mask. An anisotropic word line etch is performed using RIE. And so the process repeats itself as in 3) for the next layer.

[00151] Once the multi-layer slab is formed with the staircase word lines, the word lines layers can be isolated in the y-direction by cutting trenches in the slab and forming vertical local bit lines in the trenches. An example of isolation and forming of the vertical local bit lines in a 3D memory slab is described in United State Patent Publication No. 2012/0147650 Al, the entire disclosure of which is incorporated herein by reference.

3D ARRAY ARCHITECTURE WITH WORD LINE DRIVERS ON TOP

[00152] According to yet another aspect of the invention, the word line driver is implemented as a word line driver layer on the top end of the 3D array. This is accomplished by forming a TFT device in contact with a top segment of a staircase word line. [00153] This is distinct from conventional implementation where the word line driver is formed as CMOS devices on the substrate layer and contact is made with a word line among the multiple memory layers by means of vertical interconnects such as zias.

[00154] FIG. 18 illustrates a word line driver formed as a vertical structure on top of the 3D array of memory layers. The word line driver 350 is preferably implemented by a TFT transistor similar to the pillar select device between the local bit line and the global bit line. The TFT (Thin Film Transistor) device is a transistor form with its NPN junction as three thin layer on top of each other so that it is oriented in the z-direction. The word line driver 350 can then switch between an exposed word line segment and a word line power source (not shown). The word line driver can have a width as wide as the segment of a word line. FIG. 18 shows two adjacent word line drivers 350-Even and 350-Odd, respectively switching two adjacent segments 312-Even and 312-Odd from two adjacent word lines across the y- direction. It will be seen that the even WL access line 355-Even along the y-direction accesses the even banks of word lines along the y-direction. Similarly, the odd WL access line 355-Odd along the y-direction accesses the odd banks of word lines along the y-direction. Each of these access lines only access alternate word line segment because these segments are not isolated by an oxide layer 404.

[00155] The staircase word line architecture enables each word line to be access from either top or bottom of the 3D memory array. In one embodiment, since the CMOS layer at the bottom is already crowded with metal lines and other active devices such as sense amplifiers and op amps, it is advantageous to locate the word line drivers to the top of the 3D memory array. Even though TFT transistors are not as powerful as CMOS devices, it is possible to use them to drive the staircase word lines because the individual word lines are easier to drive they are not extended and the driver can be as wide as a segment of a word line.

EFFICIENT 3D ARRAY ARCHITECTURE WITH STAIRCASE WORD LINES

[00156] The embodiments of 3D arrays with staircase word lines shown in

FIG. 15 and FIG. 16 each has L memory layers with each word line traversing the layers in the form of a staircase. At each layer, a step of the staircase crosses a segment of R local bit lines. Then the word line rises via a riser to the next layer to cross another segment therein. Thus, the array, as seen in the x-z plane, comprises a bank of local bit lines. The bit lines are oriented in the z-axis direction across all memory layers and the bank is extended in the x-axis direction. Each flight of staircase word line traverses the memory layers in the bank from a bottom edge to a top edge. In the x-z plane, the bank is superimposed with multiple flights of staircase word line closely stacked in the x-axis direction, so that each memory layer is accessible by a segment from a different flight of staircase word line.

[00157] Since the segments in each memory layer are all aligned along the same horizontal baseline, they cannot be so tightly stacked in the x-axis direction so as to prevent shorting among them. The embodiment shown in FIG. 15 has an offset of the pitch of a bit line between two segments. The embodiment shown in FIG. 16 has an offset of half the pitch of a bit line between two segments.

[00158] In general, if there are L layers, ideally each local bit line that runs across all L layers should have an independent word line crossing at each layer. However, it can be seen that in both the embodiment of FIG. 15 and the embodiment of FIG. 16, not all the bit lines are crossed by a word line at every layer. These result from the finite offsets in these embodiment and lead to inefficient utilization of space and resources. For example in FIG. 15, at the top layer, there is one bit line not crossed by a word line for every R+1 bit lines. In this case, where R=8, it can be seen that for every 9 local bit lines, there is an idle one on the top memory layer. Similarly, the same is true for the bottom memory layer. For a total of 4 layers, and with a waste of 1 in 9 per top or bottom layer, this amounts to a loss in density of 2/(9x2), or a total of 11%. If the number of layers is increased the loss in density is reduced. However, this will require more layers and also longer word lines, which could lead to drawing excessive currents.

[00159] An efficient 3D array architecture with staircase word lines is implemented with no offset between segments along each memory layer. Essentially this is accomplished by raising an end portion of each segment away from the horizontal baseline. In this way, there will be room along the horizontal baseline at the end of the segment for the next segment to be placed there. [00160] FIG. 19A is a schematic illustration of a cross-section view of the efficient 3D array projected on the x-z plane. An example in this embodiment has the number of memory layers to be L=4 and the number of local bit lines LBL crossed by a segment of a word line in each memory layer to be R=4. At each memory layer, instead of having each word line WL segment crossing horizontally the R=4 local bit lines, it is made to ramp up as it crosses the 4 local bit lines so that the crossing with the next bit line is higher in the z-axis direction compared to that of the current bit line. In this embodiment, the ramping up is uniform across the bit lines. In the example shown where R=4, the ramp up of the segment consists of stepping up after crossing each of the four local bit lines in the segment. In this way, for each memory layer, towards the end of each segment, the bulk of the segment will be raised away from the horizontal baseline. This allows room for the next segment to follow immediately along the horizontal baseline.

[00161] In this example, essentially each word line has 4 ramping segments, one for each memory layer, and each segment crosses 4 local bit lines. Thus, each word line crosses 16 local bit lines across the 4 memory layers. It is possible to have different word line lengths. For example, a shortest ramping word line segment can be crossing just 2 local bit lines per memory layer (i.e., R=2). In that case, the pitch of the word line drivers will be 2 local bit lines. For a memory with 4 layers (i.e., L=4), each word line will cross RxL=2x4=8 local bit lines.

[00162] FIG. 19B illustrates the device structure of the efficient 3D array shown schematically in FIG. 19A. In one embodiment, the local bit lines and word lines are formed from doped polysilicon.

[00163] FIG. 20 is a schematic illustration of a cross-section view of the efficient 3D array projected on the x-z plane according to another embodiment. In this embodiment, the word line segment in a layer crosses the local bit lines horizontally similar to that shown in FIG. 15 and FIG. 16 but rises up to cross the last one or two bit lines near the end of the segment. This allows room for the next segment to follow immediately along the horizontal baseline without skipping a bit line.

[00164] The efficient 3D array architecture avoids the wastage associated with the embodiments shown in FIG. 15 and FIG. 16. As can be seen from FIG. 19, the bank of local bit lines are essentially traversed by uniform flights of staircase word line, except for the ones (shown in grey-out shade) near the left and right edges of the bank. These edge exceptions can be ignored or not formed at all. Even though a small number of local bit lines are depicted in the bank shown in FIG. 19 for ease of illustration, in practice there are many more local bit lines. Given the vast majority of regular bit lines in the core of the bank, the wastage due to the edge is diminishing.

[00165] FIG. 21 is an isometric view of a portion of an efficient 3D array similar to that shown in FIG. 19 A. The gross structure is a 3D array of memory cells (R/W material) at crossings between a 2D array of local bit lines LBLs 320 in the z- direction and segments 312 of word lines WLs 310 in each memory layer in the x-y plane. For ease of illustration, FIG. 21 shows one row of LBLs 320 crossed by flights of staircase word lines along the x-direction. Each staircase word line has one step per bit line. The WL segment in this case is of staircase form and is able to select 6 local bit lines. In this embodiment, the word line driver 450 is located at the bottom of the 3D array on the substrate side. This allows an alternative architecture of the global bit lines GBL 330 being positioned on the top of the 3D array with corresponding switches 440 to access the local bit lines LBL 320.

[00166] A first embodiment of a first implementation of forming a slab of multi-plane memory with staircase word lines has been described in connection with FIG. 17(1) to FIG. 17(6)

[00167] According to a second embodiment, the staircase word line is formed such that each segment in a plane crosses one vertical bit line. Thus, the formation of the alternate word lines and bit lines is accomplished by offsetting a mask that creates trenches that are separated by a width of a trench and by offsetting the mask each time by half a width of the trench.

[00168] FIG. 22 A - FIG. 22H illustrate a series of process steps to fabricate the efficient 3D array shown in FIG. 19.

[00169] In FIG. 22A, a masking layer 402 is laid over the word line layer 410 to enable trenches to be etched in the word line layer.

[00170] In FIG. 22B, trenches are etched in the word line layer. [00171] In FIG. 22C, an oxide layer 404 is deposited on top of the word line layer, followed by the same masking layer but offset by half a trench length to the left to enable trenches to be etched in the oxide layer. The current trenches are offset to the previous trenches by half a trench length each.

[00172] In FIG. 22D, trenches are etched in the oxide layer.

[00173] In FIG. 22E, a second word line layer is formed on top of the oxide layer and making connection with the lower word line layer through the trenches in the oxide layer. This is followed by the same masking layer but offset by yet another half a trench length to the left over the second word line layer.

[00174] In FIG. 22F, Trenches are etched in the second word line layer.

[00175] In FIG. 22G, the process repeats itself as in that shown in FIG. 22C for the next layer of oxide and masking layer to build up the staircase structure of the word line.

[00176] In FIG. 22H, the process repeats itself as in that shown in FIG. 22D where trenches are etched in the oxide layer in order to build up progressively the staircase structure of the word line.

[00177] FIG. 23 illustrates the biasing condition for setting or resetting a R/W element. For simplicity of illustration, the biasing voltages are 0V (no bias), IV (half bias) and 2V (full bias). An R/W element is selected for setting or resetting when it is exposed to the full bias voltage of 2V. This is arranged with the selected local bit line LBL set to full bias of 2V and the selected word line WL set to no bias of 0V in order to develop maximum potential difference across the selected R/W element. To prevent the other R/W elements from changing state, all other unselected WLs and LBLs are set to half bias of IV so that each will see a maximum of IV potential difference. It will be seen that the unselected bit lines and word lines are still drawing current at half bias. As explained above, a short word line is preferable as it will allow ICC current consumption to be under control. Conclusion

[00178] Although the various aspects of the present invention have been described with respect to exemplary embodiments thereof, it will be understood that the present invention is entitled to protection within the full scope of the appended claims.