Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
TRANSISTORLESS ALL-MEMRISTOR NEUROMORPHIC CIRCUITS FOR IN-MEMORY COMPUTING
Document Type and Number:
WIPO Patent Application WO/2020/226740
Kind Code:
A1
Abstract:
A circuit for multiplying a number N of first operands each by a corresponding second operand, and for adding the products of the multiplications, with N ≥ 2; the circuit comprising: N input conductors; N programmable conductance circuits connected each between one of the input conductors and at least one output conductor; each programmable conductance circuit being arranged to be programmable at a value depending in a known manner from one of the first operands; each input conductor being arranged to receive from an input circuit an input train of voltage spikes having a spike rate that derives in a known manner from one of the second operands; and at least one output circuit arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on the at least one output conductor.

Inventors:
YI WEI (US)
CRUZ-ALBRECHT JOSE (US)
Application Number:
PCT/US2020/021561
Publication Date:
November 12, 2020
Filing Date:
March 06, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HRL LAB LLC (US)
International Classes:
G06N3/063; G11C11/54; G11C13/00
Foreign References:
US20140344201A12014-11-20
KR20150091186A2015-08-07
Other References:
FAN CHEN et al., ‘EMAT: An Efficient Multi-Task Architecture for Transfer Learning using ReRAM’, In: 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 08 November 2018 sections 2.3, 4.1-4.3; and figure 3
SONG LINGHAO; QIAN XUEHAI; LI HAI; CHEN YIRAN: "PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning", 2017 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), IEEE, 4 February 2017 (2017-02-04), pages 541 - 552, XP033094169, DOI: 10.1109/HPCA.2017.55
CHEN FAN; SONG LINGHAO; CHEN YIRAN: "ReGAN: A pipelined ReRAM-based accelerator for generative adversarial networks", 2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), IEEE, 22 January 2018 (2018-01-22), pages 178 - 183, XP033323784, DOI: 10.1109/ASPDAC.2018.8297302
See also references of EP 3966745A4
Attorney, Agent or Firm:
LUSINCHI, Laurent P. et al. (US)
Download PDF:
Claims:
CLAIMS

1. A circuit for multiplying a number N of first operands each by a corresponding second operand, and for adding the products of the multiplications, with N ³ 2; the circuit comprising:

N input conductors;

N programmable conductance circuits connected each between one of said input conductors and at least one output conductor; each programmable conductance circuit being arranged to be programmable at a value depending in a known manner from one of said first operands;

each input conductor being arranged to receive from an input circuit an input train of voltage spikes having a spike rate that derives in a known manner from one of said second operands;

at least one output circuit arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor.

2. The circuit of claim 1, wherein said at least one output circuit is arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor by generating a first potential deriving in a known manner from the integration over time of the current received on said at least one output conductor and, when said first potential goes beyond a first predetermined threshold, outputting a voltage spike of first predetermined value and duration and reinitializing said first potential.

3. The circuit of claim 1, wherein each input circuit comprises an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate deriving in a known manner from an input current deriving in a known manner from a second operand.

4. The circuit of claim 3, wherein each excitatory tonic neuron circuit is provided for generating a second potential depending in a known manner from the integration over time of said input current and, when said second potential goes beyond a second predetermined threshold, outputting a voltage spike of second predetermined value and duration and reinitializing said second potential.

5. The circuit of claim 3, wherein:

said N input conductors comprise N row line conductors;

said at least one output conductor comprises a plurality of column line conductors, said plurality of programmable conductance circuits being connected each between a different row line conductor and a different column line conductor; and

said at least one output excitatory tonic neuron circuit comprises a plurality of output excitatory tonic neuron circuits having each an input connected to a different column line conductor.

6. The circuit of claim 1, wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

7. The neuromorphic crossbar array circuit of claim 6, wherein each memory cell comprises a memristor.

8. The neuromorphic crossbar array circuit of claim 3, wherein each excitatory tonic neuron circuit comprises first and second negative differential resistance devices biased each with opposite polarities, said first and second negative differential resistance devices being coupled to first and second grounded capacitors.

9. The neuromorphic crossbar array circuit of claim 8, wherein:

said first negative differential resistance device has a first node connected to an input node of the excitatory tonic neuron circuit by a first load resistor and a second node connected to a first voltage source; said first node of said first negative differential resistance device being coupled to said first grounded capacitor; and said second negative differential resistance device has a first node connected to said first node of said first negative differential resistance device by a second load resistor and a second node connected to a second voltage source; said first node of said second negative differential resistance device being coupled to said second grounded capacitor; said first node of said second negative differential resistance device forming an output node of the neuron circuit.

10. The neuromorphic crossbar array circuit of claim 9, comprising a biasing circuit for supplying the at least one output excitatory tonic neuron circuit with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

11. A method for calculating a multiplication- accumulation operation comprising

multiplying a number N of first operands each by a corresponding second operand, and adding the products of the multiplications, with N ³ 2; the method comprising:

providing N input conductors;

connecting N programmable conductance circuits each between one of said input conductors and a unique output conductor; programming each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing on the input conductor connected to each conductance circuit programmed with a first operand an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating an output train of voltage spikes proportional to a sum over time of the spikes received on said output conductor.

12. The method of claim 11, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on said output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

13. The method of claim 11, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

14. The method of claim 13, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

15. The method of claim 13, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

16. A method for calculating a number M of multiplication-accumulation operation each of N first operands by N corresponding second operands, and adding the products of the multiplications, with N ³ 2 and M ³ = 2; the method comprising: providing N input conductors;

connecting N programmable conductance circuits each between one of said input conductors and one of M output conductors;

programming each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing, on the input conductor connected to each conductance circuit programmed with a first operand, an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating M output trains of voltage spikes proportional each to a sum over time of the spikes received on said output conductor.

17. The method of claim 16, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on each output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

18. The method of claim 16, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

19. The method of claim 18, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

20. The method of claim 18, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

21. The method of claim 16 wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

AMENDED CLAIMS

received by the International Bureau on 25 August 2020 (25.08.2020)

1. A circuit for multiplying a number N of first operands each by a corresponding second operand, and for adding the products of the multiplications, with N > 2; the circuit comprising:

N input conductors;

N programmable conductance circuits connected each between one of said input conductors and at least one output conductor; each programmable conductance circuit being arranged to be programmable at a value depending in a known manner from one of said first operands;

each input conductor being arranged to receive from an input circuit an input train of voltage spikes having a spike rate that derives in a known manner from one of said second operands;

at least one output circuit arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor.

2. The circuit of claim 1, wherein said at least one output circuit is arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor by generating a first potential deriving in a known manner from the integration over time of the current received on said at least one output conductor and, when said first potential goes beyond a first predetermined threshold, outputting a voltage spike of first predetermined value and duration and reinitializing said first potential.

3. The circuit of claim 1, wherein each input circuit comprises an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate deriving in a known manner from an input current deriving in a known manner from a second operand.

4. The circuit of claim 3, wherein each excitatory tonic neuron circuit is provided for generating a second potential depending in a known manner from the integration over time of said input current and, when said second potential goes beyond a second predetermined threshold, outputting a voltage spike of second predetermined value and duration and reinitializing said second potential.

5. The circuit of claim 3, wherein:

said N input conductors comprise N row line conductors;

said at least one output conductor comprises a plurality of column line conductors, said plurality of programmable conductance circuits being connected each between a different row line conductor and a different column line conductor; and

said at least one output excitatory tonic neuron circuit comprises a plurality of output excitatory tonic neuron circuits having each an input connected to a different column line conductor.

6. The circuit of claim 1, wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

7. The circuit of claim 6, wherein each memory cell comprises a memristor.

8. The circuit of claim 3, wherein each excitatory tonic neuron circuit comprises first and second negative differential resistance devices biased each with opposite polarities, said first and second negative differential resistance devices being coupled to first and second grounded capacitors.

9. The circuit of claim 8, wherein:

said first negative differential resistance device has a first node connected to an input node of the excitatory tonic neuron circuit by a first load resistor and a second node connected to a first voltage source; said first node of said first negative differential resistance device being coupled to said first grounded capacitor; and said second negative differential resistance device has a first node connected to said first node of said first negative differential resistance device by a second load resistor and a second node connected to a second voltage source; said first node of said second negative differential resistance device being coupled to said second grounded capacitor; said first node of said second negative differential resistance device forming an output node of the neuron circuit.

10. The circuit of claim 9, comprising a biasing circuit for supplying the at least one output excitatory tonic neuron circuit with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

11. A method for calculating a multiplication-accumulation operation comprising multiplying a number N of first operands each by a corresponding second operand, and adding the products of the multiplications, with N > 2; the method comprising:

providing N input conductors;

connecting N programmable conductance circuits each between one of said input conductors and a unique output conductor;

programming each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing on the input conductor connected to each conductance circuit programmed with a first operand an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating an output train of voltage spikes proportional to a sum over time of the spikes received on said output conductor.

12. The method of claim 11, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on said output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

13. The method of claim 11, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

14. The method of claim 13, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

15. The method of claim 13, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

16. A method for calculating a number M of multiplication-accumulation operation each of N first operands by N corresponding second operands, and adding the products of the multiplications, with N > 2 and M > = 2; the method comprising:

providing N input conductors;

connecting N programmable conductance circuits each between one of said input conductors and one of M output conductors;

programming each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing, on the input conductor connected to each conductance circuit programmed with a first operand, an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating M output trains of voltage spikes proportional each to a sum over time of the spikes received on said output conductor.

17. The method of claim 16, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on each output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

18. The method of claim 16, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

19. The method of claim 18, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

20. The method of claim 18, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

21. The method of claim 16 wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

Description:
Transistorless All-Memristor Neuromorphic Circuits for In-Memory Computing

CROSS REFERENCE TO RELATED APPLICATIONS

This application is related to and claims priority from U.S. Provisional Application Serial No. 62/844,611, filed 5/7/2019, U.S. Provisional Application Serial No. 62/860,915, filed 6/13/2019, and U.S. Non-Provisional Application Serial No. 16/808,222, filed 3/3/2020, which are incorporated by reference herein as though set forth in full. This application is also related to U.S. Non-Provisional Application Serial No. 16/808,227, filed 3/3/2020, and a PCT Application (Ref. No. 632560-2), which is being filed concurrently.

STATEMENT REGARDING FEDERAL FUNDING

[0001 ] This invention was made under U.S. Government contract FA8650-18-C- 7869. The U.S. Government has certain rights in this invention.

FIELD OF THE INVENTION

[0002] Embodiments of the present technology relate generally to a device and method to conduct fast and energy-efficient Multiply-ACcumulate (MAC) arithmetic operations; in particular Vector-Matrix Multiplication (VMM) operations in a parallel fashion.

BACKGROUND

[0003] One way to implement VMM operations in hardware is an architecture known as a crossbar array accelerator. In known crossbar array accelerators that use analog input and output, the conductance of memory cells at each cross point is used as one of the MAC operands and Ohm's law is utilized to realize multiplications. To date, such crossbar array accelerators have been constructed at least partially, if not entirely, using complementary metal-oxide-semiconductor (CMOS) circuitry.

[0004] Hybrid CMOS-passive memristor crossbar circuits have been developed recently. Passive memristors (aka resistive random-access memory, ReRAM or RRAM) are a category of two-terminal circuit elements with an electrically reprogrammable device resistance; thereby they can be used in principle as a nonvolatile memory cell. Passive memristors operating in analog domain are well suited for implementing the massive MAC arithmetic operations required in convolutional neural networks (CNNs) and other in-memory computing (IMC) application scenarios. In an N by M memristor crossbar array, where N is the number of rows (word lines), M is the number of columns (bit lines), a passive memristor device connects each crosspoint between the word line (row) and the bit line (column). At an arbitrary crosspoint ( i , j) between the i th word line and the j th bit line, multiplication operation can be achieved by the Ohm's law relationship between voltage and current I ij V i x G ij , where I ij is the output current flowing through the memristor, V i is input voltage applied on the i th word line (with bit lines set at the ground level), and is the memristor device conductance. The accumulation can be realized by instantaneous current summation along the shared bit line by Kirchhoffs current law where I j is the total summed current at the j th bit line. Thereby the MAC computation (aka dot-product or inner-product operation), is the N-dimension input voltage vector, is the N-dimension device conductance vector at the j th bit

line), can be achieved. With many bit lines in parallel, MAC computations on each bit line are performed at the same time, thereby VMM operations, i = v .G (here i = [I 1 I .. I M ] is the M-dimensional output current vector, G = [g 1 g 2 ... g M ] is the N by

2 M device conductance matrix) can be realized in an analog and parallel fashion. The computation outputs can be saved in a separate passive memristor crossbar array if needed. An advantage of a passive memristor analog dot-product circuit is the high area density, high speed, and power efficiency. Importantly, if the application allows repeated usage of one of the two operands (e.g. the device conductance) for MAC operations, then MAC workloads are performed in parallel at the location of data, i.e. in situ computation, without wasting the time and energy of repeatedly shuffling the operand values between processing cores and remote memories as in the case of digital processors. This is especially true for CNN based deep learning inference applications.

[0005] Figure 1A illustrates a prior-art, mixed-signal implementation of a circuit 10 forming a portion of a VMM operator, comprising two analog nonvolatile memory elements (memristors) 12 1,1 and 12 2,1 that couple two input rows 14 1 and 142 of circuit 10 to a single output column 16 1 . Memristors 12 1,1 , 12 2,1 are each programmable to have a conductance, respectively G 1,1 and G 2, 1 · Values of G 1,1 and G 2,1 are proportional each to the value of a first input operand and can be different. Input rows 14p 142 are provided for receiving input voltages VI, V2. Values of VI and V2 are proportional each to a second input operand and can be different. The total summed output current I passing through the output column 16 1 will be equal to V1· G 1,1 + V2· G 2, 1 , and thus to the sum of the respective multiplications of the first input operands by the second input operands.

[0006] Figure IB illustrates a more complete implementation of circuit 10 comprising N input rows 14 j (i = 1 to N and M columns 16 j (j= 1 to M, a memristor 12 i,j connecting each input row 14 i to each output column 16 j . As illustrated in Figure 1B, each input row 14 i is connected to the output of a digital-to- analog converter (DAC) circuit 18 j ; each DAC circuit 18 j being arranged to receive in its input one of the second operands in digital format and to convert it to an analog voltage proportional to the received digital operand. As also illustrated in Figure IB, each output column is connected, through a sample and hold circuit 20j, to the input of an analog-to-digital converter (ADC) circuit 22, itself having an output connected to a shift and add circuit 24.

[0007] In circuit 10, the data representation is analog (voltages and currents) inside the crossbar array and digital (typically binary) outside the array.

[0008] A representative work of known circuits such as shown in Figures 1 A and IB is the publication: "ISAAC: A convolutional neural network accelerator with in- situ analog arithmetic in crossbars", by Shafiee, A., Nag, A., Muralimanohar, N., Balasubramonian, R., Strachan, J.P., Hu, M., Williams, R.S., and Srikumar, V., in Proceedings of the International Symposium on Computer Architecture, Seoul, Korea, 18-22 June 2016; pp. 14-26 and hereby incorporated by reference.

[0009] Although hybrid CMOS-memristor dot-product accelerators such as illustrated in Figure IB can realize analog domain computation in a parallel fashion, they face several challenges. One challenge is that the interfacing/communication with the external digital circuitries necessitates the use of ADCs and DACs, as illustrated, which incur area and energy overheads. In some designs the ADC/D AC can consume 85% to 98% of the total area and power. To reduce the area/energy overhead, in the design of the publication cited above, each ADC is shared by multiple memristor crossbar arrays. Moreover, most of the demonstrations still require access transistors (1T1M/1T1R, or one-transistor one-memristor/resistor cells) to block the sneak paths, hence preclude a multilayer stackability.

[0010] A different known example of VMM circuit (not illustrated) comprises full- CMOS implementation of a spiking neuromorphic circuit that mimics a spike- domain data representation. This circuit uses CMOS-built neurons arranged to input spike-encoded data into synapses comprising variable gain amplifiers provided for processing the spike-encoded data. The circuit further uses CMOS-built neurons to gather the output of the variable gain amplifier synapses.

[0011 ] Despite the existence of the prior arts circuits outlined above, there remains a need for an even smaller, more energy-efficient, and faster circuit for calculating multiplication-accumulation operations.

SUMMARY OF THE INVENTION

[0012] Embodiments of this presentation comprise methods for calculating a multiplication-accumulation operation comprising supplying a crossbar array made of programmable conductance circuits, programmed as a function of first operands, with input spike-train encoded second operands. Embodiments of this presentation comprise as well circuits for implementing such methods.

[0013] Embodiments of this presentation relate to the following concepts:

[0014] Concept 1: A crossbar array comprising a plurality of row lines and a plurality of column lines; each row line connected to each column line by at least one passive memristor synapse; an input active-memristor neuron supplying an input to each row line; and an output active-memristor neuron receiving an output from each column line.

[0015] Concept 2. The crossbar array of concept 1 wherein each said active- memristor neuron supplying an input to each row line supplies a train of spike or pulses.

[0016] Concept 3. The crossbar array of claim 1 or 2 wherein each input active- memristor neuron is a circuit having at least two VO 2 Mott memristors and a plurality of resistors and capacitors. [0017] Concept 4. The crossbar array of claim 1 to 3 wherein a small bias current is supplied to each output active-memristor neuron to fine-tune the resting potential of the output active-memristor neuron.

[0018] Concept 5. A method to perform multiply-accumulate arithmetic operations, the method comprising: encoding data into a plurality of trains of spike pulses; supplying each train of spike pulses to an input active-memristor neuron; supplying the output of each input active-memristor neuron to a corresponding row of passive-memristor synapses; forming a sum of the output currents of all the corresponding passive-memristor synapses on a same column line; supplying the current sum to an output active-memristor neuron connected to said same column line; wherein the output active-memristor neuron converts the current sum into a train of spike pulses encoding the multiply-accumulate operations; and optionally decoding data from the train of spike pulses from the output active-memristor neuron.

[0019] Concept 6. A neuromorphic crossbar array circuit comprising: a plurality of row line conductors and at least one column line conductor; a plurality of programmable conductance circuits connected each between a different row line conductor and the at least one column line conductor; a plurality of input excitatory tonic neuron circuits having each an output connected to a different row line conductor; and at least one output excitatory tonic neuron circuit having an input connected to the at least one column line conductor.

[0020] Concept 7. The neuromorphic crossbar array circuit of concept 6, wherein the at least one column conductor comprises a plurality of column line conductors; said plurality of programmable conductance circuits being connected each between a different row line conductor and a different column line conductor; said at least one output excitatory tonic neuron circuit comprising a plurality of output excitatory tonic neuron circuits having each an input connected to a different column line conductor. According to an embodiment, each programmable conductance circuit is a memory cell; and the plurality of programmable conductance circuits arranged along rows and columns forms a memory array.

[0021] Concept 8. The neuromorphic crossbar array circuit of concept 6 or 7, wherein each excitatory tonic neuron circuit comprises first and second negative differential resistance devices biased each with opposite polarities, said first and second negative differential resistance devices being coupled to first and second grounded capacitors.

[0022] Concept 9. The neuromorphic crossbar array circuit of concept 8, wherein: said first negative differential resistance device has a first node connected to an input node of the neuron circuit by a first load resistor and a second node connected to a first voltage source; said first node of said first negative differential resistance device being coupled to said first grounded capacitor; and said second negative differential resistance device has a first node connected to said first node of said first negative differential resistance device by a second load resistor and a second node connected to a second voltage source; said first node of said second negative differential resistance device being coupled to said second grounded capacitor; said first node of said second negative differential resistance device forming an output node of the neuron circuit.

[0023] Concept 10. The neuromorphic crossbar array circuit of concept 6 to 10, comprising a biasing circuit for supplying the at least one output excitatory tonic neuron circuit with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

[0024] Concept 11. The neuromorphic crossbar array circuit of concept 7, comprising a plurality of biasing circuits for supplying each of the plurality of output excitatory tonic neuron circuits with a bias current arranged to fine-tune a resting potential of said each of the plurality of output excitatory tonic neuron circuits.

[0025] Concept 12. The neuromorphic crossbar array circuit of concept 6 to 11, wherein each excitatory tonic neuron circuit is arranged for outputting a train of voltage spikes having a rate that depends on an input current.

[0026] Concept 13. A neuromorphic crossbar array circuit comprising: a plurality of row line conductors and at least one column line conductor; a plurality of programmable conductance circuits connected each between a different row line conductor and the at least one column line conductor; a plurality of input neuron circuits having each an output connected to a different row line conductor; and at least one output neuron circuit having an input connected to the at least one column line conductor; wherein each neuron circuit comprises first and second negative differential resistance devices biased each with opposite polarities, said first and second negative differential resistance devices being coupled to first and second grounded capacitors.

[0027] Concept 14. The neuromorphic crossbar array circuit of concept 13, wherein the at least one column conductor comprises a plurality of column line conductors; said plurality of programmable conductance circuits being connected each between a different row line conductor and a different column line conductor; said at least one output neuron circuit comprising a plurality of output neuron circuits having each an input connected to a different column line conductor.

[0028] Concept 15. The neuromorphic crossbar array circuit of concept 14, wherein each programmable conductance circuit is a memory cell; and the plurality of programmable conductance circuits arranged along rows and columns forms a memory array.

[0029] Concept 16. The neuromorphic crossbar array circuit of concept 15, wherein: said first negative differential resistance device has a first node connected to an input node of the neuron circuit by a first load resistor and a second node connected to a first voltage source; said first node of said first negative differential resistance device being coupled to said first grounded capacitor; and said second negative differential resistance device has a first node connected to said first node of said first negative differential resistance device by a second load resistor and a second node connected to a second voltage source; said first node of said second negative differential resistance device being coupled to said second grounded capacitor; said first node of said second negative differential resistance device forming an output node of the neuron circuit.

[0030] Concept 17. The neuromorphic crossbar array circuit of concept 16, comprising a biasing circuit for supplying the at least one output neuron circuit with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

[0031] Concept 18. The neuromorphic crossbar array circuit of concept 13 to 17, wherein each neuron circuit is an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a rate that depends on an input current.

[0032] Concept 19. The neuromorphic crossbar array circuit of concept 18, wherein the input current must be larger than a predetermined threshold (or "supra threshold") to be taken in account by the neuron circuit.

[0033] Concept 20. A spike-domain multiplicator accumulator circuit for multiplying each of a plurality of data by each of a plurality of coefficients and for adding the multiplication products, the circuit comprising: a plurality of input excitatory tonic neuron circuits arranged each for:

receiving a voltage proportional to one of said plurality of data; and outputting a series of spikes having a frequency proportional to the received voltage; a plurality of programmable conductance circuits having each a first end connected in output of one of said plurality of input circuits; each programmable conductance circuit being arranged to have a conductance proportional to one of said plurality of coefficients; and at least one output excitatory tonic neuron circuit connected to a second end of each programmable conductance circuits. According to an embodiment, each programmable conductance circuit is a memory cell; and the plurality of programmable conductance circuits forms a memory array.

[0034] Concept 21. The spike-domain multiplicator accumulator of concept 20, wherein said plurality of programmable conductance circuits is one plurality of a set of pluralities of programmable conductance circuits, and wherein said output excitatory tonic neuron circuit is one of a set of output excitatory tonic neuron circuits connected each to a second end of each programmable conductance circuits of one plurality of programmable conductance circuits of the set of pluralities of programmable conductance circuits.

[0035] Concept 22. A circuit for multiplying a number N of first operands each by a corresponding second operand, and for adding the products of the multiplications, with N > 2; the circuit comprising: N input conductors; N programmable conductance circuits connected each between one of said input conductors and at least one output conductor; each programmable conductance circuit being arranged to be programmable at a value depending in a known manner from one of said first operands; each input conductor being arranged to receive from an input circuit an input train of voltage spikes having a spike rate that derives in a known manner from one of said second operands; at least one output circuit arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor.

[0036] Concept 23. The circuit of concept 22, wherein said at least one output circuit is arranged to generate an output train of voltage spikes having a spike rate that derives in a known manner from a sum over time of the spikes received on said at least one output conductor by generating a first potential deriving in a known manner from the integration over time of the current received on said at least one output conductor and, when said first potential goes beyond a first predetermined threshold, outputting a voltage spike of first predetermined value and duration and reinitializing said first potential.

[0037] Concept 24. The circuit of concept 22 or concept 23, wherein each input circuit comprises an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate deriving in a known manner from an input current deriving in a known manner from a second operand.

[0038] Concept 25. The circuit of concept 24, wherein each excitatory tonic neuron circuit is provided for generating a second potential depending in a known manner from the integration over time of said input current and, when said second potential goes beyond a second predetermined threshold, outputting a voltage spike of second predetermined value and duration and reinitializing said second potential.

[0039] Concept 26. The circuit of concept 24, wherein: said N input conductors comprise N row line conductors; said at least one output conductor comprises a plurality of column line conductors, said plurality of programmable conductance circuits being connected each between a different row line conductor and a different column line conductor; and said at least one output excitatory tonic neuron circuit comprises a plurality of output excitatory tonic neuron circuits having each an input connected to a different column line conductor.

[0040] Concept 27. The circuit of concept 22 to concept 26, wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

[0041] Concept 28. The neuromorphic crossbar array circuit of concept 27, wherein each memory cell comprises a memristor. [0042] concept 29. The neuromorphic crossbar array circuit of concept 24, wherein each excitatory tonic neuron circuit comprises first and second negative differential resistance devices biased each with opposite polarities, said first and second negative differential resistance devices being coupled to first and second grounded capacitors.

[0043] Concept 30. The neuromorphic crossbar array circuit of concept 29, wherein: said first negative differential resistance device has a first node connected to an input node of the excitatory tonic neuron circuit by a first load resistor and a second node connected to a first voltage source; said first node of said first negative differential resistance device being coupled to said first grounded capacitor; and said second negative differential resistance device has a first node connected to said first node of said first negative differential resistance device by a second load resistor and a second node connected to a second voltage source; said first node of said second negative differential resistance device being coupled to said second grounded capacitor; said first node of said second negative differential resistance device forming an output node of the neuron circuit.

[0044] Concept 31. The neuromorphic crossbar array circuit of concept 30, comprising a biasing circuit for supplying the at least one output excitatory tonic neuron circuit with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

[0045] Concept 32. A method for calculating a multiplication-accumulation operation comprising multiplying a number N of first operands each by a corresponding second operand, and adding the products of the multiplications, with N ³ 2; the method comprising: providing N input conductors; connecting N programmable conductance circuits each between one of said input conductors and a unique output conductor; programming each programmable conductance circuits with a conductance proportional to one of said first operands; imputing on the input conductor connected to each conductance circuit programmed with a first operand an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand; generating an output train of voltage spikes proportional to a sum over time of the spikes received on said output conductor.

[0046] Concept 33. The method of concept 32, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on said output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

[0047] Concept 34. The method of concept 32 or 33, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

[0048] Concept 35. The method of concept 34, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

[0049] Concept 36. The method of concept 34 or 35, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

[0050] Concept 37. A method for calculating a number M of multiplication- accumulation operation each of N first operands by N corresponding second operands, and adding the products of the multiplications, with N ³ 2 and M ³ = 2; the method comprising: providing N input conductors; connecting N programmable conductance circuits each between one of said input conductors and one of M output conductors; programming each programmable conductance circuits with a conductance proportional to one of said first operands; imputing, on the input conductor connected to each conductance circuit programmed with a first operand, an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand; generating M output trains of voltage spikes proportional each to a sum over time of the spikes received on said output conductor.

[0051] concept 38. The method of concept 37, wherein said generating an output train of voltage spikes proportional to the sum over time of the spikes received on each output conductor comprises integrating the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

[0052] Concept 39. The method of concept 37 or 38, comprising providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

[0053] Concept 40. The method of concept 3918, wherein each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

[0054] Concept 41. The method of concept 39, comprising transforming an input voltage proportional to said second operand into said current proportional to said second operand.

[0055] Concept 42. The method of concept 37 to 41 wherein each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns. BRIEF DESCRIPTION OF THE DRAWINGS

[0056] Aspects of the present invention are illustrated by way of example, and not by way of limitation, in the accompanying drawings.

[0057] Figures 1A and IB are schematic views of known MAC and VMM operations circuits.

[0058] Figures 2A and 2B are schematic views of MAC and VMM operations circuits according to embodiments of this presentation.

[0059] Figure 3 is a schematic view of a neuron circuit of Figures 2A and 2B.

[0060] Figures 4A and 4B illustrate two rate-based convolution calculations in a circuit according to embodiments of this presentation.

[0061] Figures 5A to 5D illustrate the operation of a circuit according to embodiments of this presentation with respect to the calculations of Figures 4A and 4B.

[0062] Figures 6A and 6B illustrate the operation of a circuit according to embodiments of this presentation.

[0063] Figures 7 A and 7B illustrate exemplary input and output image data.

[0064] Figures 8A, 8B and 8C illustrate an exemplary network architecture of customized MATLAB image classification feedforward CNN model with an example simulation.

[0065] Figure 9A illustrates a circuit according to embodiments of this

presentation.

[0066] Figures 9B and 9C illustrate the operation of embodiments of the circuit of Figure 9A.

[0067] Figure 10 is a table summarizing the simulated performance metrics a circuit according to embodiments of this presentation [0068] Figure 11 illustrates a method according to embodiments of this presentation.

[0069] Figure 12 illustrates a method according to embodiments of this presentation.

[0070] The drawings referred to in this description should be understood as not necessarily being drawn to scale.

DESCRIPTION OF EMBODIMENTS

[0071] The detailed description set forth below in connection with the appended drawings are intended as a description of various embodiments of the present invention and are not intended to represent the only embodiments in which the present invention is to be practiced. Each embodiment described in this disclosure is provided merely as an example or illustration of the present invention, and should not necessarily be construed as preferred or advantageous over other embodiments. In some instances, well-known methods, procedures, objects, and circuits have not been described in detail so as to not unnecessarily obscure aspects of the present disclosure.

[0072] By contrast with the above-described prior art circuits, embodiments of this presentation relate to an all-memristor neuromorphic MAC accelerator that can be entirely analog, which can employ nonvolatile passive-memristor synapses for row- to-column coupling, and spiking active-memristor neurons for generating input and output signals. Embodiments of this presentation naturally utilize a spike domain data representation analogous to mammal brains. Embodiments of this presentation employ a fully analog neuromorphic architecture and a spike-domain data representation for in-memory MAC acceleration. [0073] Figure 2A illustrates a circuit 30 according to embodiments of this presentation, for N multiplication operations (N ³ 2 and N = 25 in Figure 2A) of first operands each by a corresponding second operand, and for adding the products of the multiplications. According to embodiments of this presentation, circuit 30 comprises N input conductors 32 i (i = 1 to N); N programmable conductance circuits W i,1 connected each between one of the input conductors 32 i and at least one output conductor 34 1 According to an embodiment of this presentation, each programmable conductance circuit W i,1 is arranged to be programmable at a value deriving in a first known manner from one of the first operands. According to embodiments of this presentation, the language "X depending in a known manner from Y" can mean "X depending in a known linear manner from Y" (e.g. "X proportional to Y" with a known ratio) or "X depending in a known non-linear manner from Y", such that the knowledge of X allows finding the value of Y and reciprocally. According to an embodiment of this presentation, each input conductor 32 i is arranged to receive from an input circuit 36 j an input train i n i of voltage spikes having each a spike rate that derives in a second known manner from one of said second operands. Figure 2A illustrates two input trains of voltage spikes in 1 and in 2 having each a spike rate that derives in a third known manner from (e.g. is proportional to) one respective of the second operands. As illustrated in Figure 2A, each second operand can itself derive in a fourth known manner from e.g. a grayscale intensity of a pixel of an image patch that is to be processed in a convolution. According to an embodiment of this presentation, the at least one output conductor 34 1 can be connected to an output circuit 38 1 arranged to generate an output train out 1 of voltage spikes having a spike rate that derives in a fifth known manner from a sum over time of the area of the spikes received on said at least one output conductor.

[0074] The circuit illustrated in Figure 2A has a total of M (= 25) input neurons that can for example convolute a (5x5) input image patch with a pre-trained (5x5) convolution kernel (or convolution filter in the CNN terminology) in one shot. Each of the M input neurons 36 i is connected to the same output neuron 38 1 through a memristor "synapse" W 1,i which functions as a programmable resistor to store the convolution weight. For CNN inference applications, only feedforward connections are implemented, i.e. data only flows from input circuits to output circuits without a feedback (recurrent) connection. In other words, the output data at out 1 does not get fed back to any of the input. The resistive connection between each input-output neuron pair performs a scaling operation (by Ohm's law) in addition to transmit the data (in the form of spikes). The scaling value is determined by the device conductance which represents the weight element of the 5x5 convolution kernel (filter).

[0075] Figure 2B illustrates a more complete view of Circuit 30, showing that circuit 30 can comprise more than one columns, wherein: said N input conductors comprise N row line conductors (N =25 in Fig. 2B), wherein the "at least one output conductor" 34 1 described in relation with Figure 2A actually comprises a plurality

M (M =10 in Fig. 2B) of column line conductors 34 j , with j = 1 to M; and wherein the plurality of programmable conductance circuits described in relation with Figure 2A actually comprises N x M programmable conductance circuits W i,j connected each between a different row line conductor 32 j and a different column line conductor 34 j . As illustrated in Figure 2B, the "at least one output excitatory tonic neuron circuit" 38 1 described in relation with Figure 2A can actually comprise a plurality M of output excitatory tonic neuron circuits 38 j having each an input connected to a different column line conductor 34 j .

[0076] As outlined above, the all-memristor (i.e. transistor-less) VMM circuit of Figure 2B can be used for simultaneous convolution of a (5x5) input image with ten (5x5) convolution kernels (filters). This is realized by a (25x10) passive memristor crossbar array, which consists of 10 distinctive copies of the MAC circuit in Fig. 2 A. Each copy receives the same input data, but encodes weights from different convolution kernels (filters). In this implementation, the 25 input neurons are shared among the 10 convolution kernels. There are 10 output neurons and 250 (25x10) synapses to store 250 weight values. In the simplest method of implementation, each weight can be stored as an analog conductance value of a nonvolatile passive memristor. It should be noted that since the weight, in embodiments of this presentation, is derived from the resistance, a positive value, the weight is a positive value too. More sophisticated implementations may stack multiple memristor cells to increase the bit precision of the weight.

[0077] According to an embodiment of this presentation, each of said plurality of programmable conductance circuit W i,j can comprise a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns, and comprising programming circuits (not shown) for storing desired values in each memory cell, for example as a conductance that derives in a known manner from values to be stored. Such an embodiment advantageously allows storing in the memory cells the first operands just after calculating them, thus allowing to use the first operands in a VMM immediately after their calculation, without having to transfer them from a memory to a distinct VMM circuit.

[0078] According to embodiments of this presentation each memory cell Wi, j can comprise a passive memristor. According to an embodiment of this presentation, a passive memristor can comprise a Pt/(Ta205:Ag)/Ru (silver-doped tantalum pentoxide switching layer sandwiched by platinum and ruthenium metal electrodes) passive memristor such as described in the publication: "Yoon, Jung Ho, et al. "Truly electroforming-free and low-energy memristors with preconditioned conductive tunneling paths." Advanced Functional Materials 27 (2017): 1702010", hereby incorporated by reference.

[0079] According to embodiments of this presentation, the at least one output circuit 38 1 is arranged to generate an output train out ] of voltage spikes having a rate that derives in fifth a known manner from the sum over time of the area of the spikes received on said at least one output conductor 34 1 by generating a first potential deriving in a sixth known manner from the integration over time of the current received on said at least one output conductor 34 1 and, when said first potential goes beyond a first predetermined threshold, outputting a voltage spike of first predetermined value and duration and reinitializing said first potential. According to an embodiment, to reduce noise the input current must be larger than a predetermined threshold (or "supra threshold") to be taken in account by the output neuron circuit.

[0080] According to embodiments of this presentation, the output circuit 34 1 can be an excitatory tonic neuron circuit such as the neuron circuit 40 illustrated in Figure 3, comprising first and second negative differential resistance devices 42, 44, biased each with opposite polarities, said first 42 and second 44 negative differential resistance devices being coupled to first 46 and second 48 grounded capacitors. In more details, according to embodiments of this presentation, first negative differential resistance device 42 has a first node 50 connected to an input node 52 of the excitatory tonic neuron circuit 40 by a first load resistor 56 and a second node 58 connected to a first voltage source 60; said first node 50 of said first negative differential resistance device 42 being coupled to said first grounded capacitor 46. Further, in said embodiments the second negative differential resistance device 44 has a first node 62 connected to the first node 50 of the first negative differential resistance device 42 by a second load resistor 64 and a second node 66 connected to a second voltage source 68; said first node 62 of said second negative differential resistance device 44 being coupled to said second grounded capacitor 48; said first node 62 of said second negative differential resistance device 44 forming an output node of the neuron circuit 40.

[0081 ] According to embodiments of this presentation, the input circuits 36 i can each comprise an excitatory tonic neuron circuit arranged for outputting a train of voltage spikes having a spike rate that depends in a seventh known manner from an input current itself depending in an eight known manner from a second operand. According to embodiments of this presentation, each excitatory tonic neuron circuit can be provided for generating a potential depending in a ninth known manner from the integration over time of said input current and, when said potential goes beyond a second predetermined threshold, outputting a voltage spike of second predetermined value and duration and reinitializing said second potential.

[0082] According to embodiments of this presentation, each input circuit 36 i can be an excitatory tonic neuron circuit such as the neuron circuit 40 of Figure 3.

[0083] Details of the operation of memristor neuron circuit 40 can for example be found in US Patent Application No. 15/976,687. It is noted that the active memristor neuron circuit 40 in Fig. 3 is an excitatory tonic neuron, which fires a continuous spike train in response to a supra threshold d.c. current stimulus (i.e a d.c. current stimulus that is above a predetermined threshold).

[0084] For example, the resistors, capacitors and voltage sources illustrated in Figure 3 can have the following values: RL1 (56) = RL2 (64) = 14kW, C1 (46) = 0.2pF, C2 (48) = 0.1 pF, -V1 (60) = -0.95V, +V2 (68) = +0.95V. A small resistance of 150W (not shown) in series with each VO2 device (X1 (42) and X2 (44)) can be used to model the electrode wire resistance. For both X1 (42) and X2 (44), the cylindrical VO2 conduction channel can have a radius of 36nm and a length of 50nm (film thickness). An actual VO2 device can have an effective area of 100x100nm and a 100nm film thickness. For the aforementioned circuit parameters, the simulated dynamic spiking energy use can be of about 0.4 pJ/spike.

[0085] Embodiments of this presentation use a spike rate based data representation/coding scheme, wherein the input image pixel data (e.g. grayscale intensity) are first converted into continuously firing spike trains as the outputs of input neurons. This data conversion can be realized by different methods.

[0086] A first method can comprise converting the pixel data to a d.c. current level, then feed the current to an input neuron. The neuron can be viewed as a nonlinear processing node that performs a "leaky integrate-and-fire" (LIF) operation, i.e. that integrates the input charges (over a characteristic time window) across a membrane capacitor (in presence of a leak through it), thus increasing the membrane potential. Once the membrane potential goes beyond a predetermined threshold, the neuron fires an output spike, resets the membrane potential and start over again. A d.c. input current is hence converted into a repetitive spike train with a character frequency that depends on the input current level.

[0087] A second method can comprise using a separate (e.g. CMOS) circuit to convert the pixel data to a train of digital pulses, which are fed to input neurons (this is the case shown in Fig. 2(a)). Each digital pulse is then supra threshold and arranged to elicit an output spike.

[0088] A third method can comprise using hardware such as event-based vision sensors (e.g. dynamic vision sensors, DVS) where output data (image data for event-based vision sensors) are already in the form of spike trains, which saves the size/power overheads for data conversion. [0089] As outlined above, embodiments of this presentation utilize active and passive memristor devices and circuits developed at HRL and described for example in: US Patent Application No. 16/005,529; US Patent Application No. 15/976,687; and US Patent Application No. 15/879,363, hereby incorporated by reference.

[0090] A crossbar architecture according to embodiments of this presentation reduces the energy and time required to perform MAC operations. In particular, the plurality of programmable conductance circuits programmed with the first operands can actually each be a memory cell used to store the first operands. This way, the first operands can be locally saved and reused, thus greatly reducing the energy used in moving data between the processor and the memory. Also, passive- memristor synapses can be programmed and addressed in an analog (continuous) fashion, without the necessity of a system clock (as in the case for digital systems). In deep learning applications and in applications that do not require a high-precision data representation, power-hungry, high-precision 32-bit or 64-bit data representations are not needed, as they have been in prior-art implementations realized in CMOS electronics.

[0091] According to embodiments of this presentation, spike-domain data representation enables energy-efficient MAC operations with minimal current draw, since neuronal spike trains can be thought of as digital pulse trains with ultra-low duty cycles.

[0092] Further, unlike prior-art implementations realized in CMOS electronics, in which all devices and circuits are constrained to the 2D topology of the Si substrate, the present invention is amenable to a 3D, stacked topology such as detailed in the above-cited HRL patent applications.

[0093] In an all-memristor design according to embodiments of this presentation, functions replacing those of ADCs and DACs are effectively realized by memristor spiking neurons. Besides the savings of size and energy overhead, thanks to the scalable and energy-efficient memristor building blocks, a main difference between embodiments of this presentation and some of the known art lies in the data representation that is sent to programmable conductance for applying Ohm's law as a data multiplier. According to embodiments of this presentation, a spiking memristor neuron operates with the "integrate and fire" principle, wherein the neuron membrane potential drifts higher by integrating an input current over time. Once the membrane potential goes beyond a threshold, a spike (aka action potential, or a narrow electrical impulse) is output (for details, see HRL Patent Application No. 15/976,687).

[0094] Active memristor neurons have superior energy efficiencies compared to CMOS prior arts, thanks to both the simple circuit topology (typically consisting of just two active memristors and 4 to 5 passive R, C elements) and the energy-efficient memristor switching operations. For more detailed analysis in the size/energy scaling of active memristor neurons that can be used in embodiments of this presentation, see for example the publication: "Biological plausibility and stochasticity in scalable V02 active memristor neurons," by Yi, W., Tsang, K. K., Lam, S. K., Bai, X., Crowell, J. A., and Flores, E. A., Nature Communications 9, pp. 4661 (2018) hereby incorporated by reference.

The dynamic switching energy of V02 Mott memristors in such memristor neuron circuits is extremely low, only at ~ 1 to 10 fJ/device at V02 channel radius of 7 to 20 nm and thickness of 20 nm (see Supplementary Fig. 9a in the above-mentioned Yi publication, thus only contributing a small fraction in the neuron spiking energy. The neuron circuit dynamic spike energy is dominated by the energy stored in the two membrane capacitors (Cl and C2; 46 and 48 in Figure 3), and hence can scale linearly with user-selectable capacitance values. Simulated VO2 neuron dynamic spike energy can be scaled down to 30 fJ/spike or even lower (see Supplementary Fig. 36a in the above-mentioned Yi publication). In IC designs, the neuron spike energy scaling will be ultimately limited by parasitic capacitance from interconnects. Supplementary Fig. 2 in the above-mentioned publication compares the scaling of energy efficiency vs. neuron area for simulated V02 neurons vs CMOS prior arts and biological neurons, which shows that V02 neurons (at 1 fF/mm-2 specific membrane capacitance) can possibly surpass the estimated human brain energy efficiency of 1.8x10 14 spike/J (or 5.6 fJ/spike energy use) at neuron sizes smaller than 3 mm 2 . The above-mentioned Yi publication also simulated the static power consumption in VO2 neurons contributed by the membrane leakage. The "membrane leakage" is a d.c. leakage current flowing through the two active memristors ("ion channels") in the resting state (due to the finite resistivity of the insulating VO2 phase). The d.c. leakage current through V02 devices are due to the finite device resistance in the insulating phase. Please refer to supplementary figure 37 of the above-mentioned Yi paper for more details. At sufficiently high spike rates (100MHz to 400MHz), the static power makes only an insignificant contribution (less than 10 %) to the total power consumption, and is not expected to be of major concern for the overall energy efficiency (see Supplementary Fig. 37 in the above- mentioned Yi publication).

[0095] As outlined above, another advantage of circuits according to embodiments of this presentation is that, unlike CMOS-only or CMOS-memristor hybrid approaches, the all-memristor architecture can be stacked vertically to replicate multilayered cerebral cortex in mammals' brains for unprecedented neural network density and connectivity.

[0096] According to embodiments of this presentation, an analog passive memristor crossbar array performs similar VMM computations as in the case of hybrid CMOS-memristor circuits (such as illustrated in Figure IB). A main difference lies in the active memristor neurons which replace the roles of ADCs and DACs (and the supporting sample & hold circuits etc.), as well as the energy-savvy spike domain data representation. [0097] According to embodiments of this presentation, the size of the memristor crossbar array needs to be carefully considered. For imperfect nanoscale memristors, the array scalability is limited by factors including the device characteristic variability, the device on/off resistance ratio, and the parasitic interconnect wire resistance, all of which impact the readout margin. The Inventors used a crossbar of 25x10 memristors as a prototype example for simulations and analysis purposes, but the same principle applies to arbitrary array sizes. According to embodiments of this presentation, a memristor crossbar array such as described in the following reference can be used: "A Functional Hybrid Memristor Crossbar-Array/CMOS System for Data Storage and Neuromorphic Applications", by Kuk-Hwan Kim,† Siddharth Gaba,† Dana Wheeler, Jose M. Cruz-Albrecht Tahir Hussa Narayan Srinivasa and Wei Lu*, dx.doi.org/10.1021/nl203687n I Nano Lett. 2012, 12, 389-395 hereby incorporated be reference.

Figures 4 A and 4B show examples of rate-based convolutions by the all-memristor MAC circuit 30 of Fig. 2A, simulated with V02 active memristor model neurons and resistor synapses. According to embodiments of this presentation, resistors are used instead of passive memristors for the sake of simplicity. For the image pixel data, the rule is that“\" encodes the maximum spike rate of 200 MH z, and "0" encodes the minimum spike rate of 100 MH z. For the convolution weight (conductance of the memristors), "1" represents the maximum conductance of (570 kW) 1 , and "0" represents the minimum conductance of (2MW) 1 .

[0098] Figure 4A illustrates a case where the input spike rate vector is parallel to the weight vector. The input vector has a high spike rate of "1" for the top 10 input neurons, and a low spike rate of "0" for the bottom 15 input neurons. Similarly, the synaptic weight vector has a high value of "1" for the top 10 synapses, and a low value of "0" for the bottom 15 synapses. The ideal convolutional output in this case, as given by the (normalized) dot product of the input vector and synaptic weight vector, In a linear rate based encoding, it corresponds to an

intermediate output spike rate of 140 MHz.

[0099] Figure 4B illustrates a case where the input spike rate vector is perpendicular to the weight vector. The input vector has a low spike rate of "0" for the top 15 input neurons, and a high spike rate of "1" for the bottom 10 input neurons. The synaptic weight vector remains the same as the previous case. The ideal convolutional output in this case is

[00100] Figures 5A, 5B, 5C and 5D show each a SPICE simulation result of the cases shown in Figures 4A and 4B, using an all-memristor MAC circuit 30 as in Figure 2A, with 25 VO 2 input neurons 36 j (i = 1 to 25) connected via 25 resistor synapses W i 1 to 1 VO 2 output neuron 38 ] . In the illustrated embodiment, digital pulse trains encoding "0" with 100 MHz frequency (Figure 5B) and "1" with 200 MHz frequency (Figure 5A) are fed to the input neurons as illustrated in Figures 4A and 4B, to generate spike trains at the same frequencies (rates). For the case in Figure 4A, illustrated in Figure 5C, the simulated output spike train shows a spike rate of 135 MHz. This spike rate corresponds to an encoded value of ["0"+('1"-"0")·(135- 100)/(200-100)]="0.35", which is very close to the ideal dot-product value of "0.4" (140 MHz). For the case in Figure 4B, illustrated in Figure 5D, the simulated output spike train shows a spike rate of 100 MHz ("0"), matching exactly with the ideal output rate.

[00101] With this encouraging result of preliminary all-memristor convolutional operation, the Inventors performed further verifications of all-memristor spike-rate based convolutions using MNIST handwritten digit images (included as part of MATLAB neural network toolbox). The Inventors also replaced the resistor synapses with a Ta 2 O 5 :Ag passive memristor SPICE model provided by the authors of the following reference paper: "Truly electroforming-free and low-energy memristors with preconditioned conductive tunneling paths," by Yoon, J. H., Zhang, J., Ren, X., Wang, Z., Wu, H., Li, Z., Bamell, M., Wu, Q., Lauhon, L. J., Xia, Q., and Yang, J. J.; Advanced Functional Materials 27, pp.1702010 (2017).

[00102] According to embodiments of this presentation, the pre-trained convolutional kernel weights can have both positive and negative values, but the converted synaptic conductance values can only have positive values. Having both positive and negative synaptic weights can however be useful for some applications.

[00103] To simulate convolutions of MATLAB MNIST images in spike domain, a linear transformation can be used to convert the pre-trained weights of convolutional kernels (filters) in the convolution layer of a customized MATLAB CNN model into synapse conductance (resistance) values. The CNN model training can be performed in MATLAB using the stochastic gradient descent (SGD) method and can be validated by a statistical MNIST image classification accuracy of 93.7% over 250 test images.

[00104] Figures 6A and 6B show an example MATLAB pre-trained convolutional kernel that the Inventors have studied, and its translation to synapse conductance (resistance) values using the linear transformation. Figure 6A illustrates an example of the ten pre-trained MATLAB convolutional kernels visualized as a (5x5) color scale map; and Figure 6B illustrates the linear transformation relationship between the MATLAB convolutional kernel weights and memristor synaptic conductance (resistance) values. [00105] Figure 7 A shows a 28x28 (=576) pixel input image from a MATLAB MNIST image set. For clarity, grayscale (255 levels) values of the image pixels are color coded as illustrated in the figure. In this example image, five example patches were extracted along the diagonal direction, each of these patches having 5x5 pixels. These five patches were then used as input examples in LTSpice to simulate all-memristor convolutions using a crossbar circuit consisting of 25 input active-memristor neurons, 25 passive-memristor synapses and one output active-memristor neuron (see FIG. 4A and 4B).

[00106] Figure 5B shows an "ideal case" output image convoluted by a pre-trained MATLAB CNN 5x5 convolutional kernel (filter), using synapse conductance values converted from the original kernel weights in FIG. 6A by the linear transformation in FIG. 6B. In Figure 7B, pixels are calculated by numerical convolutions in MATLAB followed by a final conversion of the output grayscale data into spike rates for a comparison with LTSpice simulations of all-memristor convolutional circuits. The size of the convoluted image is 24x24 pixels after removing the edge pixels (two rows on each side). The values of the pixel elements are color coded as illustrated. According to an embodiment of this presentation, a linear rate-based data encoding scheme was used, in which the lowest value of a pixel (value 0) corresponds to a spike rate of 100 MHz, and the highest value of a pixel (value 255) corresponds to a spike rate of 200 MHz. According to an embodiment of this presentation, an all- memristor circuit such as shown in Figures 4 A or 4B can be used to convolute a 5x5 input image patch in one shot.

[00107] Figures 8A, 8B and 8C illustrate an exemplary network architecture of customized MATLAB image classification feedforward CNN model with an example simulation.

[00108] Figure 8A illustrates an exemplary benchmarking MNIST database image set, and Figure 8B is a histogram that schematically illustrates a benchmarking method such as detailed in Figure 8C. Figure 8C illustrates in detail a network architecture of a customized MATLAB image classification feedforward CNN model with an example simulation. Shown at the left side of Figure 8C is a box representing an input image (a handwritten digit image from the MNIST image set) consisting of 28x28 (=784) grayscale pixels. The convolutional operations of Layer 1 (having ten 5x5 convolutional kernels/filters) convert the input image into ten activation images of 24x24 pixels (with two rows of edge pixels removed after convolution). These convolutional calculations were performed by SPICE simulations of an all-memristor convolutional circuits, with the pre-trained MATLAB convolutional kernel weights linearly transformed into synaptic conductance (see FIG. 6B for the transformation relationship). The SPICE simulated memristor-based convolutional outputs were then fed back to the MATLAB CNN model pipeline for the rest of numerical processing. In this example, the CNN model produced a proper classification of the input image as digit 'O', since the classifier output value for Class Ό' is the largest among the ten classes.

[00109] Figures 9 A, 9B and 9C illustrate an example case of simulated all-memristor convolution according to embodiments of this presentation. Figure 7 A shows the schematic for part of an all-memristor VMM circuit, specifically the convolution kernel in the circuit 30 of Figure 2A, wherein the 25 resistor synapses W i 1 are replaced with 25 model Ta 2 O 5 :Ag passive memristors. The conductance of each of these passive memristors is determined by its internal state variable which can be trained by a separate pulse-mode write circuit (not shown) prior to the convolutional operations. Note that to avoid accidental switching of the Ta 2 O 5 :Ag passive memristors during convolution operations, their switching threshold voltage can be designed to be significantly higher than a typical neuron spike amplitude of the order of 1 V. In practical implementations, such voltage-matching needs to be considered, and can be achieved by properly designed active memristor process parameters (e.g. the VO 2 conduction channel geometries). For clarity, the 25 V02 input neurons are not illustrated in Figure 9A.

[00110] Similar to the case of CNNs, the Inventors found that a small convolution filter bias, in the form of a small d.c. biasing current for output neurons, can be needed for optimized convolution accuracy. This is based on the observation of a systematic redshift in the simulated output neuron spike rates without a bias, which can be eliminated by a fine tuning of the neuron resting potential through a biasing current (2 mA in this case, provided through a 500 kW resistor and IV voltage supply). Thus, a neuromorphic crossbar array circuit according to embodiments of this presentation can comprise a biasing circuit 70 for supplying the at least one output excitatory tonic neuron circuit 38 ] with a bias current arranged to fine-tune a resting potential of the at least one output excitatory tonic neuron circuit.

[00111] Figures 9B and 9C show an example simulated convolution of a (5x5) image patch from a sample image in a MATLAB MNIST image set. Specifically, patch 2 in the image of digit "0" as outlined in Figure 6B. Figure 9C compares the relative errors in the SPICE simulated convolution output spike rates for the 10 convolutional kernels, with or without the introduction of a small bias current of 2 mA. The Inventors noticed that after introducing the bias current, the average relative convolutional error is less than 0.8 %. This result is even better than the simulated case using resistor synapses (-1.8 %). Similar results were achieved in two different SPICE simulation software (LTSpice and Cadence Verilog AMS).

[00112] The table in Figure 10 summarizes the simulated performance metrics for an all-memristor VMM circuit according to embodiments of this presentation, with 35 VO2 active memristor neurons and 250 (25x10) passive memristor synapses (see Fig. 2(b)), running convolutional processing of grayscale MNIST handwritten digit images. At the chosen circuit parameters (neuron spike energy 0.4 pJ/spike, synapse resistance 588 kW to 2 MW, spike rate of the order of 100MHz, 40 spikes per MAC operation), the simulated energy efficiency and throughput are already very competitive vs. CMOS prior arts. Simulated convolution energy use per input bit is 0.3 nJ/bit (at 10 convolutions per image pixel and 3 bits/pixel). The convolution throughput is 7.5 Mbits/s (or 2.5M pixels/s) for such a compact crossbar circuit. It must be kept in mind that both the VO 2 neuron spike energy and the number of spikes per MAC operation can be further reduced, which translates into even better energy efficiency. The passive memristor crossbar array size and number of neurosynaptic cores can be further scaled up for much higher convolution throughput. The Inventors project that a high-definition (HD) video rate convolution throughput of the order of 8 Gbit/s with an energy use of the order of 30 pJ/bit is feasible for all-memristor neuromorphic technology, which would disrupt or enable ultra-high-volume data analysis and sensemaking applications.

[00113] Many computationally intensive applications can use a circuit or method according to this presentation. For example: Space-domain data processing: image classification, image processing (e.g. compression) based on fast Fourier transform (FFT) and discrete cosine transform (DCT); Time-domain signal processing based on FFT computations; Security related applications that requires high-throughput and energy-efficient convolutions.

[00114] Figure 11 illustrates a method 90 according to embodiments of this presentation, for calculating a multiplication-accumulation operation comprising multiplying a number N of first operands each by a corresponding second operand, and adding the products of the multiplications, with N ³ 2; the method comprising: providing (92) N input conductors;

connecting (94) N programmable conductance circuits each between one of said input conductors and a unique output conductor;

programming (96) each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing (98) on the input conductor connected to each conductance circuit programmed with a first operand an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating (100) an output train of voltage spikes proportional to a sum over time of the spikes received on said output conductor.

[00115] According to an embodiment of said presentation, said generating (100) an output train of voltage spikes proportional to the sum over time of the spikes received on said output conductor comprises integrating (102) the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

[00116] According to an embodiment of said presentation, the method further comprises providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit (for example such as illustrated in Figure 3) arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

[00117] According to an embodiment of said presentation, each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and

reinitializing said potential.

[00118] According to an embodiment of said presentation, the method further comprises transforming an input voltage proportional to said second operand into said current proportional to said second operand.

[00119] Figure 12 illustrates a method 110 according to embodiments of this presentation, for calculating a number M of multiplication-accumulation operation each of N first operands by N corresponding second operands, and adding the products of the multiplications, with N ³ 2 and M ³ = 2; the method comprising: providing (112) N input conductors;

connecting (114) N programmable conductance circuits each between one of said input conductors and one of M output conductors;

programming (116) each programmable conductance circuits with a conductance proportional to one of said first operands;

imputing (118), on the input conductor connected to each conductance circuit programmed with a first operand, an input train of voltage spikes having a spike rate proportional to the second operand corresponding to said first operand;

generating (120) M output trains of voltage spikes proportional each to a sum over time of the spikes received on said output conductor.

[00120] According to embodiments of this presentation, said generating (120) an output train of voltage spikes proportional to the sum over time of the spikes received on each output conductor comprises integrating (122) the current received on said output conductor over time as a potential and, when said potential goes beyond a predetermined threshold, outputting a voltage spike of predetermined value and duration and reinitializing said potential.

[00121] According to embodiments of this presentation, the method further comprises providing each said input train of voltage spikes having a spike rate proportional to a second operand in output of an excitatory tonic neuron circuit (such as for example illustrated in Figure 3) arranged for outputting a train of voltage spikes having a spike rate proportional to an input current itself proportional to said second operand.

[00122] According to embodiments of this presentation, each excitatory tonic neuron circuit is provided for integrating said input current over time as a potential and, when said potential goes beyond a predetermined threshold, outputting said voltage spike of predetermined value and duration and reinitializing said potential.

[00123] According to embodiments of this presentation, the method further comprises transforming an input voltage proportional to said second operand into said current proportional to said second operand.

[00124] According to embodiments of this presentation, each of said plurality of programmable conductance circuit comprises a memory cell; said plurality of programmable conductance circuits forming a memory array arranged along rows and columns.

[00125] Modifications, additions, or omissions may be made to the systems, apparatuses, and methods described herein without departing from the scope of the inventive concepts. The components of the systems and apparatuses may be integrated or separated. Moreover, the operations of the systems and apparatuses may be performed by more, fewer, or other components. The methods may include more, fewer, or other steps. Additionally, steps may be performed in any suitable order. As used in this document, "each" refers to each member of a set or each member of a subset of a set.

[00126] To aid the Patent Office, and any readers of any patent issued on this application in interpreting the claims appended hereto, applicants wish to note that they do not intend any of the appended claims or claim elements to invoke paragraph 6 of 35 U.S.C. Section 112 as it exists on the date of filing hereof unless the words "means for" or "step for" are explicitly used in the particular claim.