GRADIENT FLOW EMULATION USING DRIFT DIFFUSION PROCESSES - ECOLE POLYTECHNIQUE FED LAUSANNE EPFL

Title:

GRADIENT FLOW EMULATION USING DRIFT DIFFUSION PROCESSES

Document Type and Number:

WIPO Patent Application WO/2021/116743

Kind Code:

Abstract:

The present invention concerns a method of emulating gradient flow for solving a given problem as a charge distribution in a device (1) comprising: first type charge carrier regions (5) interfacing a second type charge carrier region (11) thereby forming charge-flow barriers (20); separating regions (7) for separating the first type charge carrier regions (5) from each other; input terminals (17) connected to the first type charge carrier regions (5); and one or more output terminals (19) connected to the second type charge carrier region (11), and for measuring output signals, and for receiving biasing signals to bias the charge-flow barriers (20) during a measurement phase.

Inventors:

DIWALE SANKET (CH)

Application Number:

PCT/IB2019/060782

Publication Date:

June 17, 2021

Filing Date:

December 13, 2019

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ECOLE POLYTECHNIQUE FED LAUSANNE EPFL (CH)

International Classes:

G06G7/122; G06F17/11; H01L29/78

Foreign References:

US6949401B2	2005-09-27
US20120187498A1	2012-07-26
US4300151A	1981-11-10
US2704818A	1955-03-22

Other References:

HUANG YIPENG ET AL: "Hybrid Analog-Digital Solution of Nonlinear Partial Differential Equations", 2017 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), ACM, 14 October 2017 (2017-10-14), pages 665 - 678, XP033536666
HUANG YIPENG ET AL: "Analog Computing in a Modern Context: A Linear Algebra Accelerator Case Study", IEEE MICRO, vol. 37, no. 3, 14 June 2017 (2017-06-14), pages 30 - 38, XP011652837, ISSN: 0272-1732, [retrieved on 20170614], DOI: 10.1109/MM.2017.55
S. Z. M. A. I. H. S. BETTAYEB: "Embedding grids into hypercubes", JOURNAL OF COMPUTER AND SYSTEM SCIENCES, vol. 45, no. 3, 1992, pages 340 - 366
FENG, X.ZHAO, X.YANG, L. ET AL.: "All carbon materials pn diode", NAT COMMUN, vol. 9, 2018, pages 3750
IVAN VLASSIOUKSERGEI SMIRNOVZUZANNA SIWY: "Nanofluidic Ionic Diodes. Comparison of Analytical and Numerical Solutions", ACS NANO, vol. 2, no. 8, 2008, pages 1589 - 1602

Attorney, Agent or Firm:

LUMI IP LLC (CH)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A device (1 ) for emulating gradient flow for solving a given problem as a charge distribution in the device (1) comprising: · at least three first type charge carrier regions (5) comprising predominantly first type charge carriers, and interfacing a continuous, second type charge carrier region (11) comprising predominantly second type charge carriers having an opposite electric charge to the first type charge carriers, thereby forming at least three charge-flow barriers (20) forming boundary regions between the first and second type charge carrier regions (5, 11), the charge-flow barriers (20) preventing a flow of the first type charge carriers from the first type charge carrier regions (5) to the second type charge carrier region (11 ), and preventing a flow of the second type charge carriers from the second type charge carrier region (11) to the first type charge carrier regions (5) in the absence of an external force to help the first and second type charge carriers to overcome the charge-flow barriers (20), the charge-flow barriers (20) being arranged along one or more imaginary charge-flow barrier reference line(s) (RL1) extending across the device (1),

• separating regions (7) for separating the at least three first type charge carrier regions (5) from each other such that a respective first type charge carrier region (5) is separated from other first type charge carrier regions (5) by respective separating regions (7); · input terminals (17) for applying input signals to the device (1), the input terminals (17) being connected to the at least three first type charge carrier regions (5);

• one or more output terminals (19) connected to the second type charge carrier region (11), and for measuring output signals, and for receiving biasing signals.

2. A device (1) according to claim 1, wherein different input signals are configured to be applied to different input terminals (17) for forming a desired voltage profile along the charge-flow barrier reference line(s) (RL1 ).

3. A device (1) according to claim 1 or 2, wherein the first type charge carrier regions (5) are p-type regions, while the second type charge carrier region (11 ) is an n-type region, or vice versa.

4. A device (1) according to any one of the preceding claims, wherein the second type charge carrier region (11) extends across the device (1) along the one or more imaginary charge-flow barrier reference line(s) (RL1 ).

5. A device (1) according to any one of the preceding claims, wherein during an emulation phase when no biasing signals are applied to the output terminals (19), the device (1) is devoid of any closed circuit configuration involving the first and second type carrier charge regions (5, 11) to prevent any current flowing out of the device (1).

6. A device (1) according to any one of the preceding claims, wherein the first type charge carrier regions (5) are arranged at least in two rows extending along at least two of the one or more imaginary first type charge carrier region reference lines (RL2).

7. A device (1) according to any one of the preceding claims, wherein the separating regions (7) are formed by an insulating element or by the second type charge carrier region (11 ).

8. A device (1) according to any one of the preceding claims, wherein the device (1) is a semiconductor device, and wherein the charge-flow barriers (20) are semiconductor junctions.

9. A device (1) according to any one of the preceding claims, wherein the one or more imaginary charge-flow barrier reference line(s) (RL1) and/or the one or more imaginary first type charge carrier region reference line(s) (RL2) is/are straight lines.

10. A device (1) according to any one of the preceding claims, wherein the output terminals (19) are arranged along one or more imaginary output terminal reference line(s) (RL3), wherein the one or more imaginary output terminal reference line(s) (RL3) is/are straight lines. 11. A device system (21) comprising at least two devices (1) according to any one of the preceding claims, wherein the device system (21) comprises one or more channels (23) such that two respective devices (1) are connected to each other with a respective channel (23).

12. A device system (21) according to claim 11, wherein the device system (21) comprises one or more switches (29) for dynamically connecting one or more devices (1) to each other.

13. A device system (21) according to claim 11 or 12, wherein the one or more channels (23) are semiconductor channels, and wherein the respective channel (23) comprises a first type semiconductor channel region and a second type semiconductor channel region, both extending longitudinally along a respective channel axis.

14. A device system (21) according to any one of claims 11 to 13, wherein the one or more channels (23) comprise control terminals (27) arranged longitudinally along a respective channel axis for providing channel biasing signals.

15. A device system (21) according to any one of claims 11 to 14, wherein the device system (21) comprises one or more feedback loops so that a respective feedback loop is connected between at least some of the output terminals (19) of a respective device and at least some of the input terminals (17) and/or at least some of the control terminals (27) of the channel (23) associated with the respective device (1) to dynamically adjust the input signals and/or the biasing signals based on the output signals.

16. A method of emulating gradient flow for solving a given problem as a charge distribution in a device (1) or a device system (21), a respective device (1) comprising:

• first type charge carrier regions (5) interfacing a second type charge carrier region (11) thereby forming charge-flow barriers (20) restricting movement of electric charge carriers between the first type charge carrier regions (5) and the second type charge carrier region (11) in the absence of an external force to help the electric charge carriers to overcome the charge-flow barriers (20);

• separating regions (7) for separating the first type charge carrier regions (5) from each other; · input terminals (17) connected to the first type charge carrier regions

(5);

• one or more output terminals (19) connected to the second type charge carrier region (11), the method comprising: · determining (125) input signal values for the input terminals (17);

• applying (127) the determined input signal values at least to the input terminals (17) for allowing electric charges to be redistributed in the second type charge carrier region (11);

• applying (129) biasing signals to the one or more output terminals (19); and

• measuring (129) output signals from the one or more output terminals (19) while applying the biasing signals to the output terminals (19) to determine electric charge distributions around the output terminals (19). 17. A method according to claim 16, wherein the method further comprises converting (131) the output signals into charge distributions, and converting the charge distributions into variable distributions for solving the given problem characterised by problem variables.

18. A method according to claim 16 or 17, wherein the method further comprises:

• defining (101) one or more problem variables;

• defining (103) a problem variable space defining a computational space;

• defining (105) one or more problem functions to be used for determining the input signal values; and

• mapping locations of at least the input terminals (17) to points in the computational space to be used for determining the input signal values. 19. A method according to claim 18, wherein prior to the mapping, the method further comprises: • partitioning (107) the computational space into a set of polytopes, wherein a respective polytope comprises a respective set of line segments (31) defining edges of the respective polytope, and a respective set of vertices (33); · mapping (109) the sets of line segments (31) and the sets of vertices

(33) into the device (1 ) so that at least some of the vertices (33) overlap with at least some of the input terminals (17).

20. A method according to claim 19, wherein at least two line segments (31) are mapped into one single device (1) if their vertices (33) overlap with at least some of the input terminals (17) of the device (1 ).

21. A method according to any one of claims 16 to 20, wherein the biasing signals reverse or forward bias the charge-flow barriers (20) during a measurement phase.

22. A method according to any one of claims 16 to 21 , wherein at least some of the determined input signal values are spatially non-constant across the input terminals (17).

Description:

GRADIENT FLOW EMULATION USING DRIFT DIFFUSION PROCESSES

TECHNICAL FIELD

The present invention relates to a hardware accelerator for solving for instance optimal transport and/or Bayesian inference problems using drift-diffusion of charge carriers in the accelerator. The method also relates to a method of operating the accelerator and to a computer program product.

BACKGROUND OF THE INVENTION

Bayesian inference and optimal transport problems play a central role in mathematical modelling of various real-world phenomena. Traditionally the computational tractability of such problems is limited to special cases for exact computation due to the requirement to compute intractable integrals. Even in the cases where exact computation is possible, the computational cost is exponential in the number of integral variables, making the problem NP-hard. Most practical applications resort to iterative numerical approximation of the solution to such problems via numerical optimisation or stochastic sampling algorithms, such as Monte Carlo approaches. More recently, methods stemming from the differential form of the optimisation step in optimal transport (and Bayesian inference, as its special case) have been investigated and result in the requirement to solve a partial differential equation, (sometimes referred to as a gradient flow equation/PDE) to find the optimal solution. The differential form of the gradient flow PDE avoids the computation of the intractable integrals in the original problem. However, the PDE itself requires expensive numerical computation to solve over a discretised domain as required in traditional numerical methods for PDE solvers. All the above approaches, namely the numerical optimisation approximations, Monte Carlo approaches, exact computations or the PDE approach require some form of parallelisation and hardware acceleration to obtain practical solve times. Such acceleration is currently provided by specialised software implementations of these algorithms on a graphics processing unit (GPU), tensor processing unit (TPU) or cluster computing systems. However, the performance in terms of energy efficiency and acceleration capability of currently available acceleration techniques is not satisfactory. SUMMARY OF THE INVENTION

It is an object of the present invention to overcome at least some of the above problems related to solving gradient flow related problems for modelling real- world phenomena. More specifically, the present invention aims to increase the speed and energy efficiency of solving optimal transport and/or Bayesian inference problems.

According to a first aspect of the invention, there is provided a device as recited in claim 1.

The proposed device can be used to emulate the solution to the optimal transport and/or Bayesian inference problems. The emulation using device or semiconductor physics makes solving the gradient flow PDE extremely fast and energy efficient. The emulation happens in the order of microseconds while numerical solution to the same PDE can take tens of minutes to hours on a multicore processor. The device emulation also is a zero current consuming process and thus does not bear any resistive energy loss. Any energy consumed comes from the capacitive losses that may occur during reconfiguration of the hardware to different voltage levels or resistive losses during the output current measurement for the time duration of the measurement. The proposed new solution has also the advantage that compared with existing solutions, the proposed solution has a lower cost, and increased miniaturisation.

The proposed device operating as a hardware accelerator differs significantly from state-of-the-art hardware accelerators that rely on emulating computer algorithms by processing the required computations in parallel using floating point operations on binary digital data. The proposed device on the other hand uses the inherent physical processes occurring within a semiconductor or charge carrying material to emulate a PDE using an analogue process.

According to a second aspect of the invention, there is provided a device system comprising a set of devices according to the first aspect of the present invention.

According to a third aspect of the invention, there is provided a method of operating the device for emulating gradient flow for solving a given problem as a charge distribution in the device as recited in claim 16. Other aspects of the invention are recited in the dependent claims attached hereto.

BRIEF DESCRIPTION OF THE DRAWINGS

Other features and advantages of the invention will become apparent from the following description of a non-limiting example embodiment, with reference to the appended drawings, in which: Figure 1 is a simplified schematic illustration of the device according to an embodiment of the present invention; Figure 2 is simplified schematic illustration of the device according to a variant of the present invention; Figures 3 shows a network of devices of Figure 1 connected together; Figures 4a to 4c are flow charts illustrating method steps for solving a given gradient flow related problem; Figure 5 is a schematic illustration of a computational space according to an example of the present invention; and Figure 6 is a schematic illustration showing how line segments together with their related vertices of a computational space can be mapped into a device space according to an example of the present invention.

DETAILED DESCRIPTION OF AN EMBODIMENT OF THE INVENTION An embodiment of the present invention will now be described in detail with reference to the attached figures. This embodiment is described in the context of a semiconductor accelerator device which may be used for solving the gradient flow PDE and thus to obtain solutions to the optimal transport and/or Bayesian inference problems as electric charge distributions in the semiconductor device. The gradient flow emulation can be used to accelerate Bayesian inference tasks for instance in machine learning, signal processing and estimation. Bayesian inference is also used to approximate neural networks and other such machine learning models. It is also used to establish probabilistic numerical methods to solve problems in linear algebra, optimisation, and approximation of linear operators like integration, differentiation, convolution etc. These are fundamental operations in mathematics that find applications across many fields including machine learning, optimisation, decision making, control theory, operations research, signal processing, computer graphics and more. The proposed solution provides an energy efficient hardware accelerator for these computations, addressing the needs of all such fields and provides a superior alternative to energy hungry computations using GPUs or similar units. The embodiment describing the semiconductor accelerator device is also extended to cover a reconfigurable semiconductor accelerator device system comprising a set of interconnected semiconductor devices. However, the teachings of the invention are not limited to the above applications. The teachings of the invention are equally applicable in various other technical fields. Identical or corresponding functional and structural elements which appear in the different drawings are assigned the same reference numerals.

As will be described below in more detail, the present invention proposes a semiconductor accelerator and an in-hardware physical process to solve the gradient flow PDE for obtaining solutions to any one of the above-mentioned problems as charge distributions in a semiconductor. The invention thus covers the design of the physical hardware device, and a process of operating the device, which involves the physical process (drift-diffusion) occurring in the semiconductor as a means to solve the gradient flow PDE, a process to configure the device or device system to solve different optimal transport and/or Bayesian inference problems, and/or a measurement technique to retrieve the solution from the physical process. A new approach was also developed to solve an N-dimensional gradient flow using a collection of R-dimensional gradient flow equations combined using additional consensus terms, where R<N. This approach was used to emulate the N-dimensional gradient flow using a collection of one- to three-dimensional gradient flow equations using the physical device.

Figure 1 shows an example semiconductor apparatus or device 1 that can be used to emulate gradient flow for solving various problems, such as optimal transport and Bayesian inference, as detailed later. The device comprises in a bottom or lower region or on a first side 3, which in this case is the bottom side, a set of first type charge carrier, doped or semiconductor regions 5 (indicated with a backward oriented pattern in Figure 1 ) separated from each other by a respective separating or isolating element or region 7 (indicated with a forward oriented pattern in Figure 1), which in this example is an insulating region. The separating regions may be made of any suitable insulating material or they could be second type doped or semiconductor regions or any combination of them. The first type doped regions 5 and the separating regions 7 are in this example both longitudinal regions extending in this example substantially orthogonally from the first side 3 towards a second, opposing side 9 of the device, which in this example is the top or upper side. The first type doped regions 5 and the separating regions 7 thus extend between a first region end and a second, opposing region end, which in this example is located substantially in a centre region of the device extending longitudinally along an imaginary reference line extending across the device or along a longitudinal axis of the device (which does not necessarily cross the centre of the device), which in this example forms a straight axis, but the longitudinal axis could be curved instead. The first type doped regions 5 and the separating regions 7, which are arranged between the first type doped regions 5 to isolate the first type doped regions from each other are arranged along an imaginary first type semiconductor region reference line extending across the device 1 between its extremities. It is to be noted that the first type doped regions 5 and the separating regions 7 do not have to form a 90-degree angle with respect to the longitudinal device axis, but these regions could be instead angled with respect to the longitudinal device axis. The device length along the longitudinal axis of the device is typically between 10 pm and 1000 pm, or more specifically between 30 pm and 300 pm, or between 50 pm and 200 pm. The device dimension along an axis perpendicular to the longitudinal axis is typically between 3 pm and 300 pm, or more specifically between 10 pm and 100 pm, or between 15 pm and 70 pm.

The first type doped regions 5 and the separating regions 7 face at their second ends a second type charge carrier, doped or semiconductor region 11 , which in this example forms one continuous region and extends along the longitudinal device axis across the entire device between a third side 13 and a fourth side 15 of the device, which in this example are the lateral sides of the device 1. The region is continuous, disconnected or uninterrupted in the sense that is allows charge carriers to flow or move within the continuous region. In the configuration shown in Figure 1 , the separating regions 7 are longer than the first type doped regions 5 and thus protrude into the second type doped region 11 to ensure that the first type doped regions are properly insulated from each other. In this specific example, the first type doped regions 5 are p-type regions, while the second type doped region 11 is an n- type region. However, it is to be noted that the doping types in these regions could be easily reversed without affecting the functionality of the device 1. The device is thus split into two different types of semiconductor regions, one which has a dominant concentration of n-type charge carriers and another which has a dominant concentration of p-type charge carriers. In the upper part of the device 1 , the second type doped regions 11 form a set or series of pockets, which in this example form another set of first type doped regions, which are separated from the first type doped regions in the lower part of the device by the second type doped region 11. As can be seen in Figure 1, the p-type regions 5 at the bottom are separated into several semiconductor channels by an insulating material, while the p-type regions 5 at the top are separated into channels by a respective channel of n-type material. The number of p-type regions (or n-type regions in an inverted arrangement) may be between 2 and 1000, or more specifically between 20 and 1000, or between 3 and 500. However, it has been discovered that reliable results can be obtained if the number of p-type regions is between 10 and 300. It is to be noted that typically fewer p-type regions makes manufacturing easier at the cost of lower granularity of the input voltage distribution.

The device 1 further comprises a set of input terminals 17 forming conductive metal-semiconductor interfaces, which can be used to apply bias or input signals, which in this example are voltages, at various points. It is to be noted that the word “signal” is used in the present description in its broad sense and does not imply that any information would be coded in the signal. The input terminals have a conductive interface with the semiconductor to allow current to flow in or out of the device. The input terminals are in this example located at the first end of a respective p-type region. However, other alternative terminal locations would also be possible. As shown in Figure 1, the input terminals 17 are in this example placed both in the lower and upper parts of the device 1 , although it would be possible to have them e.g. only in the lower part of the device. The advantage of placing the input terminals both in the lower and upper regions of the device is that more input terminal locations can be obtained compared with a situation where the input terminals would be only located either in the lower or upper region of the device 1. This thus increases the granularity of the device when it comes to the possible locations where input signals could be applied to the device 1. This means that it is possible to apply different input signals to different input terminals. The input signal may thus be considered to be a spatially varying signal. The device also comprises a set or series of output terminals 19 connected to the n-type region 11. The output terminals 19 are in this example structurally substantially identical to the input terminals 17, and are arranged along the longitudinal axis of the device. The output terminals are located close to the p-n interface, the location of which is however not well defined. Each bottom p-channel is thus in this example terminated at the first end with an input terminal, which can be used to apply input signals (which may be considered to function as input biasing signals) as required for the emulation process. The output terminals are placed along each p-type channel in the n-type material where output voltages (which may also be considered to function as output biasing signals) can be applied and the output signals (in this example currents) from these terminals can be used to characterise the solution for the emulated problem. The input signals and the output biasing signals are in the present example voltage signals. In the example of Figure 1, the first type doped regions 5, the input terminals 17 and the output terminals 19 are evenly spread along the device between the two lateral sides. However, it is to be noted that the positions or the number of the output terminals 19 does not depend on the number of the input terminals 17 and/or the positions or number of the p-type regions. It is also to be noted that the arrangement of the input and output terminals remains unaffected if the p-type and n-type regions are inverted,

The interface of the p-type and n-type regions forms a series or set of semiconductor p-n-junctions or unidirectional charge-flow barriers 20 within the device 1 that allow for n-type carriers to be contained within the device 1. A p-n junction can be understood to be a boundary or interface between two types of semiconductor materials, namely p-type and n-type, inside a single crystal of semiconductor. The p- side (positive side) contains an excess of holes, while the n-side (negative side) contains an excess of electrons in the outer shells of the electrically neutral atoms there. This allows electric current to pass through the junction 20 in one direction only (assuming a closed-circuit configuration). The charge-flow barriers 20 thus prevent the flow of first type charge carriers from the first type doped regions 5 to the second type doped region 11 , and prevent the flow of second type charge carriers from the second type doped region 11 to the first type doped regions 5 in the absence of an external force to help the first and second type charge carriers to overcome the charge-flow barrier. Here the first type doped regions 5 comprise predominantly first type charge carriers (i.e., the first type charge carrier regions comprise more first type charge carriers than second type charge carriers) while the second type doped region 11 comprises predominantly second type charge carriers (i.e., the second type charge carrier region comprises more second type charge carriers than first type charge carriers) having an opposite electric charge to the first type charge carriers.

Various imaginary reference lines maybe defined in the device 1. As shown in Figure 1, one or more charge-flow barrier reference lines RL1 are defined such that each of them crosses the p-n junctions and extends across the device 1. One or more first type charge carrier region reference lines RL2 are defined such that the p-type regions are arranged along the first type charge carrier region reference lines RL2. An output terminal reference line RL3 is defined such that the output terminal reference line RL3 crosses the output terminals 19. In the example of Figure 1, all these three reference lines are parallel. Furthermore, in this example all these reference lines are straight lines.

During an emulation phase, during which input signals are applied to the device 1 but no direct signals are applied to the n-region, the charge carriers rearrange in the device, and some transient current, i.e., positive charge carriers, move into the n-region across the p-n junction 20. Under such a configuration, no n- type charge carriers can flow in or out of the device 1 or in or out of the device system if multiple devices are connected through consensus channels as explained later. During this phase, the emulation takes place in the n-region, but close to the p-n junctions 20. In other words, the distribution of the n-type carriers under such a closed system while applying input signals to the input terminals 17, allows for emulation of the required gradient flow equation. At the end of the emulation phase, the device reaches a steady state or equilibrium during which substantially no charge carrier rearrangement occurs.

During a measurement phase, during which output biasing signals are applied to the output terminals 19 for a short duration of time (e.g., a duration between 1 ps and 100 ps), and input signals are continuously applied to the input terminals 17, output signals are measured to obtain the emulation result. It is to be noted that the measurement phase may be repeated several times even so that it repeatedly and briefly interrupts the emulation phase. However, the final solution to the problem can be obtained only when the steady state has been reached. During the measurement phase(s), the respective output signal, which in this example is current, is proportional to the n-type charge density near the respective output terminal 19 and can thus be used to indirectly infer the charge distribution in the device, which in turn gives the solution to the emulated problem. The device of Figure 1 can be used to emulate gradient flow problems in one dimension. Figure 2 shows a variant of the device of Figure 1. The device 1 of Figure 2 is able to emulate two-dimensional flows as two spatial dimensions are available. If a further dimension is added, then the device would be able to emulate three-dimensional flows. Thus, a single device would be able to emulate at most three-dimensional flows as at-most three spatial device dimensions are available.

Figure 3 shows an arrangement that can be used to solve multi- dimensional gradient flow problems. More specifically, multiple devices 1 can be interconnected as shown in Figure 3 to emulate the gradient flow for such problems. The example device system 21 of Figure 3 includes five devices 1. As shown in Figure 3, multiple devices can be dynamically connected together by using a bus or channels or interconnectors 23, and more specifically semiconductor channels. These channels are in the following description called consensus channels denoting their mathematical function as explained later. The consensus channels comprise an n- type element and a p-type element in direct contact with each other and form one or more longitudinal channels for connecting various devices 1 together. Any given device may thus be connected to one or more consensus channels 23. The channels are thus used to connect the various devices at one or more consensus points or terminals 25 to one or more other devices 1. The consensus terminals 25 may thus be considered to be a subset of the input terminals 17. In the example of Figure 3, the consensus channels further comprise control terminals 27, operating as biasing terminals for applying biasing signals, which in this example are voltages, to the consensus channel 23. The control terminals are used in a consensus control process as explained later. Figure 3 also shows multiplexers or switches 29 for dynamically changing the configuration of the channel connections to configure the device system 21 to emulate different problems.

The purpose of the channels 23 is to equalise the carrier concentration in all the connected devices 1 near the points of interconnection (i.e., near the consensus terminals 25) to bring all the connected devices in charge carrier consensus with each other. A series of biasing voltages can be applied at the control terminals 27 placed along the consensus channel. These biasing voltages can be used to control the flow of carriers across the consensus channel to drive the devices into consensus. According to an active consensus process, spatially varying biasing signals are applied along a respective consensus channel to actively and quickly bring the devices into consensus. According to the active consensus process, the biasing signals also advantageously vary over time. Alternatively, the control terminals 27 may all be connected to equal voltage potential or substantially equal voltage potential to allow for passive equalisation of the carrier concentrations using the inherent carrier diffusion mechanism of a semiconductor. According to this passive consensus process, the equal voltage potential along a respective consensus channel 23 would be the same potential as applied to the consensus terminal 25 of that channel.

The device design and interconnections as shown in Figure 3 allow the emulation of the gradient flow PDE for N-dimensional problems using drift-diffusion of charges in a network of such semiconductor devices. The term drift-diffusion refers to the movement of charges due to two separate influences, (i) drift of charges, due to the influence of a spatially varying voltage on a charged particle, and (ii) diffusion, which refers to the net movement of particles from a region of higher concentration to a region of lower concentration of the particles. In the configuration of Figure 3, each device 1 is used to emulate one-dimensional gradient flow PDE. Then the device system 21 is used to emulate a general N-dimensional gradient flow by allowing each device to emulate the flow for a different dimension and/or part of the computational space; and making interconnections between the devices to enforce consensus between the devices to obtain the global N-dimensional solution. However, it would be possible to use each device to emulate two-dimensional or three-dimensional flows as well and made similar networks. A new consensus-based algorithm will be explained later in more detail to solve an N-dimensional gradient flow problem using a collection of one-dimensional (or R < N dimensional) gradient flow problems. This allows for numerical PDE solvers to solve the problem using meshes in lower dimensional spaces making the numerical solver faster than directly solving on an N-dimensional mesh. It also allows the above emulation using a network of semiconductor devices.

It is to be noted that semiconductor physics restricts the kind of electric fields that can exist within the substrate. This makes it impossible to use the drift- diffusion part of the physics in a bulk substrate to emulate a gradient flow PDE by applying an appropriate electric field/potential field. Instead, a sequence of p-n junctions 20 is used in the substrate according to the present invention. The p-n bulk regions act as reserve for excess charge carriers that can supply p or n carriers as required to support any desired electric field at the p-n junction interface. Thus, at the junction, a desired electric field for the gradient flow emulation can be established and the resultant equilibrium of carrier densities at the junction provides the solution to the gradient flow PDE. Given such charge densities, a measurement process is needed to read the solution back into traditional computing systems. Since drift currents (i.e., in this case the output currents during the measurement phase(s)) are proportional to charge densities, a probing electric field (vertical in the configuration of Figure 1) generated by the applied output biasing signals, and which is orthogonal to the gradient flow emulation electric field (horizontal in the configuration of Figure 1), can be applied to measure the output current and thus indirectly infer the charge density at the given point.

Another challenge for emulating the gradient flow PDE is in imposing a zero-current flow at the external boundaries of the device (the current flow during the measurement is only a small perturbation to the process, and which is removed after a short duration). Applying an electric field to a substrate requires electric contacts to be made (by means of the input and output terminals) in order to apply appropriate spatially varying voltages to the input and output terminals. If the input terminals are directly connected to the substrate that emulates the gradient flow (in this case the n- region), this would result in a current flow through the substrate and thus violate the required boundary condition for the gradient flow PDE. Instead, the p-n junction 20 is used to prevent such current flow. More specifically, thanks to the built-in electric field at the p-n junction, any current flow out of the device can be prevented. The process of applying the input signals (or voltages) to the input terminals applies the same input voltage at the p-n junction. In the above statements p and n are interchangeable depending on the choice of which type of carrier is to be used for the gradient flow emulation.

The method of operating the device system 21 of Figure 3 is next explained in more detail with reference to the flow charts of Figures 4a to 4c. At the beginning of the process, the user of the device system describes or specifies the problem to be solved. This description contains three parts in this example, namely steps 101, 103 and 105. In step 101 , problem variables are defined. In other words, a definition is obtained for the variables whose value is to be solved for, henceforth called the problem variables. If the teachings of the inventions are used in the context of an artificial intelligence system (e.g., an artificial neural network), then the problem variables could be the weights of the artificial intelligence system, or an output of the artificial intelligence system. If on the other hand the teachings of the present invention are applied to linear algebra problems, then the problem variables could be a parameter X whose value depends on matrix A and vector B, where A and B are in this example external data. In step 103, a problem variable space, referred to as a computational space, is defined. In other words, a definition is obtained for the collection of all possible values taken by the variables. This collection is assumed to be representable using a continuous subset of numeric values in henceforth called the computational space. is understood to be the N-dimensional real number space, or the Euclidean space. In step 105, problem functions are defined. In other words, a set of functions is obtained that can be evaluated for any given value of the variables, henceforth called problem functions. A problem function takes as inputs values for the problem variables and optionally can take inputs for a distribution of the problem variables, previous output values of the problem function and external problem related data, like the matrices A, B in the linear algebra problems mentioned above or external data used for training an artificial intelligence system.

A computer programming language may be used to provide such a description. Other ways to provide the description may include pen-paper descriptions or hardware implementations of the functions. Apart from the main problem description, the user can also specify a few hardware configuration options, described in connection with steps 107 to 113.

The hardware configuration is described next with reference to steps 107 to 113. The problem description taken from the user input is used to create a hardware configuration that will emulate the solution finding process for the defined problem. In step 107, the computational space in is partitioned using a collection of n-dimensional polytopes, where the partition may cover the entire computational space or be an approximate cover that maximises the coverage of the space according to a given metric (for example a computational space of a sphere shape could be approximated by using a set of cubes). A wide number of choices for partitioning methods and coverage metrics are available from the state-of-the-art for computational geometry that include methods for partitioning subsets in using simplexes (generalisation of triangles in and gridding using hypercubes (for instance according to teachings of S. Z. M. a. I. H. S. Bettayeb, “Embedding grids into hypercubes”, Journal of Computer and System Sciences, vol. 45, no. 3, pp. 340-366, 1992), as examples. A coverage metric is understood to be a function that gives an absolute value as an output describing the difference between the real computational space and its approximation. The partitioning provides a collection of edges 31, referred to as line segments, and vertices 33 and, in belonging to partitioning polytopes, and which are used in steps 109 and 111 to map the partitioned computational space to a collection or set of devices, and to configure interconnections between the devices, respectively. A vertex is understood to be a point where two or more curves, lines or edges meet. Figure 5 shows an example of a partitioned square in a two-dimensional computational space as an example of the output of step 107. The method and coverage metric for partitioning can be specified using default settings or user input.

The partition of the computational space from step 107 provides a collection of line segments 31 and vertices 33 that in step 109 can be mapped to a collection of devices 1 of the type described earlier with reference to Figure 1 or 2, for instance. The mapping takes each line segment 31 in the partition and assigns to it (i) a physical device 1 , and (ii) a continuous region or segment (which has not yet been assigned to any line segment in the computation space) within the physical device along a device solution axis 35. The device solution axis may be defined by an imaginary (straight or substantially straight) reference line passing through the p-n junctions 20 in a given device, and thus it coincides with the charge-flow barrier reference line RL1. The physical space spanned by the device solution axes in a device, in which length scales and orientation are determined by the physical dimensions and geometry of the device, is referred to as the device space. Thus, the device solution axis 35 is defined in the device space. Geometrical objects like points, lines or curves can be mapped from the computational space to a representation in the device space with the help of mathematical transformations like projection, rotation, translation and scaling. These transformations can also be inverted to map geometrical objects given in the device space to geometrical objects in the computational space. Depending on the device configuration, one single device may comprise several parallel device solution axes. Each device solution axis 35 may thus be understood to define one geometrical object in the device space. A given device solution axis representing a line (in this example straight line) also represents a line in the computational space, through its inverted transformation. For example, the device configuration of Figure 2 defines six vertical and six horizontal device solution axes in addition to 9 + 9 diagonal device solution axes.

The mapping in step 109 is done by first assigning to each line segment obtained from step 107, a new device and one complete or substantially complete device solution axis of that device. As the individual line segments can be freely scaled, translated and/or rotated when doing the mapping, it is possible to map each line segment from the computational space into a given device 1. Thus, when first carrying out the mapping, two consecutive vertices are mapped to two most distant terminals of a respective device. A mapping table is advantageously created to store the mapping result. Further optimisation of such an assignment can be done to reduce the number of devices in the configuration as follows: i. A collection of collinear and connected line segments 31 from the computational space can be assigned to the same device 1 and the same device solution axis if the total length of the connected segments can be scaled by multiplying with a numeric constant, such that the vertices 33 of the segments overlap (or coincide) with the positions of the input terminals 17 and/or consensus terminals 25 (each position of a respective terminal has a given length along the respective device) on the device solution axis. It is to be noted that the positions of the input and consensus terminals have already been defined earlier when the devices were manufactured or optimised for a particular problem description in a separate step before the manufacturing. ii. For non-collinear segments and using multi-axis devices, such as shown in Figure 2, the optimisation from (i) can be extended as follows: A collection of connected segments can be assigned to the same device 1 if all the segments can be scaled by multiplying with a numeric constant (maintaining congruency of the angles between the segments) and the vertices of the segments overlap with positions of the input terminals 17 and/or consensus terminals 25 of the device. This is thus an additional condition to condition (i).

If a line segment 31 cannot be grouped together with other line segments 31 according to optimisations (i) or (ii), the line segment retains its initial assignment of a separate device and its complete device solution axis. Furthermore, grouping together larger collections of line segments on the same device solution axis leads to a reduction in spatial resolution with which points can be represented using a device solution axis. An additional condition can be thus optionally be checked before applying optimisations from (i) or (ii); i.e., if the grouping of a collection of segments onto a single axis due to optimisation with (i) or (ii) reduces the spatial resolution below a minimum threshold (specified via a default setting or user input), then the optimisation is not applied to that collection. The final collection of devices obtained after the optimisations from (i) and (ii) is referred to as a device group or system 21 for the specified problem. As an outcome of step 109, it is thus guaranteed that each line segment 31 is assigned to a device 1 and a device solution axis, and the vertices 33 of the line segments 31 are represented by some input terminals and/or consensus terminals on that device. It is to be noted that additional input terminals and/or consensus terminals not coinciding with the vertices may be present. These terminals are mapped back to the computational space in step 111. Some input terminals and/or consensus terminals in a device have a fixed representation in the computational space, corresponding to the line segment vertices. Also, each device solution axis has been assigned to a collection of collinear line segments in step 109 by scaling the collection with a constant multiplication factor. Thus, in step 111, any input or consensus terminals on a given axis can be mapped back to a point on the collinear line segments in the computational space by an inverse transformation by using the constant multiplication factor or its inverse value. This process is also illustrated in Figure 6. Thus, all consensus and input terminals are assigned a corresponding point in the computational space, which is used in step 113 to configure the interconnections between devices 1. It is to be noted that in the present description, all or at least some of the constant factors may be replaced by constant transformation matrices.

In step 113, the device interconnections are configured. The consensus terminals 25 on two or more devices are configured to be connected together using a consensus channel 23 if the consensus terminals 25 represent the same point in the computational space. Such connections can be made using a permanent consensus channel interconnection between the devices if the chip containing the devices is to be permanently configured to solve a single defined problem. The interconnections can be implemented via a reconfigurable bus interconnection system that uses the switches 29 to connect or disconnect the consensus channels 23 to different consensus terminals 25, when the chip is meant to be reconfigurable to solve solutions for different problem definitions.

In step 115, it is determined whether or not an updated distribution of variables is available from a feed-back loop (between the output terminals of the device system 21 and the input terminals 17). If this distribution is available, then in step 117, this distribution is obtained and is set as a distribution of variables for the subsequent processing. In device level operations, involving feedback from output measurements, a distribution of variables can be set using the measured output distribution, after a first or any subsequent solution iteration. If on the other hand, no distribution is available from the feed-back loop, for example if no feed-back loop exists, then in step 119, an initial distribution of variables is defined. An initial distribution of variables in the computational space can be defined for example as a uniform distribution over the entire computational space.

In step 121 , external problem data, such as a learning data set, is obtained. The defined problem functions can depend on the external data to be fed into the problem solver (i.e., the system solving the defined problem) during execution. The data can be collected at once, made available in batches or be streamed as a continuous time signal. In step 123, the values of the problem functions at device terminal points (in this example the locations of the input terminals 17 and the consensus terminals 25 collectively defining the terminal points) are determined or computed. The computation of the values of the problem functions thus, in this example, takes the external data, the terminal points representation in the computational space from step 111 , and the current distribution of the problem variables, and evaluates the problem functions obtained in step 105, giving an output stream P of problem function values evaluated at the input and consensus terminals.

The signal from step 123, denoted by P, is in step 125 transformed to an input signal, and more specifically to a voltage signal, denoted by V, to be applied to the input terminals 17 and optionally to the consensus terminals 25 by applying an affine transformation, such that V = wP + c, for some fixed numeric constant w, defined using a default settings parameter or an optional user input, and a constant c obtained as an external signal input or as a feedback signal (shown by the symbol D in Fig. 4b) from the device hardware or the preprocessing software. It is to be noted that steps 115 to 125 describe the preprocessing operation of the device system 21. However, steps 115 to 121 are optional. The device level operation taking place in steps 127 and 129 is next explained. In step 127, the input voltage signals obtained for all the device input terminals 17 and optionally also for one or more consensus terminals 25 from step 125 are applied at the corresponding device terminals. This drives the drift-diffusion processes inside the interconnected devices 1. The drift-diffusion processes in the device group is evolved for k seconds before taking an output measurement, where k is a constant, greater than or equal to 0, specified using a default settings value or a user input. After the k second wait, in step 129, the charge distribution in the device group is measured. More specifically, in this step the output biasing signals are applied to the output terminals 19 to bias the p-n junctions 20 and the output current is measured. Since, the output current driven by a voltage difference between the input and output terminals is proportional to the charge density between the terminals, the output currents provide a means to estimate the charge distribution at various points in the device. The p-n junctions 20 for the measurement can then take advantage of forward or reverse biasing, where a p-n junction is said to be reverse biased when the voltage applied to the p region is greater than the voltage applied to the n-region and reverse biased when the voltage applied in the n-region is greater than the voltage applied to the p-region. The measurement process is next explained in more detail.

The output terminals 19 are assigned to one or more device solution axes during the manufacturing process of the device and stored in the form of a table. The table is accessible to the preprocessing and postprocessing operations. In step 129, for each output terminal, one device solution axis assigned to the terminal is randomly selected. The device solution axis also had one or more line segments assigned to it in step 109. Next, the position of the output terminal 19 on the line segments assigned to that device solution axis is determined, and the translation, rotation and scaling operations applied for that axis in step 109 are inverted, to recover the position of the output terminals 19 in the computational space.

For the purpose of taking output measurements from a device 1 , the maximum voltage value V_max (or more broadly maximum signal amplitude) applied at any input or consensus terminals in the device is found. The measurement offset voltage V_offset (or output biasing signal value) is obtained from a default settings value for the device or from a user input. To take an output measurement at one or more output terminals from the device, a voltage V_offset+V_max is applied at all the output terminals 19 from which a measurement is desired. Now the current flowing in or out of the output terminals 19 can be measured, giving the output signals at those terminals.

The postprocessing operation is next explained with reference to steps 131 and 133. In step 131, the measured charge distribution at the output terminals 19 is transformed into a variable distribution in the computational space. This distribution along with the corresponding applied input signal is made available to the preprocessing block or operation as feedback signals for future iterations. For the purpose of estimating the value of charge and variable distributions at the position of a given output terminal, the p nearest input and/or consensus terminals to the output terminal are determined (p is a integer constant taken from a default value or from a user input). The voltage difference V1, V2. ... .. Vp between each of the p nearest terminals and the output terminal is determined. The distances in the device d1, d2, ..., dp between each of the p nearest terminals and the output terminal are determined. Let I _meas denote the current measured flowing out from the terminal. The charge density at the output terminal is then estimated as where m represents the mobility constant for the predominant charge carrier in the continuous region (in this case the n-type region, with the predominant charge carrier being the n-type carrier, i.e., m being the mobility constant for the n-type carrier). The value for m is known from physical properties of the semiconductor material and can be obtained as a default settings value or as a user input. The value for the variable distribution at the output terminal location, denoted as q _variable , is obtained using Q _est as where Q _nominal represents the charge density for the n-type region when no input, consensus or output terminal voltages are applied to the device. Typically Q _nominal is approximately equal to the doping density in the continuous region (n-type region in this case). The value for Q _nominal can be taken from a default settings value or as a user input. Let there be a total of N output terminals 19, each labelled with a unique number between 1 to N. Let us denote the estimates for q _variable obtained from an output terminal labelled with a number i as q _variable,i · Once the estimates q _variable,i are obtained from all the output terminals 19, the values can be normalised at each output terminal to obtain a probability distribution value at the output terminal i, denoted p _variable,i as

The output of the postprocessing block or operation is the variable distribution in the computational space and is taken as the final output when the device system 21 is observed to have reached a steady state, i.e., when its change from one iteration to another over a predefined length of iterations, or predefined length of time is seen to be smaller than threshold value (defined using a default setting or user input).

The following paragraphs explain in more detail gradient flow PDEs and how they relate to Bayesian inference and optimal transport problems and how the emulation using charge carriers in the device system 21 provides solutions for the corresponding PDEs.

Let denote an N-dimensional space of real numbers as commonly defined by the Euclidean space. Let X denote a subset in and θ: [0,∞) × X → denote a function describing the evolution of a distribution in time such that at any time t ∈ [0,∞); the function θ(t,·): X → is distribution on X, taking points from X and mapping them to a real number in

Let us denote by the partial derivative of the function Q with respect to time and with the gradient of the function Q with respect to X. More explicitly, with each component of X denoted by a variable x ₁,x ₂, ··· , X _{N *} the gradient is a collection of partial derivatives along each component as given by

Since Q is a function taking values from [0,∞) × X to values in each of the partial derivatives represent functions taking values from [0,∞) × X to values in as well. Thus, represents a function from [0,∞) × X to values in For briefness of notation in equations we also denote as

Let us denote by the divergence operator that takes a function f: [0,∞) × X → with the N components of /, denoted as ƒ1, ƒ2, ...,ƒ _N; and maps it to a function given by (see C. H. Edwards, “Advanced Calculus of Several Variables”, Dover Books on Mathematics for further details on the divergence operator).

Let X _boundary denote the boundary of X. Let denote a function that, for any point x ∈ X _boundary, maps x to a vector in perpendicular to X _boundary at We refer the reader to the teachings of Trillos, Nicolas Garcia, and Daniel Sanz-Alonso, “The Bayesian update: variational formulations and gradient flows”, referred to also as Trillos, presented in the journal of Bayesian Analysis (2018) for more technical details of the mathematics involved and the intention here is to show the form taken by the gradient flow PDEs without any of the rigorous mathematical details, for which the material from the above can be referred.

It has been shown in Trillos and references therein, that a Bayesian inference problem on the computational space X can be posed as the problem of finding a function θ: [0,∞) × X → that satisfies the PDE for a problem function V: [0,∞) × X → satisfying some additional properties; with a boundary condition, and an initial time condition, for some given distribution θ ₀:X → It is shown in that the evolution of distributions converges to a steady state exponentially fast, i.e., there exist positive constants c and k, such that the solution θ satisfies or in other words the absolute value of converges to 0, exponentially fast in time. The steady state distribution θ _SS: X → is the value taken by evolving distributions 0(t,·) in the limit as t goes to ∞ (infinity). However since the rate of change of θ as given by goes to 0 exponentially fast, for any arbitrary constant ∈ > 0, we can find a finite time T _∈ ∈ [0,∞), such that the Wasserstein distance between the distributions θ(T _∈, · ) and θ _SS is less than e. Thus, we can get an approximation for θ _SS using the evolution 0 in a finite time T _∈ for an arbitrarily small approximation error ∈ > 0.

The solution to the Bayesian inference problem is given by the steady state distribution θ _SS satisfying the above PDEs with its boundary and initial conditions. The conditions on the problem function V also ensure that the steady state distribution does not depend on the initial distribution θ ₀ used for the initial time condition up to a constant multiplication factor. Thus, for any initial condition θ ₀ we can obtain an approximation to the solution of the Bayesian inference problem using the evolution θ in finite time, as described above, and thus does not require an explicit choice of θ ₀ to be imposed. The resulting approximation θ is then a distribution on X proportional to the solution (probability) distribution of the Bayesian inference problem and the proportionality constant can be deduced by integrating θ over the set X.

It is to be noted that the term denotes the vector dot product of the gradient terms with the vector perpendicular to the boundary.

The Bayesian inference problem and its system of PDEs and boundary conditions can be viewed as a special case of the more general optimal transport problem, that has been shown in the book by Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savare, “Gradient flows: in metric spaces and in the space of probability measures”, Springer Science & Business Media, 2008, to be solvable using a PDE of the form along with boundary and initial conditions as given by BC-I and IC-I respectively. F is a given function of θ, derived from the optimal transport problem desired to be solved. The Bayesian inference is a special case of such a problem, with F(θ) = θ logθ + V, giving the PDE, upon expansion of the terms.

The system of devices (1) according to the present invention, as a result of its specialised physical design, device interconnections and input signal configuration, allows the emulation of gradient flow PDEs of the form described in PDE-I and PDE-II with boundary and initial time conditions as described in BC-I & IC- I, respectively. During steady state (i.e., once the equilibrium has been reached), the charge distribution at the p-n junctions 20 in the device system 21 approximate the steady state distribution θ _SS for the problem the system is configured to be solved, and thus provides a solution to the above problems by simply measuring the spatial distribution of charges along the various imaginary reference lines passing through the p-n junctions, as described above.

In order to solve a PDE of the form PDE-I, the input signals to the devices 1 are used to establish a spatial distribution of the voltages along the imaginary reference lines of the device, corresponding to the values of V computed at the corresponding points of the imaginary line in the computational space and multiplied by some constant of choice Z.

As a result, the device drift-diffusion physics equation that dictates the charge movement along this imaginary line is given by where D is the diffusion constant for the charge carriers in the n-type region and A is the thermal voltage of the device, (both a property of the physical material used to build the device and environmental factors like temperature and are known from experimental data for many semiconductor materials). By choosing Z = 1/A, the device equations then emulate

During steady state, implying and thus Thus any π _ss that is a steady state solution for PDE-III is also a steady state solution for i.e., the constant multiplication factor D does not affect the steady state solution itself.

Since the form of PDE-IV and PDE-I are the same and the steady state solution for PDE-IV is the same as the steady state solution for PDE-III that is emulated by the device, we have that steady state charge distribution given by π _ss, which is an approximation to the steady state solution θ _SS that we sought from PDE-I. Thus, the device charges emulate the gradient flow PDE (PDE-I) that was used to solve the Bayesian inference problem.

By measuring the steady state charge distribution π _ss at various points in the device given by the output terminals, we get a discretised approximation to the continuous charge distribution that can be used to represent the output of the device and Bayesian inference process using finitely many points.

To tackle the more general optimal transport problem with the PDE of the form PDE-II, the input signals are used to create a spatial distribution of voltages in the system of the form given by Z F(θ _k(x)) at various points x in the computational space corresponding to the input and/or consensus terminals, where θ _k is the current distribution estimate for the variables as obtained from blocks 117 or 119 in Fig. 4b. Then the system of devices emulates a PDE with the charge carriers as given by By choosing a large constant Z, such that is much larger, (say

10 or more, times larger than an estimate for the gradient of the charge distribution provided by a previous measurement of the charge distribution in the device system), ensures that the system approximately evolves the charge distribution as

By evolving the system for a short time duration Δt and taking a new measurement of the charge distribution, denoted π _k+1, and updating the voltages in the device system to the values given by ZF( π _k+1 ) allows us to emulate the finite difference way of solving a PDE of the form PDE-II, where the partial derivative with respect to time is approximated as a finite difference approximation, and the solution for π _k+1 is sought for, given a value for π _k, from the equation

The smaller the value for Δt, the closer the evolution of the finite difference equation FD-I is to the evolution of the actual PDE which has the same steady state solution as that for PDE-II. Thus using such an iterative process of measuring π _k, applying the input signals, evolving for time Δt, measuring the updated distribution as π _k+1 and repeating with correspondingly updated input signals, provides a means to approximate the solution to PDE-V and thus in turn approximate the steady state solution of PDE-II.

The consensus process in the device system to solve multidimensional gradient flows is next explained in more detail. As mentioned earlier, an interconnection of multiple devices is used to emulate the gradient flow for any N- dimensional space ( N ≥ 1). According to an aspect of the invention, groups of line segments 31 from the computational space are assigned to a group of devices 1 as described earlier. Along each such group of line segments, the device emulates a one-dimensional gradient flow PDE as described above. When the consensus terminals 25 in one or more devices 1 represent the same point in the computational space, we connect those consensus terminals using a consensus channel 29 (which can be multi-branch channel). The system of interconnected devices thus formed has one or more consensus channels, each consensus channel corresponding to a unique point in the computational space. Each consensus terminal 25 is connected to the device 1 along a physical direction that is in this example orthogonal to every imaginary reference line passing through the p-n junctions 20 in the device. By making such an orthogonal connection, the flow of the charge carriers along the consensus channel is not affected by the electric field induced by spatial voltage distribution input along the reference lines. The objective of connecting two or more devices using consensus channels, is to equalise (make equal) the value of the charge distributions, at the locations of the consensus terminals 25 connected using the channel. Each consensus channel achieves this objective using one of two possible mechanisms, that we denote as ‘passive’ and ‘active’ mechanisms. The passive mechanism applies a constant voltage throughout the consensus channels 29 using one or more consensus control terminals 27. As a result of the constant voltage application over the length of the consensus channels, the electric field along the path of the consensus channel is zero and the flow of charges along the consensus channel length is dictated only by the diffusion of charges. The movement of charges in the channel is thus dictated by a diffusion current as given by, where q is the charge on one charge carrier and D both of which are physical, known (non-zero) constants for a given material and charge carrier. During steady state, by definition of the word steady state, there is no change in the distribution of charges implying the current flow (flow of charges) within the channel, 0. Thus, the diffusion of charges across the channel ensures that value of the charge distribution at the various consensus terminals 25 of multiple devices 1 that are connected using a consensus channel 29, become equal during steady state, thus achieving the objective of the consensus channel.

The active mechanism of consensus uses in addition to the diffusion currents of (J-1), the drift currents induced by an applied voltage difference across two or more consensus control terminals 27 in a consensus channel 29. The active mechanism of consensus uses prior estimates of the charge density at the locations of the consensus terminals 25 connected using the channel. Let the consensus channel be connected to M consensus terminals 25 and let be (in general continuous time signals) denoting previously observed estimates of the charge distribution values at these terminals. Let there be a total of K consensus control terminals 27 attached to the consensus channel. The active mechanism then specifies a function H that takes the previous estimates given by n and computes K voltage values, to be applied to the consensus control terminals (one value for each terminal). The applied voltages then drive an electric current in the consensus channel given by the voltage differences and thus drive charges from one point in the channel to another. A feedback loop, which may have one or more associated processing elements, to actively control the voltages at the consensus control terminals 27 can then be obtained by repeating the following steps: (i) measure the charge distribution values in the vicinity of the N consensus terminals 25 to get (ii) compute and apply the voltages to the K consensus control terminals 27,(iii) wait for a time Δt to allow the charge distributions to evolve, (iv) go to (i) and repeat. A respective feedback loop is thus connected between at least some of the output terminals 19 of a respective device and at least some of the input terminals 17 and/or at least some of the control terminals 27 of the channel 23 associated with the respective device 1 to dynamically adjust the input signals and/or the biasing signals based on the output signals. The control mechanism is repeated indefinitely until the final output measurement for the solution from the output terminals is made. By connecting multiple devices 1 , each emulating one, two or three dimensional PDE, at various vertices given by the partition of the computational space, we can emulate a system of PDEs of the form where X ₁,X ₂, — ,X _n are n grouped line segments 31 from the partition of the computational space X as provided by step 109 of the flow chart of Fig. 4a, and L is some constant determined by the constants involved in the active or passive consensus mechanisms that emulate the term by pushing θ _k to be equal to θ _j at all points that are common to X _j and X _k by virtue of the consensus mechanisms. By adding the n PDEs above, we note that the term θ _j) = 0 and thus a summation of the n processes emulates

As θ ^j is driven towards θ ^k at the connecting points of the X _js by the consensus process, and the since the X _js model a PDE flow over a disjoint subset of X, we can write a common distribution θ over the entire space X, such that for each j = 1, ...,n, θ = θ ^j for all points in X _j. The above summation process can then be written as PDE over the entire space X as which thus emulates a PDE of the form PDE-I in a general N dimensional space X.

The above arguments also work for PDEs of the form PDE-II, by simply replacing in the above equations.

Thus, the consensus mechanism allows for a system of interconnected devices existing in an at-most three-dimensional physical space to emulate PDEs in any general N-dimensional space, where N may even be greater than three.

While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive, the invention being not limited to the disclosed embodiment. Other embodiments and variants are understood, and can be achieved by those skilled in the art when carrying out the claimed invention, based on a study of the drawings, the disclosure and the appended claims. For example, the invention also relates to a computer program product comprising instructions for implementing at least some of the steps of the method when loaded and run on computing means of a computing device. It is also to be noted that the voltage configuration (i.e., all or some of the applied signals to the device system) can be changed over time as the problem input changes or as part of a controlled feed- back scheme where the output is measured and the voltage configuration is changed accordingly in order to emulate the solution process for certain optimal transport problems. It is further to be noted that instead of using semiconductor materials to build the device, there exist other mechanisms and materials, such as ionic solutions that could implement the device. For example, two types of charge carrier regions with a charge-flow barrier interface as described before and used for the invention can be implemented in materials as well, as done with all carbon materials from the teachings of Feng, X., Zhao, X., Yang, L. et al., “All carbon materials pn diode. ”Nat Commun 9, 3750 (2018), or using teachings from a point-contact diode where a metal- semiconductor junction is used to create a unidirectional charge-flow barrier junction using the teachings of H. Q. North, “Asymmetrically Conductive Device”, U.S. patent 2,704,818, or in ionic solutions as demonstrated in the teachings of Ivan Vlassiouk,

Sergei Smirnov, and Zuzanna Siwy, “Nanofluidic Ionic Diodes. Comparison of Analytical and Numerical Solutions.” ACS Nano 20082 (8), 1589-1602.

In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used. Any reference numerals in the claims should not be construed as limiting the scope of the invention.

Previous Patent: METHOD FOR OBTAINING AN ALKALINE EARTH METAL COMPOUND

Next Patent: TEMPERATURE DIFFERENCE DRIVE UNIT