Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ALL-OPTICAL IMPLEMENTATION OF MULTIPLE OPTICAL TRANSFORMATIONS THROUGH A POLARIZATION-ENCODED DIFFRACTIVE NETWORK
Document Type and Number:
WIPO Patent Application WO/2023/183859
Kind Code:
A2
Abstract:
A polarization multiplexed diffractive processor is disclosed that all-optically performs multiple, arbitrarily-selected transformations (e.g., linear) through a single diffractive network trained using deep learning. In this framework, an array of pre-selected linear polarizers is positioned between trainable transmissive diffractive materials that are isotropic, and different target linear transformations (complex-valued) are uniquely assigned to different combinations of input/output polarization states. The transmission layers of this polarization multiplexed diffractive network are trained and optimized via deep learning and error-backpropagation by using thousands of examples of the input/output fields corresponding to each one of the complex-valued linear transformations assigned to different input/output polarization combinations. This polarization-multiplexed all-optical diffractive processor can find various applications in optical computing and polarization-based machine vision tasks.

Inventors:
OZCAN AYDOGAN (US)
LI JINGXI (US)
Application Number:
PCT/US2023/064837
Publication Date:
September 28, 2023
Filing Date:
March 22, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CALIFORNIA (US)
International Classes:
G06E1/04
Attorney, Agent or Firm:
DAVIDSON, Michael S. (US)
Download PDF:
Claims:
What is claimed is:

1 . A polarization-encoded diffractive network comprising: one or more optically transmissive and/or reflective substrate layer(s) arranged in an optical path having one or more polarizer arrays interposed between the one or more optically transmissive and/or reflective substrate layer(s) along the optical path, each of the optically transmissive and/or reflective substrate layer(s) comprising a plurality’ of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) and having different transmission and/or reflection properties as a function of the lateral coordinates across each substrate layer, wherein the one or more optically transmissive and/or reflective substrate layer(s) and the one or more polarizer arrays collectively define or approximate a plurality of distinct optical transformations between a vectorized input optical field input to the polarization-encoded diffractive network and a vectorized output optical field output from the polarization-encoded diffractive network; and wherein the one or more optically transmissive and/or reflective substrate layer(s) are designed during a training phase with the one or more polarizer arrays to define or optimize the plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) to perform or approximate the plurality of distinct optical transformations between the input and the output.

2. The polarization-encoded diffractive network of claim 1, wherein the one or more polarizer array(s) include linear polarizers at a plurality of orientations.

3. The polarization-encoded diffractive network of claim 1, wherein the one or more polarizer array(s) include circular and/or elliptical polarizers at a plurality of orientations.

4. The polarization-encoded diffractive network of claim 1, wherein the one or more distinct optical transformations include at least one of: object classification, object segmentation, object imaging, object detection, object image filtering, object image compression, object image encryption, and an analog logical operation.

5. The polarization-encoded diffractive network of claim 1, wherein the plurality of distinct optical transformations are implemented or approximated at ultra-violet or visible or infrared or terahertz or microwave parts of the electromagnetic spectrum.

6. The polarization-encoded diffractive network of claim 1. wherein th< vectorized input optical field comprises narrowband or broadband radiation.

7. The polarization-encoded diffractive network of claim 1, wherein the one or more polarizer arrays have individual polarizer elements with orientations and locations that are fixed or controliabie/programmabie/reconfigurable.

8. The polarization-encoded diffractive network of claim 1, wherein the one or more polarizer arrays in tire polarization-encoded diffractive network are physically integrated on or within the one or more optically transmissive and/or reflective substrate layer(s).

9. The polarization-encoded diffractive network of claim 1, further comprising one or more image sensors or detector arrays configured to receive the vectorized output optical field from the polarization-encoded diffractive network.

10. The polarization-encoded diffractive network of claim 1, further comprising respective polarizers located along the optical path before and after the one or more optically transmissive and/or reflective substrate layer(s).

11 . The polarization-encoded diffractive network of claim 1, wherein the one or more polarizer arrays are located on or adjacent to a surface of the one or more optically transmissive and/or reflective substrate layer(s).

12. Tire polarization-encoded diffractive network of claim 1, wherein the plurality of distinct optical transformations are implemented or approximated sequentially.

13. The polarization-encoded diffractive network of claim 1, wherein the plurality of distinct optical transformations are implemented or approximated simultaneously.

14. The polarization-encoded diffractive network of claim 1, w herein the plurality of distinct optical transformations comprise complex-valued transformations of the vectorized input optical field.

15. A method of performing a plurality of optical transformations of a vectorized input optical field using a polarization-encoded diffractive network comprising: providing a polarization-encoded diffractive network comprising one or more optically transmissive and/or reflective substrate layer(s) arranged in an optical path having one or more polarizer arrays interposed between the one or more optically transmissive and/or reflective substrate layer(s) along the optical path, each of the optically transmissive and/or reflective substrate layer(s) comprising a plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) and having different transmission and/or reflection properties as a function of the lateral coordinates across each substrate layer, wherein the one or more optically transmissive and/or reflective substrate layer(s) and the one or more polarizer arrays collectively define or approximate a plurality of distinct optical transformations between the vectorized input optical field input to the polarization-encoded diffractive network and a vectorized output optical field output from tlie polarization-encoded diffractive network, wherein the one or more optically transmissive and/or reflective substrate layer(s) are designed during a training phase with the one or more polarizer arrays to define the plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) to perform or approximate the plurality of distinct optical transformations; inputting the vectorized input optical field to the polarization-encoded diffractive network; and capturing the vectorized output optical field from the polarization-encoded diffractive network with one or more image sensors or detector arrays.

16. Tire method of claim 15, wherein the one or more polarizer array(s) include linear polarizers at a plurality of orientations.

17. The method of claim 15, wherein the one or more polarizer array(s) include circular and/or elliptical polarizers at a plurality of orientations.

18. The method of claim 15, wherein the plurality of distinct optical transformations include at least one of: object classification, object segmentation, object imaging, object detection, object image filtering, object image compression, object image encryption, and an analog logical operation.

19 The method of claim 15, wherein the plurality' of distinct optical transformations are implemented or approximated at ul tra-violet or visible or infrared or terahertz or microwave parts of the electromagnetic spectrum.

20. Tire method of claim 15, wherein the vectorized input optical field comprises narrowband or broadband radiation.

21. The method of claim 15, wherein the one or snore polarizer arrays have individual polarizer elements with orientations and locations that are fixed or controllable/programmable/reconfigurable.

22. The method of claim 15, wherein the one or more polarizer arrays in the polarization-encoded diffractive network are physically integrated on or within the one or more optically transmissive and/or reflective substrate layer(s).

23. Tire method of claim 15, wherein the vectorized input optical field comprises a two-dimensional complex data field.

2.4. The method of claim 15, wherein the vectorized input optical field and the vectorized output optical field each comprises an array of pixel values.

25. The method of claim 15, further the vectorized input optical field and the vectorized output optical field are passed through respective polarizers before and after the one or more optically transmissive and/or reflective substrate layer(s).

26. The method of claim 15, wherein the plurality of distinct optical transformations comprise complex- valued optical transformations of the vectorized input optical field.

27. The method of claim 15, wherein the vectorized input optical field comprises a first optical field having a first polarization and a second optical field having a second polarization, wherein the first optical field and the second optical field are simultaneously input to the to the polarization-encoded diffractive network.

28. The method of claim 27, wherein the first optical field having a first polarization and a second optical field having a second polarization are combined to generate tlie vectorized input optical field.

29. The method of any of claims 27-28, wherein the vectorized output optical field from the polarization-encoded diffractive network is split into a first path and a second path, wherein the first path contains a polarizer with the first polarization and a first image sensor and wherein the second path contains a polarizer with the second polarization and a second image sensor.

Description:
ALL-OPTICAL IMPLEMENTATION OF MULTIPLE OPTICAL TRANSFORMATIONS THROUGH A POLARIZATION-ENCODED DIFFRACTIVE NETWORK

Related Application

[0001] Illis Application claims priority to U.S. Provisional Patent Application No.

63/323,446 filed on March 24, 2022, which is hereby incorporated by reference. Priority is claimed pursuant to 35 U.S.C. § 1 19 and any other applicable statute.

Technical Field

[0002] The technical field generally relates to diffractive optical networks used to perform multiple complex-valued, arbitrary' optical transformations using polarization multiplexing.

Hie polarization-multiplexed all-optical diffractive processor can find various applications in optical computing and polarization-based machine vision tasks.

Statement Regarding Federally Sponsored

Research and Development

[0003] This invention was made with government support under Grant Number FA9550- 21-1-0324, awarded by the U.S. Air Force, Office of Scientific Research. Tire government has certain rights in the invention.

Background

[0004] With die increasing global demand for machine learning and computing in general, using light to perform computation has been a rapidly growing focus area of optics and photonics. The research on optical computing has a long history spanning decades of exciting research and development efforts. Motivated by the massive success of artificial intelligence and deep learning, in specific, a myriad of new hardware designs for optical computing have been reported recently, including, e.g., on-chip integrated photonic circuits, free-space optical platforms, and others. Among these different optical computing systems, the integration of successive transmissive diffractive layers (forming an optical network) has been demonstrated for optical information processing, such as object classification, image reconstruction, all-optical phase recovery' and quantitative phase imaging, and logic operations. A diffractive network is first trained using deep learning and error- backpropagation methods implemented in a digital computer, after which the resulting transmissive layers are fabricated to form a physical network that computes based on the diffraction of the input light through these spatially-engineered transmissive layers. Because the computational task is completed as the light passes through thin and passive optical elements, this approach is very' fast, and the inference process does not consume power except for the illumination light. It is also scalable since an increase in the input field-of-view (FOV) can be handled by fabricating larger transmissive layers and/or deeper diffractive designs with more successive layers positioned one after another. Furthermore, both the phase and the amplitude information channels of the input scene/FOV can be processed by a diffractive optical network, w ithout the need for phase retrieval or digitizing, vectorizing an image of the scene, which makes diffractive computing highly 7 desirable for machine vision applications. Harnessing light-matter interactions using engineered diffractive surfaces also enabled the inverse design of optical elements for e.g., spatially-controlled wavelength demultiplexing, pulse engineering, and orbital angular momentum multiplexing/demultiplexing. It has also been shown that a diffractive network can be trained by optimizing its diffractive layers to perform an arbitrary 7 complex-valued linear transformation between its input and output fields-of-view, demonstrating its computing capability for complex-valued matrix-vector operations at the speed of light propagation through a passive diffractive system.

[0005] All these results highlight the unique capabilities of diffractive networks to manipulate various physical properties of light, including e.g., its amplitude and phase distribution, spatial frequency, spectral bandwidth, orbital angular momentum, for performing specific computational tasks that are desired. As another important physical property of light, polarization specifies the geometrical orientation of electromagnetic wave oscillations. Utilizing the polarization state of light has played a pivotal role in numerous applications, including telecommunications, imaging, sensing, computing, and displays. For example, polarization-division multiplexing (PDM) has been used in telecommunication systems to permit two channels of information to be simultaneously transmitted using orthogonal polarization states over a single wavelength.

Summary

[0006] In one embodiment, polarization-multiplexed diffractive optical networks are disclosed that perform a group of arbitrary linear transformations using a common set of diffractive layers that are jointly-optimized to all-optically perform each one of the target complex-valued linear transformations at a different combination of input/output polarization states. In an earlier work by Kulce et al., it was shown that a diffractive optical network composed of spatially-engineered layers could all-optically perform an arbitrary' complexvalued linear transformation between an input and output field-of-view with a negligible error when the number of trainable diffractive elements/n eurons (N) approaches N, : N 0 , where N ( and N o represent the number of pixels at the input and output FOVs, respectively. See Kulce O, Mengu D, Rivenson Y, Ozcan A., All-optical synthesis of an arbitrary' linear transformation using diffractive surfaces. Light Sci Appl 2021; 10: 196. Here, polarization multiplexing was used between the input and output FOVs of a diffractive network to increase the capaci ty' of diffractive computing and all-optically' perform a group of arbitrary linear transformations that are complex-valued. The polarization multiplexed diffractive network designs are not based on birefringent, anisotropic or polarization-sensitive materials; instead, the designs utilize standard diffractive surfaces where the phase and amplitude transmission properties (e.g., coefficient) (or reflection properties/coefficients in a reflection configuration) of each trainable diffractive feature are independent of the polarization state of the input light. Using a network design solely based on standard isotropic diffractive materials makes the designs simpler in terms of material selection, fabrication and scale-up; however, it also makes the diffractive network insensitive to different polarization states, and therefore, polarization multiplexed all-optical computation of different transformations becomes impossible.

[0007] To overcome this challenge, a non -trainable, pre-determined array of linear polarizers (at 0°, 45°, 90° and 135°) was used that was located within the diffractive network that acted as polarization seeds for the trainable isotropic diffractive layers to all-optically execute different linear transformations through input/output polarization multiplexing (see FIG. 1A). Stated differently, data-driven training and optimization of isotropic diffractive layers was used to encode different linear transformations into different input/output polarization combinations, and this encoding is made possible by the polarization mode di versity introduced by a non-trainable, pre-determined array of linear polarizers within the diffractive volume.

[00081 In a first implementation, two different, arbitrarily-selected linear transformations

(i.e., N p 2) were performed using a diffractive network composed of four transmissive layers that are jointly optimized using deep learning, where the first target linear transformation was assigned to x (0°) linear input and x linear output polarization combination, and the second target linear transformation was assigned to y (90°) linear input and y linear output polarization combination , For this case of N p = 2, there are two different schemes (FIG. I B) to all-optically access/implement the desired linear transformations: sequential (x and y input polarization states encode the input information sequentially, one after another) or simultaneous (x and y input polarizations encode the input information at the same time within the input FOV). The numerical results (FIGS. 2A-2E - 5A-5D) reveal that one can successfully train a diffractive network under each one of these operation modes (sequential vs. simultaneous) to approximate the two target, arbitrary-selected linear transformations w ith a negligible error when the number of trainable diffractive neurons N approaches N p NiN 0 = 2N(N 0 .

[0009] In a second implementation (FIG. 6), four different, arbitrary linear transformations (i.e., N p = 4) were performed using a diffractive network composed of eight transmissive layers that are jointly optimized using deep learning and examples of input/output fields corresponding to the selected complex-valued linear transformations (ground truth). In this case, the first target transformation w as assigned to x linear input and 45° linear output polarization combination, the second target transformation was assigned to y linear input and 135° linear output polarization combination, the third target transformation was assigned to x linear input and 135° linear output polarization combination and finally the fourth target transformation was assigned to y linear input and 45° linear output polarization combination. The analyses of this 4-channel polarization multiplexed diffractive system show' that when N > N p N,-N 0 = ^N(N 0 , all the target linear transformations can be successfully approximated, following a similar conclusion as in the first implementation case (N p = 2). [0010] Without the use of a non-trainable, pre-determined array of linear polarizers acting as polarization seeds within the network, none of these multiplexing results could be achieved using isotropic diffractive materials, no matter how they are trained or optimized, since they would normally perform the same transformation under different input polarization states.

[0011] It should be appreciated that these results should not be confused with polarization multiplexed (or wavelength/illumination multiplexed) projection of a set of desired complex fields at the output of a metamaterial design; such multiplexed metamaterial systems do not implement an arbitrary' matrix multiplication operation. Each input-output polarization combination in this diffractive design represents an all-optical implementation of a unique linear transformation between the input and output FOVs. Therefore, for each input-output polarization combination, infinitely many different target complex fields can be all-optically synthesized by the trained diffractive network in response to different input field distributions; and this capability accurately defines the corresponding complex-valued linear transformation at the output FOV for all the possible and infinitely many combinations of phase and amplitude distributions at the input FOV.

[0012] A polarization multiplexed diffractive network can perform an arbitrary set of target linear transformations using the same diffractive layers that all-optically implement a distinct complex -valued linear transformation at a selected input/output polarization combination. It is believed that this unique framework will be valuable in developing high- throughput optical processors and polarization-based machine vision systems operating at different parts of the electromagnetic spectrum. Moreover, the presented diffractive computing platform and the underlying concepts can be used to develop polarization-aware optical information processing systems for e.g., detection, localization, and statistical inference of objects with unique polarization properties.

[0013] In one embodiment, a polarization-encoded diffractive network includes one or more optically transmissive and/or reflective substrate layer(s) arranged in an optical path having one or more polarizer arrays interposed between the one or more optically transmissive and/or reflective substrate lay er(s) along the optical path, each of the optically transmissive and/or reflective substrate layer(s) having a plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) and having different transmission and/or reflection properties as a function of the lateral coordinates across each substrate layer, wherein the one or more optically transmissive and/or reflective substrate layer(s) and the one or more polarizer arrays collectively define or approximate a plurality of distinct optical transformations between a vectorized input optical field input to the polarization-encoded diffractive network and a vectorized output optical field output from the polarization-encoded diffractive network; and wherein the one or more optically transmissive and/or reflective substrate layer(s) are designed during a training phase with the one or more polarizer arrays to define or optimize the plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) to perform or approximate the plurality of distinct optical transformations between the input and the output. [0014] In another embodiment, a method of performing or approximating a plurality of transformations of a vectorized input optical field using a polarization-encoded diffractive network includes: (1) providing a polarization-encoded diffractive network having one or more optically transmissive and/or reflective substrate layer(s) arranged in an optical path having one or more polarizer arrays interposed between the one or more optically tran smissive and/or reflective substrate layer(s) along the optical path, each of the optically transmissive and/or reflective substrate layer(s) having a plurality of physical features fonned on or within the one or more optically transmissive or reflective substrate layer(s) and having different transmission and/or reflection properties as a function of the lateral coordinates across each substrate layer, wherein the one or more optically transmissive and/or reflective substrate layer(s) and the one or more polarizer arrays collectively define or approximate a plurality of distinct optical transformations between the vectorized input optical field input to the polarization-encoded diffractive network and a vectorized output optical field output from the polarization -encoded diffractive network, wherein the one or more optically transmissive and/or reflective substrate layer(s) are designed during a training phase with the one or more polarizer arrays to define the plurality of physical features formed on or within the one or more optically transmissive or reflective substrate layer(s) to perform or approximate the plurality of distinct optical transformations; (2) inputing the vectorized input optical field to the polarization-encoded diffractive network; and (3) capturing the vectorized output optical field from the polarization-encoded diffracti ve network with one or more image sensors or detector arrays.

Brief Description of the Drawings

[0015] FIGS. 1A and IB schematically illustrate embodiments of polarization multiplexed all-optical diffractive computing. FIG. 1A illustrates the optical layout of the polarization- encoded diffractive network, where four isotropic diffractive layers and one array of linear polarizers are jointly used to perform two distinct, complex-valued linear transformations between the input field i and the output field o by using polarization encoding/decoding at the input/output FOVs. FIG. IB illustrates a schematic for the sequential polarization access (SeqPA, left - Operating Mode 1) mode and the simultaneous polarization access (SimPA, right - Operating Mode 2) mode that can be used to operate the 2-channel polarization multiplexed diffractive network.

[0016] FIGS. 2A-2E illustrate diffractive all-optical transformation results for 2-channel polarization multiplexing using the sequential polarization access (SeqPA) mode. FIG. 2A shows amplitude and phase of the arbitrarily generated matrices A r and ri 2 , which serve as the ground truth (target) for the diffractive all-optical transformations. FIG. 2B illustrates curves representing the normalized mean-squared error between the ground truth transformation matrices (A x and A 2 ) and the all-optical transforms (A; and A 2 ) resulting from the trained diffractive networks as a function of the number of diffractive neurons N. The solid curves are achieved by the polarization multiplexed diffractive networks trained using the SeqPA mode, which are compared with the dashed curves achieved by the regular diffractive networks trained with the same set of A' but without any polarization multiplexing. For the polarization multiplexed models, the results for the two polarization channels and <2 ), corresponding to transforms A, and A' 2 , are shown in separate curves that are labeled with ‘‘SeqPA (T)” and “SeqPA (.2y”, respectively. For the regular diffractive models without polarization multiplexing, the results for all-optical approximation of A ;1 and A 2 are shown in separate curves labeled with “No pol. Aj” and “No pol. A 2 ”, respectively. The space between the simulation data points is linearly interpolated. FIG. 2C: same as FIG. 2B but the cosine similarity between the all-optical transforms and their ground truth shown in FIG. 2A is reported. FIG. 2D: same as FIG. 2.B but. the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 2E: diffraction efficiency of the presented diffractive networks.

[0017] FIG. 3 illustrates all-optical transformation matrices estimated by the 2 -channel polarization multiplexed diffractive designs trained using the SeqPA mode with N= 44 2 , 92 2 and 180 2 , and their differences from the ground truth matrices.

[001 §] FIG. 4 illustrates examples of input/output complex fields for the ground truth transformations presented in FIGS. 2 and 3 along with the output fields computed by the 2- channel polarization multiplexed diffractive designs trained with the SeqPA mode using N = 44 2 , 92 2 and I80 2 . Note that |zo — indicates the wrapped phase difference between the ground truth output field o and the normalized diffractive network output field o'.

[0019] FIGS. 5A-5D illustrate the diffractive all -optical transformation results for 2- channel polarization multiplexing using the simultaneous polarization access (SimPA) mode. FIG. 5A: curves representing the normalized mean-squared error between the ground truth transformation matrices (Ai and A 2 ) and the all-optical transforms (A^ and A' 2 ) resulting from the trained diffractive networks as a function of the number of diffractive neurons A'. The solid curves are achieved by the polarization multiplexed diffractive systems trained using the SimPA mode, which are compared with the dashed curves achieved by the regular diffractive networks trained with the same set of /V but without any polarization multiplexing. For the polarization multiplexed models, the results for the two polarization channels (1) and corresponding to transforms A[ and A 2 ’ , are shown in separate curves that are labeled respectively. For the regular models without polarization multiplexing, the results for all-optical approximation of /F and are shown in separate curves labeled with “No pol. Aj” and “No pol. A 2 '\ respectively. The space between the simulation data points is linearly interpolated. FIG. 5B: same as FIG. 5 A but the cosine similarity between the all-optical transforms and their ground truth is reported. FIG. 5C: same as FIG. 5A but the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 5D: diffraction efficiency of the presented diffractive networks. [0020] FIGS. 6A-6B is a schematic illustration of 4-channel polarization multiplexed all- optical diffractive computing framework for performing tour unique linear transformations through a single diffractive network. FIG. 6A is an optical layout of the polarization-encoded diffractive network, where eight trained diffractive layers and two arrays of linear polarizers are jointly used to perform four distinct, complex-valued linear transformations between the input field I and the output field o by using polarization encoding/decoding at. the input/output FOVs. FIG. 6B is a schematic for the operation of the 4-channel polarization multiplexed all-optical computing framework, where the four polarization channels, (1), (2), are formed by sequentially connecting one of the tw o input polarization states with one of the two output polarization states.

[0021] FIGS. 7A-7E illustrate the diffractive all-optical transformation results for 4- channel polarization multiplexing of four distinct arbitrary linear transforms (depicted in FIG . 6b) . FIG. 7A: amplitude and phase of the arbitrarily generated matrices A ;1 , A 2 , A 3 and A 4 , which serve as the ground truth (target) for the diffractive all-optical transformations. FIG. 7B: curves representing the normalized mean-squared error between the ground truth transformation matrices (A 1; A 2 , 4 3 and d 4 ) and the all-optical transforms (A 4 , A 2 , A 3 and A 4 , examples shown in FIG. 11) resulting from the trained diffractive networks as a function of the number of diffractive neurons N. The solid curves are achieved by the 4-channel polarization multiplexed diffractive systems, which are compared with the dashed curves achieved by the regular diffractive networks trained with the same set of N but without polarization multiplexing. For the polarization multiplexed models, the results for the four polarization channels (1), (2j, @ and (T) are shown in separate curves but jointly labeled with “Pol. CD-'®/®/®” due to the spatial overlap of these curves. For the regular diffractive models without polarization multiplexing, the results for all-optical approximation of A i5 A 2 , A 3 and A$, (individually) are shown in separate curves but jointly labeled with ‘"No pol, A 1 /A2/A 3 /A 4 ” due to the spatial overlap of these curves. The space between the simulation data points is linearly interpolated. FIG. 7C: same as FIG. 7B but eosine similarity between the all-optical transforms and their ground truth is reported. FIG. 7D: same as FIG. 7B but the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 7E: diffraction efficiency of the presented diffractive networks, [0022] FIG. 8: Phase and amplitude differences between the linear transformation matrices A t , A 2 , A 3 and A 4 .

[0023] FIGS. 9A-9D: Diffractive all-optical transformation results of blindly testing the SeqPA-trained 2-channel polarization multiplexed diffractive networks under the SimPA mode. FIG. 9A: curves representing the normalized mean-squared errors between the ground troth transformation matrices (Ai and A 2 , shown in FIGS, 2A and 3) and their all-optical transforms (A^ and A 2 ) resulting from the trained diffractive networks as a function of N. The dash-dotted curves (labeled with ‘"Trained with SeqPA but tested with SimPA are achieved by the 2-channel polarization multiplexed diffractive network trained under the SeqPA mode but then immediately tested under the SimPA mode, which are compared with the solid curves achieved by the same diffractive networks both trained and tested under the SeqPA mode (labeled with “SeqPA (. f)/(2 )”), and the long dashed curves achieved by different diffractive networks both trained and tested under the SimPA mode (labeled with “SimPA (T)/(2)”). Note that the results for tlie two polarization channels (T) and (2), corresponding to transforms Aj and A' 2 , are shown in separate curves but jointly labeled with “... respectively, due to the spatial overlap of these curves. The space between the simulation data points is linearly interpolated. FIG. 9B: same as (FIG. 9A), but the cosine similarity between the all-optical transforms and their ground truth is reported. FIG. 9C: same as (FIG. 9B), but the mean-squared error between the diffractive network -estimated output fields and their ground truth is reported. FIG. 9D: diffraction efficiency of the presented diffractive networks

[0024] FIGS. 10A-10E: Crosstalk analysis of the 2-channel polarization multiplexed diffractive networks trained using the SeqPA and SimPA modes. Note that Oj 2 represents the crosstalk component from the polarization channel (1) to (2) measured at the output field of the diffractive network, while o’ 21 represents the crosstalk component from the polarization channel @ to QQ measured at the output field of the diffractive network. [0025] FIG. 11 : All-optical transformation matrices estimated by the 4-channel polarization multiplexed diffractive designs with <V ::: 16.3k and 65.5k, and their differences from the ground truth matrices.

[0026] FIG. 12: Examples of input/output complex fields for the ground truth transformations presented in FIGS. 7 and 11 along with the output fields computed by the 4- channel polarization multiplexed diffractive designs using A ; ~ 16.3k and 65.5k. Note that pro — zo' | indicates the wrapped phase difference between the ground truth output field o and the normalized diffractive network output field o'.

[0027] FIGS. 13A-13D: Diffractive all-optical transformation results for 4-channel polarization multiplexing of four distinct arbitrary linear transforms (depicted in FIG. 6) using phase-only diffractive networks. FIG. 13A: curves representing the normalized mean- squared error between the ground truth transformation matrices (A 1; A 2 . A 3 and A 4 ) and the all-optical transforms (Aj, A 2 , A 3 and A 4 ) resulting from the trained diffractive networks as a function of N. The solid curves are achieved by the 4-channel polarization multiplexed, phase-only diffractive networks, which are compared with the dashed curves achieved by the 4-channel polarization multiplexed, complex-valued diffractive networks, and the dash-dotted curves achieved by the regular diffractive networks (without polarization multiplexing). For the polarization multiplexed phase-only diffractive models, the results for the four polarization channels (]j, (2), (3) and (4) are shown in separate curves but jointly labeled with “Phase-only pol. (G/( /(4)’’ due to the spatial overlap of these curves. For the polarization multiplexed complex-valued diffractive models, the results for the four polarization channels @, (2), (3) and @ are shown in separate curves but jointly labeled with “Complex-valued pol. Uj/@/I.3y/(4j’ 1 due to the spatial overlap of these curves. For the regular diffractive models without polarization multiplexing, the results for all-optical approximation of/T . A 2 , A 3 and A 4 (individually) are shown in separate curves but jointly labeled with “No pol. A 1 /A 2 /A 3 /A 4 ” due to the spatial overlap of these curves. The space between the simulation data points is linearly interpolated. FIG. 13B: same as FIG. 13A but cosine similarity between the all-optical transforms and their ground truth is reported. FIG. 13C: same as FIG. 13A but the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 13D: diffraction efficiency of the presented diffractive networks.

[0028] FIGS. 14A-14E: Diffractive all-optical transformation results for 4-channel polarization multiplexing of four distinct arbitrarily-selected real-valued linear transforms using phase-only diffractive networks. FIG. 14A: arbitrarily-selected real-valued matrices l? 1? R 2 , ^3 which serve as the ground truth (target) for the diffractive all-optical transformations. FIG. 14B: curves representing the normalized mean-squared error between the ground truth transtonnation matrices (7? :[ , J? 3 and R 4 ) and the all-optical transforms and R 4 ' ) resulting from the trained phase-only diffractive networks as a function of A'. The space between the simulation data points is linearly interpolated. FIG. 14C: same as FIG. 14B but cosine similarity between the all-optical transforms and their ground truth is reported. FIG. 14D: same as FIG. 14B but the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 14E: diffraction efficiency of the presented diffractive networks.

[0029] FIG. 15 illustrates a single substrate layer of the polarization-encoded diffractive network. The substrate layer may be made from a material that is optically transmissive (for transmission mode) or an optically reflective material (for reflective mode). The substrate layer, which may be formed as a substrate or plate in some embodiments, has surface features formed across the substrate layer. The surface features form a patterned surface (e.g., an array) having different valued transmission (or reflection) properties as a function of lateral coordinates across each substrate layer. These surface features act as artificial ‘"neurons” that connect to other “neurons” of other substrate layers of the optical neural network through optical diffraction (or reflection) and alter the phase and/or amplitude of the light wave.

[0030] FIG. 16 schematically illustrates a cross-sectional view of a single substrate layer of a polari zation-encoded diffractive network according to one embodiment. In this embodiment, the surface features are formed by adjusting the thickness of the substrate layer that forms the optical neural network. These different thicknesses may define peaks and valleys in the substrate layer that act as the artificial “neurons.”

[0031] FIG. 17 schematically illustrates a cross-sectional view of a single substrate layer of a polarization-encoded diffractive network according to another embodiment. In this embodiment, the different surface features are formed by altering the material composition or material properties of the single substrate layer at different lateral locations across the substrate layer. This may be accomplished by doping the substrate layer with a dopant or incorporating other optical materials into the substrate layer. Metamaterials or plasmonic structures may also be incorporated into the substrate layer.

[0032] FIG. 18 schematically illustrates a cross-sectional view of a single substrate layer of a polarization-encoded diffractive network according to another embodiment. In this embodiment, the substrate layer is reconfigurable in that the optical properties of the various artificial neurons may be changed, for example, by appiication of a stimulus (e.g., electrical current or field). An example includes spatial light modulators (SLMs) which can change their optical properties. In this embodiment, the neuronal structure is not fixed and can be dynamically changed or tuned as appropriate. This embodiment, for example, can provide a learning diffractive network or a changeable diffractive network that can be altered on-the-fly (e.g., overtime) to improve the performance, compensate for aberrations, or even change another task.

[0033] FIGS. 19A-19B schematically illustrates an embodiment of the polarization- encoded diffractive network operating in sequential polarization access” (SeqPA) mode. [0034] FIG. 20 schematically illustrates an embodiment of the polarization-encoded diffractive network operating in the “simultaneous polarization access” (SimPA) mode. [0035] FIG. 21 illustrates a polarization-encoded diffractive network with a holder that holds the substrate layers and the polarizers.

[0036] FIGS. 22A-22D illustrates diffractive all-optical transformation results for 4- channel polarization multiplexing of four distinct arbitrary linear transforms using orthogonal circular polarization states at the input field-of-view. Left- and right-hand circular polarization states (i.e., L.HCP and RHCP) are used at the input of a polarization-multiplexed diffractive network to encode the input information, and x and y linear polarization states are used at the output of the diffractive network. Four different, arbitrarily-selected, complexvalued linear transformations were each assigned to one combination of circular-linear polarization. FIG. 22A: curves representing the normalized mean-squared error between the ground truth transformation matrices (Aj, A?, A-. and A 4 , shown in FIG. 7A) and the all- optical transforms (A^, A' 2 , A^and A 4 ) resulting from the trained diffractive networks as a function of N. The space between the simulation data points is linearly interpolated. FIG.

22B: same as (A) but the cosine similarity between the all-optical transforms and their ground truth is reported. FIG. 22C: same as (B) but the mean-squared error between the diffractive network output fields and their ground truth is reported. FIG. 22D: diffraction efficiency of the presented diffractive networks.

[0037] FIGS. 23A-23C illustrate the analysis of the impact of the polarizer array parameters on the computational performance of polarization multiplexed diffractive networks. FIG. 23A illustrates the transformation error as a function of period of each linear polarizer unit on the polarizer arrays. FIG. 23B illustrates the transformation error as a function of overall size of each polarizer array. FIG. 23C illustrates the transformation error as a function of number and position of the polarizer arrays in the diffractive network. Note that for FIGS. 23A-23C each histogram has data presented for the different polarizer arrays (4)) arranged from left to right

[0038] FIGS. 24A-24D illustrates the polarization extinction ratio (PER)-related analysis of polarization multiplexed diffractive networks. Note that for FIGS. 24A-24C each histogram has data presented for the different polarizer arrays (Q), (2 ), (4y) arranged from left to right.

Detailed Description of Illustrated Embodiments

[0039] With reference to FIGS. 1A, IB, 6A, 6B, 19A, 19B, 20, and 21, the polarization- encoded diffractive network 10 contains a plurality of substrate layers 12 that are physical layers which may be formed as a physical substrate or matrix of optically transmissive material (for transmission mode) or optically reflective material (for reflective mode). In transmission mode, light or radiation passes through the substrate layers 12 (as seen in FIGS. 1A, IB, 6A, 6B, 19B, 19B, 20, and 21). Conversely, in reflective mode, light or radiation reflects off the substrate layer(s) 12. Exemplary materials that may be used for the substrate layers 12. include polymers and plastics (e.g,, those used in additive manufacturing techniques such as 3D printing) as well as semiconductor-based materials (e.g., silicon and oxides thereof, gallium arsenide and oxides thereof), crystalline materials or amorphous materials such as glass and combinations of the same. Metal coated materials may be used for reflective substrate layers 12.

[0040] Tire input 14 to the polarization-encoded diffractive network 10 is a vectorized optical field as best illustrated in FIGS. 1A, IB, 6A, 6B, 19A, 19B, 20 and the output 16 of the polarization-encoded diffractive network 10 is also a vectorized optical field that has been subject to the linear transformation(s) encoded by the polarization-encoded diffractive network 10. The input 14 and the output 16 may each comprise an array of pixel values. The input 14 may include a two-dimensional complex data field in one embodiment. The output vectorized optical field 16 is captured by one or more image sensors 20 (seen in FIGS. 19A, 19B, and 20). With reference to FIGS. 19A, 19B, 20, and 21 polarizers 18 are located on the input and output sides of the polarization-encoded diffractive network 10. That is to say a first polarizer 18 is located before the first substrate layer 12 of the polarization-encoded diffractive network 10 while a second polarizer 18 is located after the last substrate layer 12. The output polarizers 18 or ‘Analyzers” act as a filter so that the output optical field 16 seen by the one or more image sensors 20 contains the desired polarization. As seen in FIGS. 19 A, 19B, and 20 the different polarizations of the output polarizers 18 are used for different channels.

[0041] Hie polarization-encoded diffractive network 10 operates by light or radiation that includes vectorized optical field as the input 14 enters the optical path that contains one or more optically transmissive and/or reflective substrate layer(s) 12 arranged in the optical path and having one or more polarizer arrays 24 interposed between the one or more optically transmissive and/or reflective substrate layer(s) 12. An output 16 is generated that is captured by one or more image sensors 20. The output 16 that is captured with the one or more image sensors 20 has undergone a plurality of distinct optical transformations as compared to the input 14. The substrate layer(s) 12 together with the polarizer array(s) perform or approximate the plurality of distinct optical transformations.

[0042] With reference to FIGS. 15-18, each substrate layer 12 of the polarization-encoded diffractive network 10 has a plurality of physical features 22 formed on the surface of the substrate layer 12 or within the substrate layer 12 itself that collectively define a pattern of physical locations along the length and width of each substrate layer 12 that have varied transmission properties (or varied reflection properties). The physical features 22 formed on or in the substrate layers 12 thus create a pattern of physical locations on or within the substrate layers 12 that have different valued transmission properties as a function of lateral coordinates (e.g., length and width and in some embodiments depth) across each substrate layer 12 (or reflective properties tor the reflective mode operation). In some embodiments, each separate physical feature 22 may define a discrete physical location on the substrate layer 12 while m other embodiments, multiple physical features 22 may combine or collectively define a physical region with a particular transmission (or reflection) property. Tire one or more substrate layers 12 are arranged along the optical path along with the one or more polarizer arrays 24 interposed between the one or more substrate layers 12 and collectively define or approximate a transformation between the vectorized input optical field 14 to the one or more optically transmissive and/or reflective substrate layer) s) 12 and the vectorized output optical field 16 created by optical diffraction through and/or optical reflection from the one or more optically transmissive and/or reflective substrate layer(s) 12 within the polarization-encoded diffractive network 10 reflecting the different transformations to be performed/implemented or approximated by the polarization-encoded diffractive network 10. In one embodiment, the transformation that is implemented or approximated is a linear transformation. For example, the polarizer array(s) 24 may include linear polarizers at a plurality of orientations. In another embodiment, the polarizer array(s) 24 may include circular and/or elliptical polarizers at a plurality of orientations. A single polarization-encoded diffractive network 10 is able to perform a plurality of unique linear transformations assigned to different input/output polarization combinations.

[0043] The pattern formed by the physical features 22 on or within the substrate layer 12 may define, in some embodiments, an array located across the surface of the substrate layer 12. With reference to FIG. 15, the substrate layer 12 in one embodiment is a two-dimensional generally planer substrate having a length (L), width (W), and thickness (t) that all may vary depending on the particular application. In other embodiments, the substrate layer 12 may be non-planer such as, for example, curved. In addition, while FIG. 15 illustrates a rectangular or square-shaped substrate layer 12 different geometries are contemplated. The physical features 22 and the physical regions formed thereby act as artificial “neurons” that connect to other “neurons” of other substrate layers 12 of the polarization-encoded diffractive network 10 through optical diffraction (and/or reflection) and alter the phase and/or amplitude of the light wave. The particular number and density of the physical features 22 or artificial neurons that are formed in each substrate layer 12 may van' depending on the type of application. In some embodiments, the total number of artificial neurons may only need to be in the hundreds or thousands while in other embodiments, hundreds of thousands or millions of neurons or more may be used. Likewise, the number of substrate layers 12 that are used in a particular polarization-encoded diffractive network 10 may vary' although it typically ranges from at least two (2) substrate layers 12 to less than ten (10) substrate layers 12.

[0044] After a last substrate layer 12 in the optical path, one or more image sensors 20 (which may' include detector array(s)) is/are provided that captures vectorized output optical field 16. It should be appreciated that in some embodiments a polarizer 18 is interposed between the last substrate layer 12 and the one or more image sensors 20 and filters the vectorized output optical field 16 so that it contains the desired polarization. The image sensor 20 may include, for example, a CMOS image sensor or image chip such as CCD, although the image sensor(s) 20 may also include photodetectors (e.g., photodiode such as avalanche photodiode detector (APD)), photomultiplier (PMT) device, detector array(s), and the like. [0045] FIG. 16 illustrates one embodiment of how different physical features 22 are formed in the substrate layer 12. In this embodiment, a substrate layer 12 has different thicknesses (t) of material at different lateral locations along the substrate layer 12. In one embodiment, the different thicknesses (t) modulates the phase of the light passing through the substrate layer 12. This type of physical feature 22 may be used, for instance, in the transmission mode embodiment of FIGS. 1A, IB, 6A, 6B, 19A, 19B, 20, and 21. The different thicknesses of material in the substrate layer 12 forms a plurality of discrete "‘peaks” and ‘‘valleys” that control the transmission properties of the neurons formed in the substrate layer 12. The different thicknesses of the substrate layer 12 may be formed using additive manufacturing techniques (e.g., 3D printing) or lithographic methods utilized in semiconductor processing. For example, the design of the substrate layers 12 may be stored in a stereolithographic file format (e.g., .stl file format) which is then used to 3D print the substrate layers 12. Other manufacturing techniques include well-known wet and dry etching processes that can form very small lithographic features on a substrate layer 12. Lithographic methods may be used to form very small and dense physical features 22 on the substrate layer 12 which may be used with shorter wavelengths of the light. As seen in FIG. 16, in this embodiment, the physical features 22 are fixed in permanent state (i.e., the surface profile is established and remains the same once complete),

[0046] FIG. 17 illustrates another embodiment in which the physical features 22 are created or formed within the substrate layer 12. In this embodiment, the substrate layer 12 may have a substantially uniform thickness but have different regions of the substrate layer 12 have different optical properties. For example, the refractive (or reflective) index of the substrate layers 12 may altered by doping the substrate layers 12 with a dopant (e.g., ions or the like) to form the regions of neurons m the substrate layers 12. with controlled transmission and/or reflection properties (or absorption and/or spectral features). In still other embodiments, optical nonlinearity can be incorporated into the network design using various optical non-linear materials (e.g., crystals, polymers, semiconductor materials, doped glasses, polymers, organic materials, semiconductors, graphene, quantum dots, carbon nanotubes, and the like) that are incorporated into the substrate layer 12. A masking layer or coating that partially transmits or partially blocks light in different lateral locations on the substrate layer 12 may also be used to form the neurons on the substrate layers 12.

[0047] Alternatively, the transmission function of the physical features 22 or neurons can also be engineered by using metamaterial or plasmonic structures. Combinations of all these techniques may also be used. In other embodiments, non-passive components may be incorporated into the substrate layers 12 such as spatial light modulators (SLMs). SLMs are devices that imposes spatial varying modulation of the phase, amplitude, or polarization of a light. SLMs may include optically addressed SLMs and electrically addressed SLM. Electric SLMs include liquid crystal-based technologies that are switched by using thin-film transistors (for transmission applications) or silicon backplanes (for reflective applications). Another example of an electric SLM includes magneto-optic devices that use pi xelated crystals of aluminum garnet sw itched by an array of magnetic coils using the magneto-optical effect. Additional electronic SLMs include devices that use nanofabricated deformable or moveable mirrors that are electrostatically controlled to selectively’ deflect, light, [0048] FIG. 18 schematically illustrates a cross-sectional view of a single substrate layer 12 of a polarization-encoded diffractive netw ork 10 according to another embodiment. In this embodiment, the substrate layer 12 is reconfigurable in that the optical properties of the various physical features 22 that form the artificial neurons may be changed, for example, by application of a stimulus (e.g., electrical current or field). An example includes spatial light modulators (SLMs) discussed above which can change their optical properties. In other embodiments, the layers may use tire DC electro-optic effect to introduce optical nonlinearity into the substrate layers 12 of the polarization-encoded diffractive network 10 and require a DC electric-field for each substrate layer 12 of the polarization-encoded diffractive netw ork 10. This electric-field (or electric current) can be externally applied to each substrate layer 12. of the polarization-encoded diffractive network 10. Alternatively, one can also use poled materials with very' strong built-in electric fields as part of the material (e.g., poled crystals or glasses). In this embodiment, the neuronal structure is not fixed and can be dynamically changed or tuned as appropriate (i.e., changed on demand). This embodiment, for example, can provide a learning polarization-encoded diffractive network 10 or a changeable polarization-encoded diffractive network 10 that can be al tered on-the-fiy to improve the performance, compensate for aberrations, or even change another task.

[0049] As explained herein, a computerized or digital model of the polarization-encoded diffractive network 10 is first digitally trained using a computing device. Here, the digital model of the polarization-encoded diffractive network 10 is trained to all-optically perform a plurality of arbitrarily-selected linear transformations. Software that is executed by a computing device digitally trains a model or mathematical representation of the multi-layer diffractive and/or reflective substrate layers 12 used within the polarization-encoded diffractive network 10 to perform the selected linear transformations. This training

:stablishes the particular transmission and/or reflection properties of the physical features 22 and/or neurons formed in the substrate layers 12 to perform the different linear transformations.

[0050] Next, using the established or trained model and design for the physical embodiment of tire polarization-encoded diffractive network 10, the actual substrate layers 12 used in the physical embodiment (FIGS. 19A, 19B, 20, 21) of the polarization-encoded diffractive network 10 are then manufactured in accordance with the model or design. The design, in some embodiments, may be embodied in a software format (e.g., SolidWorks, AutoCAD, Inventor, or other computer-aided design (CAD) program or lithographic software program) and may then be manufactured into a physical embodiment that includes the plurality of substrate layers 12 having the tailored physical features 22 formed therein/thereon. The physical substrate layers 12, once manufactured may be mounted or disposed in a holder 26 as seen in FIG, 21 to maintain the appropriate spacing between the substrate layers 12. Hie holder 26, tor example, may include a number of slots formed therein to hold tlie individual substrate layers 12, the polarizer array(s) 24, and/or the polarizers 18 in the required sequence and with the required spacing between adjacent substrate layers 12 and the polarizer array(s) 24. The components held in the holder 26 may be removable. The physical substrate layers 12 and the polarizer array(s) 24 may also be integrated into a monolithic structure in other embodiments. Ihe substrate layers 12 may also be incorporated into a waveguide like an optical fiber.

[0051] FIGS. 19A and 19B schematically illustrate an embodiment of the polarization- encoded diffractive network 10 operating in “sequential polarization access” (SeqPA) mode. In this mode, two different modes of polarization through the same polarization-encoded diffractive network 10 is done sequentially. Specifically, for the first channel, the input optical field 12 is input to the polarization-encoded diffractive network 10 with the polarizers 18 in a first configuration (e.g., 0°). This is followed by inputting the same polarization- encoded diffractive network 10 with the polarizers 18 in a second polarization configuration (e.g., 90°). The output optical fields 14 are captured by the image sensor 20. Tire different polarization states may be accomplished by swapping out different polarizers 18 or repositioning the polarizers 18.

[0052] FIG. 20 schematically illustrates an embodiment of the polarization-encoded diffractive network 10 operating in the “simultaneous polarization access” (SimPA) mode. In this mode, two different modes of polarization through the same polarization-encoded diffractive network 10 is done simultaneously. A beam combiner 28 and beam spliter 30 are used, respectively, at the front and back ends of the polarization-encoded diffractive network 10 for simultaneous polarization along two different channels. Specifically, for the first channel, the input optical field 14 (e.g., input optical field 1) is input to the polarization- encoded diffractive network 10 with a corresponding set of polarizers 18 in a first configuration (e.g., 0°), At the same time, the input optical field 14 (e.g,, input optical field 2) is input to the same polarization -encoded diffractive network 10 with a corresponding second set of polarizers 18 in a second polarization configuration (e.g., 90°). Separate image sensors 20 along with respective polarizers 18 interposed in the optical path from the last substrate layer 12 and the respective image sensors 20 are used to capture the respective output optical fields 16 (e.g., output optical field 1 and output optical field 2).

[0053] The distinct optical transformations performed by tire polarization-encoded diffractive network 10 may include a number of operations or tasks. For example, the one or more distinct optical transformations may include at least one of: object classification, object segmentation, object imaging, object detection, object image filtering, object image compression, object image encryption, and an analog logical operation.

[0054] The light or radiation that is passed through the polarization-encoded diffractive network 10 may include light emitted from a number of different types of light sources. This may include narrowband or broadband sources. In addition, different portions of the electromagnetic spectrum may be used. This includes ultra-violet, visible, infrared, terahertz, or microwave parts of the electromagnetic spectrum.

[0055] Experimental

[0056] Results

[0057] Throughout this Experimental section, the terms ‘"diffractive optical network,” “diffractive network,” and “diffractive processor” are interchangeably used and describe the polarization-encoded diffractive network 10. The schematic of the framework for 2-channel polarization multiplexed all-optical computing (N p ------ 2) is shown in FIG. 1A. A polarization- encoded diffractive neural network 10, composed of four (4) trainable substrate layers 12, is trained to all -optically perform two (2) distinct, complex-valued linear transformations between the input and output FOVs through two (2) orthogonal polarization channels. The pre-determined polarizer array 24 (which is treated as non -trainable) consists of multiple repeating linear polarizer units with four different polarization directions: 0°, 45°, 90° and 135°. This non-trainable polarizer array 24 is positioned close to the center of the diffractive volume (i.e., between the 2 nd and 3 rd trainable diffractive layers 12) so that the resulting polarization modulation does not directly dominate the output field; the former and latter diffractive layers are jointly optimized to effectively communicate with the polarizer array 24 and all-optically implement the desired group of linear transformations. More details about the architecture, optical forward model and training details of the polarization diffracti ve network 10 can be found in the Methods section.

[0058] i and o are used to denote the complex-valued, vectorized versions of the 2D input and output complex fields 14, 16 located at the input and output FOVs of the diffractive network 10, respectively, as presented in FIG. 1A. Based on the scalar diffraction theory, here i x and o x represent the column vectors of the complex fields generated by sampling the x-polarized optical fields within the input and output FOVs, respectively, and vectorizing the resulting 2D matrices in a column-major order. Similar to i K and o x , l y and o y are their counterparts generated by sampling the y-polarized optical fields within the input and output FOVs, respectively. Based on this notation, (i x , i y ) and (o x , o y ) can be considered to represent the input and output channels of the polarization multiplexed diffractive network 10, respectively. In the analyses performed, the number of pixels in the input and output FOV s are both taken as N, : = N o = 8 2 = 64, such that each target linear transformation matrix has 64 2 complex-valued entries.

[0059] In this first implementation with N p ------ 2, generated two complex-valued matrices and A 2 , were randomly generated each with a size of N[ x N o = 64 2 , to serve as two unique arbitrary' linear transformations that were used to all-optically implement using a single polarization diffractive network 10. Visualized in FIG. 2A with their amplitude and phase components, these two matrices are independently generated using different random seeds, and the difference between the two matrices can be found in FIG. 8. Two training sets of complex-valued vectors {ij} and {i 2 } with A^ = 64 as input fields were randomly generated, and constructed the corresponding sets of output field vectors {04} and {o 2 } using Oj = Ai and o 2 = A 2 I 2 , respectively. For each one of these training sets, {q} and { i 2 }, 55,000 randomly generated complex fields were used in the training process. A further increase in the size of this training dataset (to e.g., >100,000 randomly generated complex fields) could improve the transformation approximation accuracy of the trained diffractive networks, but would not change the general conclusions and therefore is left as future work. [0060] Based on the given inputs of { fy } and {l 2 }, the ultimate goal of training the polarization-encoded diffractive network 10 is to simultaneously compute the all-optical output fields {ofy and {o 2 } to come close to the output ground truth (target) fields {Oj} and {o 2 }; this way, the all-optical transformations Aj and A 2 performed by the trained single diffractive system represent an accurate approximation to their ground truth (target) transformation matrices Aj and A 2 . It should be emphasized that the aim was not to train the diffractive network to implement the correct linear transformations for only a few inputoutput field pairs. Instead, despite the limited number of input/output field patterns used during the training process, the goal is to generalize to any pairs of (fy, Oj) and (i 2 , o 2 ) that satisfy f-h = tfyfy and o 2 = A 2 i2 . More details about the training data generation can be found in Methods.

[0061] To form two unique diffractive information processing pipelines in the same diffractive network 10 for performing the linear transformations given by /fy and A 2 , as shown in FIG. 1 A the input fields and the diffractive output pairs, {(fy, tfy)} and {(i 2 , o 2 )} were matched, with the input and output polarization channels of the diffractive system, i.e., fy ~ ~ °’i an d o y = o 2 . That is to say, the A\ transformation is performed by encoding the corresponding input field data fy into the x-polarized optical field within the input FOV, using e.g., an x-aligned linear polarizer 18, and decoding (sampling) the x- polarized component of the field within the output FOV as the computed output field using e.g., an x-polarized analyzer 18. One can denote this diffractive information processing channel as the channel (fy FIG. IB. It is also a similar case for the transformation, except this time the y-polarization is employed at the input and output FOVs, and this diffractive information processing channel is denoted as the channel ( 2). With this polarization encoding scheme, there are potentially two modes to perform the data inference through the same diffractive network 10: (1) in two sequential, successive accesses to the diffractive system, each time feeding the input data, using its assigned polarization channel, and obtaining the corresponding output (see FIG. IB, left); and (2) in single access to the diffractive system, by feeding the input data of both of the two polarization channels in parallel, and obtaining the two corresponding outputs simultaneously (see FIG. IB, right). The former and latter approaches were termed as the “sequential polarization access” (SeqPA) mode and the “simultaneous polarization access” (SimPA) mode, respectively.

[0062] It should be emphasi zed that the fundamental difference between these two modes of operation lies in the input information: the SimPA mode can simultaneously accept both of the input polarization states (e.g., x and y polarization) tor encoding two different channels of input information, while the SeqPA mode can accept a single polarization state as its input so that only one channel of input information is encoded at a given time. Therefore, if the input FOV simultaneously encodes the data to be processed in two different polarization states, or if the time lag caused by switching between different input polarization states is unacceptable (such as e.g., an input FOV that includes a rapidly changing dynamic scene with specific polarization information), then only the SimPA mode would be suitable to process the input encoding. Conversely, if the system is only required to compute a single linear transformation at a given time, or if the time lag caused by switching back and forth between two different input polarization states is acceptable, then SeqPA mode can be used. Detailed analyses of these two modes of operation are presented in the following subsections.

[0063] 2-channeI polarization multiplexed all-optical diffractive computing using the sequential polarization access (SeqPA) mode

[0064] As shown in FIG. IB, left, with the input data fy and r 2 being separately and sequentially fed into the polarization channels (1) and @, respectively, the all-optical computed outputs Oj and o 2 ' are also collected successively using the same diffractive network hardware. By employing this SeqPA strategy, polarization multiplexed diffractive networks were trained with different numbers of trainable diffractive neurons, i.e., N -- {32 2 , 44 2 , 64 2 , 92 2 , 128 2 , I SO 2 , 256 2 }, all using the same training datasets {(i 15 Oj)} and {(i 2 , o 2 )} and the same number of epochs. To benchmark the performances of these multiplexed diffractive networks, for each transformation dataset and N, regular diffractive networks were trained without the polarizer array or any polarization encoding/decoding at the input/output FOVs, which constitute the baseline. These regular diffractive networks, denoted as ’"No pol ” In the analyses, are trained to approximate only one linear transformation (i.e., either Aj or A 2 ), and therefore they are referred to as A' p = 1 (no polarization multiplexing).

[0065] FIGS. 2B-2E present the quantitative comparison of the all-optical transformation results obtained using the trained diffractive networks described above. Three different metrics were used to quantify the transformation accuracy and generalization performance of these diffractive networks: (1) the normalized transformation mean-squared error (MSE Trans formation), (2) the cosine similarity (CosSlm) bet ween the all-optical transforms and the target transforms, and (3) the mean-squared error between the diffractive network output fields and their ground truth (MS£ OlitpuS ). These performance metrics are reported in FIGS. 2B,2C,2D, as a function of the number of diffractive neurons (.¥) used in each design. Note that the transformation error of the polarization multiplexed diffractive systems is calculated per polarization channel. More details about the formulations of these performance metrics can be found in Methods. In FIG. 2B, it can be seen that the transformation errors of all the trained diffractive models monotonically decrease as N increases, which is expected due to the increased degrees of freedom in the diffractive processor. In the standard diffractive networks without polarization multiplexing (dash-dotted curves labeled with “No pol. or “No pol. A 2 ”), the transformation errors for implementing or A 2 are almost the same (winch indicates that these randomly selected matrices, 4j and A 2 , represent similar computational complexity; also see FIG. 8). The approximation errors of these standard diffractive networks. No pol. Aj and No pol. A 2 , both approach to 0 as N approaches A r pV 0 = 64 2 « 4,1k. In the polarization multiplexed diffractive models (solid curves labeled with “SeqPA (Y)” or “SeqPA (2 )”), the transformation errors MSE Transformat ;ion for the two distinct transforms computed through the two polarization channels are also very close to each other for all values of N demonstrating no bias toward any specific polarization channel or transform. Tire approximation errors of these polarization multiplexed models approach to 0 as N approaches N p NiN o = 2N i N 0 = 92 2 ~%.2k. This finding indicates that compared with the baseline diffractive models that can only perform a single transform, performing two unique transforms using polarization multiplexing through the same diffractive model requires the number of trainable neurons N to double. This conclusion is further supported by the results of the other two performance metrics, CosSim (FIG. 2C) and AfSE OlJ , tp -. Jt (FIG.

2D) that both show the same trends as in FIG. 2B: for the baseline diffractive models CosSim and MSE Outfnit approach 1 and 0 as N approaches A' ; A f £) , respectively, while for the polarization multiplexed diffractive models, the two metrics approach 1 and 0 as N approaches — 2A r £ /V 0 . Apart from the metrics that are used to evaluate the transformation performance, the output diffraction efficiencies (??) of these diffractive models are reported in FIG. 2E, which reveal that compared with the baseline diffractive networks (No pol.), the diffraction efficiencies of the polarization multiplexed diffractive models trained using the SeqPA mode reach a similar level.

[0066] To further demonstrate the performance of the polarization-encoded diffractive networks 10, in FIG. 3 examples of the ground truth transformation matrices (i.e., Aj and A 2 ) and their counterparts (i.e., A' r and A 2 ) resulting from the diffractive designs with N = {44 z , 92 2 , 1 SON are shown, along with the amplitude and phase absolute errors. Exemplary' complex-valued input-output fields from the same set of diffractive designs are also presented in FIG. 4. FIGS. 3 and 4 reveal that for both of the polarization channels, when N >

N p NiN o = 2N,-N 0 , the all-optical transformation matrices and the output complex fields very well match their ground truth targets with negligible absolute errors, which are also in line with the observations made in FIG. 2.

[0067] 2-channeI polarization multiplexed all-optical diffractive computing using the simultaneous polarization access (SimPA) mode

[0068] As an alternative to the sequential polarization access (SeqPA) used earlier, the use of the simultaneous polarization access (SimPA) mode in the all-optical computing framework was explored. As shown in FIG. IB, right, in single access to the diffractive system, the input complex-valued data q and i 2 are fed into the polarization channels (1) and (2j, respectively, and the all-optical diffractive outputs o) and o 2 ' are collected at the same time through two orthogonal polarization states at the output FOV. Before a new polarization multiplexed diffractive network was trained from scratch using the SimPA mode, the earlier diffractive designs trained using the SeqPA mode were obtained and tested them directly using the SimPA mode by inputting both polarization channels ( f) and (J) al the same time, deviating from their training scheme, which only used SeqPA. Hie results of blindly testing the SeqPA-trained diffractive networks under the SimPA mode are shown in FIG. 9A-9D, which reveals inference results with significantly higher values of MS£’ Transformalion and MS£’ Output and decreased values of Cos Sim, all of which indicate a performance degradation, when one operates a SeqPA-trained diffractive network using the SimPA mode. As shown in FIG. 10, this performance degradation is due to the “cross-talk” between the two transformation channels when both of the input polarization states are at the same time present, which was not considered during the SeqPA-based training process. These results highlight the necessity of training the diffractive system from scratch under the SimPA mode, so that the impact of this cross-talk can be taken into account and minimized during the iterative design process. A related mathematical analysis that supports the same conclusion is reported in the Supplementary Note 1 herein.

[0069] After training the digital diffractive models from scratch using the SimPA mode, their blind testing results are reported in FIG. 5 using the solid curves labeled with “SimPA (T)” and “SimPA (2)”. The results of the new' diffractive designs trained using the SimPA mode demonstrate the success of all-optically performing two different linear transformations in parallel using polarization multiplexing. Analysis (FIG. 5) also reveals the same conclusions discussed earlier for the models trained using the SeqPA mode: the all-optical transformation performance of polarization multiplexed diffractive networks very well match the ground truth, desired transformations as N approaches N p N[N o = 2N,-N 0 . Furthermore, as shown in FIG. 5D, the diffraction efficiencies achieved by the polarization multiplexed diffractive networks reach a similar level as their baseline counterparts that use tire same number of diffractive layers, but without the linear polarizer array.

[0070] The blind testing results of these two different modes of operation (SeqPA vs.

SirnPA) were compared and performed a cross-talk field analysis (see FIGS. 10A-10E). It was found that the amount of transformation cross-talk in the diffractive models trained using the SimPA mode (shown in the right column of FIG. 10C-10D), is -300-fold lower when compared with the amplitude values of the cross-talk observed in the diffractive designs trained using the SeqPA mode (shown in the left column of FIG. 10C-10D). During the diffractive model training, these cross-talk fields are gradually eliminated (penalized) using the SimPA mode of operation to better approximate the ground truth fields. However, for the diffractive models trained under the SeqPA mode, such cross-talk fields are ignored (i.e., remain non-penalized during the training phase) since the SeqPA operation assumes successive access to the diffractive network, one input polarization state at a time. Stated differently, Seq-PA trained diffractive networks successfully approximate the target transformations only when they are tested under the same SeqPA mode of operation, and fail due to the field cross-talk when tested under the Sim-PA mode.

[0071] 4-dianneI polarization multiplexed all-optical diffractive computing

[0072] So far, all-optical computing with 2 -channel polarization multiplexing through a single diffractive network 10 has been demonstrated. To further exploit the polarization multiplexing capability of this diffractive computing framework, next, a 4-channel polarization multiplexed design was explored for performing four (4) different arbitrarily- selected linear transformations through a single diffractive network (i.e., N p ~ 4). FIGS. 6A, 6B illustrates the schematics of this framework. As depicted in FIG. 6B, by sequentially connecting one of the two input polarization states with one of the two output polarization states, four transformation channels, (l), (2 ), (3) and (4j, can be formed to all-optically perform N p ~ 4 distinct complex- valued transforms using the same diffractive processor.

Tliis 4-channel polarization multiplexed design operates in a similar way as the SeqPA mode, where the different input data are separately and sequentially fed into different input polarization channels. Using this SeqPA operation mode, the diffractive system can accurately perform 4 different complex -valued linear transformations using the same passive diffractive layers, in a single optical network. For example, when only one polarization state (e.g., i x ) is utilized to encode the input data (i.e., I = G = G :=: i 3 ), one can measure die output field at two orthogonal polarization states and simultaneously read out two computed outputs each corresponding to the result of a uniquely different linear 3 ) computed based on the same input; this capability enables parallel optical information processing through the same polarization encoded diffractive network. The overall design of this 4-channel diffractive system can be considered to utilize the remaining degrees of freedom in the crosstalk channels of the 2-channel system.

Additional analysis that supports the same conclusions can be found in Supplementary' Note 1.

[0073] Further, compared to the 2-channel polarization multiplexed system reported earlier, the polarization states for the output field sampling in this 4-channel system are selected to be 45° and 135° linear polarization. This design choice is made to balance out the diffraction efficiencies of the resulting four (4) different linear transformations that are all- optically performed by the diffractive network 10. Stated differently, this design choice introduces symmetry' to all the input/output polarization combinations that are each assigned to a different linear transformation. In FIGS. 6A and 6B, two output fields corresponding to the linear polarization directions at 45° and 135° are denoted as o a and Op, respectively. [0074] In the light of the earlier findings that point to the need for more diffractive neurons in the case of N p = 2 when compared to N p ~ 1, here eight (8) successive trainable diffractive layers were employed to increase the degrees of freedom for N p ~ 4 design (see FIG. 6A). Also, compared to the earlier 2-channel polarization multiplexed design, an additional linear polarizer array 24 was included with the same configuration as before (with polarization orientations of 0°, 45°, 90° and 135°) to further enhance the spatial diversity' of polarization modes within the diffractive processor. These two linear polarizer arrays are positioned after the 3 rd and 5 th diffractive layers, respectively. Same as the N p = 2 diffractive designs, these linear polarizer arrays 24 are pre-determined (i.e., non -trainable) and act as ‘‘polarization seeds” within the trained diffractive network 10.

[0075] Next, random data was generated to train and test diffractive networks under N p — 4. In addition to the two randomly-generated ground truth transforms A 3 and /I? that were earlier used for the 2-channel models, two additional complex-valued transforms 4 3 and A 4 were randomly generated and accordingly constructed the training and testing dataset consisting of the input and ground truth output fields. These four (4) ground truth (target) transforms are visualized in FIG. 7 A. and their differences can be found in FIG. 8, Following the training of the polarization multiplexed diffractive networks with different N, their transformation performance for N p — 4 is analyzed in FIGS. 7B-7D based on the same set of performance metrics that were used earlier. These results reveal that, when N approaches N p N,-N 0 = 4NtN o = 16.4k, the MSE Trans f orraatjon and AT7E 0 -. Jtput of all the four diffractive transformations approach 0, while the CosSim approaches 1, demonstrating that ail the target linear transformations (A a , A 2 , 4 3 and A 4 ) can be successfully approximated by a single diffractive processor with a negligible error if AT > N p N[N o . This is the same conclusion that was reached earlier for N p = 2.

[0076] To further demonstrate the success of these 4-channei polarization-multiplexed diffractive systems, in FIG. 11 the ground truth transformation matrices are presented (i.e., A r , J4 2 , 4 3 and A 4 ) and their diffractive counterparts (i.e., 4^, A 2 , A 3 ' and A 4 ) designed with N = { 14.3k, 66.5k}, along with the amplitude and phase errors made in each case. Furthermore, exemplary complex-valued output fields achieved by these diffractive systems are also shown in FIG. 12, all of which confinn the success of the presented 4-channel polarization-multiplexed diffractive designs. Finally, the output diffraction efficiencies of these diffractive models were analyzed and reported in FIG. 7E. The results show that, compared to their counterparts without polarization encoding (N p = 1), the polarization multiplexed diffractive models with N p ~ 4 turn out to be less power efficient (per transformation), with an efficiency decrease of -6 dB at the output FOV.

[0077] This relatively small difference in the output diffraction efficiencies mainly stems from the different number of diffractive layers used in these two systems: the baseline diffractive systems without polarization encoding use 4 diffractive layers, whereas the 4- channel polarization multiplexed systems are much deeper, utilizing 8 diffractive layers. Considering that the optical field within a deeper system with more diffractive layers propagates and spreads over a longer axial distance, it exhibits a relatively lower diffraction efficiency. Therefore, these results do not contradict the previous conclusion that the diffraction efficiency of the polarization multiplexed diffractive network is similar to that of the baseline diffractive system when using the same number of diffractive layers. [0078] The results and analyses presented so far demonstrated that a single polarization- multiplexed diffractive network can all -optically compute four different complex- valued, arbitrarily-selected linear transformations between its input and output FOVs by using orthogonal linear polarization states. In addition to linear polarization, other polarization states can also be used, without loss of generality, to perform the same multiplexed computational tasks. To demonstrate this capability, two orthogonal circular polarization states were used (i.e. , left- and right-hand circular polari zation) at the input of a polarization- multiplexed diffractive network to encode the input information; the output channels in this case included x and y linear polarization states, i.e., the 4 different, arbitrarily-selected linear transformations were each assigned to one combination of circular-linear polarization. The results, reported in FIGS. 22A-22D, revealed that circular input polarization-multiplexed diffractive processors successfully approximated the target, complex-valued linear transformations, when N approaches N p N,-N 0 ~ 4NiN 0 ~ 16.4/c, arriving at the same conclusion made for linear input polarization states.

[0079] In this diffractive design, the same linear polarizer array (i.e., the seed) was used within the diffractive network volume to communicate between the circular polarization states at the input. FOV and the linear polarization states at the output FOV, all-optically performing 4 different complex-valued transformations through the same diffractive network. A mathematical analysis of this design and its relationship to earlier diffractive designs with linear input/output polarization states is also provided in Supplementary' Note 1. Since any arbitrary polarization state can be expressed through a superposition of orthogonal linear or circular polarization states, the same diffractive design can be extended to different input/output combinations of other polarization states. As detailed in Supplementary Note 1, a polarization multiplexed diffractive processor with N p = 4 can be designed by using inputoutput combinations of 2 orthogonal input polarization states (e.g., linear, circular or elliptical) and 2 orthogonal output polarization states (e.g., linear, circular or elliptical), where each input-output polarization combination all-optically performs one of the target complex-valued transtonnations (J 15 A 2 , 4 3 , A 4 ). Supplementary' Note 1 further proves that any 7 additional transformation matrix A a that can be assigned to a new combination of input- output polarization states of the diffractive network can be written as a linear combination of [0080] Discussion

[00811 The results and analysis demonstrate that, using polarization multiplexing in a single diffractive network 10, one can all -optically perform a group of complex -valued arbitrary linear transformations at the same output FOV of the diffractive network 10. In practical applications, these different transformations can cover, for example, various machine vision tasks, such as detection, classification, and localization of objects, which can be programmed into different input/output polarization states. These different tasks could potentially be also performed by employing multiple, separately-optimized diffractive networks, each of which is dedicated to performing a single computational task. However, such an approach would require the precise optical projection of an input FO V (while preserving its phase and amplitude distribution and polarization information) onto separately positioned, individual diffractive networks, and would naturally suffer from additional optical losses and aberrations, misalignment issues, a much larger device footprint and higher manufacturing/alignment-related costs. In contrast, integrating multiple tasks to be all- optically performed within the same diffractive network and a common input FOV provides a much simpler and better design, offering unique advantages such as e.g., speed, compactness, resilience to misalignments and aberrations, power efficiency and cost-effectiveness.

[0082] Also note that, it is not practical to spatially superimpose multiple diffractive subsystems, each one separately designed for a unique transformation, using e.g., phasecomposite metasurfaces or other metamaterials to create a polarization-multiplexed diffractive processor. First, in the design of each diffractive meta-unit, the cross-talk between the meta-atoms for the tw ? o orthogonal polarization states cannot be neglected. Therefore, the direct superposition of two or more different metasurface designs separately trained/designed for each one of the complex transformations would not work due to the cross-talk betw een the polarization channels of different metasurface designs. Stated differently, different metasurface designs, when put together in order to achieve multiplexed linear transformations in the same optical unit, wall fail each other’s transformation accuracy . In addition to this, there will be field cross-talk between the adjacent meta-units that are merged together on the same layer due to the in -plane propagating waves. Although increasing the lateral distance between two adjacent meta-units (from different designs, each targeting one transformation) can weaken the impact of this field cross-talk problem, it will then lead to lower diffraction efficiencies at the output and sacrifice the lateral density of the meta-units at each diffractive layer, thus degrading the computational performance and accuracy of the system. Furthermore, the desired phase response of such polarization encoded meta-units in general covers a small angular range, leading to a low numerical aperture (NA) that fundamentally limits the connectivity between the diffractive layers. In the diffractive solutions discussed herein, each isotropic feature of the diffractive network communicates with the following diffractive layer) s) with an NA of n (n = 1 in air). However, rn etas urface -based designs would fall short to offer such high numerical apertures, because the high spatial frequency components for the orthogonal input polarizations would deviate from the ideal phase response of the meta-unit, introducing errors to the multiplexed linear transformations that are targeted. Due to some of these challenges outlined above, metasurface or metamaterialbased diffractive surfaces have not yet been demonstrated as a solution to universal, all- optical implementation of an arbitrary linear transformation or a group of transformations. [0083] In addition to polarization multiplexing, it should be noted that other degrees of freedom can be used to implement multiple computational tasks through a single diffractive network 10. For example, one can divide the input/output FOVs of the diffractive network 10 into multiple regions, where each region is assigned to a unique computing task through spatial -di vision multiplexing. It is also possible to achieve wavelength-division multiplexing by assigning different wavelengths or spectral bands to independent computing tasks and employing dispersive elements in the diffractive computing system. In contrast to these other possible methods of information multiplexing, the polarization-based multiplexing that are reported here requires solely the addition of polarizers to a diffractive network without changing its architecture. Such polarizers are readily available (e.g., polarizing films), even integrated with the individual pixels of polarization-based imaging systems, and can be adapted to a wide range of wavelengths. The polarizers or polarizer arrays 24 may be located on, within, or adjacent to the surface of the one or more optically transmissive and/or reflective substrate layer) s) 12. The polarizer arrays 24 may be physically integrated on or within the one or more optically transmissive and/or reflective substrate layer(s) 12. In addition, the polarizer arrays 24 may have individual polarizer elements with orientations and locations that are fixed (as used herein) or controllable/programmable/reconfigurable.

Furthermore, polarization multiplexing can be flexibly coupled with other multiplexing methods (such as spectral and/or spatial multiplexing) to further increase the computing capacity of the diffractive network.

[0084] Unlike the substrate layers 12, where the transmission properties/coefficients are trained and optimized to all-optically perform the target transformations, the design and arrangement of the seed polarizer arrays 24 between the substrate layers 12 are treated as hyperparameters that are pre -determined and non-trainable. Therefore, the parameters of the embedded polarizers including their number, size, and orientation are fixed during the training process. The polarization modulation induced by these polarizer arrays remains unchanged and was not used as learnable degrees of freedom for the diffractive computing system to approximate the target transformations. Furthermore, their total number is small, i.e., 6x6=36 linear polarizers per array were used, which is negligible when compared to N. An increase in the number of linear polarizers per plane would not improve the approximation power of the diffractive network to perform arbitrary linear transformations. However, the topology of such polarizer seeds could potentially impact the performance of the polarization multiplexed diffractive computing system. To explore this, several key parameters of the linear polarizer array used in the diffractive processor designs was adjusted including e.g., 1) the period of each polarizer unit, 2) the overall size of each polarizer array, and 3) the number and position of the polarizer arrays within the diffractive network. For tins comparative analysis, the 4-channel polarization multiplexed diffractive system was used as the test-bed with N = N p NiN 0 = 16.3k and the same complex-valued target linear transforms (i.e., A r , A 2 , A 3 and A 4 ), the results of which are summarized in Supplementary Note 2. Based on these analyses, it was observed that: (1) a better approximation accuracy can be achieved w hen the period of each linear polarization unit on the polarizer array is < 42, and a period of '-42 empirically appears as an optimal choice, also providing an improved output diffraction efficiency (see FIGS. 23A-23C); (2) the linear transformation accuracy and the diffraction efficiency of the system can be optimized by using polarizer arrays with a sufficiently large size, i.e., at least matching the size of the neighboring diffractive layers; (3) using two polarizer arrays and placing them apart with an axial distance of ~82 within the diffractive volume can provide improved results for the all-optical transformation accuracy and diffraction efficiency of N p = 4 designs; and (4) using too many (e.g., >6) polarizer arrays within a diffractive network can lead to severe degradation in die computational accuracy of the system (unless more diffractive layers are added to the design).

[0085] It should also be emphasized that the reported polarization-multiplexed diffractive networks can be directly applied to 2D arrays of phase and amplitude input data. Compared to other optical computing systems operating based on e.g., integrated photonics, winch requires I D inputs and phase recovery if the information is represented in the phase channel, the capability to directly process and analyze raw 2D complex fields makes the framework highly advantageous for visual computing tasks. On the other hand, unless spatial light modulators (SLMs) are employed as part of the diffractive system, each physically fabricated diffractive network 10 is fixed and would need to be retrained and fabricated again as the target transformations change, which is a limitation of passive diffractive systems.

[0086] There are additional limitations of the presented diffractive computing framework. First, polarization multiplexed diffractive computing systems present lower diffraction efficiencies at their output FOV compared to regular diffractive networks without polarization multiplexing. (See FIGS. 2E and 5D). Several remedies can be used to improve the output diffraction efficiency such as e.g., adding a diffraction-efficiency-related penalty term to the training loss function, and/or restricting the diffractive layers to perform phase- only modulation. The efficacy of using these approaches in a regular diffractive network design (without polarization multiplexing) to improve the output diffraction efficiency has already been demonstrated. To exemplify the performance of a phase-only diffractive design and how 7 it can be used to improve the output diffraction efficiency, phase-only diffractive networks wore trained from scratch for the 4-channel polarization multiplexing case (N p = 4), the results of which are summarized in FIGS. 13A-13D. This analysis revealed that phase-only diffractive designs can achieve significantly better output diffraction efficiencies (FIG. 13D) (improved on average by ~12dB), while still successfully approximating the target linear transformations (Ax, A? . and A 4 ). As a trade-off, how-ever, these phase-only diffractive designs also exhibit reduced degrees of freedom compared to their complexvalued counterparts. As a result of this, it was observed that all the target linear transformations were successfully approximated by a single phase-only diffractive processor when N approached 2N p N,-N 0 = 8N,-N 0 . This 2-fold “threshold increase” in the number of diffractive features (i.e., 2N p N,-N 0 vs. N p N,-N 0 ) is a direct reflection of the reduced number of trainable transmission parameters per diffractive layer due to the phase-only operation, which is a limitation of phase-only diffractive networks, despite their enhanced output diffraction efficiency. To further validate this conclusion, another set of four (4) target linear transformations were selected by changing the matrix elements to be real-valued, and used them as ground truth to train phase-only polarization multiplexed diffractive networks with N p = 4. As shown in FIGS. 14A-14E, the results reveal that these phase-only diffractive networks can successfully approximate the real-valued target linear transforms when N > NpN;N o = 4N, : N 0 , demonstrating a similar approximation performance, with significantly higher output diffraction efficiency compared to their complex-valued diffractive counterparts. These findings emphasize the value of phase-only diffractive network designs as a photon-efficient solution in polarization multiplexed diffractive computing, also providing an important rationale for planning the diffractive neuron budget (N) for a given computational task.

[0087] Other practical concerns that need to be discussed include the potential fabrication and alignment errors, surface reflections, material absorption and non-ideal polarization modulation within the diffractive network, which may altogether limit the performance and accuracy of diffractive computing. Some of these errors can be mitigated by selecting appropriate fabrication methods, e.g., high-precision lithography, and using less absorptive materials. Moreover, previous results showed that some of these uncontrolled physical errors and imperfections did not lead to a significant discrepancy between the experimental and numerical, expected results, indicating the correctness of the assumptions involved in the optical forward model and training procedures. Even if these errors and imperfections become considerable, the performance degradation of a diffractive network caused by some of these experimental factors can be compensated by incorporating them as random variables into the physical forward model of the diffractive network during the training process. One example of this has been demonstrated previously where the destructive impact of the lateral and axial misalignments of diffractive layers was mitigated by randomly misaligning the diffractive network during its training process. Follow- ing a similar strategy, the imperfect polarization extinction ratio (PER) of the polarizer arrays/seeds can also be included as part of the physical forward model using a modified form of the Jones matrices for linear polarizers. This modeling of imperfect PER of linear polarizers during the training phase can mitigate a potential performance degradation in the computational power of a polarization multiplexed diffractive processor. Supporting this conclusion, Supplementary Note 3 and FIGS. 24A-24D report the mathematical analysis and simulation results for using imperfect linear polarizer arrays/seeds in the diffractive network designs. In the same Supplementary- Note 3, the overall PER of SimPA -based polarization multiplexed diffractive designs was quantified, considering each diffractive network as a monolithic polarization optical element. "Die analysis reveals that the SimPA-based 2 -channel polarization multiplexed diffractive design exhibits a very high PER of >51,000. In fact, such a high PER is expected since the SimPA mode is designed to simultaneously' perform two different linear transformations using two orthogonal polarization states, and therefore undesired polarization cross-talk at the output field-of-view' was penalized during the training phase, successfully leading to a high PER diffractive network. For the SeqPA mode of operation, however, PER is not a meaningful figure-of-merit since only one orthogonal polarization state is read/measured at a given time due to the sequential access of each target transformation through the diffractive network; stated differently, the SeqPA mode of operation does not penalize the leakage of power into an orthogonal polarization state at the output as it does not impact at all the accuracy of each all-optical transformation that is seq uentially performed.

[0088] In addition to performing multiple arbitrarily-selected linear transformations through polarization encoding, the presented framework can also be used for polarization- aware optical imaging and sensing tasks. Polarization-based optical imaging has been used in many biomedical applications, such as performing diagnoses of diseases, including gout, malaria infection, squamous cell carcinoma, and cerebral amyloid. The presented polarization-multiplexed diffractive computing framework exhibits translational potential for some of these biomedical applications including e.g., the all-optical detection and classification of birefringent crystals in bodily fluids for diagnosing various forms of crystal arthropathy.

[0089] A diffractive network-based all-optical computing framework is disclosed that can perform multiple complex-valued, arbitrary linear transformations using polarization multiplexing. This framework is very’ compact; for instance, the system depicted in FIGS, 1A-1B has a total length of only 20k in depth, where k is the illumination wavelength. The results show' that when the number of diffraction elements/neurons, A, in a given diffractive network design approaches N p N(N o , a group of N p arbitrarily-selected linear transforms can be all-optically computed at the output FOV of the network with negligible error. This polarization multiplexed diffractive computing framework can be used to build all-optical, passive processors that can execute multiple inference tasks m parallel. Artificially engineered materials with polarization manipulation capabilities can also be combined with advanced diffractive surface fabrication techniques (e.g., high-precision 3D additive manufacturing and photolithography) to allow the use of the diffractive computing framework in different parts of the electromagnetic spectrum.

[0090] Materials and Methods

[0091] Forward model of the polarization multiplexed diffractive optica! network. Using Jones calculus, the complex-valued, polarization-multiplexed electrical field E at a spatial location (x m , y m , z m ) can be represented as:

[ 80921 In the implementation, E x and E v are computed in parallel throughout the entire diffractive system. Since the trainable diffractive layers (i.e., substrate layers 12) are not polarization-sensitive, the complex-valued modulation generated by these thin diffractive layers is the same for tlie two orthogonal polarization states. The diffractive layers are assumed to be thin optical modulation elements, where the m ta feature on the /c th diffractive layer at location (% m , y m , z m ) represents a complex-valued transmission coefficient, t k , given by:

[0093] In Eq. 2, a and <p denote the amplitude and phase coefficients, respectively. The amplitude and phase coefficients of the diffractive neurons, a k and <p k (k E {1, 2, • ■ ■ , if}), are both trainable, with a permitted range of 0 to 1 and 0 to 2K, respectively. Before the training starts, a K and <p k are randomly initialized with a uniform ([/) distribution of [/[0, 1] and U [0, 2n), respectively. For a phase-only diffractive design a k ~ 1. The size of each diffractive neuron on the transmissive layers and the width of the pixels of the input/output fields are both chosen as 1/ 2.

[ 80941 The diffractive layers are connected to each other by free-space wave propagation, which is modeled through the Rayleigh-Sommerfeld diffraction equation: I i 1 . 1 \ I j i' Ji2nr\ t f ,y ,\ feT + pj exp V“F7

[0095] z, A) is the complex-valued field on the m th neuron of the /c th layer at (x, y, z) with a wavelength of A, which can be viewed as a secondary wave generated from the source

For the fc th layer (k > 1, treating the input plane as the 0 th layer), the modulated optical field Ep at location (x m , y m , z m ) with a polarization state of p (p E {x, y }) is given by:

[0096] where S denotes all the pixels on the previous diffractive layer. For all the diffractive networks trained in this paper, the axial distances d 0 , , d K are all chosen as

4A. [0097] When modeling the polarizer elements in the diffractive system, Jones matrices were used to represent the modulation of the complex field brought by the input polarizer, output analyzer, or the polarizer array at location (x, y, z), the process of which can be written as:

[0098] where E' !n and E out are the vectors denoting the input and output complex field before and after the polarization modulation, each containing two orthogonal components along the x and v directions, i

[0099] /linear (X z ) represents the Jones matrix of a linear polarizer element, which is given by:

[00100] where 0(x, y, z) is the angle between the x-axis and the polarizing axis of the linear polarizer located at (x, y, z). For the non-trainable, pre -determined polarizer array 24 that is composed of multiple square-shaped linear polarizers and, in total, four (4) types of linear polarizer units were used with four (4) different polarizing axis directions, 6 ().25TT, 0.571, and 0.75re}. As illustrated in FIG. 1A, these four (4) different types of linear polarizers are spatially binned to have a 2*2 period and repeated with three periods in each direction, extending into a square region. The side length of each linear polarizer array 24 is 241. The residual space surrounding the polarizer array 24 is filled with air, without any polarization modulation. For all the diffractive network designs presented herein, the axial distances (i.e., d p , d p l and d p2 ) between the pre-determined polarizer arrays 24 and the adjacent substrate layer 12 in front of them are all empirically chosen as 0; stated differently, each linear polarizer array 24 is attached to the isotropic diffractive layer in front of it.

[00101 ] Preparation of the linear transformation datasets. In the diffractive network designs, the input and output FOVs have the same size of 8 * 8 pixels, i.e., i c , o c G £ 8x8 (c G {1, 2, 3, 4}). The size of the transformation matrices is equal to 64 x 64, i.e., 4 C G c 64x64 (c E {1, 2, 3, 4}). The amplitude and phase components of the complex -valued transformation matrices A c used in this paper were generated with a uniform ([/) distribution of U[Q, 1] and U[0, 2u), respectively, using the pseudo-random number generation function random.uniform() built-in NumPy. Different random seeds were used to generate these transformation matrices to ensure they were uniquely different (see FIG. 8). Next, the amplitude and phase components of the input fields l c (c E {1, 2, 3, 4}) were also randomly generated with a uniform ([/) distribution of U [0, 1] and b'[0, 2tr), respectively. The ground truth (target) fields o c (c E {1, 2, 3, 4}) were generated by calculating o c = A c l c . For each 4 C (c e {1, 2, 3, 4}) a total of 70,000 input/output complex fields were generated to form a dataset, divided into three parts: training, validation, and testing, each containing 55,000, 5,000, and 10,000 complex-valued field pairs, respectively.

[00102] Training loss function. For training of the diffractive networks, the mean- squared-error (MSE) loss function was used, which is defined as:

[00103] where £’[■] denotes the average across the current batch, c stands for the c t!l polarization channel that is being accessed, and [n] indexes the n Eh element of the vector. <r e are the coefficients used to normalize the energy of the ground truth (target) field o c and the diffractive-network output field o c ' , respectively, which are given by:

[00104] During the training of the diffractive networks using the SeqPA mode, each polarization channel of the diffractive network is accessed and evaluated cyclically based on the order of the channel number. For instance, for the 2-channel polarization multiplexed design illustrated in FIG. IB, left, the access sequence during the training is set to be {(T), ■■■}; for the 4-channel polarization multiplexed design illustrated in FIG. 6, the access sequence is {(1), (3), (4), (T), (2), (3), (4), ...}. During the access of a certain polarization channel, the diffractive network is fed with one batch of the training input/output complex fields corresponding to the transformation matrix assigned to this channel, and then trained based on the average loss across this batch. Thus, the loss function for training the diffractive designs through the c lh polarization channel using the SeqPA mode, £ Seo c , can be simply written as:

Aeq.c “ £MSE,C (10).

[00105] During the training of the diffractive netw orks using the ShnPA mode, as illustrated in FIG. IB, right, all the polarization channels of the diffractive network are accessed simultaneously, and the training data are fed into the channels at the same time. For this SimPA mode, the diffractive network is trained based on the loss averaged across the different polarization channels and complex -valued fields in the current batch, where the loss function £ sim can be written as:

[00106] Performance metrics used for the quantification of all-optical transformation errors. To quantitatively evaluate the transformation results of the polarization-multiplexed diffractive networks, four performance metrics were calculated per polarization channel of the diffractive designs using the testing data set: (1) the normalized transformation mean- squared error (Afi’iiTransformaiion)^ (2) the cosine similarity (CosSim) between the all-optical transforms and the target transforms, (3) the normalized mean-squared error between the diffractive network output fields and their ground truth (MSE OlitpuL ), and (4) the output diffraction efficiency (p). The transformation error for the c t!! polarization channel of the diffractive network, MSE Trans f 0rmatj0n c , is defined as:

[00107] where a c is the vectorized version of the ground truth transformation matrix assigned to the c i!i polarization channel A c , i.e., a c = vec(A c ). a' c are the vectorized version of A c ' , which is the all-optical transformation matrix computed using the optimized diffractive transmission coefficients. m c is a scalar normalization coefficient used to eliminate the effect of diffraction-efficiency related scaling mismatch between A c and A c ’ , i.e., [00108] The cosine similarity between the all -optical transform and their target transform for the c th polarization channel, CosSim c , is defined as:

[00109] The normalized mean-squared error between the diffractive network outputs and their ground truth for the c th polarization channel, MSE OutputiC , is defined using the same formula as in Eq. 7 (the loss function used during the training process), except, for that E[-J is calculated across the entire testing set.

[00110] Tire mean diffraction efficiency ?? c for the c !h polarization channel of the diffractive system is defined as:

[00111 ] Training-related details. All the polarization-encoded diffractive networks discussed herein were simulated and trained using Python (v3.8.11) and TensorFlow (v2.6.0, Google Inc.). The Adam optimizer was selected for training all the models, and its parameters were taken as the default values in TensorFlow and kept identical in each model. Tire batch size and learning rate were set as 8 and 0.001, respectively. The training of the diffractive network models using the SimPA mode was performed with 50 epochs. Fortraining the diffractive models using the SeqPA mode, the 2-channel and 4-channel polarization multiplexed designs were trained for 100 and 200 epochs, respectively, so that equivalently 50 epochs are dedicated for training each polarization channel of these designs. The best models were selected based on the MSE loss calculated on the validation data set. For the training of the diffractive models, a desktop computer with a GeForce GTX 1080Ti graphical processing unit (GPU, NVidia Inc.) and Intel®? Core 1M i7-8700 central processing unit (CPU, Intel Inc.) and 64 GB of RAM was used, running Windows 10 operating system (Microsoft Inc.). The typical time to train a diffractive network model using the SeqPA mode with 2 and 4 polarization channels is ~7 and ~14 hours, respectively. Tire training time for a diffractive model using the SimPA mode with 2 polarization channels is ~4 hours.

[00112] It should be appreciated that various modifications and alternative embodiments of the polarization-encoded diffractive network 10 are contemplated. For example, the polarization-encoded diffractive network 10 may include substrate layers 12 that both transmit and reflect light or radiation. In addition, the polarizer array s 24 may be integrated on one or more substrate layers 12. Tire polarizers 18, in some embodiments, may be omitted but this may increase crosstalk and negatively impact the output vectorized optical field 16. [00113] Supplementary Note 1: Mathematical analysis of different polarization encoding schemes. Here different polarization encoding schemes that are used in the main text are analyzed. The optical fields are assumed to be fully polarized.

[00114] (1) For the 2 -channel polarization multiplexed diffractive system that uses x or y polarization at both its input and output fields-of-view (FOVs), one can describe the transformation relationship between the inputs i x , fy and the outputs o x , o y using:

[00116] where A x .. x , A y ~ x , A x .. y and A y .. y are the desired/target transformation matrices represented by the polarization multiplexed diffractive networks.

[00117] When a polarization multiplexed diffractive network is operated using the SeqPA mode, w'e sequentially and separately force i x = 0 and l y — 0 and read out the corresponding

[00120] So by training the system to approximate o x ~ Ar-xfy AA and o y =

A y y fy. ^2 S’’ *h e diffractive network model converges to satisfy and

^y-y ^2 -

[00121] When a polarization multiplexed diffractive system is operated using the SimPA mode, where both inputs i x and i y are fed into the diffractive networks simultaneously, the system is trained to satisfy’

[00124] so that the system converges to approximate 4^ = Ax-x ” Ar~y ^2’ and also A y-x , A x _y -» 0 (zero cross talk). Therefore, after the training, the transformations performed by the diffractive systems using the SimPA mode can be represented using the following formula:

[00126] Compared to the SeqPA mode, the SimPA mode presents more constraints to the system (i.e., penalizing A y .. x and A x y to approach 0). Equation (S6) only holds for the

SimPA mode of operation since A y-x and A x-y are not used in sequential polarization access, i.e., are ignored in the SeqPA mode (without any penalties).

[00127] (2) For the 4-channel polarization multiplexed diffractive network, the transformation relationship between the inputs fy, i y and the outputs o a , Op can be described using the following equation:

[00129] One can sequentially and separately force i x = 0 or fy = 0 and read out the corresponding o a and Op, which can be written as:

[00132] By training the system to approximate o a = A x a l x A r l x , Op = A y _^i y -*

A 2 i. v . o R = -» A 4 i„. the diffractive network model converges to satisfy To directly access the four diffractive linear transformation matrices (i.e., Aj, A 2 , A3 and A 4 ), we set either fy or fy as 0 and read the output fields of o a and Op in each case (based on Eqs. (S8-S9)).

[00133] (3) The input polarization channels of the diffractive network can also use left and right-hand circular polarization states (LHCP and RHCP):

[00136] Based on Equation (S 10), if one uses LHCP and RHCP as the input polarization channels of the system, the four linear transformation matrices that are encoded through the same diffractive network can be represented by their counterparts measured using the x and y polarization channels. Also see FIGS. 22A-22D, which reveals that circular input polarization-multiplexed diffractive processors can successfolly approximate the target, complex-valued linear transformations, when N approaches N p NjN 0 — 4N(N O = 16.4.k, arriving at the same conclusion for linear input polarization states. Since this type of Jones unitary transformations can be applied between any combinations of orthogonal polarization states, it can be further inferred that a polarization multiplexed diffractive processor with A'p = 4 can be designed by using input-output combinations of 2 orthogonal input polarization states (e.g., linear, circular or elliptical) and 2 orthogonal output polarization states (e.g., linear, circular or elliptical), where each input-output polarization combination all-optically performs one of the target complex- valued linear transformations (4 15 A 2 , A 3 , A4).

[00137] (4) Here, the case of N p > 4 under the SeqPA mode of operation is explored. Suppose that there exists a new' linear transformation independen t of the other four target linear transformations (zlj, A 2 , A 3 , A 4 ). j4 5-y operates between and o y at the input and output FOVs of the polarization multiplexed diffractive system (i.e., fo and o y are defined by the optical fields linearly polarized at the angles of 6 and y, respectively, from the x-axis):

[00139] Since t s and o y share the same input and output FOVs as (t x , t y ) and (o x , o y ), we can decompose them into x and y polarization states:

[00140] o., = cosyo x + sinyo y (S27),

[00141]

[00143] Based on these we can write:

[00149] Based on Eq. (S16), A 5-y cannot represent an independent linear transformation, and it linearly depends on A x-x , A y .. x , A x .. y and A y y of the polarization multiplexed diffractive network. Stated differently, an additional transformation matrix A a = A s-y that can be assigned to a new combination of input-output polarization states of the diffractive network can be writen as a linear combination of A 3 and A 4 . In this case, depending on the target linear transformation set (ri 15 A 2 , A 2 , A 4 , A fi ) and the acceptable error threshold for each transformation, a diffractive design with N p > 4 can be optimized using polarization multiplexing to provide approximate solutions to the target linear transformations. However, the approximation error and the computational accuracy for this case of N p > 4 will depend on the Euclidean distances among the target complex-valued transformation matrices, and for certain sets of target linear transformations, the trained diffractive netw ork might fail to achieve an acceptable error threshold in its approximation of Aj, A 2 , A 3 , A 4 and A a .

[00150] Supplementary Note 2. Here, the impact of some physical parameters of the polarizer arrays on the computational performance of polarization multiplexed diffractive networks is analyzed. As the testbed, the 4-channel polarization multiplexed system described herein was used with N = N p NjN o = 16.3k and the same four target linear transforms (i.e., A 15 A?, A 3 and A 4 ), i.e., N p = 4. As shown in FIGS. 23A-23C, three different comparisons were explored to perform this analysis by varying (A) the period of each linear polarizer unit on the polarizer array, (B) the overall size of each polarizer array, and (C) the number and position of these polarizer arrays within the diffractive network. All the parameter tuning used here is performed based on the N p = 4 system that are reported in the Methods section. [00151] Since the number and position of the polarizer arrays within a diffractive network can be arranged in various combinations, the following seven designs were selected for comparison, each of which was assigned a unique notation/nurnber in FIG, 23C:

[00152] “(O)” refers to the diffractive model used in the main text (N p = 4), which has 2 polarizer arrays placed after the 3rd and 5th diffractive layers;

[00153] “(1)” has 2 polarizer arrays placed after the 4th and 5th diffractive layers; [00154] "(2)’' has 2 polarizer arrays placed after the 3rd and 6th diffractive layers;

[00155] “(3)” has 4 polarizer arrays placed after the 3rd to 6th diffractive layers; [00156] ’‘(4)” has 6 polarizer arrays placed after the 2nd to 7th diffractive layers;

[00157] “(5)” has 8 polarizer arrays placed after all the diffractive layers;

[00158] ‘‘(6)” has only one polarizer array placed after the 5th diffractive layer.

[00159] Based on this comparative analysis reported in FIGS. 23A-23C, the observations can be summarized as follows:

[00160] 1 . A better all -optical approximation accuracy can be achieved when the period of each linear polarization unit on the polarizer array is < 42, and a period of -42 empirically appears as an optimal choice, also providing an improved output diffraction efficiency; [00161] 2. lire transformation accuracy and the diffraction efficiency of the system can be optimized by using polarizer arrays with a sufficiently large size, i.e., at least matching the size/width of the neighboring trainable diffractive layers;

[00162] 3. Using two polarizer arrays and placing them apart with an axial distance of -82 within the diffractive volume can provide improved results for the all-optical transformation accuracy and the diffraction efficiency ofiV p = 4 designs;

[00163] 4. Using too many (e.g., >6) or too few (e.g., 1 ) polarizer arrays will considerably deteriorate the computational accuracy of the diffractive processor tor N p = 4.

[00164] Supplementary Note 3. Here the 4-channel polarization multiplexed diffractive processor (depicted in FIG. 6A-6B) was used to analyze how the modeling of the PER («) of the polarizer arrays would affect the computational performance of the diffractive system. Two different PERs were used, i.e., K - 10' and K = 338.64, where these PER values reflect (1) the typical PER of commercial linear polarizers and (2) the PER of a microscopic polarizer filter array used in polarization-based CMOS image sensors, respectively. One can model the linear polarizer elements on each polarizer array/seed with a finite PER in the following form:

[00165] Using Eq. (S 17) as part of the optical forward model, the computational performance of the polarization multiplexed diffractive processors (N p ~ 4) as a function of different PER values is reported in FIGS. 24A-24D. It can be seen that even for a small PER value of 338.64, by appropriately training the polarization multiplexed diffractive network using Kinin = 338.64, the all-optical transformation accuracy of the diffractive network can approximately match the ideal case of strain = oo, Ktest = (FIGS. 24A-24D). Furthermore, the results also reveal that a relatively large PER of K = 10 s has a negligible impact on the linear transformation performance of the diffractive network compared to the ideal case of (Ktr,w = °C, Ktest ~ ®).

[00166] (2) One can also consider each polarization multiplexed diffractive processor as a monolithic polarization optical element and quantify the overall PER of the diffractive network after its training. It should be noted that for the SeqPA mode of operation, PER is not a meaningful figure-of-merit for a polarization multiplexed diffractive network since only one orthogonal polarization state is read/measured at a given time due to the sequential access of each target transformation through the diffractive network. In other words, the SeqPA mode of operation does not penalize the leakage of power into an orthogonal polarization state at the output field-of-view as it. does not impact the accuracy of each all -optical transformation that is sequentially performed. Therefore, this PER analysis tor the entire diffractive network treated as a single polarization optical element was only applied to the SimPA-based 2-channel polarization multiplexed diffractive design, which resulted in an overall PER of 51726.596 at the output field-of-view of the diffractive network,

[00167] Such a high PER is expected since the SimPA mode is designed to simultaneously perform two different linear transformations using two orthogonal polarization states, and therefore undesired polarization cross-talk at the output field-of-view was penalized during the training phase, successfully leading to the observed high PER value.

[00168]

[00169] While embodiments of the present invention have been shown and described, various modifications may be made without departing from the scope of the present invention. Tire invention, therefore, should not be limited, except to the following claims, and their equivalents.