Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BUILDING COMPUTATIONAL TRANSFER FUNCTIONS ON 3D LIGHT MICROSCOPY IMAGES USING DEEP LEARNING
Document Type and Number:
WIPO Patent Application WO/2021/067507
Kind Code:
A1
Abstract:
An image transformation facility is described. The facility accesses a machine learning model trained to transform three-dimensional microscopy images in an imaging domain of a source image type into three-dimensional images in an imaging domain of a target image type. The facility also accesses a subject microscopy image in the imaging domain of the source image type. The facility applies the trained machine learning model to the subject microscopy image to obtain a transfer result image in the imaging domain of the target image type.

Inventors:
CHEN JIANXU (US)
VIANA MATHEUS PALHARES (US)
RAFELSKI SUSANNE MARIE (US)
WANG HONGXIAO (US)
CHEN DANNY ZIYI (US)
FRICK CHRISTOPHER LEE (US)
SLUZEWSKI FILIP (US)
Application Number:
PCT/US2020/053644
Publication Date:
April 08, 2021
Filing Date:
September 30, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ALLEN INST (US)
UNIV NOTRE DAME DU LAC (US)
International Classes:
G06T3/00; G06N3/08; G06T7/30; G06T15/50
Foreign References:
US20190221313A12019-07-18
US20190244347A12019-08-08
US20190087939A12019-03-21
US20190251330A12019-08-15
Other References:
YICHEN WU, LUO YILIN, CHAUDHARI GUNVANT, RIVENSON YAIR, CALIS AYFER, DE HAAN KEVIN, OZCAN AYDOGAN: "Bright-field holography: cross-modality deep learning enables snapshot 3D imaging with bright-field contrast using a single hologram", LIGHT: SCIENCE & APPLICATIONS, vol. 8, no. 1, XP055751396, DOI: 10.1038/s41377-019-0139-9
Attorney, Agent or Firm:
LAWRENZ, Steven, D. et al. (US)
Download PDF:
Claims:
CLAIMS

1. A method in a computing system for transforming a subject three-dimensional image in an imaging domain from a source image type to a target image type, the method comprising: accessing a plurality of three-dimensional images in the imaging domain of the source image type; accessing a plurality of three-dimensional images in the imaging domain of the target image type; using the accessed three-dimensional images to construct a transfer function for transforming source three-dimensional images in the imaging domain in the source image type to target three-dimensional images in the imaging domain in the target image type; and applying the constructed transfer function to the subject three- dimensional image in the imaging domain in the source image type to obtain a corresponding subject three-dimensional image in the imaging domain in the target image type.

2. The method of claim 1 wherein constructing the transfer function comprises training a generative adversarial network with cycle consistency.

3. The method of claim 1 wherein constructing the transfer function comprises training a conditional generative adversarial network.

4. The method of claim 3 wherein the conditional generative adversarial network is trained using an enhanced deep residual network.

5. The method of claim 3 wherein the conditional generative adversarial network is trained using a U-net.

6. The method of claim 1 wherein constructing the transfer function further comprises training an auto-alignment module.

7. The method of claim 6 wherein the auto-alignment module is trained using a spatial-transformer network.

8. One or more memories collectively having contents configured to cause a computing system to perform a method, the method comprising: accessing a plurality of three-dimensional images in an imaging domain of a source image type; accessing a plurality of three-dimensional images in an imaging domain of a target image type; using the accessed three-dimensional images to train a machine learning model to perform a transfer function for transforming source three- dimensional images in the imaging domain in the source image type to target three- dimensional images in the imaging domain in the target image type; and storing the trained machine learning model.

9. The one or more memories of claim 8, the method further comprising: applying the transfer function to a subject three-dimensional image in the imaging domain in the source image type to obtain a corresponding subject three-dimensional image in the imaging domain in the target image type.

10. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to magnification level.

11. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to resolution.

12. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to microscope modality.

13. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to microscopy objective.

14. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to illumination level.

15. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to signal-to-noise ratio.

16. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to the extent to which fluorescent light is captured.

17. The one or more memories of claim 8 wherein the imaging domain of the source image type differs from the imaging domain of the target image type with respect to at least two imaging attributes among modality attributes, magnification level attributes, resolution attributes, microscopy objective attributes, illumination level attributes, signal-to-noise ratio attributes, and the extent to which fluorescent light is captured.

18. The one or more memories of claim 8 wherein each of the accessed plurality of three-dimensional images in the imaging domain of the source image type is captured from the same scene as one of the accessed plurality of three-dimensional images in the imaging domain of the target image type.

19. The one or more memories of claim 18, the method further comprising: using a proper subset of the three-dimensional images in the imaging domain of the source image type and the three-dimensional images in the imaging domain of the destination image type to train an image alignment network to predict misalignment of a three-dimensional image in the imaging domain of the source image type with respect to a three-dimensional image in the imaging domain of the target image type that is captured from the same scene; and for each of at least a portion of the three-dimensional images in the imaging domain of the source image type: applying the image alignment network to predict misalignment of the three-dimensional image in the imaging domain of the source image type with respect to the three-dimensional image in the imaging domain of the target image type that is captured from the same scene; and before the three-dimensional image in the imaging domain of the source image type and the three-dimensional image in the imaging domain of the target image type that is captured from the same scene are used to train the machine learning model to perform the transfer function, realigning the three- dimensional image in the imaging domain of the source image type with respect to the three-dimensional image in the imaging domain of the target image type that is captured from the same scene in accordance with the predicted misalignment.

20. The one or more memories of claim 18, the method further comprising: for each of at least a portion of the three-dimensional images in the imaging domain of the source image type: before the three-dimensional image in the imaging domain of the source image type and the three-dimensional image in the imaging domain of the target image type that is captured from the same scene are used to train the machine learning model to perform the transfer function, performing a registration process upon the three-dimensional image in the imaging domain of the source image type and the three-dimensional image in the imaging domain of the target image type that is captured from the same scene.

21. The one or more memories of claim 8 wherein the trained machine learning model is a conditional generative adversarial network.

22. The one or more memories of claim 21 wherein the conditional generative adversarial network is implemented using a U-Net.

23. The one or more memories of claim 21 wherein the conditional generative adversarial network is implemented using an enhanced deep super resolution network.

24. The one or more memories of claim 8 wherein the trained machine learning model is a generative adversarial network with cycle consistency.

25. One or more memories collectively storing a transfer function data structure, the data structure comprising: information comprising a trained state of a machine learning model trained to transform three-dimensional microscopy images in an imaging domain of a source image type into three-dimensional microscopy images in an imaging domain of a target image type, such that the trained state can be loaded into a machine learning model to cause the machine learning model into which it is loaded to transform a subject three- dimensional microscopy image in the imaging domain of the source image type into a three-dimensional microscopy image in the imaging domain of the target image type.

26. The one or more memories of claim 25 wherein the source image type is microscopy image, and target image type is segmentation mask.

27. The one or more memories of claim 25 wherein the source image type is segmentation mask, and target image type is microscopy image.

28. The one or more memories of claim 25, the data structure further comprising: information comprising a trained state of a second machine learning model trained to transform three-dimensional microscopy images in the imaging domain of the target image type into three-dimensional microscopy images in an imaging domain of a further target image type, such that the trained state can be loaded into a second machine learning model to cause the second machine learning model into which it is loaded to transform the subject three-dimensional microscopy image in the imaging domain of the target image type into a three-dimensional microscopy image in the imaging domain of the further target image type.

29. A method in a computing system, the method comprising: accessing a machine learning model trained to transform three- dimensional microscopy images in an imaging domain of a source image type into three-dimensional images in an imaging domain of a target image type; accessing a subject microscopy image in the imaging domain of the source image type; and applying the trained machine learning model to the subject microscopy image to obtain a transfer result image in the imaging domain of the target image type.

30. The method of claim 29, further comprising causing the transfer result image to be displayed.

31. The method of claim 29, further comprising storing the transfer result image.

32. The method of claim 29, further comprising: accessing a second machine learning model trained to transform a non-fluorescent image into a fluorescent image; and applying the second machine learning model to the transfer result image to obtain a fluorescent version of the transfer result image.

33. The method of claim 29 wherein target image type is fluorescent.

34. The method of claim 29 wherein source image type is not fluorescent.

35. A method in a computing system, the method comprising: receiving information identifying a plurality of three-dimensional microscopy source images captured in a source imaging domain and a plurality of three-dimensional microscopy target images captured in a target imaging domain different from the source imaging domain, wherein the source and target images are organized in pairs such that the source and target image of each pair show the same scene; receiving information specifying a quantitative biological measure and a minimum similarity level; dividing the image pairs into a training set and a validation set; using the image pairs of the training set to train a first model based on a first set of training parameters, the first model being trained to predict the target image of each training pair from the source image; for each of the image pairs of the validation set: applying the first model to the source image of the image pair to obtain a predicted image; assessing the specified measure for the obtained predicted image; assessing the specified measure for the target image of the image pair; determining a level of similarity of the specified measure assessed for the obtained predicted image to the specified measure assessed for the target image of the image pair; aggregating the levels of similarity determined for the image pairs of the validation set to obtain an aggregated level of similarity; determining whether the aggregated level of similarity satisfies the specified minimum similarity level; and in response to determining that the aggregated level of similarity does not satisfy the specified minimum similarity level, using the image pairs of the training set to train a second model based on a second set of training parameters different from the first set of training parameters, the second model being trained to predict the target image of each training pair from the source image.

Description:
BUILDING COMPUTATIONAL TRANSFER FUNCTIONS ON 3D LIGHT MICROSCOPY IMAGES USING DEEP LEARNING

STATEMENT OF GOVERNMENT INTEREST

This invention was made with Government support under CCF1617735 awarded by the National Science Foundation. The Government has certain rights in the invention.

CROSS-REFERENCE TO RELATED APPLICATIONS

This Application claims the benefit of provisional U.S. Application No. 62/908,316 filed September 30, 2019 and entitled “BUILDING COMPUTATIONAL TRANSFER FUNCTIONS ON 3D LIGHT MICROSCOPY IMAGES USING DEEP LEARNING” which is hereby incorporated by reference in its entirety.

In cases where the present application conflicts with a document incorporated by reference, the present application controls.

BACKGROUND

Modem microscopy has enabled biologists to gain new insights on many different aspects of cell and developmental biology. To do so, biologists choose the appropriate microscopy settings for their specific research purposes, including different microscope modalities, magnifications, resolution settings, laser powers, etc.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a block diagram showing some of the components typically incorporated in at least some of the computer systems and other devices on which the facility operates.

Figure 2 is a data flow diagram illustrating the paired workflow used by the facility in some embodiments.

Figure 3 is a data flow diagram illustrating the unpaired workflow used by the facility in some embodiments. Figure 4 is an image diagram showing, in an x, y view, Lamin B1 transferred to a higher magnification by the facility using the paired workflow.

Figure 5 is an image diagram showing the isolated nuclei from the images in Figure 4 at higher magnification.

Figure 6 is an image diagram showing results of the inventors' four different experiments.

Figure 7 is an image diagram showing further results of the four experiments.

Figure 8 is a chart diagram showing the results of application- specific validation for Lamin B1 magnification transfer in the four experiments.

Figure 9 is a chart diagram showing the additional results of application-specific validation for Lamin B1 magnification transfer in the four experiments.

Figure 10 is an image diagram showing predictions made for the additional intracellular structures by the facility.

Figure 11 is an image diagram showing additional results of the facility’s predictions with respect to the additional intracellular structures.

Figure 12 is an image diagram comparing prediction results for the additional intracellular structures to the actual target images for mEGFP- fibrillarin.

Figure 13 is a chart diagram showing the comparison of predicted images to real images for the additional intracellular structures.

Figure 14 is an image diagram showing the facility’s core segmentation of mTagRFP-T-nucleophosmin.

Figure 15 is a chart diagram showing a comparison of the volume of nucleophosmin segmented by the facility based upon actual target predicted images.

Figure 16 is a chart diagram showing further analysis of core segmentation of the additional intracellular structures by the facility.

Figure 17 is an image diagram showing the facility’s fine segmentation of fibrillarin.

Figure 18 is a chart diagram showing the facility’s segmentation of fibrillarin based upon predicted images versus actual target images. Figure 19 is a chart diagram showing further analysis of the facility’s fine segmentation.

Figure 20 is an image diagram showing the derivation of information from SMC1A images for assessing texture metrics.

Figure 21 is a chart diagram showing texture contrast correlation between predicted and target images for SMC1A.

Figure 22 is a chart diagram showing image texture measures for SMC1A and H2B.

Figure 23 is an image diagram showing sample results of a cross modality transfer function on Lamin B1 images.

Figure 24 shows sample results of a cross-modality transfer function on FI2B images.

Figure 25 is an image diagram showing sample results of an SNR transfer function.

Figure 26 is an image diagram showing sample results of a binary-to-realistic microscopy image transfer function.

Figure 27 is an image diagram showing an example of transferring microscopy images into binary masks on the same 3D mitochondrial images as in Application 5, but with the source type and target type swapped.

Figure 28 is an image diagram showing the result of a composite transfer function.

Figure 29 is a data flow diagram showing multiple approaches available to predict high-magnification fluorescent images from lower- magnification brightfield images.

Figure 30 is an image diagram showing actual images captured of a sample.

Figure 31 is an image diagram showing higher-magnification images of the same sample.

Figure 32 is an image diagram showing images predicted by the facility in path 1 shown in Figure 29. Figure 33 is an image diagram showing images predicted by the facility in path 2 shown in Figure 29.

Figure 34 is an image diagram showing an image predicted by the facility in path 3 shown in Figure 29.

DETAILED DESCRIPTION

Introduction

The inventors have recognized that it can be difficult to make an optimal, balanced choice among all of the potentially important factors. For instance, enhanced-resolution microscopy (such as using ZEISS LSM 880 with AiryScan FAST in super-resolution mode) may provide images of increased resolution, but generally does so while either compromising speed or large fields of view (FOVs). Also, a lower magnification air objective may permit acquisition of long duration timelapse imaging with decreased photo damage and with a larger FOV to image — for example, entire colonies of cells (instead of a handful of cells) in the image — but does so with reduced resolution and magnification.

The inventors have recognized that if it was possible to computationally transform the images in the long timelapse movie of a large FOV at low magnification/resolution into the resolution comparable to the images acquired by the enhanced-resolution microscopy modality, this would be a powerful approach to generating optimal data with much less need for compromise in microscopy settings. This would permit users to obtain entire cell colonies of long duration temporal data at high resolution for subsequent analysis.

In response to this recognition, the inventors have conceived and reduced to practice a software and/or hardware facility that uses deep learning techniques to construct computational transformation models (“transfer functions”) for 3D light microscopy images (“the facility”). Such transfer functions each transform a “source type image” into a “target type image,” and include, but are not limited to, those that transfer images between different magnifications, different microscope objectives, different resolutions, different laser powers, different light microscope modalities, different signal to noise ratios (SNRs) and even between binary simulated images and realistic microscopy images and between microscopy images and their binary masks (sometimes referred to as segmentation).

In some embodiments, the facility trains and applies a separate instance of its models for each of one or more imaging domains. In various embodiments, these imaging domains are defined at varying levels of specificity, such as genetic material; cells; animal cells and plant cells; etc.

Embodiments of the facility use two main workflows to create this collection of deep learning-based 3D light microscopy transfer functions: a paired workflow based on a Conditional Generative Adversarial Network (cGAN); and an unpaired workflow based on a Generative Adversarial Network with Cycle Consistency (CycleGAN). The paired workflow uses specialized biological experiments to acquire aligned matched training images of the two types to be transferred between, which are either inherently aligned or aligned computationally by the facility. While the unpaired workflow does not rely on such alignment, it is instead limited by a lack of direct physical correspondence between samples. Collecting specialized data for creating the aligned training samples may not always be straightforward, which gives the unpaired workflow advantage over this experimental step in some scenarios. But, as a consequence of the unpaired nature, the transfer results from the unpaired workflow exhibits less spatial correspondence which may lead to less biological validity than the paired workflow. For example, transferring an image of a nucleus via a textured DNA dye from low resolution into enhanced-resolution with the unpaired workflow may generate very realistic-looking general nuclear texture patterns that appear similar to the enhanced-resolution images, but the predicted nuclear texture could show little spatial correspondence to its corresponding lower resolution image. On the other hand, the predicted nuclear texture using the paired workflow would have both realistic and “spatially correct” nuclear texture, e.g., a high correspondence between images in a voxel-by-voxel manner when compared to the target enhanced-resolution image.

The inventors have explored different methodologies for biology- driven validation of the prediction results. The prediction from the source type image will rarely be identical to the real target type image. First, the real source type images and the real target type images may not capture exactly the same actual physical locations, especially along z, due to different imaging modalities or z-steps. Therefore, the actual z positions in the biological sample represented in the prediction and in the target type image might not be identical. The target type images can only be used as a reference. In other words, there is often no absolute ground truth. (The term “ground truth” is used herein to refer to the target type images used as reference, even though they are not the truth in the absolute sense.) In addition, no machine learning model is perfect, each will always typically incur certain discrepancy between prediction and the ground truth. If prediction has non-trivial differences from the ground truth, this raises the question of whether any downstream analysis and biological interpretation will be valid when using the prediction as input. To this end, the inventors seek to systematically evaluate the prediction results, to (1) understand the accuracy in the biological context and make appropriate interpretations in any downstream analysis based on predictions, and (2) inspire users on how to validate the results when using the facility on their own data.

By performing some or all of these ways, the facility makes available three-dimensional microscopy results that are improved in a variety of ways.

Figure 1 is a block diagram showing some of the components typically incorporated in at least some of the computer systems and other devices on which the facility operates. In various embodiments, these computer systems and other devices 100 can include server computer systems, cloud computing platforms or virtual machines in other configurations, desktop computer systems, laptop computer systems, netbooks, mobile phones, personal digital assistants, televisions, cameras, automobile computers, electronic media players, hand-held devices, etc. In various embodiments, the computer systems and devices include zero or more of each of the following: a processor 101 for executing computer programs and/or training or applying machine learning models, such as a CPU, GPU, TPU, NNP, FPGA, or ASIC; a computer memory 102 for storing programs and data while they are being used, including the facility and associated data, an operating system including a kernel, and device drivers; a persistent storage device 103, such as a hard drive or flash drive for persistently storing programs and data; a computer- readable media drive 104, such as a floppy, CD-ROM, or DVD drive, for reading programs and data stored on a computer-readable medium; and a network connection 105 for connecting the computer system to other computer systems to send and/or receive data, such as via the Internet or another network and its networking hardware, such as switches, routers, repeaters, electrical cables and optical fibers, light emitters and receivers, radio transmitters and receivers, and the like. While computer systems configured as described above are typically used to support the operation of the facility, those skilled in the art will appreciate that the facility may be implemented using devices of various types and configurations, and having various components.

Figure 2 is a data flow diagram illustrating the paired workflow used by the facility in some embodiments. First, the facility collects images of two different types (“source type” and “target type”), but of the identical cells 201 in identical samples (e.g., fixed samples). In some applications, the images are inherently fully aligned, while in other applications one or more computational steps are performed to create aligned images. This creates a set of image pairs, where each pair is two images (one in source type and one in target type) of the same FOV of the same cells. In particular, images 213 of the source type are collected by a computer or other interface device 212 from a first microscope or other imaging device 211 having source characteristics, while the images 223 of the target type are collected by a computer or other interface device 222 from a second microscope or other imaging device 221 having target characteristics. The facility then feeds the image pairs into the deep learning module 230 to train the transfer function. Finally, the facility applies the trained model 240 (the “transfer function”) to transfer (or "transform") images of the source type to predicted images comparable to the target type.

In some embodiments, the deep learning module of the paired workflow has two parts: a conditional GAN (cGAN) and an Auto-Align module. The cGAN generates target type images based on the features of the input source type images. In some embodiments, the cGAN uses two common network backbones: U-Net and EDSR. In some embodiments, the facility uses U-Nets as described in Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation, International Conference on Medical image computing and computer-assisted intervention, 2015 Oct 5 (pp. 234-241), Springer, Cham., which is hereby incorporated by reference in its entirety. In some embodiments, the facility uses EDSRs as described in Lim B., Son S., Kim H., Nah S., Mu Lee K., Enhanced deep residual networks for single image super-resolution, Proceedings of the IEEE conference on computer vision and pattern recognition workshops 2017 (pp. 136-144), which is hereby incorporated by reference in its entirety.

When the facility computationally aligns the image pairs prior to training, the Auto-Align module estimates the misalignment at the sub-pixel level to further align the training pairs. This allows the pixel-wise loss normally used in cGAN training to be calculated between more highly aligned training pairs. The pixel-wise correspondence between the source and target types is thus improved, which encourages the cGAN to generate more biologically valid images (e.g., for quantitative analysis).

In some embodiments, the cGAN network used by the facility in the paired workflow is adapted from processing to the images to processing 3D images. First, the cGAN’s 2D convolutional kernel is modified to process 3D images. Because 3D microscope images are typically not isotropic, that is, the Z-resolution is lower than the resolution in the X and Y dimensions, the facility employs anisotropic operations in the neural network. For convolutional layers, kernel size 3 pixels with the stride value of 1 are used in the Z dimension, and kernel size 4 pixels with the stride value of 2 are used in the X and Y dimensions. In some embodiments, max-pooling is performed only in the X and Y dimensions. Using a cGAN network organized in this way provides comparable theoretical receptive fields in all dimensions.

The Auto-Align module has two parts: a misalignment estimation part and an alignment part. The misalignment estimation part takes predicted target type images from cGAN and ground truth target type images as input, which it concatenates and passes into a 3-layer convolution network that is followed by a fully connected layer. The outputs are three scalar values: misalignment offsets along the z-axis (axial direction), y-axis and x-axis (lateral direction).

The alignment part shifts target type image by predicted by the cGAN by the offsets with differentiable image sampling introduced by Spatial- Transformer Network. The (x,y,z) coordinates of the shifted version (V) of the cGAN predicted target type image (U) can be calculated by T: where (n, m, I) is the coordinate in the cGAN predicted target type image (U); (ox, oy, oz) are the estimated misalignment offsets along x, y, z directions; k is the bilinear interpolation parameters; and N, M, L are the dimension of the cGAN predicted target type image along x, y, z. The shift is implemented by bilinear resampling.

The Auto-Align module optimizes the pixel-wise loss between the shifted version of the cGAN predicted target type image and ground truth target type image. When training the Auto-Align module, the facility fixes the GAN parameters. The pixel-wise loss tends to attain its minimum only when the shifted version of the cGAN predicted target type image and the ground truth target type image are optimally aligned. When this minimum is attained, the facility estimates the misalignment offset between the ground truth target type image and input source type image by the offset between shift estimation from the Auto-Align module.

In some embodiments, the facility trains the cGAN and Auto Align networks in three main training stages. First, the cGAN is pre-trained on training pairs (which may not be fully aligned). The second stage is finding the misalignment; and finally using the more well-aligned image pairs to train the model. After the cGAN is pre-trained in the first stage, the cGAN and Auto-Align module are individually trained in turns, each for one epoch every time. After each Auto-Align module training epoch, the facility obtains the misalignment values for each image. The misalignment of each image is subtracted by the misalignment of all the images, in order to offset the universal misalignment introduced by cGAN. The misalignment value at different training epochs will have a similar mean value. In the final stage, for every image, the facility uses the average misalignment value at different epoch to align the image pair. Then the facility trains the cGAN with the updated image pairs which are now much better aligned, leading to an improved prediction result.

In training, in some embodiments the facility uses the Adam optimizer with a learning rate of 0.0002 when using U-Net as the backbone, or a learning rate of 0.00002 when using EDSR as the backbone. The facility sets the batch size as 1 for training. In each epoch, the facility randomly crops 100 patches from each training pair, making the training input batch size 16 x 128 x 128 voxels while the training target patch size can be calculated accordingly base on the backbone network. The facility can often obtain reasonable results after 20 training epochs.

The inventors performed an initial evaluation of the effectiveness of the Auto-Align module using a semi-simulated dataset for magnification transfer. When image pairs are not inherently aligned, the computational alignment step cannot theoretically generate fully aligned images. For example, when the two types of images have different axial (in the Z) dimension sizes, the two images may not be captured in identical spatial locations. Also, because generally optical microscopy images have a lower axial (z) than lateral (x, y) resolution, the computational alignment along the z direction could suffer more inaccuracy than x and y. For this reason, the inventors focused on simulating the z-misalignment issue in this evaluation. To do this, the inventors imaged hiPS cells expressing mEGFP-Lamin B1, which labels the nuclear envelope, on a ZEISS spinning-disk confocal microscope with a 100x, 1.25NA water immersion objective (voxel size = 0.108 x 0.108 x 0.290 urn). More details about the cells and data can be found on Allen Cell Image Data Collection (available atwww.allencell.org). Then, the inventors computationally down- sampled the 10Ox images to the voxel size of a 20x image (0.271 x 0.271 x 0.530 urn). To simulate the z-misalignment, the inventors injected a shift of a few pixels (a random number between -4 and +4) along the z axis into the down-sampled 20x images. To evaluate the results, the inventors held five images out from training. The inventors manually selected 10 patches of about 20x20 pixels within ten different nuclei in each of the five hold-out images. The inventors then estimated the nuclear heights based on the intensities within these 50 patches along z. The differences in the estimated nuclear heights between ground truth and predicted images (also comparing two different backbone networks in cGAN) are reported together with the standard image quality metrics, peak signal to noise ratio (PSNR) and SSIM as shown in Table 1 below. The Auto-Align module consistently yielded a considerable improvement, regardless of the backbone used in cGAN.

Table 1 : Quantitative evaluation of the Auto-Align module with two different backbones in cGAN on data with and without z-misalignment. Correction Prediction is the number of nuclei where the nuclear heights estimated from the predicted images are with (-0.5, 0.5) pixel from the nuclear height estimated from the ground truth.

The inventors also performed further evaluation of the effectiveness of the Auto-Align module, described below in the Application 1 section.

Figure 3 is a data flow diagram illustrating the unpaired workflow used by the facility in some embodiments. First, the facility collects images of two different types to transfer between, but not necessarily of the identical cells or samples. Then, the facility feeds the two sets of images into the deep learning module to learn the transfer function from the source type to the target type. Images of cells 310 are collected from a microscope or other imaging device 311 having source characteristics via a computer or other interface device 312 as source type images 313, and images of cells 320 are collected by a microscope or other imaging device 321 having target characteristics and received through a computer or other interface device 322 as target type images 323. In the deep learning module 330, the facility uses a version of CycleGAN extended to 3D to tackle the unpaired microscopy image transfer problem. Given a set of source type images and a set of target type images, the facility can produce a transfer function capable of transferring images drawn from the source type into new images that fit the “distribution” of the target type. Flere, the “distribution” can be interpreted as the set of underlying characteristics of the target type images in general, not specific to any particular image. The generated images will have a similar general appearance as the target type, while still maintaining the biology attributes of the input image.

It is worth mentioning that CycleGAN is commonly used as a bi directional transfer. However, in some embodiments, the facility imposes directionality into the unpaired workflow (/. e. , source type and target type). The inventors found that the bi-directional transfer did not show similar accuracy between the two transfer directions. In other words, the model was able to transfer the target type back to the source type, but did so with greatly decreased performance compared to the transfer from source type to target type. In some embodiments, the capability of transferring from target type images back to source type images as part of the cycle consistency is utilized as an effective means to build a computational transfer function from source to target without the need for paired trained images.

Some advantages provided by the facility in some embodiments:

• To the best of the inventors’ knowledge, the facility’s Deep Transfer Function Suite is the first deep learning toolbox with a wide spectrum of applicability to different microscopy image transfer problems, including creating composites of multiple transfer functions to achieve much higher transfer power.

• In the paired workflow, when image pairs are not inherently aligned, the inevitable misalignment (even if sometimes “only” by a few pixels) is a critical challenge faced by all existing methods, especially if one purpose of the application is quantitative interpretation based on the predicted images. In many conventional approaches, the model is either trained on simulated/semi-simulated fully aligned images or simply trained on unaligned real images. The inventors found that subtle misalignment, especially along z, always caused significant degradation in the accuracy of the predicted images. The Auto-Align module is a novel solution to explicitly tackle this problem.

• A lot of existing work was built for 2D microscopy images. The inventors made non-trivial extension from 2D cGAN or CycleGAN to 3D. Key difficulties include, for example, anisotropic dimensions, handling large 3D images (especially when dealing with super resolution images) within restricted GPU memory, etc.

A number of illustrative applications in which the facility is applied to transform 3D microscopy images are discussed below. Figures 3-10 are image diagrams showing results of applying the facility to various microscopy samples.

Application 1: Microscopy Objective Transfer for Lamin B1 Images

Different microscopy objectives could be useful in different biological applications. The inventors demonstrate an application of building a computational transfer function from 3D fluorescent images acquired with a 20x, 0.8NA objective (Air/Dry, pixel sizes = 0.271 x 0.271, z-step = 0.530 urn) to 3D fluorescent images acquired with a 100x, 1.25 NA objective (Water Immersion, pixel sizes = 0.108 x 0.108, z-step = 0.290 urn) using a paired workflow. This includes a transfer in both magnification (voxel size) and resolution (different NAs).

Lamin B1 decorates the nuclear lamina located just inside the nuclear envelope, which appears as a shell with almost no texture (during interphase). Lamin B1 as the nuclear envelope has a very simple topological structure and can be used as a good approximation of the nuclear shapes, and is therefore relatively easy to conduct a biology-driven validation via nuclear shape analysis.

To collect the data, the inventors first fixed cells and imaged the identical cells twice with two different objectives on a ZEISS spinning-disk confocal microscope. The 20x images (source type) have a much larger FOV, but less resolution compared to the 100x images (target type). The two sets of original images were then computationally aligned. From the same set of experimental data, the inventors conducted four different experiments: v1. Training with roughly aligned pairs and with basic cGAN v2. Training with roughly aligned pairs and with basic cGAN +

Auto-Align module v3. Training with more accurately aligned pairs and with basic cGAN v4. Training with more accurately aligned pairs and with basic cGAN + Auto-Align module

The roughly aligned pairs used in experiments v1 and v2 were obtained by segmenting both the 20x and 100x images and finding the maximum overlapping position. The more accurately aligned pairs used in experiments v3 and v4 were obtained by the 3D registration workflow. Because the 20x images are of much lower resolution than the 100x images, the segmented nuclear shapes from 20x images are much less accurate. As a result, overlaying the two segmentations creates a rough alignment. The goal of including a roughly aligned version is two-fold: to demonstrate the importance of accurately aligned training pairs (by comparing experiment 1 and 3); and show how much the Auto-Alignment module improves prediction accuracy when accurately aligned pairs are not available (by comparing experiment 1 and 2).

Figure 4 is an image diagram showing, in an x, y view, Lamin B1 transferred to a higher magnification by the facility using the paired workflow. Image 410 shows the subject Lamin B1 nuclei imaged in accordance with a source type, in a 20x magnification image captured using a first objective on a ZEISS spinning-disc confocal microscope. The plane shown is the plane at which the xy-cross-sectional area of the nuclei in the target 100x image is maximal. The real 20x image is upsampled via bilinear interpolation to match the resolution of the target and prediction images. The scale bars are 10 urn. Image 420 of the target type is captured of the same nuclei, at a 100x magnification using a second objective on a ZEISS spinning-disc confocal microscope. Image 430 shows the facility’s transformation of the source image 410 into the target type, i.e., magnification 100x, from the source type, magnification 20x. A portion 411 of image 410 isolates a particular nucleus in the source image; corresponding portions 421 and 431 of images 420 and 430, respectively, isolate the same nucleus in those images. Figure 5 is an image diagram showing the isolated nuclei from the images in Figure 4 at higher magnification. In particular: image 511 shows the isolated nucleus in the source image in x,y view; image 512 shows the isolated nucleus in the source image in side view; image 521 shows the isolated nucleus in the target image in x,y view; image 522 shows the isolated nucleus in the target image in side view; image 531 shows the isolated nucleus in the prediction image in x,y view; and image 532 shows the isolated nucleus in the prediction image in side view. In particular, images 511 , 521 , and 531 are xy- cross-sections from the z-plane at which the xy-cross-sectional area of the nuclei is maximal; images 512, 522, and 532 are xz-cross-sections from the y- plane at which the xz-cross-sectional area of the nucleus in the target 100x image is maximal. The real 20x image is upsampled via bilinear interpolation to match the resolution of the target and prediction images. Scale bar is 5 urn. By comparing image 531 to image 521 , and comparing image 532 to image 522, it can be seen that the predicted image improves on the resolution of the Lam in B1 shell relative to the actual target image, particularly in the z dimension.

Figure 6 is an image diagram showing results of the inventors' four different experiments. Image 601 is the x,y view of the isolated nucleus in the actual target image, while image 602 is a side of the isolated nucleus in the actual target image. Thus, images 601 and 602 are the same as images 521 and 522, respectively. Image 611 is the x,y view and 612 is the side view of the same nucleus as predicted by the facility in experiment v1. Image 621 is the x,y view and 622 is the side view of the same nucleus as predicted by the facility in experiment v2. Image 631 is the x,y view and 632 is the side view of the same nucleus as predicted by the facility in experiment v3. Image 641 is the x,y view and 642 is the side view of the same nucleus as predicted by the facility in experiment v4.

Figure 7 is an image diagram showing further results of the four experiments. In particular, image 711 is the x,y view and 712 is the side view of a comparison of the green actual target image with the magenta predicted image in experiment v1. Image 721 is the x,y view and 722 is the side view of a comparison of the green actual target image with the magenta predicted image in experiment v2. Image 731 is the x,y view and 732 is the side view of a comparison of the green actual target image with the magenta predicted image in experiment v3. Image 741 is the x,y view and 742 is the side view of a comparison of the green actual target image with the magenta predicted image in experiment v4.

These images show that the difference between the predictions from all four different versions are subtle, and all exhibit high visual fidelity compared to the real image. However, visually realistic does not necessarily mean biologically realistic, and further that biological validity is application- specific. Quantitatively, it can be helpful to validate the fidelity of the prediction by analyzing its impact in downstream analysis. For example, in other work the inventors have applied a spherical harmonics-based parameterization to quantify nuclear shape and found that the first five spherical harmonic coefficients are useful shape descriptors for downstream applications. These features are an example of a specific application in which the inventors would use a Lamin B1 20x to 100x transfer function. Thus the accuracy with which these features can be derived from a predicted 100x image compared to a real 100x image is an example of application-specific validation that can ensure that the facility is used in a way that is helpful.

The inventors used a semi-automatic seeded watershed algorithm to segment Lamin B1 from both predicted images and real images. The inventors extracted the nuclear outline from these segmentations and used spherical harmonics parametrization to quantify the nuclear shape in both predicted images and real images. The inventors calculated the spherical harmonic shape features of the 182 nuclei from 15 fields of view, from both real images and four different versions of predictions.

Figure 8 is a chart diagram showing the results of application- specific validation for Lamin B1 magnification transfer in the four experiments. Chart 810 is scatter plot of the quantifications of the spherical harmonic shape features of the 182 nuclei from 15 fields of view for experiment v1. The feature quantified is the LOMO coefficient of the predicted nuclei (y-axis) and the real, target nuclei (x-axis). The LOMO coefficient is compared to nuclear volume.

The coefficient of determination (R2) and percent bias are labeled in the figure. Charts 820, 830, and 840 show the same information for experiments v2, v3, and v4, respectively.

Figure 9 is a chart diagram showing the additional results of application-specific validation for Lamin B1 magnification transfer in the four experiments. These charts compare the coefficient of determination and percentage bias of first five spherical harmonic coefficients when comparing the four predictions against real images. In particular, charts 911 , 921 , 931 , 941 , and 951 are bar graphs that show the accuracy with which the different experimental transfer functions predict the quantified spherical harmonic shaped features for the first five spherical harmonic coefficients as measured by coefficient of determination. Charts 912, 922, 932, 942, and 952 show percent bias for the four experiments for the same five spherical harmonic coefficients. Arrow bars in these charts represent the 90% confidence interval determined by 200 iterations of boot strap resampling of the statistic. The results show that (1) more accurate alignment yields more accurate prediction (comparing version 1 and version 3), (2) the Auto-Align module in the model can effectively reduce the impact of mis-alignment in training pairs (comparing version 1 and version 2), (3) when training pairs can be more accurately aligned, the Auto-Align module is not necessarily useful (comparing version 3 and version 4).

In the rest of this study, experiments follow the setup of version 3, where more accurately aligned pairs are used to train the basic cGAN model without the Auto-Align module. It is worth mentioning that the image registration algorithm cannot guarantee that every pair of images can be accurately aligned. The reason is that optimal registration based on images may not mean best alignment of the biological objects in the image, due to the difference in resolution, noise, etc. The inventors manually curated the registration results to remove poorly aligned image pairs. The percentage of removed images due to poor alignment varies between individual datasets but is usually less than 20%. The Auto-Align module is still recommended when the registration step cannot yield enough accurately aligned image pairs.

Application 2: Microscopy objective transfer functions for additional intracellular structures of various complexity

Beyond the structures with simple shell-like shapes, the next step is to evaluate the performance of the facility's microscopy objective transfer functions on intracellular structures with higher “complexity” in shape. So, the inventors extended their experiments to four additional cell lines: mEGFP- tagged fibrillarin, mEGFP-tagged nucleophosmin, mEGFP-tagged histone FI2B type 1-J, and mEGFP-tagged SMC protein 1A, which represents two different types of “complexity”. Fibrillarin and nucleophosmin mark the dense fibrillar component and granular component of the nucleolus, respectively. Morphologically, they present slightly more complexity than the simple shell morphology of Lamin B1. On the other hand, histone FI2B type 1-J and SMC protein 1 A are two nuclear structures, which mark histones and chromatin, respectively. Visually, they are within the nucleus and display very different textures, different from both the Lamin B1 shells and the nucleolar structure morphologies. Histone H2B type 1-J and SMC protein 1A also provide another means for approximating the nuclear shape. Texture-wise, histone H2B type 1- J features a more complex and uneven texture throughout the nucleus, while SMC protein 1 A exhibits smoother texture with puncta. As in the four-part experiment discussed above in connection with Figures 5-9, the inventors performed an experiment with these four additional intracellular structures in which their images were transformed from 20x magnification to 100x magnification using the paired workflow.

Figure 10 is an image diagram showing predictions made for the additional intracellular structures by the facility. For mEGFP-fibrillarin, image 1011 shows a source image at 20x magnification, image 1021 shows an actual target image at 100x magnification, and image 1031 shows the facility’s prediction at 100x magnification based upon the source image. Images 1012, 1022, and 2032 show the same contents for mTagRFP-T-nucleophosmin. Images 1013, 1023, and 1033 show the same contents for mEGFP-H2B.

Images 1014, 1024, and 1034 show the same contents for mEGFP-SMC1A.

Figure 11 is an image diagram showing additional results of the facility’s predictions with respect to the additional intracellular structures. For mEGFP-fibrillarin, image 1111 shows a single z-plane image (above) and a single y-plane image (below) of the structures in an individual nucleus that are boxed in image 1011, the source image for this structure. The top image is a single z-plane (at max xy-cross-sectional area of structure segmentation) and the bottom image is the y-plane (at max xz-cross-sectional area of the structure). The yellow line in the top image represents the plane of the bottom image and vice versa. The scale bar is 5 pm. Image 1121 has the same contents for this structure’s target image, and 1131 for its prediction image. Images 1112, 1122, and 1132 show the same contents for the mTagRFP-T- nucleophosmin structure. Image 1121 has the same contents for this structure’s target image, and 1131 for its prediction image. Images 1113, 1123, and 1133 show the same contents for the mEGFP-FI2B structure. Image 1121 has the same contents for this structure’s target image, and 1131 for its prediction image. Images 1114, 1124, and 1134 show the same contents for the mEGFP- SMC1A structure.

Figure 12 is an image diagram comparing prediction results for the additional intracellular structures to the actual target images for mEGFP- fibrillarin. Image 1210 shows the predicted image for the isolated nucleus in magenta, compared to the actual target image in green — in other words, a comparison of image 1134 to image 1124. Images 1220, 1230, and 1240 have the same contents for mTagRFP-T-nucleophosmin, mEGFP-FI2B, and mEGFP- SMC1A, respectively. In the results, one can observe that the 100x images predicted from 20x images show significant visual similarity to real 100x images. Figure 13 is a chart diagram showing the comparison of predicted images to real images for the additional intracellular structures. Chart 1310 shows the Pearson correlation metric for the additional intracellular structures. Chart 1320 similarly shows the peak signal to noise ratio (PSNR) metric, and chart 1330 the structure similarity index measure (SSIM). The legend 1340 maps colors used in these charts to the different structure types. These metrics show strong correlation and high similarity between predicted and actual target images for these structures.

The inventors performed quantitative application-specific validation for each of these structures. The evaluation of the predicted fibrillarin and nucleophosmin from 20x to 100x is segmentation-based quantification, such as the total volume, the number of pieces and mean volume per piece. Both fibrillarin and nucleophosmin are key structures in the nucleolus, which is the nuclear subcompartment where ribosome biogenesis occurs. Specifically, the inventors segment fibrillarin and nucleophosmin from both the real and predicted 100x images using the classic segmentation workflow in Allen Cell Structure Segmenter. Both fibrillarin and nucleophosmin are segmented at two different granularities. The coarse segmentation captures the overall shape of nucleolus, while the fine segmentation discussed below delineates finer details about the structure visible in the image.

Figure 14 is an image diagram showing the facility’s core segmentation of mTagRFP-T-nucleophosmin. Images 1411 and 1412 show the facility’s 3D segmentation of this structure based upon the actual target image in yellow, superimposed over the actual target image. Images 1421 and 1422 similarly show the facility’s 3D segmentation based upon the predicted image in yellow superimposed over that predicted image.

Figure 15 is a chart diagram showing a comparison of the volume of nucleophosmin segmented by the facility based upon actual target predicted images. In particular, chart 1500 is a scatter plot that compares the total volume of nucleophosmin segmented based upon the predicted image to the total volume of nucleophosmin segmented based upon the actual target image. The chart also shows coefficient of determination (R 2 ) and percent bias. This information confirms that the segmented nucleolus size from nucleophosmin in real 100x images and predicted 100x images are very consistent.

Figure 16 is a chart diagram showing further analysis of core segmentation of the additional intracellular structures by the facility. The chart 1600 shows coefficient of determination for coarse segmentation measures.

The statistic computed for total volume is shown for all structures, and for measured counts and volume per piece it is shown only for those structures where more than one piece is segmented per nuclei (fibrillarin and nucleophosmin). Error bars are the 90% confidence interval determined by 200 iterations of bootstrap resampling of the statistic. This information shows non- negligible discrepancy in the number of pieces in the segmentation and volume per piece in both coarse segmentation when comparing the results between real predicted images. This is reasonable in the sense that a single pixel difference in segmentation may alter the connectivity of the structure and thus yield different number of pieces. As a result, quantitative analyses based on predicted images need to be carefully designed as certain measurements are less accurate than others.

Figure 17 is an image diagram showing the facility’s fine segmentation of fibrillarin. In a manner similar to Figure 14 discussed above, images 1711 and 1712 show the fine segmentation fibrillarin based upon the actual target image of fibrillarin in yellow, superimposed over the actual target image. Images 1721 and 1722 show in yellow segmentation of the fibrillarin based upon the predicted image, overlaid over the predicted image. The green arrows in these images indicate an area where the target image segments one object, while the predicted image segments two objects.

Figure 18 is a chart diagram showing the facility’s segmentation of fibrillarin based upon predicted images versus actual target images. Chart 1800 is a scatter plot showing, from each of a number of fibrillarin images, a scatter plot of the number comparing the number of pieces of fibrillarin segmented in the predicted image to the number of pieces of fibrillarin segmented based on the actual target image. The chart also shows coefficient of determination and percent bias.

Figure 19 is a chart diagram showing further analysis of the facility’s fine segmentation. Chart 1900 shows the coefficient of determination for fine segmentation measures for fibrillarin and nucleophosmin. Error bars are the 90% confidence interval of the metric.

From Figures 18 and 19, it can be seen that fine segmentation also shows non-negligible discrepancy in number of segmented pieces and volume per piece.

For histone FI2B type 1-J and SMC protein 1A, the inventors evaluated two aspects of the accuracy of transfer function prediction: nuclear shape and texture. The segmentations were obtained by using a deep learning model trained with the iterative deep learning workflow in the inventors’ Segmenter on a separate set of data. The accuracy of the total measured nuclear volume is very high. In addition, texture is an important indicator of localization patterns of proteins, which tie to their biological function. Therefore, a goal of transfer functions is to maintain texture features with high fidelity. The inventors selected and calculated the Flaralick’s texture features including contrast, entropy, and variation. The inventors first looked at correlations between textures computed for different gray-levels and pixel windows to determine the best workflow using only real images. The inventors then applied the workflow and compared texture features between the real and the predicted images.

Figure 20 is an image diagram showing the derivation of information from SMC1 A images for assessing texture metrics. Image 2011 , 2012, and 2013 are source, target, and predicted images for SMC1A. Shown is a single z-plane at which the xy-cross-sectional area is maximum. Below are the gray-level co-occurrence matrices (GLCM) 2021-2023 for the corresponding images used for texture computations. The GLCMs were computed with a symmetric pixel offset of 1 in all directions, and using 4-bit (16 gray level) normalized images. Figure 21 is a chart diagram showing texture contrast correlation between predicted and target images for SMC1A. Chart 2100 is a scatter plot of the computed Haralick contrast of SMC1A fluorescence from individual nuclei. The coefficient of determination (R2) and percent bias are labeled in the figure.

Figure 22 is a chart diagram showing image texture measures for SMC1 A and FI2B. Chart 2200 shows the coefficient of determination for image texture measures for SMC1A and FI2B. Error bars are the 90% confidence interval of the metric.

The inventors found that some texture features are predicted more accurately than others, and that overall FI2B texture was predicted more accurately than the finer SMC1 A texture.

Application 3: Cross-Modality Resolution Transfer

Different microscopy modalities may have certain advantages and disadvantages, and may each require different instances of a particular segmentation or quantification method, creating technical challenges in comparing image data taken on different microscope modalities. To take advantage of different modalities, both due to their particular specialized applications and for streamlining image processing and analysis, it would be beneficial to build a cross-modality transfer function. The inventors present an application of the paired workflow to build a transfer function from images acquired on a spinning-disk microscopy to images with ~1.7-fold enhanced resolution acquired on a super-resolution microscope using the paired workflow. To collect the training data, the inventors fixed the cells and then collected 3D images of the identical cells using two different imaging modalities. The inventors first imaged the cells on a ZEISS spinning-disk confocal microscope (a 100x, 1.25 NA objective, Water-immersion, detector: cmos camera, pixel sizes = 0.108 x 0.108 urn, z-step = 0.290 urn). The second modality is a separate microscope, a ZEISS LSM 880 with AiryScan FAST (a 40x, 1.2 NA objective, Water-immersion, detector: AiryScan, pixel sizes = 0.05 x 0.05 urn, z-step = 0.222 urn). This second modality achieves sub-diffraction limit resolution that is 1.7x (both laterally and axially) better than regular fluorescence microscopy on an identical system by utilizing a build-in post processing algorithm. The inventors present results of these transfer function experiments on hiPS cells expressing mEGFP-Lamin B1 and hiPS cells expressing mEGFP-H2B, confirming the feasibility of this approach and overall validity of the prediction results.

Figure 23 is an image diagram showing sample results of a cross modality transfer function on Lamin B1 images. Image 2311 is a source image from 100x spinning disc microscopy. Image 2312 is a target image from enhanced-resolution AiryScan FAST microscopy. Image 2313 is a prediction by the facility using a UNet backbone, and image 2314 is a predicted image produced by the facility using an EDSR backbone. Charts 2321-2324 show an intensity profile along the yellow lines marked in the images above. The predicted images have much less blur compared to the input image. Signal-to- noise ratios (SNRs) with the ground truth super-resolution image as the reference is also shown. Image 2311 is upsampled to the scale comparable to the super-resolution image for visualization purposes. Images are all presented at comparable intensity scale.

Figure 24 shows sample results of a cross-modality transfer function on FI2B images. Image 2410 is a source image of the FI2B from 100x spinning-disc microscopy. Portion 2411 of image 2410 is shown at greater magnification as image 241 . Image 2420 is a target image from ZEISS LSM 8080 AiryScan super resolution microscopy. Portion 2421 of image 2420, corresponding to portion 2411 , is shown at greater magnification as image 2412'. Image 2430 is predicted by the facility using a UNet backbone. Portion 2431 of image 2430, corresponding to portions 2411 and 2421, is shown at greater magnification as image 2413'. Image 2410 is upsampled comparable to the super resolution image for visualization purposes. Images are all presented at comparable intensity scale. Application 4: SNR Transfer

For fluorescent microscopy images, using increased laser power can generate higher signal to noise ratios (SNRs), but at the cost of higher phototoxity and photobleaching. Thus, for certain experiments ( e.g ., long time- lapse), the inventors prefer to set the laser power as low as possible, which inevitably produces low SNR images. In these scenarios, the capability of transferring images from low SNR to high SNR would be beneficial for many applications such as getting more accurate quantitative measurements of intracellular structures. Here, the inventors demonstrate applying the paired workflow towards transferring low SNR images to higher SNR images

The inventors fixed hiPS cells expressing mEGFP-Lamin B1 and acquired 3D time-lapse images with a 20x 0.8NA objective (Air/Dry, pixel sizes = 0.271 x 0.271 urn, z-step = 0.530 urn) on a ZEISS spinning-disk confocal microscope. Each FOV was imaged in 3D 60 times consecutively with no time interval. The average of the 60 images of a FOV can be used as the target high SNR ground truth, while each individual one of the 60 images can be used as the model input. For this problem, the deep learning module the inventors used was cGAN (UNet as backbone) without the auto-align module because the two samples were already inherently aligned. The trained transfer function model is able to successfully transfer a low SNR image to a much higher SNR. This specific test in the inventors’ work is meant to demonstrate the general applicability and effectiveness of the facility.

Figure 25 is an image diagram showing sample results of applying an SNR transfer function. Image 2510 is an image of Lamin B1 acquired in a single acquisition using 20x spinning disc. Portion 2511 of image 2510 is shown at greater magnification as image 251 . Image 2520 is the result of repeating the acquisition that produced image 251060 times, and averaging these 60 images for use as high SNR ground truth. Portion 2521 of image 2520, corresponding to portion 2511 , is shown at greater magnification as image 2512'. Image 2530 is predicted from source image 2510 using a transfer function trained with image 2520, in which SNR is greatly boosted relative to the input image. Portion 2531 of image 2530, corresponding to portions 2511 and 2521, is shown at greater magnification as image 2513'. These images are all presented at comparable intensity scales.

Application 5: Binary to Realistic Microscopy Image Transfer

Generating a realistic microscopy image from a binary mask have found many applications in image analysis for quantitative cell biology, e.g., validating the absolute accuracy of a segmentation algorithm or augmenting the data for training a machine learning based segmentation model. In some embodiments, absolute accuracy validation of a segmentation algorithm was performed in accordance with Viana MP, Lim S, Rafeiski SM, Quantifying mitochondrial content in living cells, Methods in cell biology 2015 Jan 1 (Vol. 125, pp. 77-93), Academic Press, which is hereby incorporated by reference in its entirety. In some embodiments, data augmentation for training machine learning-based segmentation model was performed in accordance with Bailo O, Ham D, Min Shin Y., Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2019, which is hereby incorporated by reference in its entirety.

To demonstrate the facility's transfer of binary images into realistic-looking predicted microscopy images, the inventors take 3D microscopy images of mitochondria and their segmentation (hiPS cells expressing mEGFP-tagged-Tom20, which marks the mitochondrial outer). More details about the data can be found on Allen Cell Image Data Collection (available at https://www.allencell.org/). The microscopy images (source type) and their corresponding segmentations (target type) are passed into the paired workflow to train the transfer function. The inventors can observe that the predicted images look so realistic they are practically indistinguishable from the true images, demonstrating one more type of application of the inventors’ methodology and its general usefulness. Figure 26 is an image diagram showing sample results of a binary-to-realistic microscopy image transfer function. Image 2610 is a binary segmentation mask representing the overall shape of mitochondria, in which white pixels correspond to the mitochondria. Portion 2611 of image 2610 is shown at greater magnification as image 2611'. Image 2620 is a 100x spinning- disc image of mitochondria corresponding to the binary mask in image 2610.

For example, the binary mask may have been produced by segmenting the microscopy image 2620. Portion 2621 of image 2620, corresponding to portion 2611 , is shown at greater magnification as image 2621'. The facility uses image 2610 as the source image, and image 2620 as the target image. A transfer function trained on these images can be applied to transform a binary segmentation mask into a predicted microscopy image, such as image 2630. Portion 2631 in image 2630, corresponding to portions 2611 and 2621 , is shown in greater magnification as image 263T.

Application 6: Microscopy Image to Binary Mask Via Unpaired Workflow

Transferring microscopy images to binary masks is often referred as semantic image segmentation. Both classic methods and deep learning based methods have been widely studied for this problem. A common challenge is the scale-up step, namely a specific method developed and evaluated on a specific set of images may not work well on a larger set, for example due to image-to-image variations. The unpaired workflow can be used address this issue. Given a set of microscopy images and their segmentations (which can be error-prone but with fair accuracy), the facility treats the microscopy images as source type and the binary mask images (/. e. , segmentation) as the target type in the unpaired workflow. The unpaired workflow does not directly learn any pixelwise correspondence between the images. Instead, it concentrates on learning the correlation between the intrinsic properties and high-level semantic information of the two types of images. So, the large percentage of good segmentations in the training data will drive the model to learn what a “good segmentation” should look like in general and omit the small percentage of “outliers” (the errors in the current version). After training, the model can transfer the original microscopy images into their binary mask with higher accuracy showing operation of a sample microscopy image to binary mask transfer function, that is, a segmentation transfer function.

Figure 27 is an image diagram showing an example of transferring microscopy images into binary masks on the same 3D mitochondrial images (from hiPS cells expressing mEGFP-Tom20) as in Application 5, but with the source type and target type swapped. The facility simply takes the microscopy images as the source type and their current segmentation as the target type, and trains a transfer function model using the unpaired workflow via the cycle GAN model. The original segmentations are already overall very high quality, but do have some minor issues for example with the robustness of the segmentation over the entire image for a large number of images. For example, there exist a small percentage of false negative errors due to e.g., uneven illumination throughout the field of view. As shown in Figure 27, after training, the transfer function model is able to retain the overall original accuracy of the segmentation and further reduce these false negative errors that arose due to uneven illumination. In particular, the predicted binary mask 2730 achieves an overall accuracy comparable to the existing segmentation 2720, and additionally reduces the false negative errors in the existing segmentation that were due to uneven illumination, thus improving the overall quality of the segmentation mask. The region 2712, 2722, and 2732 marked by the yellow box highlights an area where the already-good results of the existing segmentation are retained in the prediction. The region 2711, 2721 , and 2731 marked by the green box highlights an area where errors in the existing segmentation, indicated by the orange arrows in image 2721 , are reduced in the CycleGAN-based prediction.

Application 7: Transfer Function Composition

The applications described above are examples of the wide spectrum of the general applicability of the facility. Another important potential of the facility is to create composites of multiple transfer functions (also described as “daisy-chaining” them) to permit a wider range of transfer. Two earlier applications are used as an example. In application 1 , the inventors trained a transfer function from 20x, 0.8NA spinning-disk microscopy images to 100x, 1.25NA spinning-disk microscopy images of hiPSCs expressing mEGFP-H2B. This transfer function can be denoted by Fi. In application 3, the inventors trained a transfer function from 100x, 1.25NA spinning-disk microscopy to a super resolution microscopy modality with ~1.7x increased resolution. The inventors can call this transfer function F2. F2 and Fi are combined in a composite as follows:

F i ¾ox) = oox

Figure 28 is an image diagram showing the result of a composite transfer function. Image 2810 is a source image of FI2B obtained by 20x spinning disc. Image 2820 shows an intermediate prediction produced by applying function Fi to the source image, increasing its magnification to 100x. Image 2830 is produced by applying transfer function F2 to intermediate prediction image 2820, increasing its resolution to SuperRes.

Comparing the quality of these composite-predicted enhanced- resolution images with the images from real enhanced-resolution microscopy images in Figure 284 shows that the facility can successfully combine individual transfer function models to extend the range of applications of the inventors’ methodology and add to its general usefulness for transfer function applications.

Application 8: Incorporating transfer functions into different paths to predict fluorescent images from briqhtfield images

Recently, brightfield images have been widely used in the label- free technique, where fluorescent images of different intracellular structures may be directly predicted from brightfield images in order to see different parts of a cell without any fluorescent marker. The label-free technique is discussed in greater detail in Ounkomol, C., Seshamani, S., Maleckar, M.M. et al. Label- free prediction of three-dimensional fluorescent images from transmitted-light microscopy, Nat Methods 15, 917-920 (2018), available at doi.org/10.1038/s41592-018-0111-2, which is hereby incorporated by reference in its entirety.

The combination of label-free techniques permits localizing different intracellular structures directly without any dyes or labels even in images of low magnification and resolution and has the potential to change imaging assays traditionally used in cell biology. The label-free method can be viewed as a special type of “transfer function”, which is from brightfield to fluorescent images under the same magnification and resolution. Following the same concept as the daisy-chaining models from 20x to enhanced-resolution microscopy images via 100x, the facility daisy-chains a transfer function model to a label-free model to achieve a wider-range of transfer. Lamin B1 cells demonstrate the potential of integrating transfer function models and label-free models to achieve transfer from 20x brightfield images to 100x fluorescent images.

Figure 29 is a data flow diagram showing multiple approaches available to predict high-magnification fluorescent images from lower- magnification brightfield images. In particular, Figure 29 shows three possible paths from a 20x brightfield image 2901 to a 100x fluorescent image 2940.

First, in Path 1 , the facility applies a 20x/100x microscopy objective transfer model 2910 to 20x brightfield images to predict 100x brightfield images 2920, then a 100x label-free model 2930 to predict 100x fluorescent images. In Path 2, first applies a label-free model 2950 to predict 20x fluorescent images from 20x brightfield images, and then applies a 20x/100x microscopy objective transfer function 2970 on fluorescent images to predict final 100x fluorescent images. Finally, the facility can also use a direct transfer function 2980 from 20x brightfield images to 100x fluorescent images. The results of all three paths demonstrate potential for combining label free and transfer functions, especially with some further model optimization.

Figure 30 is an image diagram showing actual images captured of a sample. Image 3010 is a brightfield image at 20x captured from the sample. Image 3020 is an x,y view of a fluorescent image captured at 20x, while image 3021 is a side view of this fluorescent image.

Figure 31 is an image diagram showing higher-magnification images of the same sample. Image 3110 is a brightfield image captured at 100x. Image 3120 is an x,y view of a fluorescent image captured at 100x.

Image 3121 is a side view of the 100x fluorescent image. Image 3130 is an x,y view of a 100x fluorescent image predicted by the label-free process from the captured 10Ox brightfield image 3110. Image 3131 is a side view of this predicted 100x fluorescent image.

Figure 32 is an image diagram showing images predicted by the facility in path 1 shown in Figure 29. Image 3210 is a 100x brightfield image predicted by a transfer function 2910 from the captured 20x brightfield image 3010. Images 3220 and 3221 are a 10Ox fluorescent image predicted label-free process 3230 from predicated 100x brightfield image 3210.

Figure 33 is an image diagram showing images predicted by the facility in path 2 shown in Figure 29. Images 3310 and 3311 are a 20x fluorescent image predicated by the label-free process from the captured 20x brightfield image 3010. Images 3320 and 3321 are a 100x fluorescent image predicted by transfer function 3370 from predicted 20x fluorescent images 3310 and 3311.

Figure 34 is an image diagram showing an image predicted by the facility in path 3 shown in Figure 29. Images 3410 and 3411 are a 100x fluorescent image predicted by transfer function 2980 from the captured 20x brightfield image 3010.

Thus, images 3220 and 3221 shown in Figure 32 are the result produced by the facility in path 1 ; images 3320 and 3321 shown in Figure 33 are the result produced by the facility in path 2; and images 3420 and 3421 shown in Figure 34 are the result produced by the facility in path 3. By comparing these three predicted result images to actually-captured 100x fluorescent images 3120 and 3121 shown in Figure 31, it can be seen that all of the predicted results are of reasonable quality, and that the result predicted by path 3 is of better quality in many respects than the actually-captured 100x fluorescent image.

Computational Techniques

Because the inventors employ pixel-wise loss in the objective function when training the deep learning model, it is important to have the source type images and the target type images in each training pairs well aligned. One of the challenges in registering images across different resolutions and modalities is the pixel and sub-pixel level discrepancy that can exist between these images. These discrepancies can be caused by a variety of factors, including inherent differences in the methods of acquisition and differences in the precise depths at which each z-layer of the 3D image stack was taken. One option is to use elastic registration methods to align these images with sub-pixel accuracy. Flowever, the series of interpolations that are performed during these registration processes may lead to loss of important local information and thus create training data that deviates from the true information collected by the microscope. Instead, the inventors employed a three-stage, rigid registration workflow to generate the image pairs for training and validation of the transfer function models. This approach ensures that the original information collected by the microscope is maintained.

The goal of the registration workflow is to crop either the source or the target type image, or both, so that the two images align with one another in 3D when interpolated to the same voxel dimensions. The inventors’ method assumes that either the source or target type image has a wider FOV and contains the entire other image within this FOV. If needed, of the images can be pre-cropped to ensure it is fully contained in the other. The image with the smaller FOV serves as the “moving image” during registration and the larger FOV serves as the “fixed image”, irrespective of which is the source or target. Several intermediate interpolation results are generated for accurate registration, but the actual cropping is only applied on the output image files to avoid information loss.

First, the moving image (denoted by I mov) undergoes several preprocessing steps so that the preprocessed version of the moving image (denoted by I mov _p) matches the voxel dimensions and orientation of the fixed image (denoted by l_fix). For example, if the moving image {I mov) was imaged with a voxel size of 0.108 urn x 0.108 urn x 0.290 urn and is rotated 90 degrees clockwise relative to a fixed image ( I fix ) with a voxel size of 0.049 urn x 0.049 urn x0.220 urn, then it will be upsampled in x, y, and z by factors of 2.204, 2.204, and 1.306, respectively, using linear interpolation and then rotated 90 degrees counterclockwise ( I mov _p).

Next, the two images ( I mov _p and l_flx) are registered in the x and y dimensions. This is done by matching ORB features in the maximum intensity projections of the images and estimating a Euclidean transformation matrix that aligns the matched features. The elements in the transformation matrix corresponding to translation are used to calculate the offset between the two images in x and y. This offset is then applied to the full, 3D moving image {I mov), with zero-padding used to match the fixed image’s dimensions.

Finally, the images are aligned in the z dimension. The moving image is incrementally shifted in 3D, using stochastic gradient descent, until the Mattes Mutual Information between the images is maximized. The 2D shift calculated in the previous step, and an initial z-offset of 0, is used as the starting transformation for this optimization. The combined rigid offsets in x, y, and z calculated during registration and the scaling factors are used to calculate the crop amount, generating an aligned pair of source and target type images.

The entire image registration workflow was implemented in Python. The scikit-image implantations of ORB feature detection, matching, and RANSAC-based transformation estimation were used for initial 2D registration. Registration in z was performed using the Insight Tool Kit image registration module, with a maximum of 50 iterations of gradient descent and a 25% random sampling scheme for calculating mutual information. The implementation will be released together with the toolkit. Manual inspection of the overlaid images with Fiji was used to curate the results and ensure the quality of these 3D registrations.

Conclusion

The various embodiments described above can be combined to provide further embodiments. All of the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications and publications to provide yet further embodiments. These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.