Report

Ghost cytometry

See allHide authors and affiliations

Science  15 Jun 2018:
Vol. 360, Issue 6394, pp. 1246-1251
DOI: 10.1126/science.aan0096

Seeing ghosts

In fluorescence-activated cell sorting, characteristic target features are labeled with a specific fluorophore, and cells displaying different fluorophores are sorted. Ota et al. describe a technique called ghost cytometry that allows cell sorting based on the morphology of the cytoplasm, labeled with a single-color fluorophore. The motion of cells relative to a patterned optical structure provides spatial information that is compressed into temporal signals, which are sequentially measured by a single-pixel detector. Images can be reconstructed from this spatial and temporal information, but this is computationally costly. Instead, using machine learning, cells are classified directly from the compressed signals, without reconstructing an image. The method was able to separate morphologically similar cell types in an ultrahigh-speed fluorescence imaging–activated cell sorter.

Science, this issue p. 1246

Abstract

Ghost imaging is a technique used to produce an object’s image without using a spatially resolving detector. Here we develop a technique we term “ghost cytometry,” an image-free ultrafast fluorescence “imaging” cytometry based on a single-pixel detector. Spatial information obtained from the motion of cells relative to a static randomly patterned optical structure is compressively converted into signals that arrive sequentially at a single-pixel detector. Combinatorial use of the temporal waveform with the intensity distribution of the random pattern allows us to computationally reconstruct cell morphology. More importantly, we show that applying machine-learning methods directly on the compressed waveforms without image reconstruction enables efficient image-free morphology-based cytometry. Despite a compact and inexpensive instrumentation, image-free ghost cytometry achieves accurate and high-throughput cell classification and selective sorting on the basis of cell morphology without a specific biomarker, both of which have been challenging to accomplish using conventional flow cytometers.

Imaging and analyzing many single cells holds the potential to substantially increase our understanding of heterogeneous systems involved in immunology (1), cancer (2), neuroscience (3), hematology (4), and development (5). Many key applications in these fields require accurate and high-throughput isolation of specific populations of cells according to information contained in the high-content images. This raises several challenges. First, despite recent developments (610), simultaneously meeting the needs of high sensitivity, polychromaticity, high shutter speed and high frame rates, continuous acquisition, and low cost remains difficult. Second, ultrafast and continuous image acquisition subsequently requires computational image reconstruction and analysis that is costly in terms of both time and money (11). Given this, fluorescence imaging­­–activated cell sorting, for instance, has not been realized yet. Here we show that directly applying machine-learning methods to compressed imaging signals measured with a single-pixel detector enables ultrafast, sensitive, and accurate image-free (without image production), morphology-based cell analysis and sorting in real time, which we call “ghost cytometry” (GC).

In GC, as an object passes through a pseudorandom static optical structure, each randomly arranged spot in the structure sequentially excites fluorophores at different locations of the object (Fig. 1). These encoded intensities from each fluorophore are multiplexed and measured compressively and continuously as a single temporal waveform measured with a single-pixel detector (Fig. 1, bottom graphs), which, in this work, was a photomultiplier tube (PMT). Assuming the object is in constant unidirectional motion with velocity v, the signal acquisition is mathematically described as Embedded Image(1)where g(t) is the multiplexed temporal waveform, H(x, y) is the intensity distribution of the optical structure, and I(x, y) is the intensity distribution of the moving object. Note that H, acting as a spatial encoding operator, is static, so that no scanning or sequential light projection is needed in GC. We designed a binary random pattern for the optical structure as a simple implementation (figs. S1 to S3). In the measurement process of GC, the object is convolved with the optical structure along the x direction, and the resultant signals are integrated along the y direction. In the compressive sensing literature, randomized convolutions are regarded as imaging modalities (12). Given Eq. 1 as a forward model, the image-reconstruction process amounts to solving the inverse problem. This solution can be iteratively estimated by minimizing an objective function that is computed by combinatorial use of the multiplexed temporal waveform, g(t), and the intensity distribution of the optical structure, H. For sparse events in a regularization domain, we can reasonably estimate the moving object from the measured signal, g(t), by adopting a compressed-sensing algorithm, which, in this work, was two-step iterative shrinkage/thresholding (TwIST) (13) (as detailed in the supplementary methods). This reconstruction process shares its concept with ghost imaging, in which the original image is computationally recovered after sequentially projecting many random optical patterns onto the object and recording the resultant signals with a single-pixel detector (1419). Although ghost imaging has attracted considerable attention in the scientific community, the sequential projection of light patterns makes it slow and has hampered its practical use. Even when compressive sensing was used to reduce the time required for the light projections, the method was still slower than conventional arrayed-pixel cameras (18). By contrast, GC does not require any movement of equipment, and the speed of image acquisition increases with the object’s motion, up to the high bandwidth of single-pixel detectors. The use of motion thus transforms slow ghost imaging into a practical, ultrafast, and continuous imaging procedure—GC is 10,000 times faster than existing fluorescence ghost imaging (1921).

Fig. 1 Schematic of the compressive sensing process in GC.

The relative motion of an object across a static, pseudorandom optical structure, H(x, y), is used for compressively mapping the object’s spatial information into a train of temporal signals. F1 and F2 are the representative fluorescent features in the object. According to the object’s motion, the spatial modulation of H is encoded into the temporal modulation of emission intensity from each fluorophore in the object, and their sum g(t) is recorded with a single-pixel detector, as shown in the bottom graph. In the imaging mode of GC, the object’s 2D image can be computationally reconstructed by a combinatorial use of the multiplexed temporal waveform, g(t), and the intensity distribution of the optical structure, H. In the image-free mode of GC, directly applying machine-learning methods to the compressive temporal waveform yields high-throughput, highly accurate, image-free morphology-based cell classification. Schematics are not to scale.

As an experimental proof of concept of GC in the imaging mode, we imaged fluorescent beads mounted on a glass coverslip and moved them across a random pattern using an electronic translational stage (Fig. 2, A and B). The beads were kept in focus as they moved in the direction parallel to the row direction of H. Incorporating the random optical structure in the optical path before or after the sample is mathematically equivalent. This means that GC-based imaging can be experimentally realized by either random structured illumination (SI, shown in Fig. 2A) or random structured detection (SD, shown in Fig. 2B). These configurations correspond to computational ghost imaging (15) and single-pixel compressive imaging (22), respectively. The encoding operator H in Eq. 1 in the SI mode can be experimentally measured as the excitation intensity distribution in the sample plane; the operator H in the SD mode can be measured as the pointwise product of the excitation intensity distribution in the sample plane and the transmissibility distribution of the photomask in the conjugated plane between the sample and detector. We experimentally calibrated the exact operator, H, by placing a thin sheet of a fluorescent polymer in the sample plane and measuring its intensity distribution with a spatially resolving multipixel detector (fig. S1). A blue light-emitting diode (LED) was used as an excitation light source. During the object’s motion, the PMT collected the photons that were emitted from the object as temporally modulated fluorescence intensity (Fig. 2, C and E, for the SI and SD modes, respectively). Figure 2, D and F, shows the computationally recovered fluorescence images of multiple beads for each waveform. For comparison, Fig. 2G shows the image acquired with an arrayed detector-based scientific complementary metal-oxide semiconductor (CMOS) camera. The morphological features of the beads are clear, validating GC imaging in both the SI and SD modes.

Fig. 2 Demonstration of motion-based compressive fluorescence imaging.

(A and B) Optical setups for the SI and SD modes, respectively. We moved aggregates of fluorescent beads, represented as a red sphere, on a glass coverslip (not shown) across the optical structure in the direction of the red arrow using an electronic translational stage. (C) In the SI mode, the beads go through the structured illumination according to the motion, resulting in the generation of the temporal waveform of the fluorescence intensity. (D) From this acquired temporal signal, a 2D fluorescence image is computationally reconstructed. (E) In the SD mode, the beads are illuminated by uniform light. A conjugate fluorescence image of the beads then passes through structured pinhole arrays, resulting in the generation of the temporal waveform. (F) From this temporal signal, a 2D fluorescence image is computationally reconstructed. (G) A fluorescence image of the same aggregated beads, acquired with an arrayed-pixel camera. Scale bars, 20 μm.

The simple design of the GC optics means that adding multiple light sources, dielectric mirrors, and optical filters enables multicolored fluorescence imaging with a single photomask (Fig. 3A). To validate GC for cell imaging, we stained MCF-7 cells (a human breast adenocarcinoma cell line) in three colors: The membranes, nuclei, and cytoplasm were stained with red (EpCAM PE-CF594), blue [4′,6-diamidino-2-phenylindole (DAPI)], and green [fixable green (FG)] dyes, respectively. Stained cells were mounted on a glass coverslip. We used a blue continuous-wave laser and ultraviolet LED light sources for exciting the fluorophores. We adopted the SD mode and experimentally estimated the operator, H, for each excitation light source (fig. S2). In the experiment, we moved the stage on which the coverslip was mounted and measured the temporal signals from each color channel using three PMTs, respectively (Fig. 3, A and B). Figure 3C, (i) to (iii), shows the computationally reconstructed fluorescence images for each color, clearly revealing the fine features of cellular membranes, cytoplasm, and nuclei. For comparison, Fig. 3C, (iv), shows an overlaid color image, and Fig. 3C, (v), shows a fluorescence image acquired with a conventional color camera (Wraycam SR 130, WRAYMER, Inc., Japan). The average of peak signal-to-noise ratios—which are calculated as eq. S3—of the red, green, and blue channels is 26.2 dB between the GC’s reconstructed image, Fig. 3C, (iv), and the camera image, Fig. 3C, (v). These results demonstrate a good performance of multicolor GC imaging in delineating the morphological features of cells.

Fig. 3 Multicolor and high-throughput fluorescence cell imaging by GC.

(A) An optical setup for the multicolor motion-based compressive fluorescence imaging (SD mode). The setup utilizes a 473-nm-wavelength blue laser and a 375-nm-wavelength ultraviolet LED as excitation light sources coupled by a dichroic mirror (fig. S2) to create a relatively uniform illumination, shown as a purple rectangle through which a cell moves in the direction of the arrow. Cultured MCF-7 cells were used in the experiment, with membranes, cytoplasm, and nuclei fluorescently stained red, green, and blue, respectively. When the labeled cells moved with an electronic translational stage, their conjugated images passed through the optical encoder and generated temporal waveforms. The signals were then split into three-color channels of (i) red, (ii) green, and (iii) blue with dichroic mirrors, and finally recorded by different PMTs. (B) The representative traces recorded by the PMTs. (C) From the temporal signals for each of the three PMTs, fluorescence images of labeled cells were computationally recovered in (i) red, (ii) green, and (iii) blue, respectively. (iv) A pseudocolored multicolor image combined from (i), (ii), and (iii). (v) A multicolor fluorescence image acquired with an arrayed-pixel color camera. (D) Multicolor submillisecond fluorescence imaging of the cells under flow at the throughput rate above 10,000 cells/s. In the experiment, 488-nm-wavelength blue and 405-nm-wavelength violet lasers passed through diffractive optical elements to generate the random structured illumination to the cell stream (fig. S3, SI mode). (i) and (ii) show the green and blue fluorescence signals from the cytoplasm and the nucleus, respectively. From the temporal signals for each PMT, fluorescence images of the labeled cells were computationally recovered in (iii) green and (iv) blue, respectively. (v) The reconstructed multicolor fluorescence image. Scale bars, 20 μm.

We also show that GC can achieve fast multicolor continuous fluorescence imaging of flowing cells. We used a flow cell assembly (Hamamatsu Photonics, Japan) for focusing a stream of flowing fluorescent cells in the three-dimensional (3D) space, so that the cells are focused in the plane perpendicular to the flow that is aligned parallel to the length direction of the encoder H. Using diffractive optical elements that generate structured illuminations inside the flow cell (fig. S3), we performed continuous spread optical point scans of 100 pixels perpendicular to the flow, corresponding to the image size in the y direction in the computational reconstruction. Figure 3D, (i) and (ii), shows the temporal waveforms from each color channel of a single MCF-7 cell, with its cytoplasm labeled by FG and its nucleus labeled by DAPI. Fluorescence images were computationally reconstructed for each waveform in (iii) and (iv), respectively. Figure 3D, (v), shows the computationally reconstructed multicolor fluorescence image, with clearly resolved cellular morphological features. The cells were flowed at a throughput higher than 10,000 cells/s, a rate at which arrayed-pixel cameras such as charge-coupled devices and CMOSs create completely motion-blurred images. The total input excitation intensities of the structured illuminations after the objective lens were ~58 and ~14 mW for 488- and 405-nm lasers, where those assigned to individual random spots were <43 and <10 μW on average, respectively. By designing and adopting the appropriate diffractive optical elements, we created the light pattern with minimal loss, suggesting high sensitivity of GC imaging, calculated as the minimal number of detectable fluorophores (as detailed in the supplementary methods), close to single-molecule levels.

Using the object’s motion across the static light pattern H for optical encoding and sparse sampling, we achieve blur-free, high–frame rate imaging with high signal-to-noise ratio. The frame rate r and pixel scan rate p are defined asr = v/(width of H + width of I)(2)p = v/(single-spot size in H)(3)where I is the final image and p is the inverse of the time taken for a fluorophore to pass over each excitation spot in H. First, compressive encoding reduces the number of sampling points, defined by the length of H and required for one frame acquisition, such that we effectively lower the required bandwidth for achieving high frame rates. This reduction is important, especially in ultrafast imaging with a small number of photons, because shot noise increases as the bandwidth increases. This feature allows us to effectively reduce the excitation power for the fluorescence signals to overcome noise. Second, at a sufficient signal-to-noise ratio, GC can take full advantage of the high bandwidth of single-pixel detectors while H is temporally static, unlike in the other techniques that temporally modulate the excitation intensity before the object passes through a pixel of excitation light. Consequently, GC yields blur-free images, unless the pixel scan rate is at least two times greater than the bandwidth of the PMT. For example, for an H spot size of 500 nm and a PMT high bandwidth of 100 MHz, GC provides blur-free images, unless the flow rate surpasses at least 100 m/s.

Beyond a powerful imager, direct analysis of the GC’s compressively generated signals enables high throughput and accurate classification of the cell’s morphology at considerably lower computational cost, thus leading to the realization of ultrafast fluorescence “imaging”–activated cell sorting (FiCS) and analysis. This can be achieved because compressive sensing in GC substantially reduces the size of the imaging data while retaining sufficient information for reconstructing the object image. Although human recognition is not capable of classifying the waveforms directly, machine-learning methods can analyze the waveforms without image recovery. Here we show that supervised machine learning directly applied to waveforms measured at the rate of ~10,000 cells/s classifies fluorescently labeled cells with high performance, surpassing that of existing flow cytometers and human image recognition (Fig. 4).

Fig. 4 High-throughput and highly accurate fluorescence image-free “imaging” cytometry by GC via direct machine-learning of compressive signals.

(A) The procedure for training a classifier model in GC. (i) Different but morphologically similar cell types (MCF-7 and MIA PaCa-2 cells) were fluorescently labeled: For both cell types, the cytoplasm was stained in green with FG, whereas the membranes of only the MCF-7 cells were stained in blue with BV421-EpCAM. Scale bars, 20 μm. (ii) By separately flowing the different cell types through the encoding optical structure used in Fig. 3D at the throughput rate of >10,000 cells/s, (iii) compressive waveforms of each cell type were collectively extracted from the temporally modulated signals of fluorescence intensity. (iv) A library of waveforms labeled with each cell type was used as a training dataset to build a cell classifier. A support vector machine model was used in this work. (B) Procedure for testing the classifier model. (i) The different types of cells were experimentally mixed at a variety of concentration ratios before analysis. (ii) When flowing the cell mixture through the encoder at the same throughput rate, (iii) we applied the trained model directly to the waveform for classifying the cell type. (C) In (i), blue data points are the concentration ratios of MCF-7 cells in each sample estimated by applying the trained SVM-based classification directly on the waveforms of FG intensity (iv), compared with those obtained by measuring the total intensity of BV421 (ii). Red data points are the concentration ratios of MCF-7 cells estimated by applying the same procedure of SVM-based classification to the total intensity of FG (iii), which we obtained by integrating each GC waveform over time, compared with the results from measurement with BV421 (ii). Seventy samples were measured for a variety of concentration ratios, with each sample comprising 700 tests of randomly mixed cells. The image-free GC results shown with blue data points in (C) reveal a small RMSD of 0.046 from y = x and an AUC [ROC curve shown in (D)] of 0.971 over about 50,000 cells, even though the morphologies of these cells appear similar to the human eye. (D) Each point on the ROC curve corresponds to a threshold value applied to scores from the trained SVM model, wherein red and green colors in the histogram are labeled according to the intensity of BV421 (inset derived from eq. S4). By contrast, the red data points in (C) reveal inferior classification results, with a large RMSD of 0.289 and poor ROC-AUC of 0.596. (E) When classifying the model cancer (MCF-7) cells against a complex mixture of PBMCs, the ultrafast image-free GC recorded high values of AUC ~0.998, confirming its robust and accurate performance in a case of practical use.

Fig. 5 Demonstration of machine learning–based FiCS.

(A) A microfluidic device consists of three functional sites: A flow stream of cells is first focused by 3D hydrodynamic flow-focusing (top left), then experiences the random structured-light illumination (right), and finally arrives at a sorting area (bottom left). Upon sorting action, a PZT actuator driven by input voltages bends to transversely displace a fluid toward the junction for sorting the targeted cells into a collection outlet. For real-time classification and selective isolation of the cells (right), analog signals measured at PMTs are digitized and then analyzed at a FPGA in which we implemented the trained SVM-based classifier. When a classification result is positive, the FPGA sends out a time-delayed pulse that consequently actuates the PZT device. Experiments were performed at a throughput rate of ~3000 cells/s. The cytoplasm of all MIA PaCa-2, MCF-7, and PBMC cells, were labeled in green with FG. The membranes of MCF-7 cells in (B) and the cytoplasm of MIA PaCa-2 cells in (C) were labeled in blue with BV421-conjugated EpCAM antibodies and anti–pan cytokeratin primary and AF405-conjugated secondary antibodies, respectively. (B) Accurate isolation of MIA PaCa-2 cells against morphologically similar MCF-7 cells. GC directly classified the green fluorescence waveforms without image reconstruction. (i) is a histogram of maximum blue fluorescence intensity measured for the original cell mixture, showing a purity of 0.626 for the MIA PaCa-2 cells, whereas (ii) is a histogram for the same mixture after we applied FiCS, showing a purity of 0.951. A dashed line corresponding to a threshold value of 0.05 was used to distinguish the populations of the two cell types. (C) Accurate isolation of model cancer (MIA PaCa-2) cells against a complex mixture of PBMCs. (i) is a histogram of maximum blue fluorescence intensity measured for the original cell mixture, showing a purity of 0.117 for the MIA PaCa-2 cells, whereas (ii) is a histogram for the same mixture after we applied FiCS, showing a purity of 0.951. A dashed line corresponding to a threshold value of 40 was used to distinguish the populations of the two cell types.

Our image-free GC consists of two steps: (i) training and (ii) testing a model of cell classification (Fig. 4, A and B). We first built the model based on the support vector machine (SVM) algorithm (23) by computationally mixing the waveforms of fluorescence signals from different cell types. This training data of waveforms was collected by experimentally passing each cell type separately through the optical encoder (Fig. 4A and table S5). We then tested this model by flowing experimentally mixed cells and classifying the cell types (Fig. 4B). Before the experiment, two different types of cells were cultured, fixed, and fluorescently labeled: MCF-7 cells and MIA PaCa-2 cells (a human pancreatic carcinoma cell line) [Fig. 4A, (i)]. For both cell types, the cytoplasm was labeled in green with FG for classification by image-free GC [Fig. 4A, (i)], whereas the membranes were labeled in blue with BV421-EpCAM (BV421 mouse anti-human CD326 clone EBA-1) only in MCF-7 cells. MIA PaCa-2 cells had only low autofluorescence in the blue channel, providing easily distinguishable contrast in this channel (fig. S5, A to C). We used this to validate the GC’s classification that relied on similar cytoplasmic labeling in both cell types. Using the same GC imaging setup (fig. S3), we used both violet and blue continuous-wave lasers for exciting these fluorophores while a digitizer (M4i.4451, Spectrum, Germany) recorded the resultant signals from the PMTs. We developed a microfluidic system to spatially control the position of the stream of cells with respect to the random optical structure, corresponding to the cells’ positions in the reconstructed images (fig. S4). Using this optofluidic platform, we collected waveforms of green fluorescence intensity from the cytoplasm for each cell type. From this training dataset, we then built SVM-based classifiers with no arbitrary feature extraction. To test this trained classifier, we introduced a series of solutions containing a combination of different cell types mixed at various concentration ratios. Each classifier then identified ~700 waveforms of the mixed cells as a single dataset and estimated the concentration ratios for each. We used a combination of MCF-7 and MIA PaCa-2 cells, so that the classification results could be quantitatively scored by measuring the total fluorescence intensity of BV421 at the membrane of MCF-7 cells [Fig. 4C, (ii), and fig. S5].

A plot of the concentration ratio of MCF-7 and MIA PaCa-2 cells measured by the blue fluorescence intensity versus that measured by applying the model to the green fluorescence waveform gives a line on the diagonal, with a small root mean square deviation (RMSD) of 0.046 from y = x. Using the BV421 measurement to evaluate the GC-based classification of ~49,000 mixed cells gave an AUC [area under a receiver operating characteristic (ROC) curve] of 0.971 (Fig. 4D), confirming that cell classification by GC is accurate. Each point on the ROC curve corresponds to a threshold value applied to the score obtained from the trained SVM model (eq. S4), where red and green colors in the histogram are labeled according to the intensity of BV421 (Fig. 4D, inset, and fig. S5). To confirm that the high performance of GC is due to the spatial information encoded in the waveforms, we applied the same procedure of SVM-based classification to the total green fluorescence intensity obtained by integrating each GC waveform over time [Fig. 4C, (iii), and fig. S5D]. The results, shown as red data points in Fig. 4C, gave a poor ROC-AUC of 0.596 and a large RMSD of 0.289 from y = x and thus show little contribution of the total fluorescence intensity to the high performance of GC. In addition, by computing a simple linear fit (fig. S6), we confirmed that the SVM-based classification consistently retains its accuracy over a wide range of concentration ratios. Therefore, image-free GC is an accurate cell classifier even when the targeted cells are similar in size, total fluorescence intensity, and apparent morphological features and are present in mixtures at different concentrations. Indeed, in the absence of molecule-specific labeling, classifying such similar cell types has been a considerable challenge for existing cytometers and even for human recognition.

Besides classifying two cell types that share similar morphology, GC can accurately classify a specific cell type from a complex cell mixture at high throughput. Such technology is important, for example, in detecting rare circulating tumor cells in the peripheral blood of patients (2427). Here we applied the workflow of the image-free GC for classifying model cancer cells (MCF-7) from peripheral blood mononuclear cells (PBMCs; Astarte Biologics, Inc.), a heterogeneous population of blood cells including lymphocytes and monocytes. Again, the cytoplasm of all the cells was labeled in green with FG for classification by image-free GC, whereas the membranes of only MCF-7 cells were labeled in blue with BV421-EpCAM to validate the GC classification result. We first trained the classifier SVM model by experimentally collecting the green fluorescence waveforms for labeled MCF-7 and PBMC cells and computationally mixing them. We then tested this model by flowing experimentally mixed cells and classifying their cell types one by one. All signals were measured at a throughput of greater than 10,000 cells/s by using the same experimental setup as used previously (fig. S3). For training the model, 1000 MCF-7 cells and 1000 PBMCs were used. For testing the model, 1000 cells from a random mixture of MCF-7 cells and PBMCs were used. After performing cross-validations 10 times, the AUC recorded 0.998 for the SVM-based classifier of the GC waveforms, and Fig. 4E shows one of the ROC curves, proving the ability of ultrafast and accurate detection of the specific cell type from the complex cell mixture.

Reducing the data size by compressive sensing and avoiding image reconstruction in GC shortens the calculation time required for classifying the single waveform. By combining this efficient signal processing with a microfluidic system, we finally realized ultrafast and accurate cell sorting on the basis of real-time analysis of imaging data (Fig. 5A). Here we demonstrate the ability of FiCS to isolate a specific cell population from another population of morphologically similar cells as well as a complex cell mixture at high throughput and high accuracy. In a microfluidic device made of polydimethylsiloxane (Fig. 5A, left), we designed three functional sites (fig. S7A). First, a flow stream of cells was focused into a tight stream by a 3D hydrodynamic flow-focusing (28, 29) structure (fig. S7B). The cells then experienced the random structured-light illumination of GC and finally arrived at a junction where sorting occurs. A piezoelectric (PZT) actuator was connected to this junction through a channel directed in a direction perpendicular to the main flow stream (fig. S7C). For sorting action, this actuator, driven by an input voltage, bends and transversely displaces a fluid toward the junction to sort the targeted cells into a collection outlet. As shown in Fig. 5A and fig. S7D, for the real-time classification and selective isolation of the cells, their fluorescence signals, recorded as analog voltages by PMTs, were digitized by an analog-to-digital converter and then analyzed by a field-programmable gate array (FPGA) in which we implemented the SVM-based classifier in advance. When the FPGA classifies the cell of interest as positive, it sends out a time-delayed pulse that consequently drives the PZT actuator in the chip. The computation time in the FPGA for classifying each compressive waveform was short enough (<10 μs) to enable reproducible sorting. Throughout the experiment, the width of GC’s waveform was maintained for about 300 μs, corresponding to a throughput of ~3000 cells/s. After measuring the green fluorescence waveforms of positive and negative cells with their labels made by the maximum blue fluorescence intensity, we built the classifier model in a computer offline and implemented it in the FPGA. In the experiment, the cytoplasm of all the MIA PaCa-2 cells, MCF-7 cells, and PBMCs was labeled in green with FG; in addition, the membranes of MCF-7 cells in Fig. 5B and the cytoplasm of MIA PaCa-2 cells in Fig. 5C were labeled in blue with BV421-conjugated EpCAM antibodies and anti–pan cytokeratin primary and AF405-conjugated secondary antibodies, respectively.

We first show that integrated FiCS enables accurate isolation of MIA PaCa-2 cells from MCF-7 cells, which are similar in size, total fluorescence intensity, and apparent morphology. Two hundred waveforms of MIA PaCa-2 cells and 200 of MCF-7 cells were used for training the SVM model. When we mixed the two cell types and then measured their maximum blue fluorescence intensity with a homebuilt flow cytometer (analyzer), two distinct peaks corresponding to the two cell types appeared in the histogram [Fig. 5B, (i)]. After we applied the machine learning–driven FiCS to the same cell mixture by classifying the green fluorescence waveforms, we measured the maximum blue fluorescence intensity of the sorted mixture in the same manner. As a result, the peak at stronger intensity, corresponding to MCF-7 cells, disappeared, and the purity of MIA PaCa-2 cells increased from 0.625 to 0.951 [compare Fig. 5B, (i) and (ii)]. We thus confirmed that, with just the use of cytoplasmic staining (FG) alone, which does not specifically label the targeted molecules, FiCS can recognize and physically isolate the apparently similar cell types on the basis of their morphologies with high accuracy and throughput.

Finally, we show that FiCS can accurately enrich MIA PaCa-2 cells against the complex mixture of PBMCs. Two hundred waveforms of MIA PaCa-2 cells and 200 of PBMCs were used for training the SVM model. When we mixed the two cell types and then measured their maximum blue fluorescence intensity with a homebuilt flow cytometer (analyzer), the peak at stronger intensity, corresponding to the population of MIA PaCa-2 cells, was relatively small [Fig. 5C, (i)]. After we applied FiCS to the same cell mixture by classifying the green fluorescence waveforms, we measured the maximum blue fluorescence intensity of the sorted mixture in the same manner. As a result, the purity of MIA PaCa-2 cells increased from 0.117 to 0.951 [compare Fig. 5C, (i) and (ii)]. We thus confirmed that FiCS can substantially enrich the model cancer cells against the background of a complex cell mixture without any specific biomarker at high accuracy and throughput.

Recent research has extensively used imaging flow analyzers for the detection and/or characterization of critical cells in various fields, including oncology, immunology, and drug screening (30, 31). GC’s ability to notably increase the analysis throughput and selectively isolate cell populations according to high-content information in real time will lead to integration of the morphology-based analysis with comprehensive, downstream “-omics” analyses at single-cell levels. Beyond conventional image generation and processing that rely on the limited knowledge and capability of humans, we anticipate that the idea of applying machine-learning methods directly to compressive modalities will have broad applicability for the real-time application of high-quantity and high-dimensional data.

Supplementary Materials

www.sciencemag.org/content/360/6394/1246/suppl/DC1

Materials and Methods

Figs. S1 to S7

References (3340)

Movie S1

References and Notes

Acknowledgments: We thank H. Suzuki, T. Amemiya, and K. Nakagawa for their kind support in the material production. Funding: This work was mainly supported by JST-PRESTO, Japan, grant numbers JPMJPR14F5 to S.O., JPMJPR17PB to R.H., and JPMJPR1302 to I.S. and partially supported by funds of a visionary research program from Takeda Science Foundation, and the Mochida Memorial Foundation for Medical and Pharmaceutical Research. The work is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO). S.Y., K.F., and K.W. are members of the Department of Ubiquitous Health Informatics, which is engaged in a cooperative program between the University of Tokyo and NTT DOCOMO, Inc. Author contributions: S.O., R.H., Y.K., and M.U. contributed equally to this work. R.H. and S.O. conceived and designed the concepts, experiments, data analysis, and overall research. S.O., M.U., and K.H. developed the setups and performed experiments of optical imaging. R.H. developed algorithms for the imaging recovery and data analysis, and M.U., R.K., K.S., and S.O. modified and used them for the cell analysis with the strong support of I.S. Y.K. developed and performed microfluidic cell sorting with the support of R.K. and S.O. M.U. and S.O. developed and performed experiments of image-free GC analysis. S.O., S.Y., Y.K., K.F., I.S., and K.W. designed the experiments of detecting cancer cells in blood. S.O., R.H., I.S., and H.N. supervised the work. S.O., R.H., Y.K., and M.U. wrote the manuscript with the input of the other authors. Competing interests: S.O., R.H., and I.S. are the founders and shareholders of Thinkcyte, Inc., a company engaged in the development of the ultrafast imaging cell sorter. S.O., R.H., Y.K., I.S., K.H., S.Y., K.F., and K.W. are inventors on patent applications submitted by the University of Tokyo and Osaka University covering the motion-based ghost imaging as well as image-free morphology analysis. Data and materials availability: Original measurement data and codes for analysis are available in the supplementary materials and are deposited in Zenodo (32), a repository open to the public.

Correction (15 June 2018): An acknowledgment was inadvertently omitted during revision. It has been added to the beginning of the Acknowledgments section in the PDF and HTML.

View Abstract

Navigate This Article