Quantum-inspired computational imaging

See allHide authors and affiliations

Science  17 Aug 2018:
Vol. 361, Issue 6403, eaat2298
DOI: 10.1126/science.aat2298

More to imaging than meets the eye

Traditional imaging techniques involve peering down a lens and collecting as much light from the target scene as possible. That requirement can set limits on what can be seen. Altmann et al. review some of the most recent developments in the field of computational imaging, including full three-dimensional imaging of scenes that are hidden from direct view (e.g., around a corner or behind an obstacle). High-resolution imaging can be achieved with a single-pixel detector at wavelengths for which no cameras currently exist. Such advances will lead to the development of cameras that can see through fog or inside the human body.

Science, this issue p. eaat2298

Structured Abstract


Imaging technologies, which extend human vision capabilities, are such a natural part of our current everyday experience that we often take them for granted. However, the ability to capture images with new kinds of sensing devices that allow us to see more than what can be seen by the unaided eye has a relatively recent history.

In the early 1800s, the first ever photograph was taken: an unassuming picture that required days of exposure to obtain a very grainy image. In the late 1800s, a photograph was used for the first time to see the movement of a running horse that the human eye alone could not see. In the following years, photography played a pivotal role in recording human history, ranging from influencing the creation of the first national parks in the United States all the way to documenting NASA’s Apollo 11 mission to put a man on the Moon. In the 1990s, roughly 10 billion photographs were taken per year. Facilitated by the explosion in internet usage since the 2000s, this year we will approach 2 trillion images per year—nearly 1000 images for every person on the planet. This upsurge is enabled by considerable advances in sensing and data storage and communication. At the same time, it is driving the desire for imaging technology that can further exceed the capabilities of human vision and incorporate higher-level aspects of visual processing.


Beyond consumer products, research labs are producing new forms of imaging that look quite different from anything we were used to and, in some cases, do not resemble cameras at all.

Light is typically detected at relatively high intensities, in the spectral range and with frame rates comfortable to the human eye. However, emerging technologies are now relying on sensors that can detect just one single photon, the smallest quantum out of which light is made. These detectors provide a “click,” just like a Geiger detector that clicks in the presence of radioactivity. We have now learned to use these “click” detectors to make cameras that have enhanced properties and applications. For example, videos can be created at a trillion frames per second, making a billion-fold jump in speed with respect to standard high-speed cameras. These frame rates are sufficient, for example, to freeze light in motion in the same way that previous photography techniques were able to freeze the motion of a bullet—although light travels a billion times faster than a supersonic bullet. By fusing this high temporal resolution together with single-photon sensitivity and advanced computational analysis techniques, a new generation of imaging devices is emerging, together with an unprecedented technological leap forward and new imaging applications that were previously difficult to imagine.

For example, full three-dimensional (3D) images can be taken of a scene that is hidden behind a wall, the location of a person or car can be precisely tracked from behind a corner, or images can be obtained from a few photons transmitted directly through an opaque material. Inspired by quantum techniques, it is also possible to create cameras that have just one pixel or that combine information from multiple sensors, providing images with 3D and spectral information that was not otherwise possible to obtain.


Quantum-inspired imaging techniques combined with computational approaches and artificial intelligence are changing our perspective of what constitutes an image and what can or cannot be imaged. Steady progress is being made in the direction of building cameras that can see through fog or directly inside the human body with groundbreaking potential for self-driving cars and medical diagnosis. Other cameras are being developed that can form 3D images from information with less than one photon per pixel. Single-photon cameras have already made their way into widely sold smartphones where they are currently used for more mundane purposes such as focusing the camera lens or detecting whether the phone is being held close to one’s ear. This technology is already out of the research laboratories and is on the way to delivering fascinating imaging systems.

Quantum-based imaging systems are being developed to image through opaque media (e.g., fog or human tissue) that scatter light in all directions.



Computational imaging combines measurement and computational methods with the aim of forming images even when the measurement conditions are weak, few in number, or highly indirect. The recent surge in quantum-inspired imaging sensors, together with a new wave of algorithms allowing on-chip, scalable and robust data processing, has induced an increase of activity with notable results in the domain of low-light flux imaging and sensing. We provide an overview of the major challenges encountered in low-illumination (e.g., ultrafast) imaging and how these problems have recently been addressed for imaging applications in extreme conditions. These methods provide examples of the future imaging solutions to be developed, for which the best results are expected to arise from an efficient codesign of the sensors and data analysis tools.

Computational imaging is the fusion of computational methods and imaging techniques with the aim of producing better images, where “better” has a multiplicity of meanings. The development of new imaging sensors and, in particular, instruments with single-photon sensitivity, combined with a new wave of computational algorithms, data handling capability, and deep learning, has resulted in a surge of activity in this field.

One clear trend is a shift away from increasing the number of megapixels and toward fusing camera data with computational processing and, if anything, decreasing the number of pixels, potentially to a single pixel. The incoming data may therefore not actually look like an image in the conventional sense but are transformed into one after a series of computational steps and/or modeling of how the light travels through the scene or the camera. This additional layer of computational processing frees us from the chains of conventional imaging techniques and removes many limitations in our imaging capability.

We briefly describe some of the most recent developments in the field, including full three-dimensional (3D) imaging of scenes that are hidden (e.g., around a corner or behind an obstacle), high-resolution imaging with a single-pixel detector at wavelengths for which no cameras exist, cameras that can see through fog or inside the human body, and cameras that mimic the human eye by creating detail only in areas of interest. We will also discuss how multispectral imaging with single-photon detectors can improve 3D reconstruction and provide richer information about a scene.

We discuss how single-photon detection technologies are transforming imaging capabilities with single-photon–sensitive cameras that can take pictures at the lowest light levels and with the ability to create videos reaching a trillion frames per second. This improvement has enabled the capture of images of light beams traveling across a scene and provided opportunities to observe image distortions and peculiar relativistic temporal inversion effects that are due to the finite speed of light. The ability to capture light in flight also underpins some of the applications mentioned above—for example, the ability to view a 3D scene from around a corner. Probabilistic modeling of the particlelike nature of light when using single-photon detectors has stimulated the birth of new computational techniques such as “first-photon imaging,” which hints at the ultimate limits of information to be gained from detecting just one photon.

Single-pixel and ghost imaging

Although most imaging techniques that have emerged recently are based on classical detectors and cameras, some of these approaches have been inspired by or have a tight connection with similar ideas in quantum imaging. A prime example is ghost imaging (Fig. 1) (1), originally thought to be based purely on quantum principles but now recognized as being dependent on spatial correlations that can arise from both quantum and classical light (2). The realization that this technique does not require quantum light led to a merging of the fields of computational ghost imaging (3) and work on single-pixel cameras (4), as well as to an overall increase of activity in the field. In its quantum version, ghost imaging refers to the use of parametric down-conversion to create pairs of photons with correlated positions. If we detect the position of one photon with a standard camera and illuminate an object with the other position-correlated photon, it is sufficient to detect only the reflectance or transmittance of the object with a single-pixel detector—i.e., to measure the correlation count between the beams to then reconstruct a full image by repeating the measurement with many different photon pairs (each of which will be randomly distributed because of the random nature of the correlated photon pair generation process) (5, 6). It is now acknowledged that the quantum properties of the correlated photons play no role in the image reconstruction process: Thermal light split into two beams using a beam splitter can be used equally effectively, albeit at a higher photon flux (7). Rather than using a beam splitter, it is possible to use a spatial light modulator to create a pattern where the copy is simply the computer memory. This approach therefore no longer requires a camera of any kind in the setup: The computer-generated pattern is already known and the image, I, can be retrieved by multiplying the single-pixel readout, ai, with the corresponding pattern, Hi, and then summing over all patterns, i.e., Embedded Image. This opens the route to so-called compressed single-pixel imaging, in which assumptions about the spatial correlations of the image enable patterns to be judiciously chosen to require far fewer patterns than the final number of image pixels, with compression factors as high as 80 to 90%. This concept is not dissimilar from standard JPEG compression, which assumes that typical images are concentrated in their spatial frequencies, with the difference that now the compression is applied at the image acquisition stage. By this compression, single-pixel imaging is therefore transformed from a slow, relatively inefficient process into a highly efficient imaging technique that can operate at video frame rates in full color (8). More recent developments include extensions to lensless imaging (9) and to full 3D images for which depth information is obtained by also using time-of-flight information (10, 11)—i.e., in addition to object reflectivity, the imaging system also estimates the light travel distance, d, from the temporal shift, τ, of the detected signal, as the two are related by the speed of light, d = cτ, in free space, where c is the speed of light. In general, this single-pixel technique suffers from having low resolution and providing poor-quality images even when compared with a cell-phone camera. This limitation may be partly overcome by taking inspiration from nature and implementing computational algorithms so that the system increases the density of the projected spatial patterns only in areas of interest, therefore increasing the spatial resolution in regions where it is needed and leaving the surrounding areas relatively less defined (12). This is just one example of computational techniques being combined with detection technology to provide more efficient sensing solutions. Another example is the first-photon imaging approach that emerged from a codesign of hardware and computational algorithms, built around the concept of single-photon detection.

Fig. 1 Ghost imaging.

Random spatial patterns, Rn, illuminate an object and only the total transmitted (or reflected) light, An, is measured. This intensity reading is then computationally combined with the information of the pattern, In(x, y) (either measured separately or known if generated by a computer), to form an image of the object. CCD, charge-coupled device.

First-photon imaging

An important legacy of recent interest in the field of quantum information science is the development of a series of detector technologies for single photons. The workhorse for most laboratories is the single-photon avalanche diode (SPAD). SPADs are, in essence, semiconductor diodes that are reverse-biased beyond their breakdown threshold: A single photon (or even a thermally generated charge in the diode) is sufficient to lead to the rapid charge multiplication process (or avalanche) that creates a spike in the output current. A quenching mechanism stops the avalanche process before the diode is irreversibly damaged, leading also to a dead time during which the diode is insensitive to incident photons before being reactivated. The particlelike nature of a photon is revealed through the very short burst in the SPAD output current that can then be very precisely timed when a reference signal is also provided. The ability to precisely detect the photon arrival time can be used for long-distance, high-precision light detection and ranging (LIDAR). A distant object is illuminated with a pointlike pulsed laser beam. Each outgoing pulse starts a counter, which is then stopped at time τ when a return photon is detected; accounting for the two directions of the light travel, the distance of the illuminated object is simply cτ/2. Scanning the scene using this time-correlated single-photon counting (TCSPC) technique can therefore provide a full 3D image (or depth image) of the scene (1315). However, TCSPC-based imaging can require very long acquisition times, in particular when photons return to the detector at a low rate. Conventional processing techniques require (i) operation in the photon-starved regime (i.e., 10% or less of the outgoing laser pulses should give rise to a detected return photon so that bias from detector dead times is negligible) and (ii) measurement over many illumination repetition periods so that 100 to 1000 photons or more are detected for each position. Under these conditions, a faithful measurement of the photon arrival time is obtained. This approach can easily lead to acquisition times of a complex scene that can be on the order of many seconds or even minutes.

The computational imaging philosophy enables a marked reduction in the number of detected photons needed for 3D imaging (16). In the first-photon imaging approach, only the very first detected photon at each scan location is used, so the acquisition time is limited primarily by the speed of scanning, and any detector dead time coincides with the scanning (17). Using the number of pulses until a photon is detected as an indirect measurement of reflectivity, along with a piecewise-smooth assumption for both reflectivity and depth, a 3D image of a scene is produced after several computational steps, as shown in Fig. 2. This approach builds a strong link between the computational steps and the detailed mechanism of single-photon detection, with various aspects (such as the noise background and the particlelike nature of the photons and their detection) built into the information used to retrieve high-quality 3D images. Similar extreme photon efficiency can be achieved with a fixed dwell time at each scene position (18), and principled statistical techniques for adapting the local spatial resolution to characteristics of the data enable background noise 25 times as strong as the back-reflected signal to be mitigated (19). Additional performance improvements have been obtained with deep learning (20). Using an array of SPADs parallelizes the data acquisition and thus can increase imaging speed, though an array has coarser time resolution, translating to coarsened longitudinal distance measurements (21). Methods for arrays can also be highly photon efficient (22).

Fig. 2 First-photon imaging.

(A) Each photon detection can be mapped to a 3D position, which is often far from correct because half of the detections are due to background light. The number of illumination pulses until a photon is detected is inversely proportional to an initial estimate of reflectivity. (B) Exploiting piecewise smoothness yields improved reflectivity estimates. (C) Approximate noise censoring removes most detections due to background light. (D) The final estimate exploits piecewise smoothness of depth. [Adapted from figure 2 of (16)]

Non–line-of-sight imaging

Photon counting has strongly affected the field of non–line-of-sight (NLOS) imaging—i.e., the imaging of objects that are, for example, hidden behind a wall, corner, or obstacle (2332). Access to very high temporal resolution imaging systems allows reconstruction of a full 3D image of the hidden scene, as conceptually explained in Fig. 3A. A short laser pulse is sent to a scattering surface chosen so as to scatter light behind the obstacle and thus illuminate the hidden scene. The hidden scene will reflect a return echo that will once again hit the first scattering wall and return to the imaging system. An intuitive understanding of the hidden object reconstruction is based on the fact that the locus of points that can give rise to a backscattered signal from a laser spot at a position rl = (xl, yl) and measured at a given point ri = (xi, yi) on the wall is given by Embedded Image. This equation describes an ellipsoid of points that can be recalculated for each detection point on the wall: each of these ellipsoids will overlap only at the points of origin of the (hidden object) scattering. Therefore, by summing over all ellipsoids, one obtains a high “intensity” (proportional to the overlap) in correspondence to the hidden object. With sufficient temporal resolution and additional processing to sharpen the retrieval, it is possible to reconstruct full 3D shapes: for example, 100 ps is sufficient to resolve centimeter-sized features. Practically, most retrieval techniques aim at iteratively finding an estimate of ρ(x, y, z), which represents the physical distribution of the hidden object, from the measured transient image intensity Embedded Imagewhere δ represents Dirac’s delta function. The first demonstration of this technique was obtained with the use of a streak camera (23) that provides very high 1- to 10-ps temporal resolution but at the expense of relatively long acquisition times (Fig. 3, B and C). Successive demonstrations resorted to single-photon counting to reconstruct 3D images (24) and to perform tasks such as tracking of moving objects (27) and humans, even over very large distances [more than 50 m between the scattering wall and the imaging system (28)]. Recent improvements have demonstrated acquisition times on the order of seconds for a full 3D scene reconstruction by modifying the acquisition scheme. Specifically, photons are collected coaxially [i.e., along the same trajectory as the outgoing laser beam (31)], and, as a result, the measurement integral is simplified toEmbedded Imagewhere the radiometric factor 1/r4 is now only a function of τ and can thus be removed from the integral. Overall, the result of this is that I(x, y, τ) reduces to a convolution that substantially decreases the computational retrieval times, paving the way to real-time reconstruction of 3D scenes. This is an example of how imaging hardware and computational techniques have coevolved to create a new imaging capability. It is worth pointing out that recent measurements have shown not only real-time capability but also the capacity for long-distance and full-daylight operation (28, 31), thus moving from proof-of-principle studies to first steps toward deployment in real-world applications in just a few years. An interesting challenge for this field of research starts from the observation that much of the technology involved in NLOS imaging is common with standard, direct line-of-sight of LIDAR (i.e., 3D imaging of environments using laser pulse time-of-flight measurements). In this sense, NLOS imaging has the potential to become a natural extension of LIDAR. In this context, there are clear applications for NLOS imaging, when combined with LIDAR, for urban safety and unmanned vehicles. Additionally, future NASA missions will employ SPAD arrays for LIDAR mapping of planet surfaces, and studies are currently under way to evaluate the potential of NLOS imaging to remotely (e.g., from a satellite) assess the internal structure of underground caves on planets, with a view toward future human colonization activities (33).

Fig. 3 Non–line-of-sight imaging.

(A) Basic geometry of the problem. A laser illuminates a scattering surface and scatters light around a wall that hides an object from the direct line of sight. The return signal backscattered from the hidden object is detected at a point “i” on the scattering surface. This geometry, with a single observation point, defines an ellipsoid of possible locations for the object. Detection of the time-resolved transient images at multiple points on the surface allows reconstruction of the location or even the full 3D shape of the object. (B) An example of a hidden object with its reconstruction shown in (C). [Panels (B) and (C) adapted from (21)]

Enhanced SPAD arrays for imaging in scattering media

Over the past several years, a number of industrial and academic research groups have been developing a new generation of cameras in which each individual pixel consists of a SPAD. A relatively widespread version of these sensors, often referred to as “quanta imaging sensors,” is operated in binary mode—that is, the sensor pixel generates a “1” when the number of photons is larger than a certain threshold (typically set to just one photon) and generates a “0” otherwise (3438). Each frame therefore has a single-bit depth. To build an image, multiple frames must be added together. This operation mode brings some advantages: Aside from the single-photon sensitivity, one can add as many frames as desired so as to achieve very high bit depths (dynamic ranges) that are not attainable with standard complementary metal-oxide semiconductor cameras. Moreover, the single-bit nature of the acquisition permits very high frame acquisition rates (rates up to 100 kHz have been reported) (39).

Progress has also been made in the full on-chip integration of TCSPC electronics, thus providing the additional functionality of temporal resolutions as low as 50 ps (21, 4043). This implies that, when combined with a high repetition rate laser for the active illumination of the scene, video rates reaching up to 20 billion frames per second can be achieved (44). This remarkable performance can be better appreciated when expressed in terms of the actual imaging capability. At such frame rates, light propagates just 1.5 cm between successive frames, which implies that it is possible to actually freeze light in motion in much the same way that standard high-speed cameras can freeze the motion of a supersonic bullet. The first images of light in flight were shown in the late 1960s via nonlinear optical gating methods (4547), but the first camera-based measurements were only recently demonstrated with the use of a streak camera (48). More recent measurements based on SPAD arrays have allowed the first capture of light pulses propagating in free space with total acquisitions times on the order of seconds or less (44). SPAD array cameras have also been used to directly image laser pulse propagation through optical fibers: Beyond their direct applications (e.g., estimation of physical parameters of optical fibers), these measurements combined a fusion of single-photon data with hyperspectral imaging over several different wavelengths (discussed below) and computational processing through which the 32-pixel–by–32-pixel resolution was successfully up-sampled by using the temporal axis to re-interpolate the pulse trajectory in the (x, y) spatial plane (49).

The ability to capture simultaneously spatial and high-resolution temporal information at very low light levels with SPAD cameras has recently been applied to other difficult imaging problems, such as imaging and sensing through scattering and turbid media. For example, Pavia et al. have applied inverse retrieval methods in combination with spatial and temporal information from a linear SPAD array for tomographic reconstruction of objects hidden in murky water (50). More recently, Heshmat and co-workers have acquired data with a planar SPAD array and reconstructed various shapes of objects obscured by a thick tissue phantom (51). Their technique was referred to as “All Photons Imaging,” directly underlining the importance of the photon time-of-flight information that is recorded by the single-photon camera. We note that such approaches do not explicitly account for the physical origins of the data. For example, temporal and spatial information are placed on equal footing and enter in the retrieval process without incorporation of statistical models for timing jitter or surface reflectivity of the objects. Future SPAD imaging will benefit from incorporating accurate spatiotemporal statistical models for sources, photon transport medium, and photon detectors. In the broad regime of strong scattering, the camera will typically record an indistinct, diffuse illumination transmitted through the medium or reflected from the scene with little or no obvious information about any objects hidden behind or inside the scattering medium. Computational techniques are thus required to actually retrieve details about the object. This field of research is of particular interest for a number of applications such as medical diagnostics and imaging, as well as sensing and imaging through fog.

With the emergence of imaging systems for autonomous underwater vehicles, unmanned aerial vehicles, robots, or cars, rain and fog present important challenges that must be overcome. Sonar is a well-established technology for long range underwater imaging, but it can suffer from low spatial resolution limited by the physics of sound propagation in the medium. Although high-power optical solutions can be used for short-range imaging in relatively clear water, the presence of underwater scatterers between the active imaging system and the scene (e.g., the seabed) usually produce large quantities of reflected photons that can mask the returns from the scene of interest. By using a pulsed illumination source combined with sensitive single-photon detectors, it is possible to discriminate the photons reflected because of scattering in the water from those (an extremely small refraction) that actually reach the surfaces of interest. For instance, Maccarone et al. have demonstrated the ability to image underwater up to eight attenuation lengths (52). When combining this cutting-edge technology with advanced statistical methods inspired by our previous work (53), substantial performance improvements could be achieved in terms of 3D reconstruction and estimation of the surface reflectivity by accounting for the distance-induced signal loss (54). On a related note, efforts for 3D reconstruction of terrestrial scenes at long distances suffer from limitations similar to those described above. Even if the measurements are performed under favorable (e.g., dry) conditions, the recorded signals can be considerably affected by atmospheric turbulence (5557) and solar illumination (58, 59). Again, marked improvements in detection accuracy (60) and maximal observable range (61) have been obtained via the use of adapted computational tools. The problem becomes even more acute in the presence of fog, which is a major concern for the next generation of automated cars. It has been demonstrated that it is technically possible to detect and analyze fog patches over long distances, provided that the laser power is sufficient to ensure a nonzero probability of photon reflection and a long enough acquisition time (62, 63). In the automotive context, where the acquisition time is intrinsically limited by the vehicle displacement velocity, more robust and computationally efficient strategies have been recently proposed (51, 64), and it is clear that future imaging systems will incorporate computational models for both the propagation physics of the medium and physical properties of the detector.

Multispectral single-photon imaging

Multispectral and hyperspectral imaging, which are extensions of classical color (RGB) imaging, consist of imaging a scene using multiple wavelengths (from four to several hundreds or even thousands in hyperspectral images). These modalities have benefited from a robust body of research spanning more than 35 years, from the data collection community (6567), and, more importantly, from the data processing and analysis community (6873). Indeed, such modalities can be associated with a wide variety of computational problems, ranging from image acquisition (compressive sampling), restoration (denoising/deblurring, superresolution), segmentation (classification) to source separation (spectral unmixing), object/anomaly detection, and data fusion (e.g., pansharpening). Though the main applications using (passive) multispectral imaging are in Earth and space observation, the proven benefits of imaging with multiple wavelengths simultaneously have enabled its application in the food industry (66, 74) and a broader range of applications such as diagnostic dermatology (75, 76). Active multispectral imaging is less sensitive to ambient illumination than passive imaging, which requires data acquisition under daylight condition (e.g., for Earth observation). Without fast timing capabilities, however, multi- and hyperspectral imagers are only able to provide 2D intensity profiles and are thus poorly adapted to analysis of multilayered 3D structures such as forest canopies. Multispectral LIDAR is a promising modality that allows for joint extraction of geometric (as single-wavelength LIDAR) and spectral (as passive multispectral images) information from the scene while avoiding data registration issues potentially induced by the fusion of heterogeneous sensors. Wallace et al. have demonstrated that it is possible to use multispectral single-photon LIDAR (MSPL) to remotely infer the spatial composition (leaves and branches) and the health of trees using only four wavelengths (77). More recently, new experiments have been designed to image up to 33 wavelengths (500 to 820 nm) in free space (78) and 16 wavelengths underwater (79). As a consequence, we have witnessed the development of algorithms inspired from passive hyperspectral imagery (3D datacubes) for analysis of MSPL data (4D datacubes).

For instance, Bayesian methods have been proposed to cluster, in an unsupervised manner, spectrally similar objects while estimating their range from photon-starved MSPL data (80). This work was further developed (81, 82) to classify pixels on the basis of their spectral profiles in photon-starved regimes down to one photon per pixel and per spectral band, on average (Fig. 4). Such results are possible only by efficiently combining a highly sensitive raster-scanning single-photon system that allows for submillimeter range resolution with hierarchical Bayesian models able to capture the intrinsic, yet faint, structures (e.g., spatial and spectral correlations) of the data. A notable improvement has been demonstrated by using simulation methods (see next section) to reconstruct scenes (range and reflectivity profiles) with as few as four photons per pixel (with four spectral bands and one photon per pixel, on average) (81).

Fig. 4 Computational inverse probability methods to spectrally classify and depth-resolve objects in a scene from photon-starved multispectral LIDAR images.

The scene (A) was composed of 14 clay materials of different colors. The recorded images consist of a 200-pixel–by–200-pixel area (scanned target areas were approximately 50 mm by 50 mm), and the targets were placed 1.85 m from the system. In (B), the first column depicts the estimated depth profile (in millimeters), the reference range being arbitrarily set to the backplane range. The second column shows color classification maps, and the third column depicts the spectral signatures of the most prominent classes, projected onto the first and second axes obtained by using principal components analysis. Each of these subplots illustrates the similarity between the estimated spectral signatures. Rows (i) and (ii) in (B) depict results obtained with an average of one detected photon per pixel, for each spectral band, with 33 and 4 bands, respectively. [Adapted with permission from (73)]

Spectral unmixing presents another challenging problem in the use of multi- and hyperspectral data for identification and quantification of materials or components present in the observed scene. Spectral unmixing can lead to improved classification by accounting for the fact that several mixed materials can be observed in a given pixel. Spectral unmixing methods allow for subpixel material quantification, which is particularly important for long-range imaging scenarios in which the divergence of the laser beam cannot be neglected. We developed a computational method for quantifying and locating 15 known materials from MSPL data consisting of 33 spectral bands while detecting additional (potentially unknown) materials present in the scene (78). Again, this work demonstrated the possibility of material quantification and anomaly detection with as little as 1 photon per pixel and per spectral band, on average. It also illustrated how Bayesian modeling can be used for uncertainty quantification—e.g., for providing confidence intervals associated with estimated range profiles. As mentioned above, although the most recent single-photon detectors are very attractive because of their high temporal resolution, their application to information extraction from wide-area scenes is hindered by long acquisition times associated with raster scanning strategies. This is particularly limiting when several wavelengths are acquired in a sequential manner. To address this problem, compressive sampling strategies have been investigated to achieve faster MSPL data acquisition (83, 84). Although computational methods for image scanning systems have been proposed, whereby a random number of spectral bands can be probed in a given pixel, the most promising results have been obtained with a simulated mosaic filter (four wavelengths) whose implementation within a SPAD array could allow for the simultaneous acquisition of multiple pixels and fast reconstruction of range and reflectivity profiles. These results show how advanced computational methods can be used to enhance information extraction from imaging systems and also improve the design of future detectors and detector arrays.

Computational methods in the photon-starved regime

From a mathematical perspective, computational imaging is formulated as finding a mapping that reconstructs a set of parameters x, which may have a physical meaning (or not), from a set of measurements y recorded by an imaging or sensing device. These parameters can take continuous values (e.g., light field intensities, object positions, and velocities) or discrete values (e.g., the number objects, binary values representative of the presence or absence of objects). Two main families of methods can be adopted to design algorithms for data analysis—namely, supervised machine learning approaches and statistical signal processing approaches—although hybrid methods can also be used. The choice of the most suitable approaches depends primarily on the complexity of the computational model, as well as the computational budget available (i.e., the expected processing time, data storage limitations, and the desired quality of the information extracted). Supervised machine learning (including deep learning) approaches are particularly well suited for applications where a sufficient quantity of ground truth data or reference data is available (8588). Such methods rely on a two-stage process, which consists of the training stage and the test stage. Starting from a forward model yg(x), relating the measured data y to the unknown source parameters x, the training stage uses a set of measurements and corresponding parameters to learn the inverse mapping Embedded Image between the measurements and the set of parameters to be recovered—i.e., it fits an inverse model Embedded Image. In contrast to model-based statistical methods, data-driven machine learning approaches do not rely on the knowledge a forward model Embedded Image. Thus, these methods can often be applied to complex problems where the forward model is unknown or too complicated to be derived analytically but for which plenty of training data are available. The training stage controls the quality of the estimation of the mapping Embedded Image, which in turn depends on the representational power of the machine learning algorithm and on the quality and diversity of the training samples. Machine learning approaches have been successfully applied to imaging applications such as imaging through multimode fibers (85), lensless imaging of phase objects (86), and identification of a human pose from behind a diffusive screen (87). SPAD cameras have been specifically applied to identifying both the positions and identities of people hidden behind a wall (88), imaging through partially transmissive (89) and scattering media (51), and profiling camouflaged targets (90). The design of reliable machine learning approaches is currently limited by high generalization error (i.e., machines must be retrained for different acquisition scenarios) and a lack of ground truth information about the sources or the medium.

Statistical model-based methods can thus be more attractive than data-driven machine learning approaches for photon-limited imaging, as a mathematical forward model Embedded Image can be combined with a statistical noise model to better fit the data. Physical considerations, such as light transport theory through the medium and the detector, can often guide model choice, although non–physically inspired approximations can be used to make the model fitting algorithm more computationally tractable. When there is measurement uncertainty and noise, the forward model can be better characterized by the conditional probability distribution Embedded Image, which describes the statistical variation of the measurements y for a given source parameter value x. For fixed value y, the function Embedded Image, called the likelihood function, quantifies the likelihood that the source value x generated the observed value y. The maximum likelihood principle forms an estimate of x from y by maximizing the likelihood over x. However, the maximum likelihood estimate is often not unique in high-dimensional inverse problems such as imaging. Fortunately, additional information about x (e.g., a priori knowledge about positivity, smoothness, or sparsity) is often available and can be used to improve on the maximum likelihood estimate. For instance, suppose ϕ is a regularization function such that Embedded Image is small when x complies with a priori knowledge and is large otherwise. Then it is possible to recover x by minimizing the cost function Embedded ImageIf Embedded Image can be associated with a proper density Embedded Image, called the prior distribution, this penalized likelihood estimation strategy can be interpreted in the Bayesian formalism as a maximum a posteriori estimation procedure. In other words, the above minimization is equivalent to maximizing the posterior density of xEmbedded Imagewhere f(y) is a density that does not depend on x. These and related likelihood-based approaches have been adopted by many researchers studying low photon imaging (17, 18, 83).

The Bayesian formalism allows for the observed data to be combined with additional information in a principled manner. This also allows so-called a posteriori measures of uncertainty to be derived. However, such measures cannot be computed analytically in most practical applications because of the complexity of accurate spatial correlation models, and they often must be approximated using high-dimensional integration. A considerable advantage may be gained from computationally simple pixel-by-pixel adaptation (91), and a classical approach thus consists of approximating these measures (e.g., a posteriori variances/covariances or confidence intervals) using variational approximations or simulation techniques. Markov chain Monte Carlo methods are particularly well adapted for inference in difficult scenarios for which the cost function or posterior distribution of interest has multiple modes and multiple solutions potentially admissible. For instance, such methods have been successfully applied to object detection (60), and joint material identification and anomaly detection (78) from low-flux single-photon LIDAR measurements.


Considering the rapid advance in imaging cameras and sensors together with a leap forward in computational capacity, we see enormous potential for innovation over the next several years. The main challenge—or the main opportunity—at this stage is the codevelopment of sensors and computational algorithms built around the physical processes of the photon transport and detection mechanisms. We have provided examples showing progress in this direction, ranging from first-photon imaging techniques to photon-starved hyperspectral imaging. The trend seen in commercial cameras between 2000 and 2015, characterized by a constant drive toward higher pixel counts, has slowly subsided, giving way to a different approach whereby both performance and functionality are increased by combining multiple sensors through computational processing. Obvious examples are recent advances in cell-phone technology, arguably one of the main drivers behind imaging technology, that now boasts multiple cameras and lenses providing depth perception, improved signal-to-noise ratio, and other functionalities such as 3D face recognition. With SPAD cameras also gradually making their appearance on the commercial scene, single-photon imaging and computational techniques offer a promising avenue for future innovation in situations where previously imaging was thought to be impossible. We have briefly discussed examples such as imaging though denser scattering media (the human body or fog) and full 3D imaging of scenes around a corner or beyond the direct line of sight. These examples are particularly relevant in demonstrating the progress that can be made when the photon transport models and computational approaches are integrated with the new generation of photon detectors. The expectation is that over the next several years we will witness substantial growth of computational imaging methods, driven by and also driving new technologies such as single-photon SPAD arrays that will revolutionize nearly every aspect of human activity, ranging from medical diagnostics to urban safety and space missions.

References and Notes

Acknowledgments: We thank the Royal Society for support and hosting the Theo Murphy Scientific meeting on “Light transport and imaging through complex media.” Funding: Y.A. acknowledges support from the UK Royal Academy of Engineering under the Research Fellowship Scheme (RF201617/16/31). S.McL. acknowledges financial support from the UK Engineering and Physical Sciences Research Council (grant EP/J015180/1). V.G. acknowledges support from the U.S. Defense Advanced Research Projects Agency (DARPA) InPho program through U.S. Army Research Office award W911NF-10-1-0404, the U.S. DARPA REVEAL program through contract HR0011-16-C-0030, and U.S. National Science Foundation through grants 1161413 and 1422034. A.H. acknowledges support from U.S. Army Research Office award W911NF-15-1-0479, U.S. Department of the Air Force grant FA8650-15-D-1845, and U.S. Department of Energy National Nuclear Security Administration grant DE-NA0002534. D.F. acknowledges financial support from the UK Engineering and Physical Sciences Research Council (grants EP/M006514/1 and EP/M01326X/1). Competing interests: None declared.

Stay Connected to Science

Navigate This Article