Report

Multidimensional Drug Profiling By Automated Microscopy

See allHide authors and affiliations

Science  12 Nov 2004:
Vol. 306, Issue 5699, pp. 1194-1198
DOI: 10.1126/science.1100709

Abstract

We present a method for high-throughput cytological profiling by microscopy. Our system provides quantitative multidimensional measures of individual cell states over wide ranges of perturbations. We profile dose-dependent phenotypic effects of drugs in human cell culture with a titration-invariant similarity score (TISS). This method successfully categorized blinded drugs and suggested targets for drugs of uncertain mechanism. Multivariate single-cell analysis is a starting point for identifying relationships among drug effects at a systems level and a step toward phenotypic profiling at the single-cell level. Our methods will be useful for discovering the mechanism and predicting the toxicity of new drugs.

High-throughput methods for describing cell phenotype such as transcriptional and proteomic profiling allow broad, quantitative, and machine-readable measures of the responses of cell populations to perturbation (14). Automated microscopy has the potential to complement these profiling approaches by allowing fast and cheap collection of data describing protein behaviors and biological pathways within individual cells (59). Accessing these data to produce useful profiles of cell phenotype will require new image analysis methods, the development of which has so far lagged behind the adoption of high-throughput imaging technologies.

In the context of drug discovery, profiling technologies are useful in measuring both drug action on a desired target in the cellular milieu and drug action on other targets. Ideally, such profiling should be performed as a function of drug concentration, because several factors make the effects of drugs highly dose dependent. For example, the degree to which a primary target is perturbed may affect different downstream pathways differently, and drugs can bind to multiple targets with different affinities. In some cases, the therapeutic mechanism may involve binding to more than one target with differing affinity (10, 11). To date, drug effects have been broadly profiled with transcript analysis, proteomics, and measurement of cell line dependence of toxicity (1121). In these studies, multidimensional profiling methods were only applied at a single-drug concentration. The only studies in which drug dose has been explicitly considered as a variable used the degree of cell proliferation, an essentially one-dimensional (1D) readout of phenotype (12, 13). Two recent reviews have highlighted the possibility of using combinations of targeted phenotypic imaging screens to generate profiles of drug activity (6, 22). Here, we suggest that large sets of unbiased measurements might serve as high-dimensional cytological profiles analogous to transcriptional profiles. We present a method based on hypothesis-free molecular cytology that provides multidimensional single-cell phenotypic information yet is simple and inexpensive enough to allow extensive dose-response profiles for many drugs.

We assembled a test set of 100 compounds (table S1). Of these, 90 were drugs of known mechanism of action, six were blinded alternate titrations from this set of known drugs, one (didemnin B) was a toxin reported to have multiple biological targets (23), and three were drugs of unknown mechanism. The known drug set was chosen to cover common mechanisms of toxicity or therapeutic action in cancer and other diseases and to include several groups with a common target (macromolecule or pathway) but unrelated structures. We analyzed 13 threefold dilutions of each drug, covering a final concentration range on cells from micromolar to picomolar [table S2 and supporting online material (SOM) text A]. HeLa (human cancer) cells were cultured in 384-well plates to near confluence, treated with drugs for 20 hours, fixed, and stained with fluorescent probes for various cell components and processes. We chose 11 distinct probes that covered a range of cell biology, multiplexing a DNA stain and two antibodies per well [the probe sets are SC35, anillin; α-tubulin, actin; phospho-p38, phospho–extracellular signal–regulated kinase (ERK); p53, c-Fos; phospho–adenosine 3′,5′-monophosphate response element–binding protein (CREB), calmodulin]. Using automated fluorescence microscopy, we collected images of up to ∼8000 cells from each well. On each plate, 26 wells were treated only with dimethyl sulfoxide (DMSO) to generate a control population (SOM text A). The experiment was performed twice in parallel to provide a replicate data set. Image segmentation procedures were used to automatically identify nuclei and nuclear organelles, and cytoplasmic regions were approximated as an annulus surrounding each identified nucleus (Fig. 1A and SOM text B). For each cell, region, and probe, a set of descriptors was measured. These included measures of size, shape, and intensity, as well as ratios of intensities between regions (93 descriptors total, table S3). In all, ∼7 × 107 individual cells were identified from >600,000 images, yielding ∼109 data points.

Fig. 1.

Key steps in algorithm for reducing image data to compound profile. (A) Image segmentation. For each image [examples show DNA (blue), SC35 (red), and anillin (green)], we generated a nuclear region (blue) and a set of associated regions [shown here are cytoplasmic annulus (yellow) and SC35 speckles (red)]. For each defined nuclear region, we measure multiple descriptors. (B) Quantification of population response. For a given compound, titration, and descriptor, we generated a population histogram and related cumulative distribution function (cdf) (black) to be compared with the control population (blue). Shown is a threefold dilution series ranging from 65 pM to 35 μM camptothecin. We reduced each experimental cdf to a single dependent variable through comparison with a control population with the nonparametric KS statistic against a control population (SOM text C). Each vertical red or green line indicates the position and sign of the maximal height difference between the curves; this height is the KS statistic. (C) Heat map of compound profile. A z score is calculated for each KS statistic (SOM text C), and the vector of z scores for all descriptors and all titrations is displayed for rapid visual assessment. Increased scores are represented in red and decreased in green, with intensity encoding magnitude. Arrowheads to the right indicate descriptors shown in (B), and the arrowhead at the bottom indicates the dose shown in (A).

We can examine the population response of each descriptor to increasing concentrations of a given drug, which we show with the genotoxic compound camptothecin (24) (Fig. 1B). At low concentrations, the histogram for the total DNA content has the characteristic bimodal shape reflecting a mixture of G1, S, and G2/M cell populations. G2 and M populations may be distinguished by 2D display of total DNA signal against nuclear area (25). As drug concentration increases, the cells arrest with S/G2 DNA content (24). The measured DNA content distribution shifts leftward as dose increases, and at the highest concentrations apoptosis is widely induced. Anillin, a cytokinesis protein whose levels reflect cell cycle progression (26), shows marked nuclear accumulation in the G2 arrested state (Fig. 1A). p53, a transcription factor that is part of the genotoxic response pathway, is strongly induced at high camptothecin concentrations, but much less so at concentrations sufficient to promote G2 arrest (Fig. 1B).

For profiling studies, it is useful to reduce each population of descriptor values to a single number. Our study made several demands of this reduction: It must be able to compare distributions of arbitrary shape (Fig. 1B); it must be robust to variation in dynamic range and noise levels among different descriptors; it must convert different types of measurement into a common unit for comparison; it must be descriptor parameterization independent (e.g., an intensity ratio should behave the same as its reciprocal); and it must be insensitive to the precise quantitative relationship between antibody-staining intensity and antigen density. We devised a measure based on the Kolmogorov-Smirnov (KS) statistic, allowing nonparametric comparison of experimental and control distributions from the same plate (Fig. 1B, fig. S1, and SOM text C). Dividing by a measure of the variability within the control population yielded a z score, which can be displayed as a function of descriptor and drug concentration in a heat plot to allow rapid visual comparison of compound response profiles (Fig. 1C). These plots represent a family of dose-response curves for a single drug but differ from traditional curves reflecting changes in a biochemical measurement. In particular, the relationship between z score and the original physical measure may be nonlinear. For example, the statistically significant responses of p53 to low doses of camptothecin (Fig. 1C) reflect subtle effects not easily discerned by eye in the source images.

The heat plots typically have a sharp transition, reflecting a concentration at which many descriptors become different from control values. We will refer to this as the primary effective concentration (PEC) for the drug. The isolated responses observed at some low concentrations represent noise that could be reduced by increasing replicates, improving experimental procedures, and normalizing for local variation in cell density. For 39 drugs, we saw no strong effect, leaving a heat plot dominated by noise. Those drugs either lack a target in HeLa cells, were used at inactive dosages, or effected changes not detectable with our antibody set. For almost all of the 61 drugs that showed a strong response, some descriptors responded at concentrations other than the PEC (Fig. 2). This may reflect varying biological consequences of low and high saturation of a single target, or it may reflect interactions with multiple targets with different affinities. For example, camptothecin binds primarily to DNA complexes with topoisomerase I, promoting DNA strand breaks and S-phase arrest at low concentrations, but it also blocks transcription and a number of other cellular processes at higher concentrations (24). Other drugs in our test set are known to have multiple targets, such as histone deacetylase inhibitors (27) and the general kinase inhibitor staurosporine (28) and were thus expected to show complex dose-response behavior. Such phenotypic complexity may help explain why toxicity at high doses is common even for therapeutic drugs that are apparently highly selective at the level of target binding.

Fig. 2.

Comparison of compound profiles. As in Fig. 1C, the x axis shows increasing dose and the y axis encodes descriptors. Dose ranges are shown from 65 pM to 35 μM for all drugs except epothilone B, which is shown from .65 pM to .35 μM. Color scale is as in Fig. 1C. For ease of visualization, descriptors in all profiles are sorted in decreasing order of camptothecin response. (A) Compounds of similar mechanism show similar profiles. Shown are representative compound profiles. HDAC, histone deacetylase; ALLN, N-acetyl-Leu-Leu-norleucinal. (B) Compound profiles can distinguish differences between drugs with similar mechanisms. Wells with too few cells for analysis are represented in white.

Drugs with common reported targets but diverse chemical structures often showed similar profiles readily distinguished from those of drugs of different mechanism (Fig. 2A). In other cases, markedly different profiles were evident within a family, most notably the protein synthesis inhibitors (Fig. 2B). This may reflect different cell responses to alternative biochemical mechanisms of poisoning ribosomes (29) or perhaps the existence of alternate targets (23).

When comparing drug mechanism, changes in specificity (and thus phenotype) are relevant, but changes in affinity (and thus PEC) are not. Two different dosage series of the same drug should result in similar heat plots shifted along the concentration axis. We developed a titration-invariant similarity score (TISS) to allow comparison between dose-response profiles independent of starting dose (SOM text C). TISS values were generated for the 61 compounds that showed significant signal, and these were used for unsupervised clustering (Fig. 3). TISS was successful at grouping compounds with similar reported targets (Table 1). As expected, clustering reflected biological mechanism rather than chemical similarity. For example, kinase inhibitors, most of which are adenosine 5′-triphosphate–mimetic compounds, did not cluster as a group. Clustering was poor even within a set of kinase inhibitors with overlapping targets [cyclin-dependent kinase (CDK) inhibitors], perhaps reflecting variable inhibition of other kinases. The CDK inhibitors related by structure and reported target (purvalanol, roscovitine, and olomoucine) did cluster.

Fig. 3.

Hierarchical clustering of the 61 most responsive compound profiles by TISS values. Compound stock concentrations (μM) are in parentheses (fig. S3). Left panel shows mechanism of compound as described in literature. In blue are compounds that were blinded or are of unknown mechanism. Middle panel shows matrix of P values derived from pairwise TISS values (SOM text C). Dendrogram at top shows degree of association.

Table 1.

Assessment of TISS by literature categories. For each category that has more than two compounds, we computed two sets of TISS scores: pairwise TISS comparisons between members of the category (intrapair) and comparisons in which only one element of the pair is in the category (interpair). As a crude in silico comparison to other cell-based assays such as fluorescence-activated cell sorting (single-cell based) and cytoblots (whole-population based), we repeated this procedure with a descriptor set consisting of only total intensity measures and compared it with either our KS-based TISS values or a mean-based TISS values (SOM text C). P values (columns 2 to 4) describe the probability that the rank ordering of the two sets of TISS values would have been seen by random draws from the same distribution (SOM text C). KS, KS-based TISS (P value); mean, mean-based TISS (P value).

Category All descriptors Total intensity descriptors No. pairwise TISS comparisons
KS KS Mean Intrapair Interpair
Actin 0.025 0.776 0.327 6 218
DNA replication 0.011 0.057 0.007 3 168
Histone deacetylase 0.001 0.024 0.489 10 265
Kinase 0.223 0.746 0.902 3 168
Kinase CDK 0.057 0.221 0.050 6 218
Microtubule 3.86 × 10-20 9.81 × 10-6 0.295 55 484
Protein synthesis 6.02 × 10-5 0.004 0.180 15 309
Topoisomerase 0.005 0.011 0.693 3 168
Vesicle trafficking 0.206 0.314 0.514 3 168

Of the blinded alternate titrations of known drugs, scriptaid, hydroxyurea, emetine, and two alternate series of nocodazole showed significant responses. These clustered closely with their unblinded counterparts and compounds of similar reported mechanism. Didemnin B, for which the reported range of activities includes inhibition of protein synthesis (23), clustered with ribosome inhibitors (Fig. 2B). Two of the three poorly characterized compounds showed strong responses. One, concentramide, is difficult to interpret. The other, austocystin, clusters with transcription and translation inhibitors. Preliminary experiments suggest that this compound inhibits transcription in vitro (25). Thus, our methods can group compounds of like mechanism and thereby suggest mechanism for new drugs.

Extensions of cytological profiling to reflect dependencies among descriptors will allow more sophisticated analysis of drug responses at a systems level. For example, both p53 and c-Fos, a transcription factor involved in mitogen-activated protein kinase (MAPK) signaling, are involved in cell stress responses, but the interrelationship of the p53 and MAPK pathways is poorly understood (30). Single-cell profiling reveals that different drug mechanisms induce different relative patterns of response by these two pathways (Fig. 4). The proteasome inhibitor MG132 causes increased correlated induction in these pathways, whereas responses to camptothecin are anticorrelated. Anticorrelated responses observed in fixed-time images may reflect switching of mutually exclusive cell states in response to different degrees of stress or might reflect a dynamic temporal response, such as oscillation, that is not synchronized among cells (31). These data help establish a concentration and time window, but live imaging will be required to distinguish between these hypotheses.

Fig. 4.

Single-cell analysis shows differing patterns of dose-dependent p53 and c-Fos responses to different drugs. (A) Scatter plot of average nuclear p53 intensity versus average c-Fos intensity in a typical control well and representative image. The bright cells at the top of the image are in mitosis. (B) Dose-dependent increases in response to MG132 shown in heat maps are correlated in scatter plots and images (orange nuclei), shown for the four highest concentrations. (C) Dose-dependent increases in response to camptothecin shown in heat maps are anticorrelated in scatter plots and images. The black (c-Fos) and green (p53) heat map values for the highest dose reflect the contribution of apoptotic cells with negligible p53 and c-Fos nuclear staining.

Cytometric dose-response profiling is a fast and cheap method for quantitatively surveying broad ranges of individual cell responses. We have used our methods to assign mechanism to blinded and uncharacterized drugs and to suggest systems-level relationships between signaling pathways. The complex dose-response curves and large cell-to-cell variability we frequently observed reinforce the utility of unbiased multidimensional characterization of drug effects over wide ranges of doses.

Many improvements and extensions of this work are possible. These include better lab automation, broader drug reference sets, different types of perturbation (such as RNA interference), improved strategies for cell segmentation, more sophisticated feature extraction (5, 9), different sets of antibody probes and cells, the inclusion of more time points and live cell imaging, and the integration of complementary profiling strategies. Additionally, our methods may be extended to allow the characterization of responses by subpopulations defined by such variables as cell cycle state, cell density, or neighboring environment. This analysis, extended to work in tissues or clinical samples, offers the potential to speed the identification of toxic compounds during therapeutic drug development and the targeting of drug effects to specific subtypes of cells.

Supporting Online Material

www.sciencemag.org/cgi/content/full/306/5699/1194/DC1

SOM Text

Figs. S1 to S5

Tables S1 to S4

Database S1

References and Notes

View Abstract

Navigate This Article