Recently, excitement has surrounded the application of null-hypothesis approaches for identifying evolutionary design principles in biological, technological, and social networks (*1*–*13*) and for classifying diverse networks into distinctive superfamilies (*2*). Here, we argue that the basic method suggested by Milo *et al*. (*1*, *2*) often has limitations in identifying evolutionary design principles.

The technique is relevant for any network that can be notated schematically as a directed graph of *N* nodes (for example, representing neurons) and a set of edges or links between pairs of nodes (for example, synaptic connections). In particular, the approach is able to identify unusually recurring “network motifs”—patterns of interconnections among a small number of nodes (typically three to five) that are significantly more common in real networks than expected by chance (*1*–*13*). Overabundance is taken to mean that the motifs are the manifestation of evolutionary design principles favored by selection in biological or synthetic systems (*1*–*8*).

In statistical parlance, the basic method [which has a long history in theoretical biology (*10*–*13*)] tests a “random null hypothesis” by statistically comparing the distribution of motifs in an observed network with that found in a computer-generated ensemble of appropriately randomized networks. Over and above the realistic constraint that the degree distribution of incoming and outgoing links to every node must be maintained (*14*), the edges in the randomized network are connected between nodes completely at random and without preference. Such randomized networks are considered null in that their structure is generated by a process free of any type of evolutionary selection acting on the network's constituent motifs. Rejection of the null hypothesis has thus, in many studies, been taken to represent evidence of functional constraints and design principles that have shaped network architecture at the level of the motifs through selection (*1*–*13*).

However, the method outlined above can lead to the wrong interpretations if the underlying null hypothesis is not posed carefully. For example, using this approach, Milo *et al*. (*1*) identified several significant network motifs in the neural-connectivity map of the nematode *Caenorhabditis elegans*. However, in the case of *C. elegans*, neurons are spatially aggregated and connections among neurons have a tendency to form in local clusters (*15*). Two neighboring neurons have a greater chance of forming a connection than two distant neurons at opposite ends of the network. This feature of local clustering, though, is not reflected in the baseline randomized networks used by Milo *et al*. (*1*, *2*), in which the probability of two neurons connecting is completely independent of their relative positions in the network (Fig. 1). The test is not null to this form of localized aggregation and will thus misclassify a completely random but spatially clustered network as one that is nonrandom and that has significant network motifs.

Analysis of a “toy network” (Fig. 1) illustrates what can go wrong. In this network, the nodes are randomly connected preferentially to nearby neighbors, but with a probability that falls off for more distant neighbors (a Gaussian distribution is used). Although the toy network is built devoid of any rule selecting particular motifs for their functions, we find that the same network motifs identified by Milo *et al*. (*1*) for *C. elegans* are present, and the random null hypothesis must be rejected (Fig. 1). Thus, the statistically significant motifs found in *C. elegans* (*1*) are more likely to be the result of the inherently localized partitioning of the nematode's connectivity network than a property that emerges from the action of evolutionary forces selecting particular motifs for their specific functions. It is not our goal in this case to construct a model that realistically captures the distribution of motifs as found in *C. elegans*, but merely to explore the implications of choosing an incomplete null model. Having said that, it is still somewhat surprising that the simple “toy model” reproduces the distribution (significance profile) of all three-node motifs with reasonable realism.

Many biological and synthetic networks, such as the metabolic and transcription networks (*9*) and the World Wide Web (*16*), are characterized by a scale-free distribution of links to every node. In scale-free networks, the probability of a node having *k* connections obeys the power law *p(k)*∼*k*^{–γ} (with γ > 2)—that is, most nodes have few connections and a few nodes have many connections. It has been argued (*16*) that some biological scale-free networks are generated by the rule of preferential attachment, a rule that in itself does not include any type of selection for or against particular motifs. We have used two variants of the preferential-attachment rule (*17*) to generate toy networks, and have then analyzed their motif structure. Using the first variant, we find that the feedforward loop (FFL, shown schematically in Fig. 1C) is always significantly over-represented (>2σ from the mean) compared with the randomized null networks, which implies that the motif has been favored by evolution. In contrast, for the second variant, the FFL is significantly underrepresented, which indicates that the motif has been disfavored. As such, the actual process by which a network is generated, even if it is free of selection for or against particular motif functions, can strongly bias an analysis that seeks to determine the quantitative significance of motifs.

Similar problems arise when applying the approach to studying complex ecological food webs (*10*–*13*). In these systems, each node represents an organism, and an edge between two organisms indicates that one feeds on the other. Food webs are nonrandom structures largely governed by trophic relationships; randomizing feeding links in a food-web network and testing the random null hypothesis serves at best only to trivially prove this point. Unsurprisingly, Milo *et al*. (*1*) find nonrandom overrepresented network motifs that are consistent with simple trophic relationships such as predator–prey–resource interactions. From an ecological perspective, little can be learned from rejecting the possibility that the food web is random. It may be worthwhile in the future to seek ways of posing the null hypothesis in a more sophisticated ecological framework (*10*–*13*).

In summary, for all of these examples, the null hypothesis test suggested the involvement of evolutionary design principles in random toy networks that were generated without the involvement of any fitness-based selection process. The only possible resolution to this problem is to reformulate the test in a manner that is able to identify functional constraints and design principles in networks and to discriminate them clearly from other likely origins, such as spatial clustering.

There is no denying that the network randomization approach has a certain charm in facilitating diverse and multidisciplinary cross-system comparisons in the search for common universal network motifs, design principles, and characteristics defining distinctive network superfamilies (*1*, *2*). Indeed, this approach has stimulated theoretical and experimental work that has demonstrated the utility of certain motifs in tasks such as information processing (*18*, *19*). However, given the dangers sketched above, any cross-system analysis may be very fragile and will be prone to comparing network motifs that are found to be statistically significant because of an ill-posed null hypothesis. Moreover, the method described in (*2*) forces a common reference frame for comparing motif significance profiles (distribution and significance of all possible motifs) of networks, even if they are of different origins—for example, neural networks, for which a null model based on spatial clustering may be justified, versus transcription networks, for which such a null model would be unsuitable. Thus, comparisons mediated through a common but inappropriate reference frame may give the wrong impression that different networks are in fact similar with respect to their motif significance profile. Clearly, these techniques need to be developed further before design principles can be deduced with confidence (*20*).