## Abstract

Analyses of firm sizes have historically used data that included limited samples of small firms, data typically described by lognormal distributions. Using data on the entire population of tax-paying firms in the United States, I show here that the Zipf distribution characterizes firm sizes: the probability a firm is larger than size*s* is inversely proportional to *s*. These results hold for data from multiple years and for various definitions of firm size.

Firm sizes in industrial countries are highly skew, such that small numbers of large firms coexist alongside larger numbers of smaller firms. Such skewness has been robust over time, being insensitive to changes in political and regulatory environments, immune to waves of mergers and acquisitions (1), and unaffected by surges of new firm entry and bankruptcies. It has even survived large-scale demographic transitions within work forces (e.g., women entering the labor market in the United States) and widespread technological change. The firm size distribution within an industry indicates the degree of industrial concentration, a quantity of particular interest for antitrust policy.

Beginning with Gibrat (2), firm sizes have often been described by lognormal distributions. This distribution is a consequence of the “law of proportional effect,” also known as Gibrat's law, whereby firm growth is treated as a random process and growth rates are independent of firm size (3). Such distributions are skew to the right, meaning that much of the probability mass lies to the right of the modal value. Thus, the modal firm size is smaller than the median size, which, in turn, is smaller than the mean.

The upper tail of the firm size distribution has often been described by the Yule (1) or Pareto (also known as power law, or scaling) distributions (4, 5). For a discrete Pareto-distributed random variable, *S*, the tail cumulative distribution function (CDF) is(1)where *s*
_{0} is the minimum size (6). Recent analysis of data on the largest 500 U.S. firms gives α as ∼1.25, whereas it is closer to 1 for many other countries (7). The special case of α = 1 is known as the Zipf distribution and has somewhat unusual properties insofar as its moments do not exist (8). This distribution describes surprisingly diverse natural and social phenomena, including percolation processes (9), immune system response (10), frequency of word usage (4), city sizes (4, 11), and aspects of Internet traffic (12).

From an analysis using a sample of firms in Standard & Poor's COMPUSTAT, a commercially available data set, it has been reported that U.S. firm sizes are approximately lognormally distributed (13). The COMPUSTAT data cover nearly all publicly traded firms in the United States—some 10,776 firms in 1997, almost 4300 of which had more than 500 employees. Firms covered by COMPUSTAT collectively employed over 52 million people, approximately one-half of the U.S. work force. However, these data are unrepresentative of the overall population of U.S. firms. Data from the U.S. Census Bureau put the total number of firms that had employees sometime during 1997 at about 5.5 million, including over 16,000 having more than 500 employees. Furthermore, the Census data have a qualitatively different character than the COMPUSTAT data. Census data display monotonically increasing numbers of progressively smaller firms, a shape the lognormal distribution cannot reproduce, and suggesting that a power law distribution may apply. As shown in Table 1 (14), the mean firm size in the COMPU-STAT data is 4605 employees (6349 for firms larger than 0), whereas in the Census data it is 19.0 (21.8 for firms larger than 0). Clearly, the COMPUSTAT data are heavily censored with respect to small firms. Such firms play important roles in the economy (15, 16).

For further analysis, I used a tabulation from Census in which successive bins are of increasing size in powers of three. The modal firm size is 1, whereas the median is 3 (4 if size 0 firms are not counted) These data are approximately Zipf-distributed (α = 1.059), as determined by ordinary least squares (OLS) regression in log-log coordinates (Fig. 1). There are too few very small and very large firms with respect to the Zipf fit, presumably due to finite size effects, yet the power law distribution well describes the data over nearly six decades of firm size (from 10^{0} to 10^{6} employees). This result suggests both that a common mechanism of firm growth operates on firms of all sizes, and that the fundamental unit of analysis is the individual employee.

But firms having a single employee are not the smallest economic entities in the U.S. economy. Although there were some 5.5 million firms that had at least one employee at some time during 1997, there were another 15.4 million business entities in that year with no employees. These are predominantly self-employed individuals and partnerships, and are called “nonemployer” firms by Census. These smallest of firms account for nearly $600 billion in receipts in 1997. Yet, if these firms are included in the overall firm size distribution, the Zipf distribution still fits the data well. To see this, Eq. 1 must be modified to accommodate firms having no employees(2)Here, OLS yields an estimate of α = 1.098 (SE = 0.064), and the adjusted R^{2} = 0.977. Including self-employment drives the average firm size down to 5.0 employees/firm, and makes the median number of employees 0.

An interesting property of firm size distributions noted in previous studies of large firms is that the qualitative character of such distributions is independent of how size is defined (1). Although the position of individual firms in a size distribution does depend on the definition of size, the shape of the distribution does not. This also holds for the Census data. Basing firm size on receipts, a Zipf distribution describes the data (α = 0.994) (Fig. 2). Here, modal and median firm revenues are each less than $100,000, and the average is $173,000/firm.

As a further test on the robustness of these results, I repeated these analyses for Census data from 1992. Average firm size was slightly smaller then, at 20.9 employees/firm (excluding size 0 firms). But overall, the Zipf distribution is as strong (Table 2).

Virtually all U.S. firms experienced significant changes in revenue and work force from 1992 to 1997. Thus, individual firms migrated up and down the Zipf distribution, but economic forces seem to have rendered any systematic deviations from it short-lived. Even the substantial merger and acquisition activity of this period seemed to have little effect on the overall firm size distribution.

There are a variety of stochastic growth processes that converge to Pareto and Zipf distributions (1, 5,17, 18). Empirically, there is support for Gibrat-like processes in which average growth rates are independent of size (19, 20) and growth rate variance declines with size (21, 22). Consider a variation of the Gibrat process known as the Kesten process (23-25), in which sizes are bounded from below; i.e.,(3)where γ is a random growth rate. For nearly any growth rate distribution, this process yields Pareto distributions that have the exponent α defined implicitly by (26)(4)where *N* is the total number of firms and*A* is the number of employees. For *N* = 5.5 × 10^{6} and *A* = 105 × 10^{6}, as in 1997 (excluding self-employment),*s*
_{0} = 1 implies α ≈ 0.997, a value close to my empirical finding. Similar results are obtained for each year back through 1988 (Table 3).

The Zipf distribution is an unambiguous target that any empirically accurate theory of the firm must hit. This result, taken together with those in (21) and (27), place important limits on models of firm dynamics. That is, (i) firm growth rates follow a Laplace distribution, (ii) the standard deviation in growth rates falls with initial firm size according to a power law, and (iii) large firms pay higher wages for the same job according to yet another power law (the so-called wage-size effect). Because the Zipf distribution obtains all the way down to the smallest sizes, it should be possible to derive Kesten-type processes and, hence, the Zipf distribution from a microeconomic model in which individual agents interact to form productive teams. Although today no analytically tractable models of this type exist, agent-based computational results have achieved significant success according to these criteria (28).

The Zipf distribution may describe firm sizes in other countries as well, a conjecture that can only be tested once individual governments make available—and in some cases gather for the first time—data that purport to be comprehensive.