Special Viewpoints

Limits on Silicon Nanoelectronics for Terascale Integration

See allHide authors and affiliations

Science  14 Sep 2001:
Vol. 293, Issue 5537, pp. 2044-2049
DOI: 10.1126/science.293.5537.2044

Abstract

Throughout the past four decades, silicon semiconductor technology has advanced at exponential rates in both performance and productivity. Concerns have been raised, however, that the limits of silicon technology may soon be reached. Analysis of fundamental, material, device, circuit, and system limits reveals that silicon technology has an enormous remaining potential to achieve terascale integration (TSI) of more than 1 trillion transistors per chip. Such massive-scale integration is feasible assuming the development and economical mass production of double-gate metal-oxide-semiconductor field effect transistors with gate oxide thickness of about 1 nanometer, silicon channel thickness of about 3 nanometers, and channel length of about 10 nanometers. The development of interconnecting wires for these transistors presents a major challenge to the achievement of nanoelectronics for TSI.

Silicon technology has advanced at exponential rates in both performance and productivity throughout the past four decades. From 1960 to 2000, the energy transfer associated with a binary switching transition—the canonical digital computing operation—decreased by about five orders of magnitude and the number of transistors per chip increased by about nine orders of magnitude. Such exponential advances must eventually come to a halt imposed by a hierarchy of physical limits. The five levels of this hierarchy are defined as fundamental, material, device, circuit, and system (1). A coherent analysis of the key limits at each of these levels reveals that silicon technology has an enormous remaining potential to achieve TSI of more than 1 trillion transistors per chip, with critical device dimensions or channel lengths in the 10-nm range. This potential represents more than a three-decade increase in the number of transistors per chip and more than a one-decade reduction in minimum transistor feature size compared with the state of the art in 2001. Fundamental physical limits that are independent of the characteristics of any particular material, device structure, circuit configuration, or system architecture are virtually impenetrable barriers to future advances of TSI.

Binary switching transitions implemented with transistors are indispensable to performing computation in a digital system. The energy transfer per binary transition is a revealing metric for comparing the performance of switching operations at all levels of the hierarchy. Consider the power-delay plane, where the ordinate is the average power transfer, P, during a binary transition and the abscissa is the time interval of the transition,t d. The use of logarithmic scales on both axes results in a diagonal line (or locus) where the switching energy,E = Pt d, remains constant. During the past four decades, constant switching energy loci have migrated continuously toward the lower left corner of the power-delay plane, reflecting a monotonically decreasing binary switching energy (1). The prime cause of this migration has been the scaling down of the dimensions of transistors and their binary signal voltage swing, typically equal to the supply voltage. Supply voltage is reduced to maintain a nearly constant electric field (in V/cm) or electrical stress on the transistor. Scaling of transistors reduces their energy dissipation per binary transition, their intrinsic switching delay, their area, and therefore their cost.

The second indispensable function performed in a digital system is communication, implemented by interconnects or wires. The primary purpose of an interconnect is communication between distant points with small latency. Interconnect performance can be elucidated at all levels of the hierarchy by plotting the square of the reciprocal interconnect length, L −2, against latency, τ. In theL −2 versus τ plane, with logarithmic scales on both axes, a diagonal line is a locus of constant value ofL −2τ =r int c ints/cm2 or constant distributed resistance-capacitance product. This product is the prime figure of merit for interconnects. During the past four decades, constant distributed resistance-capacitance loci have migrated continuously toward the upper right corner of the L −2-τ plane, reflecting a continuously increasing distributed resistance-capacitance product and consequently a larger latency for communication between two fixed points. Larger latency cannot be avoided because the cross-sectional dimensions of interconnects must be scaled down to provide the dense wiring required by smaller and smaller transistors. Consequently, during the past decade, interconnect latency (as well as energy dissipation) has become a primary constraint on current gigascale integration. Exploring key limits at each of the five levels of the hierarchy in the power-delay and reciprocal length squared-latency planes elucidates future opportunities for TSI.

Fundamental Limits

The three key fundamental limits on TSI are derived from thermodynamics, quantum mechanics, and electromagnetics (1,2). The fundamental limit on signal energy transfer during a binary switching transition is E(min) = (ln2)kT, where k is Boltzmann's constant andT is absolute temperature. This limit is characterized as fundamental because its value is independent of the properties of any particular material, device, or circuit that may be used to implement the binary transition (3). Its importance as a constraint on nanoelectronics for TSI is unsurpassed. In simple physical terms, the limit reveals that a single electron undergoing a binary transition must have an energy comparable to its thermal energy, (3/2)kT, to satisfy the quintessential requirement of binary signal discrimination.

The first statement of this limit known to the authors is attributed to John von Neumann, who “computed the thermodynamical minimum of energy per elementary act of information from the formulakTloge N” whereN = 2 for a binary act (4, p. 183). Keyes observes, however, that “the report of von Neumann's ideas fails to provide any justification of this assertion or explanation of the reasoning underlying it” (5). Landauer derived the same result by analyzing a hypothetical binary device consisting of a particle in a bistable potential well (5). On the basis of earlier work of Swanson and Meindl (6), the minimum switching energy of an ideal transistor operating in the simplest digital circuit, an inverter, is E(min) = (ln2)kT (3). Precisely the same result is derived (3) by treating an isolated interconnect as a communication channel described by Shannon's classical theorem for channel capacity (7). This fundamental limit receives further support from the observation that on the basis of a Boltzmann probability density function, the probability of error is 0.5 for a binary transition with signal energy transfer E(min) = (ln2)kT(8).

Quantum mechanics and, more specifically, the Heisenberg uncertainty principle (9) define the second fundamental limit, which requires a signal switching energy transfer ΔEh/t d, where h is Planck's constant and t d is the transition time. This limit results from the wave nature of the electron and the resulting uncertainty in its position-momentum and energy-time relations (9).

The fundamental limits based on thermodynamics and quantum mechanics result in a “forbidden region” in the power-delay plane (red region, Fig. 1). In this region, no binary transition can operate, regardless of the materials, devices, or circuits used for its implementation.

Figure 1

Average power transfer during a binary transition,P, versus transition time, t d, for the first three levels of the hierarchy. The red, orange, and green zones are forbidden by fundamental, silicon material, and 50-nm channel length transistor device level limits, respectively.

The third fundamental limit from electromagnetics simply expresses the fact that the time of flight, τ, of an electromagnetic wave traveling along any metallic interconnect or optical fiber of length Lis strictly limited by the velocity of light in free space,c 0, according to τ ≥L/c 0 (Fig. 2) (1). The red region is again a forbidden zone of operation for any interconnect regardless of the materials or structure used for its implementation.

Figure 2

Reciprocal interconnect length squared,L −2, versus latency, τ, for the first three levels of the hierarchy. The red, orange, and green zones are forbidden by fundamental, material (ɛr = 2.0), and 250-nm-wide interconnect device level limits, respectively.

Material Limits

Material limits are determined by the properties of the particular semiconductor, dielectric, and metallic materials used but must be essentially independent of the structural features and dimensions of particular devices (1, 10). There are five key material limits. Silicon imposes four of them: a switching energy, a transit time, a thermal conductance, and a dopant fluctuation limit. The dielectric constant of the insulator of a multilevel interconnect network imposes the final material limit.

The switching energy limit is determined by the amount of energyE that must be stored in a cube of semiconductor material to support a selected binary transition voltage,V o. This is the voltage applied between two opposite faces of the cube in the direction of current flow. The expression for this limiting energy is given by E = ɛ(V o)3/2ℰc, where ɛ is the permittivity and ℰc is the breakdown electric field strength of the semiconductor material. The transit time limitt d is defined by the smallest time interval required for an electron to be transported through the cube. This limiting time is expressed as t d =V o/v sc, where v s is the electron saturation velocity (the largest possible electron velocity whose value is 107cm/s in silicon) in a particular material. The thermal conductance limit defines the maximum amount of power, P, that may be dissipated in a single transistor within a particular semiconductor chip. P must equal the rate of heat removal under steady state conditions. The power dissipation limit is given byP = πKv sΔTt d, whereK is the thermal conductivity, v s is the saturation velocity of the semiconductor material, ΔTis the temperature difference between the transistor and an ideal heat sink for the semiconductor chip, and t d is the device transit time.

The minimum binary transition voltageV o needed for high-performance devices and circuits for TSI is believed to be 0.5 V. The orange region defined by the switching energy, transit time, and thermal conductance limits (Fig. 1) is a second forbidden zone of operation, imposed by the material limits of silicon. No silicon transistor regardless of its structural features can operate in this orange forbidden region. It is especially notable that the three expressions defining the material limits are essentially independent of the structural features and dimensions of any particular device. A rare exception may be certain very small devices exhibiting an effective increase in carrier velocity due to a short-range phenomenon termed velocity overshoot (11).

The fourth key semiconductor material limit is a dopant fluctuation limit, which is defined by the expression σ/μ = (ℓ/Δχ)3/2. The standard deviation and the mean value of the number of dopant atoms within a cube of semiconductor material of dimension Δχ are σ and μ, respectively; ℓ is the average distance between dopant atoms in the cube. This expression reveals that the standard deviation of the number of dopant atoms in a cube of semiconductor material, σ, increases without bound as the cube dimension, Δχ, decreases. This poses a critical concern for TSI because it hints that deviations in the values of key device parameters, such as the threshold voltage of a transistor, may increase without bound as device dimensions are scaled to the 10-nm range.

The time of flight τ of an electromagnetic wave in a solid dielectric material with a relative permittivity, ɛr, is expressed by τ =L/(ɛr)1/2 c 0, which defines the fifth key material limit. The dashed locus (Fig. 2) represents this limit for ɛr = 2. The orange zone is a forbidden region for any interconnect whose relative permittivity ɛr is greater than 2. Relative permittivity values less than two generally require porous materials consisting of gas “balloons” encased by thin solid walls.

Device Limits

There are five key limits at the device level( 1, 12) of the hierarchy. Metal-oxide-semiconductor field effect transistors (MOSFETs), the most critical devices of TSI, impose a switching energy, a transit time, and a parameter fluctuation limit. Interconnects impose key latency and cross-talk limits. An advanced MOSFET structure is illustrated in Fig. 3.

Figure 3

Schematic diagram of the cross section of a symmetrical double-gate MOSFET. The gate electrode is highly conducting, the gate oxide is highly insulating, and the undoped channel is semiconducting silicon. In this so-called metal-oxide-semiconductor field effect transistor, or MOSFET, an input signal voltage applied between the gate and source electrodes controls output current flow from drain to source.

During a binary switching transition, the energy stored on the capacitive gate or control electrode of a MOSFET device is transferred. This energy therefore represents its switching energy limit, given byE = (1/2)C g(V dd)2. The gate capacitance of a minimum geometry MOSFET is expressed byC g = ɛox(L ch)2/T ox, where ɛox is the permittivity of the gate oxide,L ch is the channel length, andT ox is the gate oxide thickness. The binary signal voltage swing is assumed to equal the supply voltageV dd, as is the case for the predominant complementary metal-oxide-semiconductor (CMOS) digital circuit family. The lower limit on E corresponds to a minimum channel length, L ch, or minimum size MOSFET operating at a minimum supply voltage, V dd.

The intrinsic switching delay of a MOSFET can be expressed in its simplest form as the transit time of carriers across its channel from source to drain or t d =L ch/v s, where the average velocity of a transiting electron is taken to be the saturation velocity, v s.

Both the switching energy, E, and the switching delay,t d, of a MOSFET will be at a minimum for the smallest possible channel length, L ch. It is this observation that has driven the quest for ever smaller transistors for the past four decades. Unfortunately, as transistor channel length is scaled down, eventually the gate or threshold voltage at which the device switches from open or nonconducting to closed or strongly conducting precipitously decreases. The double-gate MOSFET structure (Fig. 3) enables the smallest values of channel length. In this device, drain-to-source channel current is controlled by electric fields created by both top and bottom gate voltages rather than from a top gate only as in conventional MOSFETs (1).

A recently derived solution to the two-dimensional Poisson equation of electrophysics defines the channel length of a double-gate MOSFET as (13)Embedded Image Embedded Image(1)where λ = [1 + (1/r)][1 + (π/2)]−1 T Si andr =T Si/3T ox, and β =q/kT. Figure 4illustrates two plots of Eq. 1, which indicate the key opportunity for double-gate MOSFET channel lengths in the 10-nm range. The ultimate challenge of TSI is implementing several trillion of these devices—with tightly controlled gate oxide thicknessT ox in the 1.0-nm range, silicon channel thickness T Si in the 3.0-nm range, and channel length L ch in the 10-nm range—in a single silicon chip selling for less than $100. As indicated in Fig. 4, these values of T ox and T Si are necessary to achieve channel lengths L ch in the 10-nm range.

Figure 4

(A) Channel length versus oxide thickness for T Si = 5 nm. (B) Channel length versus silicon thickness forT ox = 0.8 nm. These curves illustrate the potential to achieve double-gate MOSFETs with 10-nm channel lengths for gate oxide thickness in the 1.0-nm range and silicon channel thickness in the 3.0-nm range.

The third key device limit concerns the need for ultratight control of MOSFET dimensions and dopant impurity concentrations to preclude parameter fluctuations so large as to cause functional faults in device and circuit operation. Random deviations from nominal values of MOSFET and interconnect parameters preclude attainment of the precise performance levels defined by the hierarchy of limits on TSI. A prime example of this generalization is the fundamental limit imposed by thermodynamics on signal energy transfer during a binary switching transition, E(min) = (ln2)kT. At this level of signal energy transfer, the probability of error during a binary transition is unacceptably high and therefore mandates a larger value of switching energy and its associated lower probability of error. Moreover, double-gate MOSFET models of the impact of random placement of dopant atoms in the channel region (Fig. 3) reveal that control of threshold voltage deviation demands the use of very lightly doped (typically < 1015 atoms/cm3) channel regions (14).

A distributed resistance-capacitance network serves as the model for an isolated interconnect whose response time or latency, τ, increases quadratically as interconnect length increases and as metal width and height as well as insulator thickness are scaled downward to increase wiring density (1). (As the width and height of a metal interconnect continue to scale downward, an additional severe deleterious effect enters the problem. This is the increase in the effective resistivity, ρ, that results from several factors, including strong electron scattering at the interface of the conductor and its surrounding insulator, and from large temperature increases resulting from the poor thermal conductivity of insulating layers.)

The normalized peak cross-talk voltage due to capacitive coupling between a quiescent interconnect and two adjacent parallel interconnects that undergo binary switching transitions is given byV n/V dd = (1/2)[c m/(c int +c m)], where c m is the distributed mutual capacitance between the quiescent interconnect and an adjacent interconnect. As mutual capacitance increases because of smaller interconnect spacing, peak cross-talk voltage increases (15).

The MOSFET switching energy and transit time limits result in the green forbidden zone of operation for a conventional (that is, single-gate bulk silicon) device whose channel length is greater than 50 nm (Fig. 1). A 50-nm channel length represents a conservative value for limiting channel length of such MOSFETs. The latency of an interconnect modeled as a distributed resistance-capacitance network is illustrated in Fig. 2. The green region represents a forbidden zone of operation for any interconnect with a copper conductor, an insulator with a relative permittivity of two, and a square cross-sectional dimension of 250 nm (a suitable value for intermediate length interconnects). Figures 1 and 2 illustrate the comparative values of key limits at the first three levels of the hierarchy (1).

Circuit Limits

The six key circuit limits (1, 12) on TSI are a static transfer curve, a switching energy, and a propagation delay limit imposed by CMOS logic circuits; latency and signal contamination limits imposed by global interconnect circuits; and a performance fluctuation limit.

To provide the quintessential capability of binary signal discrimination, the signal voltage swing of a CMOS digital logic circuit must satisfy the constraint V dd ≥ 2(ln2)kT/q ≥ 0.038 V, where q is the charge of a single electron and T = 300°C (8). This static transfer curve limit applies to the predominant static CMOS logic circuit family for which binary signal swing is equal to the supply voltage. The switching energy limit is determined by the amount of energy that is transferred during a binary transition of an inverter, the basic circuit of the CMOS logic family. The switching energy is given by E = (1/2)C c(V dd)2, where C c is the capacitance loading the output terminals of the circuit (1). The propagation delay limit is the average time, t d, required for a binary signal appearing at the input terminals of a logic circuit to be propagated to its output terminals. In essence,t d is simply the circuit latency (1).

The latency of a global interconnect circuit is, for example, the time required for a signal to propagate from the output terminals of a driver circuit, feeding a global interconnect extending from corner to corner of a chip, to the input terminals of a receiver circuit. This latency is minimal if the total resistance of the interconnect is small compared with its characteristic impedance, Z o, and the output resistance of the driver equals Z o(1). The characteristic impedance is given byZ o =( L/C)1/2, where L andC are the distributed inductance and capacitance per unit length of the interconnect, respectively.

The signal contamination limit results from mutual inductance and capacitance between a global interconnect, the victim, and its surrounding interconnects, the aggressors, causing unwanted or contaminating noise to appear on the victim when an intended signal appears on the aggressors. A simplified expression for the normalized peak cross-talk noise is given byV n/V dd = (π/4)[c m/(c int +c m)], where c m is the mutual capacitance between adjacent interconnects andc int is the capacitance between an interconnect and its underlying conducting plane (15,16).

The performance fluctuation limit at the circuit level results from transistor and interconnect electrical parameters deviating from their nominal values for whatever reasons including intrinsic and extrinsic manufacturing tolerances, temperature variations, supply voltage changes, and so forth. As previously noted, fluctuations prevent circuit performance levels from reaching those defined by nominal physical limits. Typical increases in propagation delay and power dissipation due to such fluctuations are 30 and 50% above nominal for 50-nm generation CMOS logic circuits (17).

The switching energy and propagation delay limits for 50-nm generation CMOS logic circuits are illustrated in Fig. 5; Fig. 6illustrates the global interconnect latency limit. In both figures, the blue regions define forbidden zones for operation due to circuit limits.

Figure 5

P versus tdfor all levels of the hierarchy. The blue and purple zones are forbidden by representative gigascale circuit and system limits. The tiny white triangle is the allowable design space for a representative gigascale chip.

Figure 6

L−2 versus τ for all levels of the hierarchy. The blue and purple zones are forbidden by representative gigascale interconnect circuit and system level limits. The tiny white triangle is the allowable design space for the longest interconnects of a representative gigascale chip.

System Limits

Architecture, switching energy, heat removal, clock frequency or timing, and chip size impose five critical system limits on TSI. To elucidate these limits, it is helpful to select a representative set of requirements that must be satisfied by a gigascale system. The system to be considered requires 1 billion logic gates implemented with 50-nm generation CMOS technology. The required heat removal capacity of the package must not exceed 50 W/cm2. The required clock frequency is 10 GHz. The entire system must be fabricated within a single silicon chip.

A distributed shared memory multiprocessor architecture that consists of a 24 by 24 array of 576 identical macrocellular microprocessors each containing 1.73 million gates is assumed. Each macrocell communicates directly only with its four nearest neighbors. The relatively small size of a macrocell and its nearest-neighbor-only external interconnects ensure relatively short internal and external interconnects and therefore small interconnect capacitances and hence small latency and switching energy dissipation.

To determine the switching energy limit, it is necessary to derive the complete stochastic interconnect length distribution of a macrocell (18). This enables calculation of the average capacitance, C s, loading a two-input CMOS logic gate in the critical path of a macrocell. The switching energy limit is given by E = (1/2)C s(V dd)2, where V dd is determined by minimizing the sum of the switching and static energy dissipation during a clock cycle (19).

The heat removal limit requires that the total power dissipation of the chip, P t, is less than the cooling capacity of the package or P tQA, where Q is the cooling coefficient of the package (in W/cm2) and A is the chip area. Heat removal actually limits the performance or maximum clock frequency of the chip (1).

The clock frequency limit requires that the clock period,T c, must be greater than the sum of the clock skew, T cs, and the critical path delay,T cp, or T cT cs + T cp. Clock skew is the maximum difference in arrival times of a clock pulse at any two locations on the chip, and critical path delay is the maximum time interval required for a signal to propagate between two clocked locations.

The interior of the tiny white triangle in theP-t d plane of Fig. 5 is the allowable design space for a system that fulfills all of its specified critical requirements. The surrounding purple region is a forbidden zone of operation in which one or more critical requirements cannot be fulfilled. Similarly, the small white triangle in theL −2-τ plane of Fig. 6 represents the allowable design space and the purple zone is a forbidden region. The orthogonal sides of the triangle in the Fig. 6 are defined by the edge length of a macrocell and the latency of an interconnect of the same length.

Conclusion

A hierarchy of fundamental, material, device, circuit, and system limits reveals that 10-nm TSI is feasible assuming the critical development of double-gate MOSFETs with gate oxide thickness in the 1.0-nm range, silicon channel thickness in the 3.0-nm range, and channel length in the 10-nm range.

In Fig. 5, the white triangle—the allowable design space for a year 2011 generation TSI system (20)—is separated from the forbidden red zone imposed by fundamental limits by over five orders of magnitude. This is observed by noting the separation of the loci, representing the fundamental limit from thermodynamics and the system switching energy limit, along the abscissa of the figure. This huge separation is the result of the large interconnect capacitance that must be charged or discharged during a binary transition and the relatively large binary signal swing of 0.5 V. This amount of signal swing is necessary for large drive currents, leading to small circuit propagation delays and hence 10-GHz clock frequencies.

After four decades of rapid advances in both the performance and productivity of silicon semiconductor technology, a systematic assessment of its hierarchy of physical limits reveals an enormous remaining potential to advance from current multibillion transistor chips to the multitrillion transistor range of terascale integration.

  • * To whom correspondence should be addressed. E-mail: james.meindl{at}mirc.gatech.edu

REFERENCES

View Abstract

Navigate This Article