## Optimizing flow in dammed rivers

Hydropower dams radically alter river flow regimes, often with consequences for the functioning and productivity of the waters downstream. Where fisheries in large tropical river systems are affected, there can be knock-on effects on food security. For the Mekong River, Sabo *et al.* used a data-based time series modeling approach to estimate the features of the flow regime that optimize the fishery that is crucial to food security in Cambodia (see the Perspective by Poff and Olden). Fish futures can be maximized within a managed hydrologic system with careful prescription of flows. Such data-driven approaches can be used to link hydrology to ecology and food production and specify design principles that could help to deliver food security in other river systems.

## Structured Abstract

### INTRODUCTION

The Mekong River provides renewable energy and food security for a population of more than 60 million people in six countries: China, Myanmar, Lao PDR, Thailand, Vietnam, and Cambodia. Seasonal rains flood the river’s floodplain and delta. This flood pulse fuels what is likely the world’s largest freshwater fishery in Cambodia’s Tonle Sap Lake, with >2 million tonnes of annual harvest valued at ~$2 billion. Hydropower development is crucial to the region’s economic prosperity and is simultaneously a threat to fisheries and agriculture that thrived in the natural-flow regime. The Mekong is testament to the food, energy, and water challenges facing tropical rivers globally.

### RATIONALE

We hypothesized that high fisheries yields are driven by measurable attributes of hydrologic variability, and that these relationships can be used to design and implement future flow regimes that improve fisheries yield through control of impending hydropower operations. Hydrologic attributes that drive strong fisheries yields were identified using a data-driven approach that combined 17 years of discharge and standardized harvest data with several time-series methods in the frequency and time domains. We then analyzed century-scale time series of discharge data on the Mekong and associated hydroclimate data sets to understand how current dams, independent of climate, have changed key drivers of the fishery since the early 1960s. Finally, we used estimated hydrologic drivers of the historical bag net, or “Dai,” fishery on the Tonle Sap River—the largest commercial fishery in the Mekong—to design better fisheries futures by comparing designed flows to current and pre-dam (natural-flow) regimes.

### RESULTS

Our analysis identified several features of hydrologic variability that portend strong fisheries yield. These include two “high-level” descriptors: flood pulse extent (FPExt) and net annual anomaly (NAA). FPExt, which combines flood magnitude and duration, has long been hypothesized to drive fisheries yield in ecosystems subject to flood pulses, such as the Mekong. NAA is the annual sum of daily residual flows standardized to the long-term average hydrograph. Hence, NAA is a compact measure of hydrologic variance and can be further decomposed into nine shape “components.” Several of these components drive high fisheries yields, including a long low-flow period followed by a short, strong flood pulse with multiple peaks. All essential drivers of the flood pulse fishery have been changing since the closure of the first Mekong tributary dam and are independent of changes associated with climate observed over the past century. The direction of these changes is consistent with declining fisheries yield in the Tonle Sap. Projection of the fishery driven by a hypothetical “designer” hydrograph capturing the key shape features associated with strong yield improved harvest relative to current conditions; yield was projected to exceed that of the natural-flow regime by a factor of 3.7. This result was robust to the inclusion of density-dependent recruitment in our time-series model.

### CONCLUSION

A data-driven approach reveals a new perspective on hydrologic drivers of fishery productivity in the Mekong. The extent of the flood pulse is paramount, as previous literature suggests, but so are other descriptors of hydrologic variation, including anomalous low flows. Variance is key—specifically, the sequence and timing of within-year anomalous high and low flows. A focus on variance shifts the conversation from “How much water do we need?” to “When do we need it the most, and when can we spare it?” Beneficial components of variance in the hydrograph can be described by a simple Fourier series—an asymmetric rectangular pulse train. A quantitative ecological objective function fills a critical gap in the balancing of fisheries harvest with other important objective functions including hydropower generation, rice production, and transportation. This opens the possibility of specifying and implementing flow regimes to manage rivers to satisfice trade-offs between fishery productivity and other ecosystem services provided by tropical rivers subject to flood pulses.

## Abstract

Rivers provide unrivaled opportunity for clean energy via hydropower, but little is known about the potential impact of dam-building on the food security these rivers provide. In tropical rivers, rainfall drives a periodic flood pulse fueling fish production and delivering nutrition to more than 150 million people worldwide. Hydropower will modulate this flood pulse, thereby threatening food security. We identified variance components of the Mekong River flood pulse that predict yield in one of the largest freshwater fisheries in the world. We used these variance components to design an algorithm for a managed hydrograph to explore future yields. This algorithm mimics attributes of discharge variance that drive fishery yield: prolonged low flows followed by a short flood pulse. Designed flows increased yield by a factor of 3.7 relative to historical hydrology. Managing desired components of discharge variance will lead to greater efficiency in the Lower Mekong Basin food system.

Hydropower is currently the dominant source of renewable energy worldwide (*1*–*4*) and is poised to power some of the poorest, predominantly rural, populations in the world (*5*, *6*). Storage of streamflow in reservoirs and dam operations often dampen peak flows, deliver higher base flows, and change the range and frequency of discharge variability in rivers worldwide (*7*–*11*). This hydrologic alteration—from dams and other stressors—promotes the invasion of non-native aquatic species (*12*, *13*), alters food web structure (*14*), and reduces biodiversity by homogenizing regional freshwater faunas (*10*). Traditional river conservation under the natural-flow regime paradigm (*15*) posits that the restoration of pre-dam flow variability maximizes ecological outcomes (*15*–*18*). Although many empirical studies support this hypothesis, removal of large dams and restoration of pre-dam conditions is rarely possible, especially for systems with large hydropower dams or storage reservoirs that have long lifespans and are central to socio-environmental water systems (*19*).

In tropical basins, construction of mainstem and tributary dams is imminent and requires foresight about how these new facilities are implemented. In these cases, strategic operation of existing and future infrastructure to deliver the right dose of variation via smart control (*20*–*22*) may be a more effective and immediately deployable strategy to deliver human and ecological outcomes. This recent paradigm shift is fueling much research on algorithm development to prescribe ecological objective functions [e.g., (*23*)] and optimize these with other competing objective functions to harness water for power generation, irrigation, and other human uses [reviewed in (*24*)]. The framework for optimization is rich, but the algorithms for prescribing an ecological objective function are generally less well developed. Here, we address this limitation using a data-driven time-series approach relating hydrologic variation to fishery yields, and we provide tangible alternatives in the context of hydropower development and inland fishery production in the Lower Mekong Basin (LMB).

The Mekong is the eighth largest river by discharge and hosts one of the largest inland fisheries in the world (*25*). The proposed scope of hydropower development in the Mekong and other tropical rivers in Asia, Africa, and South America poses pronounced trade-offs for biodiversity and the production of freshwater fish for food (*26*, *27*). In the LMB, monsoon rains drive a flood pulse that inundates floodplain habitats throughout the basin. In flood-pulse fisheries, the duration and magnitude of the flood pulse—which controls the spatial extent of inundation (*28*)—is a key driver of production (*28*), but there is some support for the notion that flood timing, especially the relative sequence of weak and strong flood pulses, is important (*29*, *30*). This suggests that regulation of the timing and magnitude of the high and low flows could be harnessed to improve the fisheries. However, the exact features of a hydrograph that could promote fishery yields in the LMB, and how these might be translated into actionable design principles, remain unknown.

We tested the historic role of multiple variance components of the LMB hydrograph in driving harvest of fishes (total annual harvest in kilograms, standardized by effort) from the bag net or “Dai” fishery on the Tonle Sap River. The river connects the Mekong to the Tonle Sap Lake, the largest lake and wetland in Southeast Asia and a nursery habitat for as many as 300 species of fishes (*25*, *31*), many of which provide a key source of animal protein and vitamin A for rural subsistence fishing and agricultural communities (*32*–*34*). The Dai fishery is one of the most valuable and productive commercial fisheries in the LMB (*35*) and has the longest and most complete record of fishery catches in the region, with biomass and species composition data spanning 17 years (1996–2012) from 64 Dais at 14 locations along the Tonle Sap River.

Our analysis expands on previous research in at least three ways: (i) Our data-driven rather than mechanistic modeling approach allows us to connect key aspects of hydrology to fishery dynamics while minimizing assumptions arising from specified eco-physiological parameters (*36*). Briefly, we leverage observed relationships between variance components of recent hydrology and harvest to identify shape features in the hydrograph that should maximize desired outcomes for the future fishery. Hence, our analysis provides a path for scientifically informed proactive management of a flood-pulse fishery in systems in which a reasonable understanding of the biological mechanisms at play is not within reach. (ii) We focus on design of future hydrology via ecologically informed dam operations. This focus is timely in the Mekong given the imminence of extensive hydropower in the LMB and the potential to undertake this development in ways that minimize the impact on the fishery (and thus on regional food security). The method for design that we propose leverages a highly flexible and well-understood spectral toolbox that would have direct, transferable applicability to systems engineers who operate dams. In short, the design provides a potential tool for managing mainstem hydrology for fisheries as new dams come on line. (iii) We articulate a set of ecological design principles that achieve socioecological goals by modifying the variance within an annual hydrograph without increasing mean annual discharge. A focus on variance shifts the conversation from “How much water do we need for the fishery?” to “When do we need it the most, and when can we spare it?” (*37*).

## Defining the flow objective function for the Tonle Sap fishery

We used a data-driven, time-series approach that allowed us to quantify interannual variation in discharge variance and connect this to change in harvest of the fishery. Hydrologic variance was quantified and decomposed into nine shape components describing departure from the long-term trend, using discrete fast Fourier transform (DFFT) methods (*38*). These metrics of hydrologic variance were then tested in terms of their ability to explain variance in fishery harvest according to a multivariate autoregressive state space (MARSS) modeling framework (*39*).

### Discharge variance

We measured recent discharge variability and century-scale hydrologic change (see below) using DFFT methods (*38*) (Fig. 1) on daily average stage (water level) data for the Mekong River mainstem (Stung Treng, Cambodia). The DFFT is a type of spectral analysis that converts a time series into a vector of amplitudes (power) and phases associated with all frequencies in the data set (typically *N*/2, where *N* is the length of data in the time domain). In a hydrologic context, a small number of these characteristic frequencies [<3 in most cases (*38*)] constitute the characteristic signal of the discharge time series: seasonal variation in the hydrograph. We extracted this characteristic signal for the time series spanning 1993–2012 (consistent with the record of fishery yield, 1996–2012).

Using the characteristic signal for 1993–2012, we then identified low- and high-flow deviations from the characteristic signal. These deviations are called anomalies, and they sum to zero across the entire time series. However, there is considerable interannual variation in the sum of within-year anomalies. Hence, the sum of all positive and negative anomalies in a year, or the net annual anomaly (NAA), provides a simple composite measure of annual discharge variance. Therefore, NAA describes anomalous or atypical wetness (positive NAA) or dryness (negative NAA) in any given year (*40*, *41*).

The NAA, in turn, consists of nine components that define the sequence of deviation (i.e., the shape) of the annual hydrograph in terms of magnitude, duration, and frequency of low and high anomalies (Fig. 1). Noteworthy among these are the interflood interval (IFI), which measures the number of contiguous days between maximum high-flow anomalies in adjacent water years, and the equivalent interdrought interval (IDI) that measures contiguous days between negative anomalies. Spectral anomaly magnitude and frequency (SAM and SAF) respectively quantify the strength and number of independent observations of within-year anomalies. Thus, these metrics can be negative (low, or LSAM) or positive (high, or HSAM) relative to the long-term average condition. The timing of LSAM and HSAM can be measured against the expected long-term peak signal (in days). Finally, transition time measures the number of contiguous days between the day with the greatest absolute magnitude LSAM and the day with the greatest HSAM (Fig. 1). See table S1 for quantified annual measures of NAA and its nine component measures of variance for the record spanning 1996–2012.

We further quantified a more standard measure of the magnitude of the flood pulse: flood pulse extent or FPExt. FPExt is defined as the average deviation between stage and baseflow multiplied by the number of days in which flow exceeds baseflow (i.e., average magnitude × duration). This metric has been positively related to the spatial extent of flooding and fish harvest in flood-pulse fisheries (*28*). In our analysis of hydrologic drivers of the fishery, we divided these 11 metrics into two groups of covariates: “high-level” (HL), including FPExt and NAA, and “NAA-component,” including IFI, IDI, HSAM, LSAM, HSAF, LSAF, timing of HSAM and LSAM, and transition time.

### Flow-fish relationships

Flow-fish relationships were developed using catch (biomass in kilograms) and effort data (Dai days, equal to number of fishing days across all Dais in a given Dai row and year) collected by the Inland Fisheries Research and Development Institute of Cambodia and maintained by the Mekong River Commission. The data set includes monthly species-specific harvest, allowing the calculation of catch per unit effort (CPUE) for 64 Dais in 14 locations on the Tonle Sap River. We pooled catch data within location for each year, culminating in a data set of 14 time series of total CPUE, each 17 years long.

The relationship between annual HL and NAA-component drivers and total CPUE was determined using a multivariate autoregressive state-space (MARSS) framework. Multivariate autoregressive models are common in econometrics, where they are referred to as vector autoregression. In ecology they have been used most often to identify drivers of community and food web dynamics [reviewed in (*36*)]. Because they do not require eco-physiological parameters, MARSS models can be less challenging to parameterize than mechanistic models; instead of measuring the parameters, those parameters are inferred from observational time-series data according to theory about temporal correlation patterns (*42*). More specifically, abundance time series are used to estimate parameters associated with population growth, biotic interactions, or environmental stochasticity (*42*). Moreover, the state-space approach enables the separation of often substantial observation error from true fluctuations in yield, allowing for a more accurate identification of drivers (*43*, *44*). We used the “MARSS” R-package (*39*), which provides support for fitting MARSS models with covariates to multivariate time series data via maximum likelihood, using an expectation maximization algorithm. In our case, the covariates are annual measures of hydrologic variation extracted from DFFT. In matrix form, the MARSS model can be written as(1)(2)where **w*** _{t}* ~ MVN(0,

**Q**) and

**v**

*~ MVN(0,*

_{t}**R**). The fishery data are harvests of all species in one year from a single row of Dais (hereafter, “Dai”) standardized by effort (CPUE), where effort is Dai days in a year. These data enter the model as

**y**in Eq. 2, where

**y**

*is the log-transformed CPUE of all species at each Dai. We then modeled CPUE as a linear function of the “unobservable” or true catch (*

_{t}**x**

*) and*

_{t}**v**

*, a vector of observation errors. In the state process (Eq. 1),*

_{t}**B**is an interaction matrix that can be used to model the effect of abundances on each other (e.g., density dependence),

**C**is the matrix whose elements describe the effect of each hydrologic covariate on abundance at each Dai, and

**w**

*is a matrix of the process error, with process errors at time*

_{t}*t*being multivariate normal with mean 0 and covariance matrix

**Q**. In our case, the covariate data (

**c**

*) comprised a suite of 11 possible metrics of discharge variation for the current harvest year described above (Fig. 1), thereby allowing us to measure the impact of hydrology at time*

_{t}*t*(i.e., this year) on the change in harvest between

*t*– 1 and

*t*(i.e., last year and this year).

The number of covariates is large in number relative to our data set; more important, many of the covariates are likely correlated (e.g., HSAM and LSAM should be negatively correlated within a year). In addition to this, not all covariates are likely important. To address potential estimation bias associated with collinearity and model overspecification, we devised a three-step model selection routine: (i) We divided our drivers into HL and NAA-component groups, and carried out separate MARSS analysis on the same CPUE data using either HL or NAA-component drivers as covariates. HL drivers are highly correlated (fig. S9) but not collinear [variance inflation factor (VIF) < 5]. Hence, the HL model included both NAA and FPExt. The observation that NAA and FPExt are correlated underscores the positive relationship between high-flow anomalies captured in NAA and the extent of the flood pulse. The lack of collinearity reinforces the independence of low-flow events and FPExt. (ii) For the NAA-component model, we first screened drivers with potentially high collinearity, eliminating all drivers with VIF > 5. (iii) This VIF screen was followed by model selection using an all-possible-subsets (APS) routine on the model with all remaining, noncollinear NAA-component covariates. In this APS we considered only main effects (not interactions) of the remaining independent NAA drivers. Alternative models were tested using an information-theoretic approach [Akaike’s information criterion corrected for small sample size, AICc (*45*)] that represented different hypotheses concerning which drivers best explain observed fishery yields. Included in model selection were all possible combinations of hydrologic variables (after culling for strong collinearity). Ranges in effect sizes of all parameters are presented as 95% confidence intervals (CIs) based on 1000 parametric bootstrap samples (*41*).

### Result 1: An ecological objective function for the Mekong

We found that FPExt had strong positive effect sizes and that NAA had strong negative effect sizes, indicating that flood pulse extent and low flows (negative NAA) are almost equally strong and significant drivers of the fishery (Fig. 2 and Table 1). Of the many shape components of NAA (*41*), four were positively related to fish catch: the interflood interval (IFI), the maximum magnitude of spectral anomalies (HSAM), the frequency of events with large positive spectral anomalies (HSAF), and the minimum (negative) magnitude of spectral anomalies (LSAM). Of these high-level drivers, IFI had the highest-magnitude effect, followed closely by HSAM. However, the effect of LSAM was significant in magnitude, indicating that low flows (i.e., IFI, LSAM) are as important in determining catch as the magnitude of the flood pulse itself (HSAM, Fig. 2). Hence, the design principles for productive fisheries in this system include a long dry period (large IFI, large-magnitude negative NAA and LSAM) punctuated by a strong flood pulse (large FPExt and HSAM) characterized by multiple storm peaks (high HSAF). Variance, in addition to the magnitude of the flood pulse, is paramount.

### Density dependence

Our initial analysis assumed density independence (i.e., a **B** matrix with ones in the diagonal and zeros in the off-diagonal elements in Eq. 1), based on the assumption that high fishing pressure should maintain total density below the level triggering negative feedbacks. Despite this likely scenario, we also analyzed HL and NAA-component models assuming density dependence (i.e., **B** diagonal estimated by MARSS rather than fixed at 1). For HL models, the sign and relative magnitude of coefficients for FPExt and NAA were preserved; the flood pulse and negative NAA had positive effects on the fishery in the density-dependent HL model. However, the effect size for NAA was nonsignificant under the assumption of density dependence, likely owing to lower power in a more heavily parameterized model. Similarly, for NAA-component models, effect sizes generally were in accord between density-independent and density-dependent models (Fig. 2). NAA-component models under both assumptions featured significant effect sizes of IFI and LSAM. One significant difference between these two NAA-component models was a stronger influence of low flows in the density-dependent model (high-magnitude effect size for LSAM and nonsignificant effect size for HSAM and HSAF). We note that one of the NAA-component density-dependent candidate models produced a null (zero) process error variance estimate, thereby precluding proper model averaging and prescription of robust density-dependent designs. This was likely caused by limited data relative to estimated parameters, which was not the case for any of the density-independent candidate models. Nonetheless, regardless of the assumption about density dependence, the flood pulse and low-flow anomalies are important (positive) predictors of harvest (CPUE) (*41*).

## Long-term hydrologic change vis-à-vis design principles

Our observation that the ideal hydrograph for the fishery is one with a long dry period punctuated by a large-magnitude flood pulse is troubling, given observed trends in discharge variation in the LMB. Substantial work suggests that dams mute variation, dampening the flood pulse and augmenting low or base flows (*46*–*48*). We developed a spectral approach to quantify century-scale change in discharge variance, and we used this new tool to measure change in high-level and NAA-component drivers of the Mekong fishery with reference to the closure of the first dam on the Mekong (Ubol Ratana dam in the headwaters of the Mun River, Thailand, in 1964). This dam was chosen not to imply that all potential change in hydrology derives from a single structure (Ubol Ratana), but rather to pinpoint the start of extensive and continued development in the basin.

### Century-scale hydrologic change

We constructed a detrended hydrologic baseline using a windowed average DFFT analysis across all 37 consecutive 20-year time series from 1910 to 1964 (e.g., 1910–1919, 1911–1920, …, 1945–1964) predating the closure of Ubol Ratana. In each window, we estimated the trend, characteristic phase, amplitude, and frequency of the signal. This produced a sample of 37 point observations, which was bootstrapped to generate unbiased measures of central tendency for these historical parameters. Bootstrapped values for frequency, amplitude, and phase of significant signals allow us to reconstruct a detrended historical baseline (hereafter “baseline”; green line in Fig. 1).

With a detrended baseline in hand, we then compared (detrended) post-dam time series to the baseline to estimate NAA and its components. This was done using a windowed approach as above, but across all 83 consecutive 20-year time series from 1910 to 2012 (e.g., 1910–1919, … 1965–1984, 1966–1985, …, 1993–2012). Hence, both pre-dam and post-dam data were compared to a smoothed and bootstrapped baseline to obtain 83 point estimates of NAA and the components of NAA. Comparison of detrended post-dam observations to detrended baseline hydrology allowed us to measure changes in variance (second moment) not confounded by changes in the trend (first moment). We also estimated average FPExt numerically across this same record for all 83 20-year time windows in our record. We have thoroughly documented these methods (see below).

### Breakpoint analysis

With 83-year time series for FPExt, NAA, and NAA components, we identified breakpoints in the time series using a divisive hierarchical estimation algorithm for multiple change point analysis (*49*, *50*). To further strengthen the inference that dams, and not changes in climate, were responsible for any observed changes in hydrologic variance, we performed the same breakpoint analysis on annualized, basin-scale Palmer Drought Severity Index (PDSI) estimates reconstructed from tree ring data (Fig. 3) (*51*, *52*).

### Result 2: Change in hydrologic variance after dam closure

Our measure of FPExt has declined since the closure of the first dam on the Mekong, and there have been similar declines in IFI, HSAM, and HSAF and countervailing increases in NAA (Fig. 3) (*41*). Thus, dam development changes hydrologic variation in a manner exactly opposite to the pattern that promotes higher fishery yields. Hydroclimate has changed multiple times over the observed discharge record (1910 to the present), but identified breakpoints of change in the hydrograph do not coincide with changes in the PDSI. However, they do coincide with the period of dam closure (1954–1974) in tributaries of the LMB. Many of the initial dams were storage facilities (*41*, *53*) that greatly affect the shape (*11*) and hence the NAA components of the hydrograph (Fig. 3).

## Restoration potential of designed and natural flows

As with some of the first dams closed on tributaries of the LMB, many of those slated for construction in the Sekong, Sesan, and Srepok tributaries of the Mekong in Lao PDR, Vietnam, and Cambodia will create floodwater reservoirs that have potential to be used for dry-season agriculture (*48*) and could thus be used to generate designed flows. We hypothesized that implementation of controlled releases from these storage facilities mimicking the key design principles articulated above would deliver desired fishery targets equally or more effectively than restoration of the natural-flow regime.

### Spectral implementation of an ecological objective function

To test this hypothesis, we developed a flexible algorithm using a signal-processing tool kit that generates target river flows that capture key hydrologic elements associated with strong fishery yield (Fig. 2). This algorithm is a compound sinusoidal generating an asymmetric rectangular pulse train:(3)where *a*_{0} = *Ad*, *a _{n}* = (2

*A*/

*n*π) sin

*n*π

*d*, and

*d*=

*k*/

*T*. Here,

*A*,

*d*,

*f*,

*k*,

*n*,

*t*, and

*T*are the amplitude, duty cycle, frequency, length of the flood pulse, wave series number, time (in days), and period (365 days), respectively. Here we consider an asymmetric rectangular wave (

*b*= 0) to produce a deterministic hydrograph with known characteristics that was then corrupted with red noise:(4)where ε

_{n}*~*

_{t}*N*(0, σ), ρ = 0.95, and σ = 0.042.

Asymmetry (i.e., longer IFI) arises in this pulse train from a short duty cycle (associated with a brief flood pulse), and the reddened noise spectrum provides a more natural random walk that is characteristic of storm sequences. We created two designed flows—a “Good” design (long IFI and large relative amplitude of flood pulse) and a “Bad” design (shorter IFI and small relative amplitude of flood pulse)—simulating the expected effects of medium-term hydropower development (Fig. 4). We also created pre-dam hydrographs representing restoration of the natural-flow regime by resampling a >50-year record of observed annual metrics for NAA and FPExt from the observed record before the closure of the first dam on the LMB in 1964, and strings of NAA and FPExt representing current conditions (Fig. 4). We then projected the fishery forward, preserving the same MARSS model structure and using the estimated coefficients for discharge anomaly effects (in **C**) and process error variance (in **Q**). Trajectories of fishery catch were constrained by the HL metrics for each scenario. We used catch at each Dai in 2012 as initial values, ran 100,000 simulations per scenario, and then pooled abundances across Dais by realization to obtain 2.5, 25, 50, 75, and 97.5 percentile catch levels at each time step (8 years, i.e., 2013–2020).

For each scenario, we computed “global deltas” as the difference between median pooled catch at the simulation end (year 8) and initial pooled catch, divided by the initial pooled catch (i.e., observed pooled values in 2012). We similarly computed “annual deltas” between successive years, and calculated a mean annual delta for each scenario as a relevant metric for fishery management (i.e., the expected percent change in next year’s fishery in response to changes in the flow regime). Finally, to test the robustness of our findings under different assumptions, we compared the projections obtained in the previously described model to those obtained in a model that incorporates density dependence [that is, an aggregate measure of carrying capacity of the fisheries (*41*)].

### Result 3: Engineered hydrograph performs as well as or better than natural-flow regime

Natural flow restoration (the “Pre-dam” scenario) produced a 47% annual increase in yield (median biomass) relative to the modern fishery (Fig. 5A and Table 2) under the assumption of density independence. This increase is likely attributable to a stronger flood pulse during the pre-dam era captured by the reconstructed covariates used to project this scenario. This increase was not, however, significant (CI: –0.72 to 733.54%). Despite substantial increases under restoration of natural flows, designed flows mimicking long IFI and a short but strong flood pulse (the “Good” design) produced even greater yields than natural-flow regimes (76% annual increase; Fig. 5A and Table 2).

The lower bounds of 95% CIs were positive (i.e., the intervals did not include zero) for both the projected harvest (Fig. 5) and the global deltas based on 10,000 resampled realizations (CI: 0.66 to 2022.17%; Table 2). This indicates higher certainty in increased yield under this scenario than under restoration of natural flows. Global deltas by the end of the 8-year forecast under the Good design exceeded those under the Pre-dam scenario by a factor of 3.7 (Table 2). These improvements in yield emerged even with designed flows featuring an average flood pulse—comparable in magnitude to that of the observed (current) hydrological record—by accentuating variation between high and low flows via long IFI and short and strong HSAM (Fig. 4). By contrast, short IFI and a muted flood peak (“Bad” design; Fig. 4) led to sharp (53%) annual declines in yield in the Dai fishery (Fig. 5 and Table 2).

The direction of change under each scenario did not change if density dependence was included in the model (Fig. 5). The differences between the Good design and the Pre-dam scenario were dampened: By the end of the projection, the Good design under density-dependent models delivered total deltas that were higher than those for the Pre-dam scenario by a factor of 1.6, versus a factor of 3.7 difference under density-independent models (Table 2). However, the Good design was again the only scenario delivering a certain (*P* > 95%) catch increase by the end of the forecast (Fig. 5 and Table 2). Hence, engineering the timing of lows and highs relative to the natural-flow regime can produce results similar to or better than mimicking the natural-flow regime itself.

Finally, a sketch of existing and planned dam locations and reservoir storage with respect to key spawning areas for Tonle Sap Lake fish tied to regional food security suggests that designed flows are possible (Fig. 6). There is already adequate storage in existing storage facilities in China, and in the tributaries of the Mekong in Lao PDR and Thailand, to store and release designed flows. By contrast, further development of mainstem hydropower facilities lower in the basin presents a trade-off between access to key spawning areas and the flexibility to deliver designed flows from storage facilities. Specifically, the closure of mainstem dams in Cambodia would make designed flows moot, as all upstream spawning grounds would be inaccessible to Tonle Sap Lake fishes.

## Conclusion and outlook for managing the Mekong

Our results have implications for the management of hydropower development in tropical river basins, many of which provide food security in some of the poorest countries in the world via capture fisheries. In this context, previous work suggests two hypotheses to explain how dams diminish production of inland fisheries: (i) by reducing connectivity and dispersal (*54*), and (ii) by reducing primary productivity through the entrapment of sediment supply and associated nutrient delivery from headwaters to downstream nursery habitats (*46*, *55*). The importance of a long drought period (high IFI) suggests that changes in the timing of drought and of the flood pulse influence secondary production and fish harvest independent of impacts on connectivity and sediment supply. Therefore, we propose a third hypothesis: Flow variation (high and low) may drive production by controlling redox conditions in floodplain soils, and hence the production of terrestrial nutrients and organic matter and the transfer of this material to the aquatic ecosystem. This hypothesis, however, remains untested. Alternatively, the importance of a large flood pulse (large FPExt) may in part be related to increased catchability of fish at high water levels. Our state-space approach distinguishes between real fluctuations in the fishery and fluctuations in catch numbers arising from variability in observation or sampling. This provides evidence against this fourth hypothesis and suggests that our conclusion may be robust to flow-related increases in catchability. Future research should attempt to distinguish the relative importance of these four possible mechanisms linking hydrologic variation to fishery catches.

The asymmetrical pulse train specified above (Eqs. 3 and 4) represents a flexible design algorithm that captures fundamentally important features of hydrologic variance for fish production and yield in the Mekong. Long IFI also drives primary production in desert rivers, where nitrogen accumulation within the watershed and subsequent input are positively related to a discharge sequence including long dry periods (months) followed by exceptional high flows (*56*–*58*). Hence, the asymmetric pulse train could be a general design principle with wide application in freshwater conservation with respect to environmental flow restoration. Regardless of its generality, identifying robust ecological objective functions early in the planning process is a necessary precursor to optimization with other competing objective functions (i.e., hydropower); the pulse train provides a viable ecological objective function for sustainable operations of existing and planned dams in the Mekong. Although designed flows are a potentially powerful tool to increase fish production and therefore fishery yield, a design paradigm does not guarantee sustainability of the resource. Exceptionally little is known about the current level of sustainability of the LMB fishery. Nonetheless, implementing designed flows coupled with intensive time-series monitoring of catch can provide a means to better understand and manage the fishery through adaptive management (*59*).

Because the record used in this analysis postdates all major recent dam construction in the lower Mekong, our inferences are relevant despite any past impacts of dams on upstream dispersal and downstream sediment transport. Future impacts of dams on migration and sediment delivery are, however, not directly considered in this analysis. The majority of fishes migrating in and out of the Tonle Sap are small explosive breeders in the family Cyprinidae (*Henicorhynchus* spp., *Labiobarbus* spp., and *Paralaubuca* spp.; total length <25 cm) and likely do not migrate long distances relative to the longer-lived catfishes (e.g., Pangasiidae) and larger barbs and carp (e.g., *Probarbus*, *Catlocarpio*; family Cyprinidae). These small cyprinid fishes are nonetheless a staple in the daily meals of local residents across the LMB, and have even given their name to the Cambodian currency [riel (*60*)]. This tight connection between fish and human society underscores the need for formal trade-off analysis between hydropower and food systems, including fish and rice. This task is achievable given the articulation of an ecological objective function for the Mekong, as described above, and the breadth of approaches in multi-objective optimization already available (*24*, *61*–*63*).

## Methods

Our data-driven approach connects past hydrologic dynamics of the Mekong to fishery resources (specifically, fishery catches standardized by effort), and uses estimated coefficients based on the historical covariation to designed flows and forecast fishery catches under different scenarios (Fig. 7). Each step is further articulated below.

### Step 1: Quantifying past hydrological change and variability

*Discrete fast Fourier transform (DFFT) and discharge variability metrics*

Using available stage data at the Stung Treng gage, we identified daily discharge anomalies using DFFT (*38*). Fourier analysis allows for the extraction of the characteristic seasonal profile of a time series, constituted by the recurrent frequencies, amplitudes, and phases. With the reconstructed hydrograph (or “seasonal profile”), it is then possible to identify low- and high-flow anomalies, or departures from the expected seasonal profile. We defined a suite of metrics for characterizing discharge variability, as follows:

Flood pulse extent (FPExt) (*28*) is defined as the product of the average deviation between stage and baseflow and the number of days in which flow exceeds baseflow (i.e., average magnitude × duration). Here, we defined baseflow as the average stage for the time series. To estimate NAA, we rely on spectral methods defined in (*38*). A spectral anomaly is the difference between the daily observation and the “characteristic” or 20-year seasonal signal obtained from the frequency domain via DFFT and reconstructed in the time domain (Fig. 7). NAA is the net sum of all positive and negative anomalies in a year. Although the net sum for the entire time series used to estimate the characteristic signal is null, there is generally ample interannual variation in this statistic. In this way, NAA describes anomalous or “atypical” wetness (positive NAA) or dryness (negative NAA).

NAA (*40*) is a composite measure of deviations from an expected hydrograph. Its components are events that occur outside of the expected season (timing component) or are larger than expected for a given season (magnitude component), happen more often (frequency component), or last longer (duration component) than those constituting the characteristic signal. Collectively, the timing, magnitude, frequency, and duration describe the shape of the annual hydrograph and its departure from the norm. Hence, we developed spectral-based metrics for magnitude, frequency, and duration [following (*38*)] as well as newer metrics of timing that are relevant to flood pulse systems (Fig. 1). Noteworthy new metrics are IFI, which measures the number of contiguous days between high-flow anomalies, and IDI, which measures contiguous days between negative anomalies. Finally, transition time measures the number of contiguous days between LSAM and HSAM (the anomalies with the highest negative and positive spectral anomaly magnitudes).

Using these spectral metrics, we set out to quantify changes in the flood pulse and measures of daily hydrologic variation (from DFFT) across the LMB before and during periods of active dam construction (pre-dam, 1910–1964; post-dam, 1965–2008). The year 1965 marks the closure date of Ubol Ratana dam on the Phong River in Thailand, the first dam on the Mekong system. Over the ensuing 6 years (1965–1971), four additional dams were built in tributaries, resulting in five reservoirs with a total of ~11.56 km^{3} in active storage, the bulk of which is held in Nam Ngum 1, Lao PDR (7 km^{3}). Given the large increase in tributary storage between 1966 and 1971, we hypothesized that filling and ensuing operations of these five reservoirs changed the timing of daily high and low flows (within years) and that changes in timing are observable as changes in NAA and its components.

*Windowing with bootstrapped baseline*

The direct before/after comparison provides a means for measuring absolute but not relative change in key characteristics of the hydrograph. Here, we present a method for prescribing a baseline trend and signal from the historical record (before 1965) that is then used as a reference for comparing current hydrology (after 1965). Our baseline is appealing because it is derived from a distribution of baselines resampled from different windows of the historical climate record. This windowing approach requires a long time series, and the Stung Treng record provided adequate sample size to develop a robust baseline.

We used the stage data set from the Mekong River at the Stung Treng gage from 1910 to 2008. As above, we split this time series into a historical record (1910–1964) and a current record (1965–2008). To develop a historical baseline, we windowed the DFFT over all 37 consecutive 20-year time series from 1910 to 1965 (e.g., 1910–1919, 1911–1920, …, 1946–1965). In each window, we estimated the trend, characteristic phase, amplitude, and frequency of the signal. This produced a sample of 37 point observations, which we then bootstrapped to generate unbiased measures of central tendency for these historical baseline parameters. The historical baseline was then estimated using the bootstrapped median for the trend, amplitudes, frequencies, and phases. This baseline is a detrended sinusoidal, which is then used as a reference for computing anomalies in the current record.

With a detrended baseline in hand, we then estimated departures from that historical expectation as residuals (anomalies) of detrended observations referenced to the baseline for the entire time series. This differs from direct before/after comparison in that daily discharge is referenced to the detrended signal from the historical record rather than the current record. As in the historical analysis, we did this via a window-bootstrapping procedure in which we compared observations of daily discharge from all consecutive 20-year windows to the historical baseline. In each window, we estimated all stochastic properties outlined in Fig. 1. We also computed FPExt and related traditional flood pulse metrics in each window, which does not require a baseline for comparison.

The end result of the windowed bootstrap is a pair of distributions for each metric of hydrologic variability, one for the historical record and another for the current record, both referenced to the average historical baseline. Visual inspection of overlap of distributions allows for heuristic assessment of statistical change. In addition to this comparison of distributions, we plotted trends of historical and current records chronologically to provide a tool for visualizing breakpoints in direction. For several key parameters (FPExt, NAA, IFI, HSAM, and LSAM), we conducted a formal breakpoint analysis. Specifically, to test for significant breakpoints, we used a divisive hierarchical estimation algorithm for multiple change point analysis (*49*, *50*). This nonparametric approach is robust to violations of key assumptions of regression and provides a tool for identifying multiple transitions in the time series.

### Step 2: Estimating the influence of hydrologic variation on fishery catches

*The multivariate autoregressive state-space (MARSS) modeling approach*

The core of the hydro-fisheries model (Fig. 7) is a MARSS model that connects measures of hydrologic variation to CPUE. We first extracted hydrologic covariates from the stage data set at the Stung Treng gage. This was done for the period 1993–2012, to be consistent and overlapping with the Tonle Sap Dai catch data set (1996–2012). Analysis of this 20-year time series yielded annual estimates of FPExt, NAA, IFI, IDI, transition time, HSAM, LSAM, HSAF, LSAF, and timing of annual HSAM and LSAM, for a total of 11 possible hydrologic covariates. Then we modeled total catch at all 14 Dais using state-space versions of multivariate autoregressive (MAR) models, or MARSS models. MAR models have been traditionally used in econometrics, and more recently in some ecological applications, to quantify temporal population and community trends and their drivers [reviewed in (*36*)]. MAR models rely on theory about the patterns of temporal correlation that emerge from environmental drivers and species interactions (*42*) and are advantageous because they do not require ecophysiological parameters. Moreover, MARSS models allow estimation of non-process or observation error in the data, which is important because ignoring observation error can change our inference about the underlying process (*44*). Our fish count data certainly contained observation error that could potentially bias the measured influence of abiotic drivers (hydrology) on fishery catch. Therefore, MARSS allowed us to separate the variation in the fish count data attributable to observation error from the variation attributable to true population fluctuations. We used the “MARSS” R-package (*39*), which provides support for fitting MARSS models with covariates (i.e., annual measures of hydrologic variation) to multivariate time-series data via maximum likelihood, using an expectation-maximization algorithm. In the matrix form, a MARSS model takes the form following Eq. 1 and 2 above (Eq. 1 is the state process; Eq. 2 is the observation process).

Data enter the model as **y** (in Eq. 2) and as **c** (in Eq. 1), where **y*** _{t}* is fish abundance (here, total catch of all species combined) at each Dai, and

**c**

*is a suite of metrics of discharge variation used as possible covariates (see below). Catch data are modeled as a linear function of the “unobservable” or true catch (*

_{t}**x**

*) and*

_{t}**v**

*, a vector of non-process or observation errors, where observation errors at time*

_{t}*t*are multivariate normal with mean 0 and covariance matrix

**R**(Eq. 2). In the state process (Eq. 1),

**B**is an interaction matrix and can model the effect of abundances on each other (here we set it to “identity,” i.e., ones in the diagonal and zeros in the off-diagonal),

**C**is the matrix whose elements describe the effect of each covariate on abundance at each Dai (hereafter discharge anomaly effects), and

**w**

*is a vector of process errors, with process errors at time*

_{t}*t*being multivariate normal with mean 0 and covariance matrix

**Q**.

The model used here assumes a single process error variance (i.e., a single diagonal value in **Q** for all Dai), as the biology of the modeled fishes should not differ across Dais. In contrast, we assumed Dai-specific observation error (i.e., 14 different diagonal values in **R**), as site-specific conditions can potentially affect capturability—and hence observation error—in different ways in each Dai. The off-diagonal structure also differed between **Q** and **R**: Whereas we estimated covariance in **Q** (as “true” fish abundances should be to some extent coordinated over time across Dais, e.g., due to stochastic events not accounted for in the hydrologic covariates that could affect all Dais simultaneously), we assumed no covariance in **R** by fixing zeroes in the off-diagonals (as it is unlikely that measurement error varies in a coordinated manner across Dais).

For all models, we used the natural log–transformed and Z-scored fish biomass time series as variates, and the Z-scored hydrologic time series as covariates. All parameters, including the ones of interest (discharge anomaly effects in **C**) were assessed via 95% CIs (1000 bootstrap samples). After fitting the models, autocorrelation in the residuals was inspected via the autocorrelation function. The MARSS modeling involved three substeps:

1) Covariate selection: We first tested for multicollinearity among the 11 covariates. As expected, NAA was correlated with almost all NAA-component drivers (i.e., IFI, HSAM, etc.). Hence, we implemented two separate MARSS-DFFT models, one with FPExt and NAA as HL drivers and the other with 11 NAA-component drivers (see description in Fig. 7). For each of these models, we subsequently tested for collinearity. FPExt and NAA were strongly positively correlated (*R*^{2} = 0.85), but collinearity was not strong enough to force exclusion of one variable from the model (VIF < 5). By contrast, NAA-component drivers were collinear (VIF > 10), so we culled this set of drivers to a group of eight that produced acceptable levels of collinearity (VIF < 5).

2) MARSS model fitting and selection: We then fitted the HL MARSS model (i.e., a model using both FPExt and NAA as covariates) and the NAA-component MARSS models. For the NAA-component MARSS models, we compared all possible unique combinations among the eight NAA-component (noncollinear) drivers. We compared model fits (using AICc) as well as coefficients for all nonzero coefficients in each model. We then used these outputs to calculate variable weights (importance of each driver) and weighted coefficient estimates (effect sizes, because catch and driver data are Z-scored). Effect sizes were compared between MARSS model families (i.e., HL versus NAA-component).

3) Inclusion of density dependence: To test the robustness of our findings under different assumptions, we compared the effect sizes obtained in the previously described HL model to those obtained in an HL model that incorporates density dependence (that is, an aggregate measure of carrying capacity of the fisheries). A deterministic model frequently used to describe density-dependent population growth is a discrete-time Gompertz model (*64*):(5)where *n _{t}* is population abundance,

*u*is the intrinsic population growth rate, and

*b*governs the strength of density dependence. When

*b*= 1, there is no density dependence. When

*b*gets closer to 0, the strength of density dependence increases; that is, the stronger is the pullback to the mean (or mean-reverting effect). On a log scale, we can substitute ln(

*n*) for

_{t}*x*:(6)which is the origin of MARSS Eq. 1 (

_{t}*39*). Because MARSS is multivariate,

**B**is a matrix, and it can model the effect of a Dai’s catch on the catch at the next time step (diagonal values). Here we compared discharge anomaly effects under HL models with

**B**diagonal values being estimated (density dependence) versus fixed at 1 (density independence).

### Step 3: Projecting future fishery catches under current and designed flow scenarios

*Generating designed flows using Fourier series*

We compared projected fishery yields achieved by two hypothetical designed flows (“Good” and “Bad”) to simulated historical flows in a stochastic framework (Fig. 4). To simulate stochastic designed flows, we created a Fourier series with design principles from MARSS-DFFT and corrupted this deterministic profile with red noise to simulate atmospheric forcing. Design principles identified by MARSS included a strong flood pulse (high FPExt) and negative NAA with shape characteristics that include a long IFI punctuated by a large-magnitude flood with multiple impulses (high HSAM and HSAF). To capture this shape, we created an irregular rectangular pulse train described by Eq. 3.

Large-magnitude negative NAA and long IFI (i.e., the Good design) can be achieved in this Fourier series by setting the duty cycle (*d* = *k/T*) low (*d* = 0.41, *k* = 150-day flood pulse) and amplitude high (*A* = 0.7). Here, the period (*T* = 365 days) is much longer than the length of the pulse (*k*) such that the pulse train has a lengthy period of zero amplitude punctuated by strong pulses. We also set the mean annual flow (MAF) to 95% of the value from the observed record used in MARSS-DFFT. By assuming lower than historical MAF, we ensure that our Good designs do not require more water than is typically available in the LMB, and are attainable even under heavier water resource withdrawals than might be required to support irrigated agriculture or municipal use in the future. In addition to this Good design, we created a Bad design with shorter IFI and a dampened flood peak, characteristics of many managed river systems. Here, the duty cycle was set higher (*d* = 0.46, *k* = 170-day flood pulse) and amplitude low (*A* = 0.5) and MAF was at 100% of historical levels.

We note here that the transfer function represented by Eq. 3 produces a much more diverse set of designs, including square, triangle, and rectified waves (also *b _{n}* = 0) and sawtooth waves (

*b*=

_{n}*A/n*π) as well as combinations of each of these shapes.

*Designed flows in a stochastic framework*

We created 1000 realizations of a single Good and a single Bad design using red noise via Eq. 4 above. Red noise creates more realistic storm sequences than random (white) noise can achieve. To simulate a stochastic set of historical (pre-dam) hydrographs, we resampled HL and NAA drivers from DFFT analysis of daily stage data at Stung Treng over the time period 20 years before the onset of construction of the first dam (1941–1960). Specifically, we reconstructed 1000 unique 20-year sequences of annual drivers by drawing years (1941–1960) from a uniform random distribution and annealing sets of drivers to create sequences for each driver that preserved the annual correlation between HL and NAA drivers.

*MARSS simulation framework*

Preserving the same model structure specified for the historical record, we projected the fishery forward using the estimated coefficients for discharge anomaly effects (in **C**) and process error variance and covariance (in **Q**). Because in this case we were forecasting “true” as opposed to “observed” abundances, there was no need for an observation process (Eq. 2) and thus no **R** matrix was used. Trajectories of fishery catch were constrained by the HL metrics described above. We ran 100 realizations in each of the 1000 realizations per each scenario (for a total of 100,000 realizations per scenario), using total catch at each Dai in 2012 as initial values. We then pooled abundances across Dai by realization, and calculated 2.5, 25, 50, 75, and 97.5 percentile catch levels at each time step (2013–2020) across realizations. Thus, probabilistic distributions of future catch represent pooled catches across Dai but arise from Dai-specific forecasts.

## SUPPLEMENTARY MATERIALS

www.sciencemag.org/content/358/6368/eaao1053/suppl/DC1

Supplementary Text

Figs. S1 to S10

Tables S1 to S3

Data S1 to S11

This is an article distributed under the terms of the Science Journals Default License.

## References and Notes

**Acknowledgments:**Supported by MacArthur Foundation award 100717 and NSF grant DEB-1541694 (G.W.H.); NSF grant CBET-1204478 (J.L.S.); NSF grant EAR-1740042 (J.L.S. and G.W.H.); the National Socio-Environmental Synthesis Center (SESYNC) with funding from NSF grant DBI-1052875 (A.R.); MAA–Ja Vesitekniikan Tuki (T.A.R.); and MacArthur Foundation awards 100718 and 15-108455-000-CSD to Conservation International. All authors thank the Mekong River Commission and the Inland Fisheries Research and Development Institute (IFReDI) of Cambodia for access to data, and C. Phen, T. Cochrane, M. Cooperman, C. Costello, T. Farrell, L. Hannah, L. Kauffman, K. McCann, E. Moody, T. Pool, N. Rooney, and S. Lim for helpful comments on the design of the research or previous versions of the manuscript. Illustrations were created by S. Deviche (Fig. 1) and M. Northrop (Fig. 6), Arizona State University. Data used in this analysis are presented in the supplementary materials and are available in raw form by request from the Mekong River Commission by writing to S.N. (sonam{at}mrcmekong.org).