Remembered reward locations restructure entorhinal spatial maps

See allHide authors and affiliations

Science  29 Mar 2019:
Vol. 363, Issue 6434, pp. 1447-1452
DOI: 10.1126/science.aav5297

Reward and the map in the brain

Recent findings suggest a more complex role of grid cells in the brain than simply coding for space. The grid map in the entorhinal cortex, which is responsible for encoding spatial information, is not as rigid as originally thought and can be distorted by environmental modifications (see the Perspective by Quian Quiroga). Butler et al. compared grid cell coding during a free-foraging task and a spatial memory task in rats. They discovered that entorhinal spatial maps restructure to incorporate the location of a learned reward. Boccara et al. tested the influence of behaviorally relevant information on the cognitive map that emerges from grid cell firing in the rat medial entorhinal cortex. They found that grid cells participate in neural coding of the goal locality, not the whole environment.

Science, this issue p. 1447, p. 1443; see also p. 1388


Ethologically relevant navigational strategies often incorporate remembered reward locations. Although neurons in the medial entorhinal cortex provide a maplike representation of the external spatial world, whether this map integrates information regarding learned reward locations remains unknown. We compared entorhinal coding in rats during a free-foraging task and a spatial memory task. Entorhinal spatial maps restructured to incorporate a learned reward location, which in turn improved positional decoding near this location. This finding indicates that different navigational strategies drive the emergence of discrete entorhinal maps of space and points to a role for entorhinal codes in a diverse range of navigational behaviors.

The ability to recall and navigate to a remembered reward location is essential to survival. The hippocampus and medial entorhinal cortex (MEC) contain cells that provide representations of self-location and orientation within the local spatial environment (15). Initial experiments suggested a dissociation between representations in these regions: spatially modulated codes sensitive to contextual features in the hippocampus and context-independent codes for position, orientation, and speed in the MEC (2, 3, 510). In contrast, recent work has shown that MEC spatial codes are flexible and adaptive (6, 1113). However, these MEC spatial coding features have primarily been observed during random foraging, whereas ethologically relevant strategies often employ more complex behaviors such as goal-directed navigation (14). Although the MEC plays a critical role in navigation (15), the degree to which remembered reward locations influence MEC neural codes remains unknown.

We recorded neural activity in the MEC and surrounding cortical areas of seven rats as they explored two arenas (1.5 m by 1.5 m) (fig. S1). In environment one (ENV1; black walls, lemon scent), rats foraged for randomly scattered crushed cereal (25, 12). In environment two (ENV2; white walls, vanilla scent), rats navigated to a remembered, unmarked 20-cm–by–20-cm zone in response to an auditory cue and received a food reward (0.5 to 1 cereal units). The rats freely foraged for randomly scattered crushed cereal between trials (10) (Fig. 1, A and B, and fig. S2). Reward trials (cue onset to reward-zone entry) occurred ≥10 times per session (Fig. 1C). After training (mean number of sessions to reach criterion = 15; range = 8 to 24), animals took rapid, direct paths to the reward zone upon cue onset (Fig. 1D).

Fig. 1 Performance of a task induces grid rotation and rescaling.

(A) Schematic of environments. (B) Trajectories (gray) from a paired session. Trial trajectories are shown in color. Mean trial circuity and trial time are noted below the ENV2 trajectories. The reward zone is outlined in red. (C) Histogram of intertrial intervals. (D) Circuity and trial time improved with training in individual animals (gray lines). Data are aligned to each animal’s first post-trained session (red line). Black lines indicate the median value across animals. (E) (Left) Grid cell rate maps in both environments; peak firing rate and grid score are noted atop each plot. (Middle) Corresponding autocorrelations; spacing and orientation are noted atop each plot. Red lines indicate grid axes; white text indicates ellipticity. (Right) Corresponding ENV1-ENV2 cross-correlations. Distance from the cross-correlation’s center to the nearest peak is noted atop each plot. (F) (Left) Grid cell orientations. Red lines indicate rotations equivalent to modulo 60. (Right) Histogram of grid orientation differences for experimental and control animals. (G) (Left) Grid cell spacing. Red line indicates identical spacing. (Right) Histograms of grid spacing ratio. (H) (Left) Grid cell ellipticities. Red line indicates identical ellipticity. (Right) Histograms of ellipticity ratio. (I) Scatter plots of the innermost six fields in each grid cell’s autocorrelation. Orange lines represent north-south–aligned axes; blue lines represent east-west–aligned axes. (J) Unpaired grid cell recordings from four animals, clustered into modules according to spacing and orientation. (K) Mean orientations (left) and spacings (right) in each environment for each of the six modules in (J). Error bars indicate SEM.

We considered the coding features of 778 cells recorded in both environments (fig. S3). We identified cells as encoding position (P), head direction (H), or running speed (S), then further classified P-encoding cells as grid, border, or nongrid, nonborder spatial cells (12). Between environments, we observed equal proportions of grid and border cells, as well as of cells encoding P, H, or S (fig. S4A). Stability, information content, and average and peak firing rates did not change between environments, apart from the firing rates of grid cells (fig. S4, C to E). Multiple features of local field potential theta oscillations (6 to 10 Hz) were also similar between environments (fig. S5).

We next asked whether task demands alter the structure of MEC firing patterns (6, 9). Grid cells’ (n = 102 cells) firing patterns reorganized between environments, despite their shared geometric shape and size (Fig. 1E and table S1). First, the orientation of the grid pattern rotated (median absolute orientation change: 12.53°, P = 1.12 × 10−12) (Fig. 1F). These rotations varied across animals (mean rotation range: −27° to +7°) and resulted in grid orientations that were less environmentally aligned in ENV2 compared to ENV1 (P = 0.001) (Fig. 1I) (13). Second, there was a small decrease in grid spacing (P = 0.015) (Fig. 1G), but not in field size (P = 0.85), in ENV2. Third, we observed less-elliptical grid patterns in ENV2 (P = 0.006) (Fig. 1H). Finally, we observed a translation in the grid pattern (fig. S6, D to G) (16, 17). Co-recorded grid cells changed coherently and maintained their phase offsets (fig. S6A). The observed grid orientation, scaling, and ellipticity changes also held for unpaired grid cell recordings clustered into modules (Fig. 1, J and K) (18). Overall, 49 of 102 grid cells showed a statistically significant change on at least one measure (fig. S6B), with changes largely conserved within animals (fig. S6, C to H). Notably, we observed grid pattern translation but not orientation, spacing, or ellipticity changes when ENV1 and ENV2 had the same behavioral demand (random foraging, n = 3 rats), although there was no difference in the change in grid spacing between groups (Fig. 1, F to H, right; and table S1) (11, 13, 16, 17).

Consistent with task demands restructuring MEC representations, head direction, border, and nongrid spatial cells reorganized between environments. Head direction (HD) cells coherently rotated their preferred direction within sessions and animals (both P < 0.002) (Fig. 2A and fig. S7, A to C), with 70 of 132 cells exhibiting significant changes in tuning. Rotations were consistent with the rotation in grid orientation (all HD-grid cell pairs: correlation coefficient r = 0.45, P = 0.02; averaged within sessions: r = 0.70, P = 0.02) (Fig. 2, B and C). A majority (24 of 36) of border cells remapped between environments, primarily through rotations (Fig. 2, D and E) (6). Lastly, 196 of 271 nongrid spatial cells significantly remapped between ENV1 and ENV2, with task-trained animals showing more remapping than free-foraging controls (task-trained mean correlation coefficient ± SD: 0.32 ± 0.22; control: 0.41 ± 0.27; 49 of 100 control cells remapped, proportions test P = 3 × 10−5) (Fig. 2G and fig. S7E). We observed no changes in S-encoding cells (fig. S7, F and G).

Fig. 2 Performance of a task induces remapping in head direction, border, and nongrid spatial cells.

(A) (Top) Four co-recorded HD cells in each environment. The rightmost panel indicates each cell’s rotation between environments. (Bottom) Rotation angles observed across sessions. Gray lines indicate boundaries between animals. (B) Co-recorded grid and HD cells. (Top) HD tuning curves. (Bottom) Grid cell autocorrelations with grid axes. Co-rotation of grid and HD signals shown by rotating the ENV1 grid axes by the rotation observed in co-recorded HD cells (blue dashed lines). (C) (Gray) HD cell orientation change (between environments) versus grid cell orientation change for all possible pairs of co-recorded HD and grid cells. (Blue) Same data, with all HD or grid cells recorded within the same session averaged together. (D) Border cell rate maps in ENV1 and ENV2. (E) Histograms of border cell rate map ENV1 versus ENV2 correlation coefficients (left) and rotation values (right). (F) Nongrid spatial cell rate maps in ENV1 and ENV2. (G) (Left) Histogram of nongrid spatial cell rate map ENV1 versus ENV2 correlation coefficients (black, cells with significant remapping; gray, nonsignificant remapping). (Right) Histogram of the difference in spatial stability between ENV1 and ENV2.

We next examined whether spatial restructuring incorporated the remembered reward location. As running speed and spatial sampling differed between environments (fig. S2), we first downsampled the data to match in speed and position occupancy between environments (3, 5, 12). The relative activity of grid and nongrid spatial cells increased near the reward zone in ENV2 compared with ENV1 (signed-rank test, normalized activity versus distance slopes, grid: P = 0.0025; nongrid: P = 5 × 10−4) (Fig. 3, A and B, and fig S8, A to D). The robustness of this effect was reinforced by the observation of the same effect at the level of individual animals (fig. S8A) and was not driven by increased occupancy near the reward zone (fig. S8, E to H). Directional and nondirectional grid cells showed comparable reward-related firing increases (fig. S8, I and J).

Fig. 3 Grid and nongrid spatial cells have localized firing-rate changes near the reward zone.

(A) (Left) Mean normalized grid cell firing rate as a function of distance from the reward zone. Shaded regions indicate SEM. (Right) Difference in grid cell firing rate (ENV2 compared with ENV1). (B) (Left) Mean normalized non-grid spatial cell firing rate as a function of distance from the reward zone. (Right) Difference in firing rate (ENV2 compared with ENV1). (C) (Left) Rate maps for three grid cells recorded in both environments. (Right) Corresponding field peak firing rates, plotted as a function of the field’s distance from the reward zone. Best-fit lines are shown; the difference between the best-fit lines (slope ENV2 − slope ENV1) is indicated in the upper left corner of each plot. (D) (Top) Best-fit slope values for each cell in ENV1 and ENV2. (Bottom) Histogram of slope differences for grid cells. FR, firing rate. **P = 0.01. (E) (Top) Distance from the reward zone to the highest FR field in each environment for each cell. (Bottom) Histogram of distance differences. **P = 0.01. (F) Nongrid spatial cells that show reward preference in ENV2 correspond to four categories of remapping (I, II, III, and IV); two examples per group are shown. (G) (Top) Fraction of reward-preferring cells in each remapping category (of 159 total reward-preferring cells). (Bottom) Fraction of cells in each remapping category that show reward preference.

We next investigated how grid cells restructure their firing toward the reward zone (fig. S9A). Our observation of coordinated translations between simultaneously recorded grid cells (fig. S6A) eliminated the possibility that cells translate independently. Emergence of new grid fields, distortion of the grid pattern, and systematic reshaping of grid fields were also eliminated, as we did not observe changes in grid score (fig. S4B), the number of fields, or the distance between the reward zone and closest field (fig. S9B and table S2). Moreover, we did not observe changes in field size or eccentricity as a function of fields’ proximity to the reward zone (fig. S9C). Finally, we examined whether grid field rate-remapping (19) shows reward specificity, such that fields near the reward zone exhibit higher firing rates. We did not observe significant changes in the overall field peak firing rates or coefficient of variation among field peak firing rates (fig. S9D). However, the peak firing rate of grid fields closer to the reward zone was higher in ENV2 (P = 0.01) (Fig. 3D), and the distance from the reward zone to the grid field with the highest firing rate was smaller in ENV2 (P = 0.01) (Fig. 3E and fig. S9, E and F).

We then investigated how nongrid spatial cells (n = 271 cells) remapped to support reward-localized changes in firing rates. Nongrid spatial cells did not extend their firing fields in a reward-specific manner, as average field size, total field area, and number of fields did not change (fig. S10A). Instead, many cells (n = 159 cells) heterogeneously remapped to preferentially encode the reward location (Fig. 3, F and G, and table S3). First, some cells (group I) exhibited coherent spatial tuning in both environments, with a firing field located closer to the reward zone in ENV2 (P = 2 × 10−5). A second group of cells (group II) exhibited coherent spatial tuning in ENV1, with the field farther from the reward than expected by chance (P = 0.02). Third, a population of cells (group III) had coherent spatial tuning only in ENV2, and this activity was closer to the reward zone than expected by chance (P = 0.002). Finally, group IV did not exhibit any coherent spatial fields but exhibited increased activity near the reward zone in ENV2. The proportion of cells exhibiting reward preference did not depend on the group type (all P > 0.05) (Fig. 3G, bottom). Further, reward preference and other coding features did not cluster (fig. S10, B and C).

We next asked whether these changes reflected neural activity during the spatial task trials or were persistent throughout the ENV2 recordings. We analyzed two rate maps for each ENV2 session: one for task trajectories (tone onset to zone entry) and one for speed- and position-matched no-task trajectories (Fig. 4, A and B). Grid cells’ average firing rate did not differ between the task and no-task trajectories, though nongrid spatial cells had higher firing rates during task times (fig. S11). Notably, task and no-task maps both exhibited significant increases in normalized activity near the reward zone (Fig. 4, C and D, and table S4), indicating that the reward influence was present throughout the session.

Fig. 4 Long-term changes in the spatial map support spatial decoding near the reward.

(A) Rate maps of the full ENV2 session (left), task trajectory (middle), and no-task trajectory (right) speed-matched for each position bin. (B) (Left) Average normalized firing rate as a function of distance from the reward zone for task (orange) and no-task (green) trajectories. Data are for the cell featured in (A). (Right) Mean running speed during task and no-task trajectories as a function of distance from the center of the reward zone, before and after speed-matching. (C and D) (Left) Average normalized firing rate for grid (C) and nongrid position (D) cells as a function of distance from the reward zone. (Right) The slopes of both task and no-task trajectories were significantly negatively distributed for grid (C) and nongrid position (D) cells. (E) Example decoding error maps for ENV1 (left), ENV2 (middle), and the normalized difference (ENV2 − ENV1; right) from a single session (ENV1: n = 6 P-encoding cells, ENV2: n = 5 cells). (F) Normalized error (left) and ENV2-ENV1 error difference (right) as a function of distance from the reward zone for the example in (E). (G) Normalized error versus distance from reward zone for each environment, averaged over all decoding sessions (n = 43 sessions). (H) Average difference in error (ENV2 − ENV1) for all sessions. (I) Distribution of slopes of ENV2-ENV1 tuning curves across sessions. (J) Across all sessions, the decoding error within 30 cm of the reward zone is lower in ENV2 than ENV1 (median difference in error = −4.3 cm, signed-rank P = 0.028).

Finally, we asked how the task-associated changes in MEC representations could affect navigation. MEC representations can support vector navigation by providing unique combinations of spatial firing patterns, which downstream neurons may use to estimate the distance between an animal’s position and a goal location (20). We estimated the animal’s position using the activity from simultaneously recorded neurons in ENV1 and ENV2 (Fig. 4, E and F). Using a Bayesian decoder, we observed that the decoding accuracy increased near the reward zone in ENV2 compared with ENV1 (ENV2 slope > ENV1 slope for 27 of 43 sessions, median slope difference = 1 × 10−3, signed-rank P = 0.042) (Fig. 4, G to I, and fig. S12). Moreover, the improved position decoding was highly localized to the reward zone, with a decrease in decoding error in ENV2 observed up to 30 cm from the reward-zone center (Fig. 4J and fig. S12, A to C). Reward-related decoding did not consistently covary with fluctuations in task performance (fig. S12, E and F).

Our understanding of how remembered reward locations mediate MEC navigational codes has lagged owing to a lack of task diversity. Here, we report that the firing rate and spatial pattern of MEC representations restructure in response to changes in navigational strategy. This restructuring did not reflect trajectory-specific coding, as previously observed in the MEC (21), which suggests that task-relevant features of the two environments evoked separate long-term map representations (17). However, the precise parameters of MEC map restructuring may depend on experience and task familiarity, as recent work indicates (22). Combined, our data point to the MEC as a region capable of dynamically altering its coding features to integrate relevant contextual features to support a range of navigational strategies.

Supplementary Materials

Materials and Methods

Figs. S1 to S12

Tables S1 to S4

References (2432)

References and Notes

Acknowledgments: We thank A. Borrayo and A. Diaz for histology assistance. Funding: L.M.G. is a New York Stem Cell Foundation–Robertson Investigator. This work was supported by funding from the New York Stem Cell Foundation, NIMH MH106475, NIDA DA042012, Office of Naval Research N000141812690, Simons Foundation 542987SPI, and the James S. McDonnell Foundation awarded to L.M.G. and a Stanford Interdisciplinary Graduate Fellowship awarded to K.H. Author contributions: L.M.G., W.N.B., and K.H. conceptualized experiments and analyses. W.N.B. and K.H. performed implantations and collected and analyzed data. All authors wrote the paper. Competing interests: The authors declare no competing interests. Data and materials availability: Data are available at Code is available at Zenodo (23).

Stay Connected to Science

Navigate This Article