Ocean Model Formulation Influences Transient Climate Response

The transient climate response (TCR) is 20% higher in the Alfred Wegener Institute Climate Model (AWI-CM) compared to the Max Planck Institute Earth System Model (MPI-ESM) whereas the equilibrium climate sensitivity (ECS) is by up to 10% higher in AWI-CM. These results are largely independent of the two considered model resolutions for each model. The two coupled CMIP6 models share the same atmosphere-land component ECHAM6.3 developed at the Max Planck Institute for Meteorology (MPI-M). However, ECHAM6.3 is coupled to two different ocean models, namely the MPIOM sea ice-ocean model developed at MPI-M and the FESOM sea ice-ocean model developed at the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI). A reason for the different TCR is related to ocean heat uptake in response to greenhouse gas

although ocean heat uptake and inhomogeneous Pacific surface warming that may not be properly simulated by CMIP5 models have been recently acknowledged as a source of uncertainty (Sherwood et al., 2020).
Typically, a coupled atmosphere-ocean model needs to be spun up for at least 500 years to account for adjustments on decadal to centennial time scales  while it may take 5,000 years to achieve an equilibrium of the deep ocean (Li et al., 2012). The e-folding time (time when the difference between simulated and equilibrium temperature of the deep ocean has decayed to 1/e times the difference between initialized and equilibrium temperature) of a previous version of the Alfred Wegener Institute Climate Model (AWI-CM) has been found to be about 800 years . For a radiation imbalance, the effectiveness of vertical mixing and diffusion at high latitudes as well as the thermohaline circulation have been found to be important for the heat redistribution in the ocean (Banks & Gregory, 2006;Gregory, 2000;Hansen et al., 2011). Depending on the intensity of the circulation and vertical mixing, the ocean can be a heat sink of different intensity as a response to greenhouse gas forcing which may also influence the climate sensitivity, especially the TCR. It has been found that the ocean takes up 90% or more of the radiation imbalance over the last decades (von Schuckmann et al., 2020;Meyssignac et al., 2019;Rhein et al., 2013).
The role of the ocean for the climate sensitivity, especially TCR, has been investigated in various previous studies : Winton et al. (2014) examine the role of horizontal resolution (eddy parameterizing vs. eddy resolving) on the TCR and attribute changes in the TCR to differences in initial Atlantic meridional overturning circulation (AMOC) as well as AMOC decline and Southern Ocean surface warming under increasing greenhouse gas concentrations. Models with weaker AMOC decline tend to show higher TCR to ECS ratios. He et al. (2017) point out that starting from different ocean climates can have an influence on ocean circulation changes. In their study, a simulation with initially weaker overturning shows a smaller overturning decrease than the one with initially stronger overturning. The weaker overturning decrease leads to stronger high latitude surface heating -since the poleward energy transport is less decreased -and thus stronger TCR compared to their simulation with stronger overturning decrease. Furthermore, convection regions in the high latitudes such as Labrador, Weddell, and Ross Sea play an important role for the redistribution of the heat. In addition, TCR may be strongly influenced by the calculation method, and long control simulations are important to get a robust estimate of TCR (Liang et al., 2013). This is in line with investigations of variability in a pre-industrial control state which is stronger than the variability in a transient state with increasing greenhouse gas concentrations (Brierley et al., 2009). Newsom et al. (2020) argue that differences in surface warming and its influences on ventilation especially in the Southern Ocean and subtropical areas are important for the heat uptake of the ocean.
Two models in the CMIP6 archive, MPI-ESM and AWI-CM, offer the opportunity to examine the influence of a different ocean sea-ice model formulation on the climate while leaving the atmosphere and land surface component untouched. In contrast to previous studies (e.g., Winton et al., 2014;Yokohata et al., 2007), we consider not only different horizontal ocean resolutions within one model, but also consider completely different ocean models. Both models utilize the atmosphere and land surface model ECHAM 6.3/JSBACH 3.2.0. For the ocean and sea ice, MPI-ESM uses MPIOM and AWI-CM FESOM. The models are documented in Mauritsen et al. (2019) and Semmler et al. (2020). For both models, two different configurations are considered in this study: the low resolution (LR) version with dynamic vegetation in JSBACH and the high/medium resolution without dynamic vegetation in JSBACH.
The MPI-ESM versions have been tuned to an ECS of 3°C using cloud feedback parameters to match the observed historical 2 m temperature evolution over the last 150 years (Mauritsen & Roeckner, 2020). Nevertheless, since in the AWI-CM setups the same ECHAM 6.3/JSBACH 3.2.0 version as in MPI-ESM has been used without any further tuning, different ECS and TCR in the AWI-CM setups can be interpreted as the influence of the different ocean model formulation and the slightly different land sea masks resulting from the coupling to the different ocean model formulation. A third candidate for this comparison could be the FOCI model (ECHAM6.3/ JSBACH 3 coupled to the ocean model NEMO: Matthes et al., 2020). However, in this case the ECHAM 6.3 model has been differently tuned and therefore the reasons for different evolutions of the ocean state cannot be purely ascribed to the different ocean model formulation.
In Section 2 a brief model description of both coupled systems is given; Section 3 compares and explains the results of the different models. In Section 4 the results are discussed in the light of previous studies and in Section 5 conclusions are given.

Model Components
In the following, a brief summary of the model components used in this study is given. We briefly characterize the model components rather than the coupled models since both coupled models share the same atmospheric and land components. Both coupled models employ the coupler OASIS3-MCT_3.0 (Craig et al., 2017).

ECHAM 6.3/JSBACH 3.2.0
The general circulation model for the atmosphere is ECHAM 6.3 as it is implemented in the CMIP6 version of MPI-ESM (Mauritsen et al., 2019;Mauritsen & Roeckner, 2020;Müller et al., 2018). ECHAM consists of a dry spectral-transfer dynamical core, a transport model and a suite of physical parameterizations. The vertical discretization employs a hybrid sigma-pressure coordinate system. ECHAM's design principles and features are described in detail in Stevens et al. (2013) and Mauritsen et al. (2019) account for changes from the CMIP5 to the CMIP6 generation of ECHAM. The model is applied here in two configurations that differ in horizontal and vertical resolution. The LR version applies a triangular truncation of the spherical harmonics to 63 wave numbers (T63). In physical space this transfers to a grid spacing of roughly 200 km. In the vertical, the LR version employs 47 levels. The higher-resolution (HR) set-up has a T127 truncation (roughly 100 km) and 95 vertical levels. Both versions resolve the atmosphere up to 0.01 hPa or about 80 km.
The land component is represented in the JSBACH model (Mauritsen et al., 2019;Reick et al., 2013Reick et al., , 2021. JS-BACH 3.2.0 provides the lower boundary conditions for the atmosphere and includes the land geochemistry, soil hydrology, the terrestrial carbon cycle, and a river-routing scheme. In the LR version of JSBACH 3.2.0, dynamic vegetation is considered while in the high resolution version the vegetation is prescribed. While for the AWI model configurations this difference is reflected in the model name of the coupled system (AWI-ESM with dynamic vegetation and AWI-CM without dynamic vegetation) this is not the case for the MPI model configurations (both are called MPI-ESM).

FESOM 1.4
A detailed description of FESOM 1.4 is given by Wang et al. (2014). This model is the first global sea ice ocean model to use unstructured meshes with variable resolution for climate research (Wang et al., 2014). The mesh flexibility allows to increase resolution in dynamically active regions, while keeping a relatively coarse resolution elsewhere. FESOM allows global multi-resolution simulations without traditional nesting. The dynamical core of FESOM 1.4 employs the finite element method to solve the primitive equations. The mesh is composed of horizontal triangles that constitute the faces of three-dimensional prisms which are cut into tetrahedral elements. The ice module FESIM uses 0-layer thermodynamics and elastic-viscous-plastic rheology. Small-scale mixing along isopycnals as well as tracer stirring through eddies are parameterized (Gent & McWilliams, 1990;Redi, 1982) and scaled by the local horizontal resolution. Vertical mixing is implemented via KPP (Large et al., 1994). FES-OM1.4 is used in this study in two different horizontal resolutions while maintaining the same vertical 46 levels. The LR varies between about 25 km in the Arctic and the tropics and around 100 km in the subtropical regions ( Figure 1a). The high resolution features resolutions as high as 8 km over key ocean regions such as the Gulf Stream/North Atlantic Current area, parts of the Southern Ocean as well as coastal regions (Figure 1b). Over the subtropical areas the horizontal resolution is around 80 km.

MPIOM
MPIOM is the ocean component of MPI-ESM Mauritsen et al., 2019;Müller et al., 2018). MPIOM applies the Boussinesq and hydrostatic approximations and is discretized on a Arakawa-C grid in the horizontal and on a z-level grid in the vertical direction (Marsland et al., 2003). Subgrid-scale parameterizations such as those for lateral mixing on isopycnals and tracer transports by unresolved eddies are described in Jungclaus et al. (2013). Vertical mixing employs a Richardson-number dependent formulation (Pacanowski & 4 of 19 Philander, 1981) and directly wind-induced mixing in the mixed layer (Marsland et al., 2003). The configurations used in this study differ in their horizontal grid design ( Figure 1). The LR version uses a bi-polar grid, where one grid pole is located over Antarctica, the other over Greenland. The resulting grid features enhanced resolution in the northern deep water formation regions and the Greenland-Scotland Ridge (Figure 1c). The nominal 1.5° resolution is therefore transformed to less than 20 km near Greenland and almost 200 km in the tropical Pacific. The three-pole "HR" set-up  has a more uniform resolution of 0.4° (Figure 1d), which can be classified as "eddy-permitting." The vertical dimension is represented in both cases by 40 levels with the first 20 levels covering the upper 700 m. The dynamical sea-ice model in MPIOM uses a viscoplastic rheology following Hibler (1979) and the thermodynamic representation of sea ice is based on a simple zero-layer mono-category formulation (Semtner, 1976). The sea-ice model is basically unchanged from the version described in Notz et al. (2013).

Simulations
In this study we use piControl, 1pctCO2, and abrupt-4xCO2 simulations from the Coupled Model Intercomparison Project 6 (CMIP6). A detailed description about the experimental protocol of these "Diagnostics, Evaluation and Characterization of Klima" (DECK) simulations is given in Eyring, Bony, et al. (2016). In short, piControl is a long control simulation of at least 500 years with constant 1850 greenhouse gas, aerosol, ozone, and solar forcing. Both the 1pctCO2 and the abrupt-4xCO2 simulations are branched off from the piControl simulation. For 1pctCO2, the CO 2 concentration is increased by 1%/year for 140 years reaching a quadrupling at the end. For abrupt-4xCO2, the CO 2 concentration is quadrupled instantaneously at the beginning of the simulation and the simulation has been carried out for 150 years. The simulations of four model configurations are used in this study: AWI-CM-1-1-MR, AWI-ESM-1-1-LR, MPI-ESM1-2-HR, and MPI-ESM1-2-LR, for simplicity in the following referred to as AWI-MR, AWI-LR, MPI-HR, and MPI-LR. Overviews over the results of these simulations have been published in Semmler et al. (2020) for AWI-MR, and for MPI-LR and HR in Mauritsen et al. (2019), Müller et al. (2018), and Mauritsen and Roeckner (2020). Data of all four configurations are published at the Earth System Grid Federation (ESGF) (Danek et al., 2020;Jungclaus et al., 2019;Semmler et al., 2018;Wieners et al., 2019).
In the following, we classify AWI-MR and MPI-HR as high resolution configurations and AWI-LR and MPI-LR as LR configurations. AWI-MR consists of the high resolution version of FESOM1.4 coupled to the high resolution version of ECHAM6.3. It is called AWI-MR and not AWI-HR because in CMIP6 this set-up is the MR Unless otherwise stated, we use years 60-80 of the 1pctCO2 simulation and compare these to the corresponding years 60-80 (after branch-off of the corresponding 1pctCO2 simulation) of the piControl simulation. Since in the 1pctCO2 simulations we are in a strongly transient climate, we opt for a relatively short averaging time period as we otherwise may mix signals from substantially different states of the climate system.

ECS and TCR
For the calculation of the ECS and the TCR, we use the same methodologies as defined in the ESMValTools (Eyring, Righi, et al., 2016). The ECS is computed according to the method of Gregory et al. (2004). For each year, the near-surface (2 m) air temperature change and the change in net downward radiative flux between the abrupt-4xCO 2 and piControl simulations are computed. For this, a linear fit of the piControl simulation to the 150 years corresponding to the years 1-150 of the abrupt-4xCO 2 simulation is calculated. These annual fitted values are subtracted from the abrupt-4xCO 2 simulation annual mean values. To compute the equilibrium temperature difference, a regression is built from all insofar detrended data points and extrapolated to the equilibrium (net shortwave radiation change = 0). The ECS is then obtained by dividing the equilibrium temperature difference by two. The TCR is computed as the globally averaged 2 m temperature change of the 1pctCO2 simulation versus piControl (CO 2 -doubling is reached after ∼70 years) averaged over the years 61-80. Here, the linear fit detrending was based on the first 140 years of the piControl simulation counted from the branch point of the 1pctCO2 simulation.

Two-Layer Energy Balance Model (EBM)
Simplified climate models relating the global mean surface temperature to a prescribed effective radiative forcing (Geoffroy et al., 2013;Held et al., 2010;Mauritsen et al., 2019) are able to emulate the thermal properties and time-dependent responses seen in coupled atmosphere ocean models (AOGCM). Using analytical solutions, Geoffroy et al. (2013) provided a method to calibrate the parameters of a two-layer energy balance model (EBM) so that it can mimic a specific AOGCM. The first layer corresponds to the atmosphere, the land surface and the upper ocean, the second layer represents the deeper ocean below the mixed layer. The equations can be written as follows: Here the prognostic variables T and T 0 are the temperature perturbation at the surface and a characteristic temperature perturbation of the deeper ocean, respectively. F is the "effective" radiative forcing. The other symbols are free parameters: C and C 0 are effective heat capacities of the upper and deeper layer, λ the climate feedback parameter and γ is the heat uptake coefficient of the deeper ocean. The heat capacities C and C 0 correspond to equivalent ocean layer depths D and D 0 using equation 22 of Geoffroy et al. (2013).
In a first step the feedback parameter is estimated by a linear regression of the radiative imbalance at the topof-the-atmosphere as a function of the surface temperature perturbation (Gregory et al., 2004 The parameters can be determined by fitting the global mean surface air temperature response and regression analyses (see Geoffroy et al., 2013 for details). The solution includes the characteristic time scales for fast (τ f ) and slow (τ s ) response. This method is applied to the abrupt 4xCO 2 experiments where the effective radiative forcing is constant in time and is determined following Gregory et al. (2004).

Results
ECS and TCR from the four model configurations are shown in Table 1. The four values of TCR are around the mean value of 2.0°C from the CMIP6 simulations available at ESGF in March 2020 (Meehl et al., 2020), while the four values of ECS are all below the mean value of 3.7°C from these CMIP6 simulations (Meehl et al., 2020). Compared to the range of existing TCR values (1.3-3.0°C) and existing ECS values (1.8-5.6°C) according to Meehl et al. (2020) differences between the four model configurations are rather small (up to about 20% for TCR and up to about 10% for ECS). However, there are some important differences between ocean characteristics of the four model configurations that lead to the differences especially in the TCR which are described in the following.
The ocean surface heats stronger in most areas in the AWI simulations compared to the MPI simulations in the 1pctCO2 experiment (Figures 2c and 2i), consistent with their higher TCR. In fact, in the AWI simulations the ocean surface heats everywhere (Figures 2a and 2g) while there is an important exception in the MPI simulations in the North Atlantic subpolar gyre (Figures 2b and 2h) where a cooling of up to 1°C is simulated. This phenomenon is well known and is referred to as North Atlantic warming hole (e.g., Drijfhout et al., 2012;Jungclaus et al., 2014;Keil et al., 2020). In the LR, the surface heating is up to around 1.5°C stronger in the AWI configuration compared to the MPI configuration over the western boundary current regions (Gulf Stream and North Atlantic Current, Brazil Current, and Kuroshio Current), and over the Southern Ocean (Figure 2c). There are some limited areas around the gateways to the Arctic as well as in the Agulhas region and south of Australia in which the AWI LR configuration shows up to around 1°C less surface heating compared to the MPI LR configuration. In the high resolution (Figure 2i), strongest positive differences in the surface heating of up to around 2°C in AWI compared to MPI occur over the North Atlantic subpolar gyre, the Kuroshio Current as well as the North Pacific upwelling region off the US West Coast including the gateways to the Arctic. The pronounced positive difference over the North Atlantic gateways to the Arctic between AWI-MR and MPI-HR ( Figure 2i) is in contrast to the negative difference between AWI-LR and MPI-LR ( Figure 2c). Between AWI-MR and MPI-HR, small negative differences of up to 0.5°C are mainly restricted to subtropical areas in the Southern Hemisphere.
While globally averaged there is more ocean surface heating in the AWI configurations compared to the MPI configurations (difference AWI minus MPI: 0.19° for LR and 0.28° for HR), the opposite is true at intermediate depths of the ocean (1,000 m) (difference AWI minus MPI: −0.04° for LR and −0.1° for HR) (Figures 2d-2l). In the Southern Ocean as well as in the Arctic ocean AWI-LR heats by up to around 1°C less than MPI-LR (Figure 2f). However, over the eastern parts of the North Atlantic MPI-LR shows cooling (Figure 2e) while AWI-LR shows warming (Figure 2d) leading to very pronounced positive differences of up to about 2°C in AWI-LR compared to MPI-LR. In the high resolution configuration, the strongest negative differences between AWI and MPI of up to 1°C occur in the Atlantic Ocean south of around 45°N. In contrast, in the Labrador Sea and the GIN seas (Figure 2l) a positive difference of similar magnitude can be seen. In conclusion, AWI configurations tend to heat stronger at the surface and weaker at intermediate depths compared to MPI configurations. Regionally there are important exceptions that are dependent on the model resolution as described above.
The shallower warming in AWI configurations compared to MPI configurations can be seen especially over the Southern Ocean (Figure 3). In MPI configurations warming of at least 0.1 K penetrates into the deep layers up to at least 5,000 m already in the considered time period of 60-80 years (Figures 3b and 3e) which is not the case for AWI configurations (Figures 3a and 3d). In the case of the high resolution configurations, warming of at least 0.1 K penetrates into intermediate depths of the tropical ocean (more than 1,000 m) in MPI-HR (Figure 3e) but not in AWI-MR (Figure 3d) Tuned to be close to 3.0 (to match the observed historical development of 2 m temperature).  (Eyring, Righi, et al., 2016) changes can be seen at intermediate depths. A common feature is that AWI configurations heat less in northern high latitudes north of around 60° to 65°N than MPI configurations, in the case of LR at the surface and downward up to around 1,000 m ( Figure 3c) and in the case of high resolution between around 100 and 900 m depth (Figure 3f).
Why are the AWI configurations globally averaged warming stronger at the surface and weaker at intermediate depth than the MPI configurations? As a first step to answer this question, we investigate the change in the radiation budget over time with increasing CO 2 concentrations and as a function of latitude. The globally averaged faster surface temperature response in AWI configurations compared to MPI configurations goes along with a faster decline of surface albedo in the 1pctCO2 experiment (Figure 4b). The stronger albedo decrease in AWI configurations is such that the outgoing longwave radiation (OLR) is reduced compared to the piControl only for the first ∼100 years (Figure 4a; note that positive values are defined downward, so that reduced upwelling OLR corresponds to positive values). Thereafter the OLR change relative to piControl fluctuates around zero in AWI-MR, meaning that the imbalance is almost completely linked to change in shortwave radiation. In AWI-LR the OLR becomes even larger than in the 1pctCO2 run compared to the piControl run after ∼100 years, consistent with the even stronger albedo decrease in AWI-LR. That is not the case in the MPI configurations, where in both cases a reduced OLR still contributes around 40% to the total imbalance at the end of the 1pctCO2 runs.
The faster decline of surface albedo in AWI configurations compared to MPI configurations happens for example, in the Southern Ocean due to sea ice decline ( Figure A1), which is reflected by decreasing surface albedo and to some extent also decreasing planetary albedo (Figures 4d and 4f). Slightly further north, the LR versions exhibit a stronger planetary albedo increase compared to the high resolution versions, especially in the last 21 years of the 1pctCO2 simulation (Figures 4d and 4f). Differences in albedo changes are less pronounced in other latitudes, except for some interesting differences between low and high resolution configurations around 60°-70°N. The LR configurations have been run with dynamic vegetation, the high resolution configurations not. Therefore, the stronger albedo decrease in those latitudes could be due to a northward extension of vegetation cover in the runs with dynamic vegetation. In fact, in addition to the four configurations discussed in this manuscript, we have run AWI-LR simulations without dynamic vegetation which show a similar albedo response in those latitudes compared to the two high resolution configurations without dynamic vegetation (not shown).
AWI-MR shows a weaker anomalous inner tropical heat uptake through radiation than all other configurations, especially the LR configurations (Figures 4c and 4e). This is apparently not related to albedo (Figures 4d and 4f). At the same time, the LR configurations feature the weakest TOA imbalance change around 15°N/15°S.
It is possible to mimic the surface temperature evolution of AOGCMs with a two-layer EBM containing the atmosphere and upper ocean as one layer and deeper ocean that is still active as an energy reservoir at the considered time scales of 150 years as another layer (Geoffroy et al., 2013). The very deep ocean that is at the considered time scales quasi inactive as an energy sink is excluded in this simple model. AWI configurations show a low effective heat capacity C 0 in the deeper ocean compared to MPI configurations ( Table 2). The difference is especially pronounced between the LR configurations and amounts to more than a factor of two. Similar differences exist for the related parameters: deeper ocean equivalent depth D 0 and slow relaxation time scale τ s . In contrast, upper ocean parameters C, D, and τ f are similar between the four different configurations. By applying the coefficients given in Table 2, we can mimic the AOGCM-simulated surface temperature evolution by the EBM (Figure 5). Roughly consistent with the deeper ocean parameters, AWI-LR shows the strongest surface heating, while the two MPI configurations show the weakest surface heating.
Combining these insights with the results from Figures 2 and 3, the EBM analysis shows that AWI configurations, especially the LR configuration, simulate a stronger near-surface heating and less vertical heat exchange with the deep ocean (below the two layers of the EBM) compared to MPI configurations.
However, investigating the total ocean column including the very deep ocean not covered by the two-layer EBM, it turns out that the vertically integrated ocean heat content increases less in AWI configurations compared to MPI configurations according to the 1pct-CO2 simulations in most latitudes, especially at high latitudes ( Figure 6). Only in limited mid-latitude bands there is an increase of vertically integrated ocean heat content that is by a factor of two higher in AWI-MR compared to MPI-HR (4 Jm-2x1e-9 compared to 2 Jm-2x1e-9 at the end of the simulation). For LR configurations mid-latitude heat content increase differences between AWI and MPI models are less pronounced than for high resolution configurations. Over parts of the Southern Ocean the heat content increase is about three times less in AWI compared to MPI configurations (3 Jm-2x1e-9 as opposed to 10 Jm-2x1e-9 at the end of the simulation) and over the Arctic Ocean depending on the resolution 1.5 to 3 times less in AWI compared to MPI. This feature is more pronounced in LR compared to high resolution configurations. In the tropics, AWI configurations accumulate around 20% less additional energy than MPI configurations. Globally averaged, AWI-LR vertically integrated ocean heat content increases least, followed by AWI-MR, followed by the two MPI configurations (not shown).
Differences in mixed-layer depth (Figure 7) suggest that weaker mixing in AWI compared to MPI configurations may be key for these differences. The strongest differences occur in the Weddell and Ross Gyres, and in the northern North Atlantic. This is consistent with the fact that the strongest differences in ocean heat content increase between AWI and MPI configurations occur in the high latitudes ( Figure 6). Comparing the monthly maximum mixed layer depth with ARGO float measurement data (Holte et al., 2017), it turns out that AWI configurations underestimate mixing at northern high latitudes while MPI configurations overestimate Southern Ocean mixing, especially in the Weddell and Ross Seas. The ARGO float measurement data can neither be compared directly to the piControl nor to the 1pctCO2 simulations at the time of doubling of CO 2 but would be representative for a CO 2 concentration between piControl and doubling of CO 2 .
Vertical mixing has been identified as an important difference between AWI and MPI configurations, but what about the large-scale ocean circulation? As indicated in the discussion of Figure 2, in MPI configurations the North Atlantic subpolar gyre cools by around 1°C (Figures 2b and 2h) while this region warms by around 1°C in AWI configurations (Figures 2a and 2g). Over the GIN seas, the Barents Sea and the Gulf stream separation clearly larger positive temperature changes of up to around 5°C are simulated. In other words, there is a clear signature of the North Atlantic warming hole present in MPI configurations but not in AWI configurations. The warming hole has been previously linked to a decline in the AMOC (i.e., Drijfhout et al., 2012;Chemke et al., 2020;   Note. λ is the radiative feedback parameter, γ the heat exchange coefficient, C and C 0 the upper and deeper ocean effective heat capacities, D and D 0 the upper and deeper ocean layer equivalent depths, and τ f and τ s the fast and slow relaxation time scales.

Table 2
Parameter Estimates From the Global Mean Two-Layer Energy Balance Model According to Geoffroy et al. (2013) for AWI and MPI Configurations Keil et al., 2020;Menary & Wood, 2018). In our simulations, the AMOC starts at different levels for the different configurations, AWI-LR showing the weakest AMOC and MPI-LR the strongest (Figure 8). Here we explain the changes with the AMOC anomalies of the 1pctCO2 experiment with respect to piControl (Figure 9). While according to AWI configurations the AMOC at 26°N decreases by around 5 Sv and the northward heat transport by 0.2 PW toward the end of the 140-year period of the 1pctCO2 simulation, stronger decreases of around 8 Sv and 0.3 PW occur, respectively, in the MPI configurations (Figures 10 and 11). Keil et al. (2020) found in a large ensemble of MPI-ESM global warming experiments that AMOC changes are in fact essential for the warming hole but that they are accompanied by changes in the sub-polar gyre circulation and associated heat export to the Nordic Seas while cloud feedbacks had minor impacts on the warming hole. In the case of the MPI-ESM experiments evaluated here, no increase in heat export to the Nordic Seas has been found in the considered 1pctCO2 warming experiments in years 60-80 while a slight increase in heat export to the Nordic Seas is simulated in AWI-MR configuration (not shown). Having said this, MPI configurations do show stronger warming north of around 60°N compared to AWI configurations between 0 and 1,000 m in the case of LR ( Figure 3c) and between 100 and 900 m in the case of high resolution (Figure 3f).

Discussion and Conclusions
This study highlights the importance of the ocean and in particular the intensity of high-latitude ocean mixing especially for the TCR but also for the ECS. Different warming patterns may result in different TCR and ECS (Rose & Rayborn, 2016;Rugenstein et al., 2020) due to the stronger efficacy of high-latitude near-surface ocean warming compared to low-latitude near-surface ocean warming. One possible reason for this are differences in  cloud feedback induced by the different underlying ocean (Andrews et al., 2012;Sherwood et al., 2020). Certainly the estimated ECS in our study is prone to inaccuracies due to the application of the Gregory et al. (2004) model applied to only 150 years of data. Specifically, the ECS may be underestimated when considering only 150 years of data (Knutti & Rugenstein, 2015;Rugenstein et al., 2020).
For climate change impact studies relevant for society the TCR is more appropriate as it serves better to predict the development of the climate in the next decades (Knutti et al., 2017). An equilibrium state of the coupled atmosphere-ocean system is reached only after several thousand years (Li et al., 2012) and is therefore not as relevant for political decision-making as the TCR.
In the case of the two considered CMIP6 models in this study we found that the AWI model upper ocean layers heat faster as a response to greenhouse gas forcing compared to the MPI model upper ocean layers while the opposite is true in the deep ocean starting from around 1,000 m downward. The faster upper layer ocean warming can be mimicked with a two-layer EBM. AWI configurations, especially AWI-LR, are pumping the heat less vigorously into the deep ocean (1,000 m and below) compared to MPI configurations and therefore warm relatively fast in the upper ∼1,000 m. From the sea surface, more energy is transferred to the atmosphere and then into space as longwave radiation in AWI configurations compared to MPI configurations. As a result, the anthropogenically induced energy imbalance leads to slower heat accumulation in the global ocean column than in MPI configurations. Differences come from the high latitudes while the tropics are hardly affected. We link this to weaker high-latitude ocean mixing, particularly in the key regions Weddell and Ross Gyres and Nordic Seas in AWI configurations compared to MPI configurations.
The convection regions of the ocean which are linked to the changes in overturning strength are key for the differences in the heat redistribution of the ocean that we see in our study. Kostov et al. (2014) link the different vertical distribution of ocean heat uptake to differences in the AMOC. Weaker AMOC decrease as response to greenhouse gas forcing has been linked to stronger high latitude surface heating and stronger TCR increase (He et al., 2017;Winton et al., 2014). This is because a strong AMOC (or a weak AMOC decrease) causes a strong (or only slightly decreased) northward ocean energy transport away from the tropics to the Northern North Atlantic and enables the tropical ocean to take up extra heat. According to AWI configurations the AMOC at 26°N decreases by around 5 Sv and the northward heat transport by 0.2 PW toward the end of the 140-year period of the 1pctCO2 simulation; according to MPI configurations stronger decreases of around 8 Sv and 0.3 PW occur. This leads to a lack of a North Atlantic subpolar gyre warming hole in AWI model configurations and therefore stronger surface warming compared to MPI model configurations. In contrast, MPI model configurations show a clear warming hole over the North Atlantic subpolar gyre with local cooling in this region.
A substantial increase in TCR along with a comparably weak increase in ECS has been found previously by Yokohata et al. (2007) when increasing horizontal ocean resolution; even the values (20% TCR increase along with 10% ECS increase) are very similar to the values reported in our study -with the difference that in our case this does not happen through increasing the horizontal resolution but through replacing the ocean component MPIOM with FESOM. In both cases different high-latitude ocean mixing leads to the differences in TCR and ECS.
In our case the ocean mixed layer depth seems to suffer from different biases in the different models: AWI model configurations tend to show too weak mixing in the northern high latitudes while MPI model configurations are   more realistic compared to ARGO floats there (Figure 7). On the other hand, over the Southern Ocean the MPI model configurations show strong ocean mixing in the Weddell and Ross Gyres which is not observed from ARGO floats -here AWI model configurations seem to be more realistic.
There is an interesting aspect of tuning a model toward the observed near-surface temperature increase during the historical simulation through tuning the ECS to a certain value as done for MPI model configurations (Mauritsen & Roeckner, 2020): A higher (lower) ECS model with a stronger (weaker) deep-ocean mixing might yield a similar historical surface temperature evolution, that is, stronger deep-ocean mixing might compensate for a higher ECS when it comes to the TCR. In case one would have wanted to tune the AWI model configurations to match the historical 2 m temperature development, the tuning goal would have been toward a lower ECS compared to MPI model configurations.
Regarding the vertical ocean heat uptake distribution, Gleckler et al. (2016) analyzed the ocean heat uptake from pre-industrial times to 2010 in three different layers: 0-700 m, 700-2,000 m, below 2,000 m. They conclude that the uppermost 700 m have taken around twice as much energy compared to 700-2000 m and around four times as much energy as below 2,000 m. For a more recent time period, 1972, Von Schuckman et al. (2020 found a similar vertical distribution: the uppermost 700 m have taken almost twice as much energy as the layer between 700 and 2,000 m and almost six times as much as below 2,000 m. A quantitative comparison with our results cannot be made due to the different strength of transient forcing (1% per year in our idealized model simulations; less than half of that in the past few decades). Nevertheless the order of magnitude is similar between both models and the observations. In AWI configurations the uppermost 700 m takes around three times as much energy as the layer between 700 and 2,000 m. For AWI-MR, the uppermost 700 m takes around nine times as much energy as the lowest layer below 2,000 m while in AWI-LR the deep ocean heat uptake is negligible and therefore the factor between uppermost 700 m heat uptake and 2,000 m and below heat uptake is as large as around 50. Consistent with the stronger ocean mixing in MPI configurations compared to AWI configurations, especially the factor between the two upper layers (uppermost 700 m vs. 700-2,000 m) is limited to two in MPI-HR and even 1.5 in MPI-LR. In MPI-HR the uppermost 700 m takes around 10 times more heat than the layer below 2,000 m while in MPI-LR the factor is only four. Generally, the observations show values between the AWI and the MPI configurations. Having said this, due to the stronger transient forcing in our idealized model simulations compared to the observed greenhouse gas concentration increase, the factors are expected to be higher than in the observations. Brierley et al. (2009) state that cold ocean states in a pre-industrial climate tend to warm stronger in response to greenhouse gas forcing compared to warm ocean states. While they investigate this for different states of the ocean in one long control run we see this phenomenon in two different versions of the AWI model: AWI-LR is about 1 K colder than AWI-MR in pre-industrial climate and warms stronger as response to greenhouse gas forcing.
The faster albedo response in AWI configurations compared to MPI configurations may be due to the more rapid surface temperature response (due to weaker connection with the deep ocean as seen from EBM results and weaker mixing in AWI configurations compared to MPI configurations) rather than the other way around. This implies that the equilibrium response should be more similar between the models again. Indeed it is the TCR that is substantially different between AWI and MPI model configurations rather than the ECS. Donohoe et al. (2014) make an attempt to constrain shortwave and longwave radiation changes with satellite observations from the last decades. They come to the conclusion that OLR reduction takes place only for a few decades after greenhouse gas forcing is switched on and that afterward shortwave radiation feedbacks kick in and lead to enhanced shortwave radiation absorption at the surface. This seems to be more consistent with the AWI simulations ( Figure 4a); however a quantitative comparison is not possible because the considered simulations in our study are idealized 1pctCO2 simulations. This is a faster increasing forcing compared to the observed greenhouse gas concentration increase during the last century.
Even though a quantitative comparison of the AWI and MPI simulations with observations cannot be made due to the pre-industrial and idealized 1pctCO2 rather than historical forcing, it becomes clear from this study that a realistic representation of high-latitude ocean mixing is crucial for constraining TCR and ECS. For future studies historical simulations should be considered in a comparison. Figure A1 shows that in AWI configurations there is a faster decline of Antarctic sea ice concentration compared to MPI configurations. As indicated in the text of the main article, this is associated with faster decline of surface albedo in AWI configurations compared to MPI configurations.

Data Availability Statement
The climate model simulation data used in this study are publicly available at the Earth System Grid Federation (ESGF): https://esgf-data.dkrz.de/projects/cmip6-dkrz/. The German Climate Computing Centre (DKRZ) granted the computing time and technical support for carrying out the CMIP6 simulations. BMBF provided funding for preparing CMIP6 simulations and supporting the development of postprocessing tools for the publication of the data. We are grateful to Claudia Hinrichs for providing valuable comments on the paper. We thank Oliver Gutjahr for providing ARGO float data. D. Sidorenko was supported by the Helmholtz Climate Initiative REKLIM (Regional Climate Change). H. Goessling was supported by the Federal Ministry of Education and Research of Germany in the framework of SSIP (Grant 01LN1701A). The work described in this paper has received funding from the Helmholtz Association through the project "Advanced Earth System Model Capacity" in the frame of the initiative "Zukunftsthemen." The content of the paper is the sole responsibility of the authors, and it does not represent the opinion of the Helmholtz Association, and the Helmholtz Association is not responsible for any use that might be made of information contained.