The Arctic Ocean in CMIP6 Models: Biases and Projected Changes in Temperature and Salinity

We examine the historical evolution and projected changes in the hydrography of the deep basin of the Arctic Ocean in 23 climate models participating in the Coupled Model Intercomparison Project Phase 6 (CMIP6). The comparison between historical simulations and observational climatology shows that the simulated Atlantic Water (AW) layer is too deep and thick in the majority of models, including the multi‐model mean (MMM). Moreover, the halocline is too fresh in the MMM. Overall our findings indicate that there is no obvious improvement in the representation of the Arctic hydrography in CMIP6 compared to CMIP5. The climate change projections reveal that the sub‐Arctic seas are outstanding warming hotspots, causing a strong warming trend in the Arctic AW layer. The MMM temperature increase averaged over the upper 700 m at the end of the 21st century is about 40% and 60% higher in the Arctic Ocean than the global mean in the SSP245 and SSP585 scenarios, respectively. Salinity in the upper few hundred meters is projected to decrease in the Arctic deep basin in the MMM. However, the spread in projected salinity changes is large and the tendency toward stronger halocline in the MMM is not simulated by all the models. The identified biases and projection uncertainties call for a concerted effort for major improvements of coupled climate models.

To better understand these changes and provide trustworthy future projections, high-quality modeling of the Arctic Ocean, including the proper representation of critical processes, such as water mass transformations and development of the Arctic Atlantification is required. This is especially important because of the limited amount of observational data from the Arctic, due to its remoteness, harsh environmental conditions, the presence of sea ice cover, and limited solar illumination that allow only restricted use of satellites to monitor ocean properties.
AW is the main oceanic heat source of the Arctic deep basin (Rudels & Friedrich, 2000). The warm AW layer is characterized by high temperatures and salinity in comparison to the halocline water, and potentially can impact the sea ice cover (Carmack et al., 2015;Dmitrenko et al., 2014;Polyakov et al., 2010). The AW inflow from the Nordic Seas consists of two branches: One through the Fram Strait by means of the West Spitsbergen Current, and the other one through the Barents Sea. Observations show that the Barents Sea branch loses most of its heat to the atmosphere already in the Barents Sea. This leaves the Fram Strait branch as the the major heat source of the Arctic AW layer (Schauer et al., 2008;Smedsrud et al., 2013). Part of the AW at the Fram Strait recirculates southwards into the Greenland Sea (Marnela et al., 2013). In this context, it has been shown that the water mass properties at the Fram Strait as well as the partition of the West Spitsbergen Current into the Arctic interior can be significantly influenced by mesoscale eddies (Hattermann et al., 2016;Wekerle et al., 2017).
As the baroclinic Rossby radius in the Arctic Ocean is on the order of a few kilometres or less, even state-ofthe-art ocean models used in climate simulations are too coarse to resolve mesoscale eddies. Although model developers tune their model parameterizations and parameters to improve the representation of the ocean circulation in the Arctic region, significant temperature and salinity biases still exist as shown in previous model intercomparison studies (Holloway et al., 2007;Ilıcak et al., 2016;Proshutinsky & Kowalik, 2007;Proshutinsky et al., 2016;Q. Wang et al., 2016bQ. Wang et al., , 2016a. In particular, both the ocean models analyzed in the Arctic Ocean Model Intercomparison Project (AOMIP) and the ocean components of global climate models analyzed in the Coordinated Ocean Ice Reference Experiments Phase 2 (COREII) project show a large model spread in their simulated temperature and salinity in the Arctic halocline and AW layer when driven by atmospheric reanalysis forcing (Holloway et al., 2007;Ilıcak et al., 2016;Q. Wang et al., 2016a).
The Coupled Model Intercomparison Project (CMIP) was initiated by the World Climate Research Program to provide a standardized framework for carrying out climate change experiments with fully coupled models (Meehl et al., 2000). Although the fifth phase of CMIP (CMIP5) incorporated many of the same ocean models as those assessed in the COREII project (in ocean stand-alone simulations (Ilıcak et al., 2016)), the model spread of Arctic Ocean temperature and salinity in CMIP5 models is significantly larger than that in COREII models (Shu et al., 2019). The most probable explanation for this finding is that fully coupled models are further influenced by bias in atmospheric and land models as well as by biases that are amplified through two-way coupling between the ocean and the atmosphere. One major common issue in both forced ocean simulations and CMIP5 coupled model simulations is that the Arctic AW layer is too deep and too thick as reported in the aforementioned model assessment studies.
Currently, CMIP is in its sixth phase (CMIP6, Eyring et al., 2016) and it is crucial to analyze the performance of these models in simulating the present and the future state of temperature and salinity in the halocline and AW layer in the Arctic deep basin. Here, we examine the historical simulations and the future projections in CMIP6 coupled models. We focus on the following questions: (a) Can the available CMIP6 models adequately reproduce the temperature and salinity in the Arctic deep basin? Specifically, we want to know whether the large temperature and salinity biases in the Arctic deep basin found in CMIP5 models are reduced in the CMIP6 models. (b) How will the Arctic hydrography evolve and how does the warming trend in the Arctic Ocean deep basin compare to the global mean values in a warming world? This paper is organized as follows: Data processing and methodology are described in Section 2. Subsequently, the results and discussion are presented in Sections 3 and 4 respectively, followed by conclusions and suggestions for further investigations in Section 5.

Methodology and Data
We assess temperature and salinity in the CMIP6 historical simulations by comparing against the Polar Science Center Hydrographic Climatology (PHC) 3.0 database (Steele et al., 2001). PHC3.0 was generated based on observations obtained mainly before 2000. An alternative observational climatology is the temperature data from the World Ocean Atlas 2018 (WOA18; Locarnini et al., 2018) which as presented in Figure 1a, is close to PH3.0. However, we decided to use PHC3.0 as the main reference as WOA18 appears to be spatially discontinuous in temperature when plotted on the same two-dimensional grid as the model data. The mean vertical distribution of temperature and salinity in the Eurasian and Canadian basins are evaluated separately. These are the deep ocean  Climatological (1979Climatological ( -2014 and basin mean potential temperature (top) and salinity bias (bottom) profile in the Arctic Ocean. The Eurasian basin is shown on the left and Canadian basin in the right panels. The 19 models, which are taken into account for generating the multi-model means (MMM), are shown as thin solid and dashed lines. The four models excluded from the MMM are marked differently (with stars). For temperature profiles the thick blue, black and green curves represent the MMM and the PHC3.0 and WOA18 climatologies, respectively. In contrast to temperature, salinity profiles are presented as biases with respect to PHC3.0 in order to more clearly show the spread across models.The black dashed curve represents the PHC3.0 observation itself (not the bias) with the corresponding salinity values shown on the upper x-axis. The thick blue curve is the MMM bias. The original salinity profiles are shown in Figure S1 in Supporting Information S1. basins with bottom topography deeper than 300 m which are separated by the Lomonosov Ridge. For the sake of simplicity we will refer to the modeled climatological mean as climatology hereafter which is calculated over 36 years  of the historical experiments. Moreover, to assess possible future changes of the Arctic Ocean, the climate change signals of the temperature and salinity are calculated by taking the difference between the present day and future values. Here we chose the definitions for present day (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) and long-term future (2081-2100) conditions that are consistent with those used in the upcoming IPCC AR6. Two Shared Socioeconomic Pathways (SSP) scenarios (O'Neill et al., 2016) are assessed in this study: SSP245 (the so-called medium forcing scenario with 4.5 W/m 2 forcing at the end of the century) and SSP585 (the "high-end" of carbon emission or the strong forcing scenario with high carbon emission for radiative forcing of 8.5 W/m 2 by 2100).
The CMIP6 model data is provided through the Earth System Grid Federation (ESGF). The CMIP6 historical experiments cover the time period from 1850 to 2014. The projections from 2015 to 2100 are carried out as part of the scenario experiments, which define future scenarios based on approximate total radiative forcing levels by 2100. Among the many models participating in CMIP6, so far only 23 models from 18 institutions (see Table 1 and Table S1, Supporting Information) have provided the required data from both their historical and SSP  Canuto et al. (2002); EPBL, Energetically constrained parameterization of the surface boundary layer (Reichl & Hallberg, 2018); GLS, generic length scale scheme of Umlauf and Burchard (2003); KPP, k-profile parameterization by Large et al. (1994); NK, surface mixed layer parameterization of Noh and Jin Kim (1999); PP, Richardson number-dependent scheme of Pacanowski and Philander (1981); TKE, Turbulent Kinetic Energy scheme based on the model of Gaspar et al. (1990). The models have different grid resolutions and provide their three-dimensional data (here sea-water potential temperature and salinity) on different depth levels. Before computing the multi-model mean (MMM), all model outputs were re-gridded to the common grid for the PHC3.0 climatological data (1 × 1°). Re-gridding was done using Climate Data Operators (CDO; Schulzweida, 2019). Then, the re-gridded data were used to produce the MMM fields. Likewise, when averaging over a vertical level was needed, the individual model levels were interpolated to the 33 levels of the PHC3.0 data. However, given that individual models have different grid structures and topographies, regridding them to the 1 × 1° grid causes issues over continental slopes when calculating the MMM (as indicated by grid scale noise). Nevertheless, none of the aspects mentioned above is expected to impact the conclusions of this study.
For each model, the Atlantic Water Core Temperature (AWCT) is determined by finding the maximum temperature along the vertical axis at each location. The depth at which the maximum temperature occurs is defined as the Atlantic Water Core Depth (AWCD). In order to eliminate outlier results and for the outcome to be comparable to the assessment of CMIP5 AW layer, we implemented the same criterion as Shu et al. (2019) when calculating MMM. That is, if the simulated AWCD in any of the two basins is deeper than four times that of the observation, then the model is not considered in the MMM calculation (see Figure 2).

Figure 2.
Atlantic Water core depth (AWCD, in m) and Atlantic Water core temperature (AWCT, in °C) from the individual CMIP6 models (bars), multi-model-mean (blue solid line) and PHC3.0 climatology (black solid line) for the Eurasian and Canadian basins. White bars represent models that have been discarded from further analysis (i.e., models with AWCD larger than 4 times that of the observation). The models shown with white bars are excluded in the multi-model mean. The multimodel mean ± one standard deviation range is indicated through light blue shading.

Model Evaluation From Historical Simulations
The vertical profiles of observed hydrography highlight vertical structure of the Arctic Ocean water masses in the Eurasian and Canadian basins ( Figure 1). Averaged over each basin, the halocline is located above about 200 and 300 m in the Eurasian and Canadian basins, respectively. Below the cold halocline the warm AW layer can be found, which occupies the layer between the halocline and about 800 m depth (as indicated by the depth of the 0°C in the temperature vertical profile). According to the observational data, the mean AWCT is about 1° and 0.5°C and the mean AWCD amounts to about 300 and 500 m in the Eurasian and Canadian basins, respectively. Although the MMM reproduces the main vertical structure of the temperature and salinity to some extent, there are substantial biases. More specifically, the simulated MMM AWCD is about 250 m deeper than observed; and the simulated AW layer is too thick with its lower boundary reaching to a depth of about 3,000 m, instead of about 800 m in the observation (Figure 1a). This is very similar to the results of CMIP5 models (Shu et al., 2019).
Inspection of individual models reveals that most of the models overestimate the AWCD (Figure 2a). There are three models with AWCD similar to or smaller than the observations in either basin; however, their AWCT is much lower than observed (Figure 2b). In particular, four models have extremely deep AW in at least one of the basins (depicted with white color in Figure 2a); hence they are excluded when calculating the MMM as described in Section 2. Even with these models excluded, the model spread (defined by the standard deviation; std) of the AWCD is about 250 m in both basins (Figure 2a). The model spread of the AWCT is also quite pronounced in both basins (about 1°C, Figure 2b). The range (difference between the maximum and the minimum) of AWCT in the models is more than 3.5°C in both basins. It is also worth noting, that the biases for AWCD and AWCT are very similar in the two basins for all models, which may not be too surprising given that the Canadian basin lies "downstream" of the Eurasian one (see below).
The spatial patterns of the MMM AWCT and AWCD are compared to observational estimates in Figure 3. The observations clearly show the AW pathway: AW enters the Arctic Ocean through the Fram Strait and circulates cyclonically along the continental slope in the Eurasian Basin; it then penetrates into the Canadian Basin. The AWCD deepens along the AW pathway, and it is on average about 200 m deeper in the Canadian Basin than in the Eurasian Basin (see also Figures 1a and 2a). The MMM AWCT is colder than observed nearly everywhere inside the Arctic Ocean, although its spatial pattern indicates that, on average, the simulated AW circulation is cyclonic as expected. The MMM AWCD reproduces the contrast between the two deep basins (deeper in the Canadian Basin); however, AWCD is overestimated by models in both basins. There are differences in the detailed spatial pattern of AWCD between the MMM and the observation. One outstanding difference is that the observed maximum is in the southeastern Canadian Basin, whereas in the MMM it is located in the western Canadian Basin.
The simulated salinity also has large biases in both basins, which are most pronounced in the halocline and at the surface ( Figure 1b). The MMM salinity has negative biases up to 0.5 psu in the halocline in the Canadian Basin, and even larger biases in the Eurasian Basin. The largest fresh bias is closer to the surface in the Eurasian basin than in the Canadian Basin given that the halocline is thinner in the Eurasian Basin. At the surface, the MMM salinity bias is negative in the Eurasian Basin and slightly positive in the Canadian Basin. Inspecting individual models reveals that the models have a large spread in the simulated salinity in the upper 400 m. The largest spread is at the surface, with the difference between the maximum and minimum surface salinity reaching more than 5 psu. Even at 200 m depth, the range of the simulated salinity between the models is still more than 1 psu. Although the MMM underestimates the upper ocean (∼400 m) salinity on average (thus overestimating the Arctic freshwater content), some models do significantly overestimate the upper ocean salinity.
In summary, CMIP6 historical simulations show a too deep and too thick AW layer and a too fresh halocline in a MMM sense, and they show considerable model spread in the simulated temperature and salinity. These issues are the same as in CMIP5 models (Shu et al., 2019). Importantly, not only can these "large-scale" issues be found in CMIP5 and CMIP6 models; also some details, such as the location of maximum AWCD ( Figure 3) and opposite biases in MMM sea surface salinity between the two basins (Figure 1b), are essentially the same in the two generations of CMIP models. Therefore, for the representation of the Arctic hydrography, CMIP6 does not show clear improvements compared to CMIP5.

Future Projections
In this section, we will explore the climate change signals of Arctic temperature and salinity for two scenarios (i.e., SSP245 and SSP585). Climate change signal for zonal mean temperature in the Arctic deep basins as simulated by CMIP6 models is presented in Figure 4. In both scenarios, ocean warming mainly occurs in the upper 2,000 m. This holds for MMM as well as most individual models. For both basins and scenarios the strongest warming signal for MMM is found in two depth ranges-at depths close to the observed AWCD (about 200-500 m depth) and at the surface. The former indicates the warming of the AW layer, while the latter reflects the surface warming associated with sea ice decline, Arctic Amplification and increasing ocean heat inflow from the Pacific. In both scenarios, the warming in the AW layer is stronger in the Eurasian Basin than in the Canadian Basin. This is consistent with the fact that the AW circulates cyclonically from the Eurasian Basin to the Canadian Basin. For MMM the maximum climate change signal for the AW layer temperature is about 1.7°C (1.4°C) in the Eurasian (Canadian) Basin in SSP245, while it is about 3°C and 2.4°C in the two basins in SSP585, respectively. At the surface, the climate change signals in the two basins are comparable. In fact, the MMM surface temperature climate change amounts to about 1° and 2.8°in the SSP245 and SSP585 scenarios, respectively. Only in the Canadian Basin and for the more extreme SSP585 scenario is projected climate change in surface temperature larger than that in the AW layer (by up to about 0.4°C). As the strongest warming in the AW layer is at depth shallower than the simulated AWCD in historical simulations (cf. Figures 1b and 4), the AWCD becomes shallower at the end of the 21st century (by about 200 m in both warming scenarios, see Figure S2, in Supporting Information S1).
A summary of the MMM temperature in different depth ranges for the historical period versus both future projection scenarios is presented in Table 2. Although there are temperature biases in the MMM of the historical simulations relative to the observations, the projected changes in the MMM temperature in both scenarios are larger than the model biases. Even when taking the large model spreads (standard deviation) into account, the projected warming is still considerable.
The spatial patterns of MMM climate change signals for AWCT are consistent with the source region and circulation direction of AW ( Figure 5). In both scenarios the strongest warming signal starts at the Fram Strait, the entrance of the warm AW; it then propagates into and around the Eurasian Basin and then Canadian Basin. The warming at the Fram Strait amounts to more than 2°C and 4°C in the SSP245 and SSP585 scenarios, respectively.
The MMM warming signal does not propagate from the Eurasian Basin to the Canadian Basin through the expected cyclonic boundary current (see PHC3.0 AW circulation pattern in Figure 3), as indicated by the extension of the warming signal from the Eurasian Basin toward the Canadian Basin through the central Arctic.  Note. The historical  and two projection scenarios (2081-2100) are shown. The standard deviations of the model results are also indicated.

Table 2 The Climatological Mean of the Potential Temperature in the Depth Ranges of 0-250 m and 250-700 m in the Eurasian and Canadian Basins of the Arctic Ocean in PHC3 and the Multi-Model Mean (MMM)
Despite the large warming trend in the MMM, the individual models show a large spread of the climate change signals for temperature. Not all of these signals from individual models are physically consistent with those of the MMM (Figure 4, for model spread see also the Hovmöller diagrams of temperature for individual models in Figure S3 in Supporting Information S1). The range of the climate change signals of temperature among the models is about 4°C in SSP245 and 7°C in SSP585; this is more than twice that of the MMM climate change signals. There are even two models with negligible or even negative temperature changes in the core depth range of the AW layer in both scenarios, while all other models predict ocean warming in the AW layer. Furthermore, the models do not agree on whether the ocean surface or the AW layer will warm more in the future. In both basins and in both scenarios, there are models with relatively stronger warming at the surface and models with stronger warming in the AW layer.
The strong warming trend in the Arctic Ocean is consistent with the intense warming in the inflow waters (Figure 6). The projected climate change for the temperature averaged over the upper 700 m reveals that the sub-Arctic seas close to the Arctic Ocean inflow gateways are all warming hotspots, namely the northern Nordic Sea, the Barents Sea and the Bering Sea. The warming in these hotspots is stronger than most of the world ocean areas. The warming in the Pacific Water inflow, which mainly enters the upper Arctic Ocean, could partially explain the enhanced surface warming in the Canadian Basin, especially in the SSP585 scenario ( Figure 4). The strong warming of the Arctic AW layer ( Figure 5) can be associated with the warming of the AW entering the deep basins. The CMIP6 projections indicate that the currently observed ocean warming associated with Arctic Pacification and Arctic Atlantification (Polyakov et al., 2017(Polyakov et al., , 2020Timmermans et al., 2018) will continue to develop in future warming climate.
As the ocean heat in the Arctic AW layer is a potential source of sea ice basal melting, we need to better quantify the future change in the Arctic AW layer in the context of global-ocean change. Hovmöller diagrams for MMM temperature for both the Arctic Ocean and the global ocean are shown in Figure 7a. In the Arctic Ocean, the  strongest warming trend can be seen at the depth where AW prevails, while the surface ocean shows a comparatively smaller warming trend, as can be seen in Figure 4. In contrast, the maximum global average warming trend is at the ocean surface. Although the global mean surface warming trend is stronger than the mean over the Arctic surface, the warming in the AW layer of the Arctic Ocean causes stronger overall warming in the Arctic deep basin, as indicated by the time series of mean temperature averaged over the upper 700 m and upper 2,000 m (Figures 6 and 7b). The increase in temperature averaged over the upper 700 m of the Arctic deep basin at the end of the 21st century is higher than that of the global ocean by 0.4° (40%) and 1° (60%) in the SSP245 and SSP585 scenarios, respectively. Although the amplitude of the temperature increase averaged over the upper 2,000 m is smaller than averaged over the upper 700 m, the amplified warming in the Arctic deep basin is even more pronounced. It is about 75% higher in the Arctic deep basin than in the global deep basin at the end of the 21st century in the SSP585 scenario. It is worth stressing that the warming in the Arctic Ocean has just started to stand out from the 2020s according to the MMM; in contrast the temperature change is rather small from the beginning of the industrialization to the present day (see the Hovmöller diagram for Arctic temperature covering the whole CMIP historical simulation period in Figure S4 in Supporting Information S1).
For the climate change signals of salinity, in both basins there is a freshening of the upper 400 m ocean in both scenarios (Figure 8 and Figure S5 in Supporting Information S1). The strongest freshening occurs in the upper halocline and in the mixed layer (upper 200 m), indicating an increase in freshwater storage in the Arctic Ocean in the future which is in agreement with Zanowski et al. (2021). The freshening is consistent with an enhanced hydrological cycle, and thus increased freshwater supply to the Arctic Ocean in a warming climate (Carmack Although salinity for MMM shows freshening in the Arctic Ocean in both warming scenarios, some of the models predict an increase of salinity in the upper 200 m depth, either near the surface or in the halocline (Figure 8). The range of projected salinity changes among the models amounts to about 2-3 psu even when the "outlier models" are excluded, which is much larger than the MMM salinity climate change signal. The large model spread in the simulated salinity in the future scenarios implies large spread in the simulated future Arctic freshwater storage in CMIP6 models. Therefore, the issue of large spread in Arctic freshwater storage simulated in CMIP5 models (Shu et al., 2018) remains nearly unchanged in CMIP6 models (see also Zanowski et al., 2021).
In summary, the CMIP6 MMM shows strong warming in the Arctic AW layer and at the surface for both future scenarios considered in this study. The AW layer is likely to become shallower. The warming in the bulk of the AW layer may cause the temperature climate change in the Arctic deep basins to be much larger than the global mean change at depth. The Arctic halocline is likely to become much fresher in the future, in particular in the Canadian Basin. However, the CMIP6 models have large spread and thus uncertainty in the simulated climate change signals for both temperature and salinity.

Discussion
Most of the state-of-the-art CMIP6 models simulate a warm AW layer below the cold halocline in the Arctic Ocean, which is one of the key characteristics of the Arctic Ocean evident from observations. However, the simulated AW layer is too thick and too deep compared to observations in most of the models and also in the MMM. This issue has been found in forced ocean simulations more than a decade ago (Holloway et al., 2007); it was prevalent in both forced and coupled ocean simulations in the period of CMIP5 (Ilıcak et al., 2016;Shu et al., 2019), and it continues to remain a critical issue in CMIP6 models, as shown by our analysis. There is agreement across the above-mentioned studies that numerical mixing in coarse resolution models is a main reason for this issue. Indeed, it was found that increasing horizontal resolution to 4.5 km in the Arctic Ocean, although not fully eddy resolving yet, can visibly reduce the too thick and too deep biases of the AW layer (Q. Wang et al., 2018). As the model resolutions in CMIP6 models are typically quite coarse, the associated numerical mixing is also probably the main reason for the spatially diffused pattern of anthropogenic warming in the central Arctic Ocean as shown in Figure 5.
The CMIP6 models tend to have a too fresh mid to lower halocline (with MMM salinity biases of more than 0.5 psu), as in CMIP5 models (Shu et al., 2019), which means a weaker stratification in the associated depth range. Strong diapycnal mixing can weaken the halocline stratification (Zhang & Steele, 2007). So it seems likely that the diapycnal numerical mixing associated with coarse model resolution is partially responsible for the salinity bias. Furthermore, the choice of applied vertical mixing parameterizations and associated model parameters can also influence vertical salinity profiles and stratification (Liang & Losch, 2018). Among other factors, Arctic freshwater sources, including river runoff and precipitation, which typically have considerable spread in climate model simulations (Shu et al., 2018), can also contribute to the identified model spread in upper ocean salinity.
We did not find clear correlation relationship between model performance and model horizontal resolution in the analyzed CMIP6 models (not shown). The reason could be that the horizontal resolutions in CMIP6 models are still coarse (Table 1), and much coarser than the resolution of 4.5 km that was found to be very effective in reducing long-standing model biases in ocean-only configurations (Q. Wang et al., 2018). Even in the HighResMIP of CMIP6, the spatial resolution in the Arctic Ocean does not exceed 1/4° (Docquier et al., 2019). Nevertheless, these models can improve the AW heat transport toward the Arctic Ocean to some extent in comparison to the models using 1° resolution-thus encouraging the use of higher model resolution in future CMIP efforts, as suggested by Docquier et al. (2019). There is ongoing effort to reduce numerical mixing through improving model formulations (Griffies et al., 2020). As the model biases discussed above are likely to be partially associated with numerical mixing in the models, it remains to be seen whether such improvement can lead to breakthroughs in model performance in the Arctic Ocean in next generations of CMIP simulations.
The MMM shows that the upper ocean including the upper halocline and mixed layer will become fresher in a warming climate (Figure 8), while, simultaneously, the AW layer will become warmer and the AWCD will become shallower (Figure 4 and Figure S2 in Supporting Information S1). The shoaling and warming of the AW layer implies that winter convection, if it happens, does not need to reach very deep to bring up a large amount of ocean heat. Therefore, the changes in the AW layer can potentially enhance sea ice decline through basal melting.
On the other hand, the freshening of the halocline may contribute to strengthen the isolation of sea ice from the warm AW layer. Because the decrease in upper ocean salinity, thus the increase in stratification, is much smaller in the Eurasian Basin than in the Canadian Basin in the MMM (Figure 8), sea ice in the Eurasian Basin may have larger chances to be significantly influenced by the changes in the AW layer. Stronger freshening in the Canadian Basin than in the Eurasian Basin could be partially explained by the effect of sea ice decline as explained in the following. On average, the ocean surface Ekman transport is directed from the Eurasian Basin toward the Canadian Basin. Sea ice decline can increase ocean surface stress, thus the Ekman transport, which can enhance freshwater accumulation in the Canadian Basin and tends to reduce it in the Eurasian Basin (Q. Wang et al., 2019;S. Wang et al., 2021).
The models have large spread in the projected salinity and temperature changes in both analyzed scenarios. Some models show salinification in the upper ocean, thus weakening in the ocean stratification, while some models indicate upper ocean freshening that is much stronger than the MMM, thus a clear increase in the ocean stratification ( Figure 8). Therefore, the models likely do not agree on changes in the strength of vertical mixing and the possibility of emergence of deep convection in the Arctic deep basin as well. In order to better predict the future development of the Arctic Atlantification and its possible impact on sea ice, model uncertainties need to be reduced. As some of the model biases in the Arctic Ocean could have origins outside the Arctic Ocean and possibly in other components of the climate system (Gutjahr et al., 2019;Hinrichs et al., 2021), identifying these origins in individual models is needed to improve the Arctic Ocean representation in CMIP simulations.
Here we note that model biases and spreads are common in all parts of the global ocean in CMIP simulations, although we only assessed the Arctic Ocean in this paper. For example, very large model spreads in simulated key dynamical processes in the North Atlantic have also been found (Heuzé, 2017;Weijer et al., 2020). The model spreads inform us of the uncertainty range of the MMM. Our study suggests the need to further improve CMIP simulations to reduce the uncertainty, while we note that CMIP results are the best we have for learning possible future changes.
We used one realization from each model in this study. We checked a few realizations from one model and found that the difference in the simulated mean state and climate change between the realizations is much smaller than the spread between different models (not shown). Therefore, we propose that our results are representative for the overall CMIP6 status.

Conclusion
In this study, we assessed the temperature and salinity in the Arctic deep basin (the Eurasian and Canadian basins) in CMIP6 historical simulations and the respective climate change projections (SSP245 and SSP585 scenarios). One of our main findings is that the biases in Arctic Ocean temperature and salinity found in CMIP5 historical simulations remain virtually unchanged in the CMIP6 simulations. The AW layer is still too deep and too thick in nearly all models, the multi-model-mean (MMM) halocline is too fresh, and the models have a large spread in both the simulated temperature and salinity. Even some details in model biases in CMIP6 models are similar to those in CMIP5 models. Therefore, it can be concluded that there is essentially no improvement in the representation of the hydrography in the Arctic deep basins from CMIP5 to CMIP6.
We found that the warming in the Arctic deep basins in the future (2081-2100) relative to the present day conditions (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) is largest in the upper AW layer (200-500 m) for the MMM, with a magnitude of about 1.7°C (3°C) and 1.4°C (2.4°C) in the Eurasian and Canadian basins in SSP245 (SSP585). The warming results in an uplift of the AW layer. The corresponding climate change signal of sea surface temperature amounts to about 1°C and 2.8°C in the SSP245 and SSP585 scenarios, respectively. Furthermore, it is shown that in the depth range of the Arctic AW layer, the Arctic Ocean has a stronger warming trend than the global mean. Averaged over the upper 700 m, the increase in Arctic basin temperature at the end of the 21st century is 40% and 60% higher than the global mean in the SSP245 and SSP585 scenarios, respectively. We further found that all the sub-Arctic seas close to the Arctic inflow gateways are warming hotspots in a warming climate, including the northern Nordic Seas, Barents Sea and eastern Bering Sea. Previous studies showed that the warming trend in the AW inflow is not only induced by warming upstream in the North Atlantic, but also can be enhanced by local atmospheric warming around the Arctic gateways (Asbjørnsen et al., 2020;Shu et al., 2021) and ocean-ice feedback processes (Q. Wang et al., 2020).
The MMM upper 400 m ocean salinity is found to decrease in both Arctic basins in the future scenarios, with the decrease in the Canadian Basin being stronger than in the Eurasian Basin. Therefore, the stratification in the Arctic upper ocean is projected to be more stable in the MMM in both the SSP245 and SSP585 scenarios. However, the models show a large spread in the simulated climate change for upper ocean salinity, with some models indicating upper ocean salinification and some freshening. The upper ocean stratification influences the strength of vertical mixing, and thus potentially the impact of AW layer on sea ice. Therefore, CMIP6 models do not agree on the extent to which the future changes in AW layer may influence the sea ice.
The identified model biases in CMIP6 models reported in this study call for a concerted effort to improve climate coupled models in support of future CMIP beyond incremental. We discussed possible reasons for model biases and suggest that measures, including using higher model resolutions, reducing model numerical mixing, improving vertical mixing parameterizations and reducing model biases in sub-Arctic seas and related fields of other climate components, could help reduce these biases.

Data Availability Statement
We would also like to thank the working groups, who prepared the following data for the CMIP6 historical, SSP245 and SSP585 experiments (listed in Table S1 in Supporting Information S1). The CMIP6 data were downloaded from https://esgf-data.dkrz.de/projects/esgf-dkrz/ and PHC3 data from http://psc.apl.washington. edu/nonwp_projects/PHC/Data3.html.