Process-Based Flood Risk Assessment for Germany

Large-scale flood risk assessments are crucial for making, to flood schemes, adaptation planning estimating insurance premiums. We apply the process-based Regional Flood Model (RFM) to simulate a 5000-year flood event catalog for all major catchments in Germany and derive risk curves based on the losses per economic sector. The RFM uses a continuous process simulation including a multisite, multivariate weather generator, a hydrological model considering heterogeneous catchment processes, a coupled 1D–2D hydrodynamic model considering dike overtopping and hinterland storage, spatially explicit sector-wise exposure data and empirical multivariable loss models calibrated for Germany. For all components, uncertainties in the data and models are estimated. We estimate the median Expected Annual Damage (EAD) and Value at Risk at 99.5% confidence for Germany to be €0.529 bn and €8.865 bn, respectively. The commercial sector dominates by making about 60% of the total risk, followed by the residential sector. The agriculture sector gets affected by small return period floods and only contributes to less than 3% to the total risk. The overall EAD is comparable to other large-scale estimates. However, the estimation of losses for specific return periods is substantially improved. The spatial consistency of the risk estimates avoids the large overestimation of losses for rare events that is common in other large-scale assessments with homogeneous return periods.

the 2013 flood in Germany, a new national flood protection program was kickstarted with a funding volume of €5.437 bn (LAWA, 2014). For the prioritization of the investments based on decision-support frameworks like cost-benefit analyses, comprehensive large-scale risk assessment models are essential tools (e.g., de Moel et al., 2015;Kreibich et al., 2014;Vorogushyn et al., 2018).
Several flood risk assessments have been conducted on the national and continental scales. The models applied within these assessments are usually based on generalized data and simplified methods. Spatial dependence between precipitation or discharge peaks at multiple locations is often ignored, i.e., the uniform return period assumption over large areas is adopted for risk estimation (e.g., Alfieri et al., 2015;Ward et al., 2013;Wing et al., 2018;Winsemius et al., 2013). In these flood maps, the scenario with a N-year return period (e.g., 100-year) is composed of all flooded areas within the region, where each location shows the N-year flood. Exposed assets represented as rasterized Gross Domestic Product ignore their spatial variability (e.g., Alfieri et al., 2015Alfieri et al., , 2018Ward et al., 2013). The vulnerability curves or flood loss models are often not calibrated for the area of application. For instance, generalized stage-damage curves for all asset types are applied (e.g., Alfieri et al., 2015;Rojas et al., 2013). Finally, uncertainties in data and models, especially with respect to exposure and vulnerability are rarely considered due to lack of information and computational constraints (e.g., see, Alfieri et al., 2015Alfieri et al., , 2018. Over the last years significant progress in addressing these limitations converged in the development of a process-based Regional Flood Model (RFM) for Germany (Falter et al., 2015;Metin et al., 2018). For example, the assumption of uniform return period of driving variables leads to piecewise, mosaicked inundation patterns across large areas that would not occur in a single flood event resulting in unrealistic risk estimates for specific return periods in large areas, though Expected Annual Damage (EAD) is not affected (Lamb et al., 2010;Metin et al., 2020;Nguyen et al., 2020). To overcome this assumption, the RFM accounts for spatial dependence between extremes considering spatially consistent precipitation footprints resulting in coherent heterogeneous return period flows across the catchments. The effect of floodplain storage and risk reduction for downstream areas is considered within sub-basins due to continuous hydrodynamic modeling (Apel et al., 2008;de Bruijn et al., 2014;Vorogushyn et al., 2018). Since ignoring spatial distribution of assets results in coarse risk predictions at the local/regional levels (Jongman et al., 2012), the RFM uses spatially explicit sector-wise exposure asset values at the municipality-level which are disaggregated on detailed land use areas. Risk assessment using loss models that are not calibrated for the area of application results in unreliable risk predictions (Schröter et al., 2014;Wagenaar et al., 2018). In contrast, RFM employs sector-wise validated empirical flood loss models specific to Germany Tapia-Silva et al., 2011;Thieken et al., 2008). Since ignoring uncertainties in risk predictions impedes robust decision making (Beven et al., 2018;de Brito & Evers et al., 2016;Steinhausen et al., 2020), the RFM accounts for uncertainties in data and models such as simulated water levels, spatial distribution of asset values and vulnerability curves.
The objective of this study is to provide for the first time a process-based, spatially consistent flood risk assessment for the residential, commercial and agricultural sectors for all major catchments in Germany. With the RFM a 5000-year long synthetic flood event catalog is simulated and risk curves derived based on the losses per economic sector. We further capture the range of uncertainty in each modeled component of risk (hazard, exposure and vulnerability). As these results are derived using the best process-based flood risk model chain currently available, they serve as a benchmark for future model development studies and regional risk estimates.

Regional Flood Model (RFM)
The RFM is a chain of four coupled models: (a) a stochastic multi-site, multivariate Regional Weather Generator (RWG) simulates synthetic time series of precipitation, temperature, humidity and solar radiation at climate station locations for entire Germany and upstream parts of riparian countries (Hundecha et al., 2009;Nguyen et al., 2021). (b) These time series are fed into the hydrological model -Soil and Water Integrated Model (SWIM) (Krysanova et al., 1998) to obtain runoff input into the river network. SWIM operates on a sub-basin scale and uses the Muskingum routing method. (c) the 1D hydrodynamic model component of the Regional Inundation Model (RIM) is driven by discharge input from SWIM and estimates flood flows and water levels along the river network, considering dike overtopping and water inflow into protected floodplains. If overtopping is simulated, the raster-based 2D model component of RIM is activated to estimate the spatial distribution of maximum inundation depth and duration in protected hinterland. (d) These patterns are then intersected with exposure maps of asset values in €/m 2 separately for different asset types (e.g., buildings, contents) and the three sectors (residential, commercial, agriculture). See the Supporting Information S1 [chapter 2] for a detailed description of the exposure maps. Finally, specific multivariable Flood Loss Estimation MOdels (FLEMO) per asset type and sector are applied to compute economic loss. The rule-based loss models for the residential , commercial  and agricultural sector (Tapia-Silva et al., 2011) were developed and validated on basis of flood loss data from Germany. The 2D hydrodynamic model component operates on a spatial resolution of 100 × 100 m and the exposure data is disaggregated on units of the digital basic landscape model ATKIS Basic DLM, therefore the loss estimation is initially computed on polygons that may be smaller than the 100 × 100 m grid, but are interpreted on an aggregated scale (e.g., catchment). The individual components of the RFM are calibrated and validated based on past flood events in Germany (Falter et al., 2015;Hundecha et al., 2009;Seifert et al., 2010;Tapia-Silva et al., 2011;Thieken et al., 2008). Further technical details on the individual model components (see, Figure 1) and their application in this study are provided in the following sections. The calibration and validation of the model components are provided in the Supporting Information S1 [chapter 1].

Regional Weather Generator (RWG)
The first component of the RFM is the multi-site, multi-variate stochastic Regional Weather Generator (RWG) based on a first-order multivariate autoregressive model considering spatial correlation structure (Hundecha et al., 2009;Nguyen et al., 2021). The model has two simulation stages. In the first stage, it generates daily precipitation at multiple point locations simultaneously. In the second stage, if necessary, it generates daily non-precipitation variables such as temperature (maximum, minimum, average), relative humidity and solar radiation conditioned on the state (dry/wet) of the generated precipitation. RWG was originally introduced by Hundecha et al. (2009) and has recently been improved and evaluated for all major German river basins by Nguyen et al. (2021). For non-zero precipitation values at individual stations, a theoretical distribution is fitted to simulate daily precipitation sums. The RWG provides two options: (a) a six-parameter mixed distribution (Frigessi et al., 2002;Hundecha et al., 2009) combining a Gamma distribution for bulk precipitation and a Generalized Pareto distribution for extreme precipitation, (b) a parsimonious three-parameter extended Generalized Pareto distribution allowing a smooth transition between bulk distribution and extreme values (Naveau et al., 2016;Nguyen et al., 2021). In the present study, the mixed distribution was used because it generally shows a better fit to observational data although the fitting procedure is much more computationally expensive. The mixed distribution of non-zero precipitation is then combined with the frequency of dry days to formulate the distribution of the full range of precipitation values. For non-precipitation variables, the normal distribution is used to model humidity and temperature values. The radiation data exhibits a relatively strong right skew therefore the normal distribution is fitted to their square root transformed values. Moreover, to account for seasonality, the RWG is parameterized on a monthly basis.

Hydrological Model (SWIM)
The second component of the RFM is the conceptual, semi-distributed hydrological model SWIM (Krysanova et al., 1998). SWIM is designed as an integrated tool for hydrological and water quality modeling in meso-scale and large river basins. The hydrological component of SWIM solves the water balance equation considering snowmelt, precipitation, evapotranspiration, percolation, surface and subsurface runoff, recharge, and capillary rise and simulates daily runoff as a target variable. SWIM employs a 3-level spatial disaggregation scheme including basins, sub-basins and hydrotopes. A major river basin (e.g., Elbe) is subdivided into sub-basins linked through a topological network which is derived in a pre-processing step. The sub-basins are further subdivided into the hydrological response units or hydrotopes, where soil type, land use and other properties are assumed to be homogeneous. Runoff is calculated for each hydrotope and then aggregated on a sub-basin scale. For flow routing between sub-basins, the Muskingum routing approach is used (Cunge, 1969) to obtain total discharge at the outlet of each sub-basin.

Regional Inundation Model (RIM)
The third component of the RFM is a Regional Inundation Model (RIM) consisting of two coupled sub-components: RIM1D and RIM2D. Since SWIM employs the Muskingum method to route the flow between sub-basins it does not need detailed geometry of sub-basin channels (e.g., cross-section profiles) and hence is not able to provide water level information which is required for the simulation of flood defense overtopping. Therefore, RIM1D -a 1-dimensional (1D) hydrodynamic routing model is used for flow routing and calculations of water levels and overtopping discharge. The hydrodynamic routing scheme solves the diffusive wave equations describing water flow in open channels using an explicit finite difference solution scheme. Adaptive time step technique based on Courant-Friedrichs-Lewy criterion is used to keep numerical stability. When the water level exceeds the dike crest high, the overtopping flow into the hinterland is calculated with the broad-crested weir equations (Falter et al., 2015). In order to simulate the inundation process in the hinterland, RIM2D -a 2-dimensional (2D) hydrodynamic model using an inertial formulation of the shallow water equations is employed (Bates et al., 2010;Falter et al., 2015). The momentum and continuity equations are solved explicitly using a finite volume scheme and are suitable for efficient parallelization. In the RFM, RIM2D is implemented using the CUDA Fortran programming language on the highly parallelized NVIDIA Graphical Processing Units (GPUs) (Falter et al., 2016). It should be noted that the 1D-2D coupling is currently one-way, in which water flows from channel to hinterland but not vice versa. The overtopping terminates as soon as the water level in the hinterland is equal to the water level in the channel.

Flood Loss Estimation Models (FLEMO)
The final component of the RFM is loss estimation based on the FLEMO model family, which has been developed from empirical loss data of German river floods, and validated in previous modeling studies ( Kreibich et al., 2010;Tapia-Silva et al., 2011;Thieken et al., 2008). For the residential and commercial sector, loss functions are based on water depth, discretized into six levels (<0.21, 0.21-0.6, 0.61-1, 1.01-1.5, >1.5 m). The residential sector model distinguishes three building types (single-family, semi-detached, multi-family) and two building quality levels (Low/medium quality, high quality). The commercial sector is subdivided in four sub-sectors (producing industry, trade, corporate services, public & residential services) and three size classes according to the number of employees (1-10, 11-100, >100). For the agricultural sector, loss functions are based on four inundation duration classes (1-3, 4-7, 8-11, >11 days), season of the flood (per month), seven different crop types (canola, maize, potatoes, sugar beet, barley, rye, wheat) (Förster et al., 2008). Upper and lower bounds of the uncertainty in the loss estimation models are accounted for by applying the Mean Absolute Error estimated during calibration/validation of the FLEMO model family.

Risk Curves
Flood risk estimates from a 5000-year event catalog simulated using the RFM are represented using the Occurrence Exceedance Probability and Aggregate Exceedance Probability curves. Both are standard measures used in the insurance industry that are included in typical vendor catastrophe modeling software (e.g., by RMS, AIR Worldwide, EQECAT) and also used in respective scientific literature (e.g., Hillier et al., 2015;Priestley et al., 2018). Aggregate Exceedance Probability is calculated by ranking the sum of total loss for all events per year, while Occurrence Exceedance Probability is calculated by ranking only the most severe individual loss event per year. The resulting table has as many rows as there are years in the simulation (with empty rows for years without losses at the bottom), therefore the simulated loss return period can be derived directly from this ranking. The difference between Aggregate Exceedance Probability and Occurrence Exceedance Probability curves signalizes whether the losses are produced from single extreme events, or rather from many small to medium scale events (Priestley et al., 2018). From these curves we further derive the EAD, as well as the VAR (Value At Risk) and TVAR (Tail Value At Risk) based on the event with a loss return period of 200 years. The EAD equally distributes risk over time, while VAR gives the maximum expected loss in a year at a specified confidence level, and TVAR the average loss above the confidence level of VAR. TVAR is therefore an indicator of the upper tail risk. VAR at 99.5% confidence (VAR 99.5% ), i.e., at a return period of 200 years, is the current regulatory standard in the Solvency II legislation on required minimum reserves of insurance undertakings (Dos Reis et al., 2010).
Uncertainty range in exposure is estimated based on uniform sampling of the factors leading to variations of asset values of residential and commercial sectors. Uncertainty in flood loss models pertaining to residential, commercial and agriculture sectors are estimated based on the validation of predictions against empirical loss data. Uncertainty in water depths from the 5000-year event catalog is quantified by adding Gaussian noise to the simulations that represents the random uncertainty in dike heights. The upper and lower uncertainty ranges from data and individual components of RFM are aggregated and reported in the risk curves and EAD and VAR 99.5% estimates. The bounds indicate the aggregation of uncertainties in hazardflood inundation levels, exposure -asset values and vulnerability -flood loss models. Further details on uncertainty estimation is provided in the Supporting Information S1 and in the chapter 2.1 corresponding to the respective RFM components.

Risk Estimates-Expected Annual Damage (EAD), Value at Risk (VAR 99.5% )
The median EAD for Germany aggregated across the three sectors (residential, commercial and agriculture) and across all catchments is estimated to be €0.529 bn (Figure 2c). The lower and upper limits of EAD considering uncertainties are €0.245 bn and €0.94 bn, respectively. The commercial sector dominates by making about 60% of the total risk, followed by the residential sector (Figure 2d). The agriculture sector only contributes to less than 3% to the total risk. The individual catchments contribute quite differently to the EAD as follows: Rhein (28%), Danube (27%), Elbe (40%), Weser (5%) and Ems (<0.1%) (see, Figure 2c). Across catchments, the Elbe, Danube and Rhine share about 90% of the total risk.
The high uncertainty range of several billion Euros of the VAR 99.5% estimation is not unusual for large scale flood risk assessments of extreme events (Alfieri et al., 2015). However, it underpins the importance to quantify and communicate uncertainties. The uncertainty of the probabilistic estimate for the VAR is of importance for rating agencies and regulatory requirements for insurers. Insurers are typically requested to keep a minimum amount of reserve assets as indicated by the VAR at a given confidence level, and are rated accordingly. However, the uncertainty of the modeling itself hints at the potential necessity of additional backup reserves. Improving the reliability of the upper tail risk estimates (i.e., reducing the uncertainty) is therefore of practical relevance.

Flood Risk Curves
Flood risk curves are shown in Figure 4 separated for the three sectors and the large river catchments in Germany. The differences between Aggregate and Occurrence Exceedance Probability risk curves are rather small and always within the uncertainty bounds (Figure 4), indicating that the RFM rarely predicts multiple strong events per year. In the entire 5000-year simulation, very few (only six flood events resulting in losses >€.1 bn) were modeled in the Ems catchment. These results agree with historical records such as EM-DAT and the Hanze database consisting of flood records since 1872 (EM-DAT, 2020; Paprotny et al., 2018). The losses to residential/commercial and agricultural sectors start only at return periods around 200 and 1,000 years, respectively (Figures 4g-4i). We estimate the EAD and VAR 99.5% at the Ems catchment to be €0.61 mn and €21 mn, respectively.
The agricultural sector gets affected by smaller return period floods in all catchments compared to the residential and commercial sectors (Figures 4c, 4f, and 4l) as there is often no protection at all for crop fields or the areas are protected by low dikes with a design level of 5 to 10-year floods (ATV-DVWK., 1989). Residential areas, i.e., cities and villages, are usually protected, with the protection level in general higher in more densely populated places. Commonly, dike heights in Germany range between 100 and 500-year floods (te Linde et al., 2011). Therefore, it is to be expected that loss in the residential sector only occurs above a certain return period, i.e., the specific design level of flood protection in the flood affected area. Commercial areas may sometimes be located outside the cities or even directly at the river for transportation reasons. Particularly along the Elbe, loss to commercial buildings is predicted to start at lower return periods than loss to residential buildings (Figures 4d-4f). This also explains the high EAD share of the commercial sector in the Elbe catchment (Figure 2b).
In the Danube catchment, the residential sector and commercial sector contribute equally to the modeled risk, while the commercial sector dominates along Elbe, Rhine and Weser (Figures 2a, 2b, 4a, and 4b). Agriculture makes only for a small share of the total risk, however by far the highest loss in this sector occurs along the Elbe. This finding is explainable given the smooth topography in large parts of the Elbe catchment, which leads to shallow floods of large extent. Further, the relatively high agricultural losses are related to the coincidence of high flood probability and high crop susceptibility in spring and summer. Mean seasonal flood risk of crops (canola, sugar beets and potatoes) in the Elbe catchment peaks in spring and in late summer. Also, increased flood probability during the summer months yields a larger risk for crops in the Danube catchment. In contrast, in the Rhine catchment flood probability is high in winter, when flood susceptibility of crops is very low (Klaus et al., 2016). The agricultural sector in the Elbe catchment is also the only case where the Aggregate Exceedance Probability curve is constantly and visibly above the Occurrence Exceedance Probability curve (see, Figure 4f), indicating that multiple small events per year are happening, which do not affect residential buildings, but agricultural areas and a few commercial sites. This result can be explained by the flood regimes in Germany. The Elbe catchment shows a rather variable flood seasonality, that is floods frequently occur in winter but are also not unusual in summer, while flood occurrence in the other catchments is more restricted to the winter or summer half year (Beurton & Thieken, 2009). The higher variability in the Elbe catchment can thus more easily lead to multiple small events within one year.

Significance and Interpretation of Risk Estimates
The median EAD for Germany aggregated across all catchments consisting of residential, commercial and agriculture sectors was found to be €0.529 bn Euros with upper and lower uncertainty bounds, €0.245 bn and €0.94 bn, respectively. This value agrees with the EAD for Germany reported by studies at continental and global scales. For example, for baseline scenario (present), Alfieri et al. (2015) and Hattermann et al. (2014) reported EAD values €0.4 bn-€1 bn and €0.5 bn, respectively. Additionally, global assessment studies such as Dottori et al. (2018) and Alfieri et al. (2018) reported EAD for Germany in the ranges €0.5-€2.8 bn and €0.1-€0.6 bn. Feyen et al. (2012) used two different climate models and derived EAD €0.581 bn and €0.483 bn, respectively, for the current climate. Alfieri et al. (2018) computed that the EAD for Germany based on reported losses in the EM-DAT is between €0.74 bn and €2.5 bn. This is according to our expectation, since considering or disregarding spatial dependence does not influence the EAD, but leads to bias in the risk curve . Accordingly, our risk estimate for an event corresponding to 100-year loss is, with €4.973 bn (upper and lower uncertainty bounds €2.345 bn and €8.69 bn, respectively) significantly lower than the potential flood impact value of €15.5 bn reported by Alfieri et al. (2015), who based their estimate on the assumption of homogeneous 100-year return period flood discharges. When assuming full dependence, i.e., simultaneous occurrence of 100-year return period floods at every location/spatial unit, the total 100-year loss integrated over the entire domain is strongly overestimated (Lamb et al., 2010;Nguyen et al., 2020).
In some of the risk curves, we see stepwise increases at return periods above 200 (Figures 4a and 4b), mainly above 2000 years (Figures 4e, 4j, 4k, and 4o). We have investigated, exemplarily, this step change at about 200 years in the Danube basin (Figures 4a and 4b). Several events prior and beyond the step change are selected and the spatial loss patterns in three major branches (Upper Danube, Isar and Inn) with their respective return periods are analyzed. We have not detected any specific threshold in any of these branches, whose exceedance would result e.g., in a sudden inundation of a city, thus causing a step change in the risk curve. This step change is rather caused by a random combination of loss patterns, which are not extraordinary themselves but when combined result in a considerable impact. The fact that a random combination of loss patterns in different river branches results in a step change hints to the sampling uncertainty in the tail of the loss distribution. Analyzing more than the 5000 years should result in smoother risk curves in the tails.
We compare the loss estimates from the events generated by the RFM between 1990 and 2003 with losses reported in the Hanze data set (Paprotny et al., 2018). The reported losses lie within the upper and lower bounds of the RFM loss estimations for four out of the five events (Figure 5). The event in 2002 was extreme and caused more than 100 dike breaches . Currently RFM does not account for dike breaches in the model chain and only overtopping flow is considered. In particular, many breaches occurred in 2002 at water levels below the design height due to piping and slope instability breach mechanisms (Horlacher et al., 2005). Thus, we are unable to replicate the processes that occurred during the 2002 event resulting in an underestimation of losses. Hence, the detailed consideration of flood defense failures is an important next step in the further development of RFM. Additionally, we aim at using more detailed object-level exposure data from crowd-sourced, open datasets (Paprotny et al., 2020;Sieg et al., 2019) and implementing probabilistic loss models (Schoppa et al., 2020;Steinhausen et al., 2020) in the future. Currently, due to the stationarity assumption of the weather generator, the period used to set-up and calibrate the weather generator represents a certain climate state and the presented simulations do not account for climate change.

Conclusions
Applying the process-based RFM we derive risk curves based on the losses of the residential, commercial and agriculture sectors for the whole of Germany. The estimates for the median (upper and lower uncertainty ranges) EAD and VAR 99.5% are €0.529 (€0.245 bn and €0.94 bn) and €8.865 bn (€4.148 bn and €15.386 bn), respectively, for Germany. The commercial sector dominates by making about 60% of the total risk, followed by the residential sector. Across catchments, the Elbe, Danube and Rhine share about 90% of the total risk. These results support the investment decisions of the national flood protection program (LAWA, 2014) and stress the necessity to quantify, communicate but also further reduce uncertainties in flood risk assessments also for extreme events.
The process-based, spatially consistent flood risk estimates by RFM are an important step forward, since they avoid overestimation of losses for specific high return periods due to consideration of spatial dependency. Still, the results are associated with high uncertainty and there is room for further improvements like the integration of dike breaches, the use of model components which are intrinsically probabilistic and increasing the 5000-year simulation. Nevertheless, the presented results should be regarded as first estimates, which can serve as a benchmark for future German-wide flood risk assessments.

Data Availability Statement
Sources of data and model components are referenced in the manuscript and Supporting Information S1. The data pertaining to risk curves can be accessed from the GFZ data repository , https:// doi.org/10.5880/GFZ.4.4.2021.003).