Context aware benchmarking and tuning of a TByte-scale air quality database and web service

Betancourt, Clara ORCIDiD
Hagemeier, Björn ORCIDiD
Schröder, Sabine ORCIDiD
Schultz, Martin G. ORCIDiD

DOI: https://doi.org/10.1007/s12145-021-00631-4
Persistent URL: http://resolver.sub.uni-goettingen.de/purl?gldocs-11858/11033
Betancourt, Clara; Hagemeier, Björn; Schröder, Sabine; Schultz, Martin G., 2021: Context aware benchmarking and tuning of a TByte-scale air quality database and web service. In: Earth Science Informatics, 14, 3, 1597-1607, DOI: https://doi.org/10.1007/s12145-021-00631-4. 
 
Betancourt, Clara; Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Hagemeier, Björn; Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Schröder, Sabine; Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
Schultz, Martin G.; Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany

Abstract

We present context-aware benchmarking and performance engineering of a mature TByte-scale air quality database system which was created by the Tropospheric Ozone Assessment Report (TOAR) and contains one of the world’s largest collections of near-surface air quality measurements. A special feature of our data service https://join.fz-juelich.de is on-demand processing of several air quality metrics directly from the TOAR database. As a service that is used by more than 350 users of the international air quality research community, our web service must be easily accessible and functionally flexible, while delivering good performance. The current on-demand calculations of air quality metrics outside the database together with the necessary transfer of large volume raw data are identified as the major performance bottleneck. In this study, we therefore explore and benchmark in-database approaches for the statistical processing, which results in performance enhancements of up to 32%.