README.txt This dataset contains projected temperature and ozone data provided by EPA's Office of Research and Development in support of the manuscript "A Flexible Bayesian Ensemble Machine Learning Framework for Predicting Local Ozone Concentrations," by Xiang Ren, Panos Georgopoulos, et al. This dataset was prepared using meteorology downscaled from CESM RCP8.5 simulation with the CMAQ chemical transport model. Further details are provided in Fann et al. (2021) and Nolte et al. (in review). Anthropogenic emissions were from the 2011 emission model platform used for analysis of the Heavy Duty Greenhouse Gas Rule (U.S. EPA, 2016). ===== mda8.zip mda8 - contains layer 1 daily maximum 8-h average O3 concentrations (ppbV) modeled by CMAQ for each day for the period 2045-01-01 through 2055-12-31. Data are available for individual years in netCDF format (using Models-3 I/O API): mda8.YYYY.nc for YYYY in [2045, 2055] The file cesm.mda8.Rdata contains exactly the same data: mda8 is an array with dimensions [148, 110, 365, 11] corresponding to column, row, julian day, and year. Data values are daily maximum 8-h average ozone in the lowest model layer, in ppbV. Column/row (1, 1) is in the lower left (southwest) corner of the domain. ===== met.zip met - contains R data files for meteorological variables, all dimensioned [148, 110, 365, 11]. cesm.mcip.t2.prec.2045-2055.Rdata: t2.dmax - daily maximum 2-m temperature, deg C t2.dmin - daily minimum 2-m temperature, deg C precip.dsum - daily accumulated precipitation, mm The GRIDCRO and GRIDDOT files describe the grid used, in IOAPI/netCDF format. ===== The following data/variables were provided but were not used in the analysis. Due to space limitations, they are not included in this dataset, but are archived at the EPA's Environmental Modeling and Visualization Laboratory (EMVL) Archival Storage Management (ASM) system. cesm.mcip.rh.2045-2055.daily.Rdata: tlml.dmin - daily minimum temperature in lowest model layer, deg C tlml.dmax - daily maximum temperature in lowest model layer, deg C rh.dmin - daily minimum relative humidity in lowest model layer (fraction) rh.dmax - daily maximum relative humidity in lowest model layer (fraction) The CMAQ aerosol model computes RH from air temperature, air pressure, and water vapor mixing ratio using the algorithm of Alduchov and Eskridge, J. Appl. Meteorol. (1996). That procedure was replicated here. In this way, calculated RH exceeds 1 less than 1% of the time. I then applied bounds of [0.005, 0.99] for the RH, as done in CMAQ. cesm.mcip.RGRND.2045-2055.daily.Rdata var.dmean - shortwave radiation reaching ground, W m-2. This is the RGRND variable output by MCIP. It is not actually used within CMAQ, but it is a useful diagnostic for cloudiness. cesm.mcip.winds.2045-2055.daily.Rdata wspd.dmean - scalar average 10-m wind speed (i.e., average WSPD10 for all hours) wdir.dmean - vector average 10-m wind direction (calculated by computing u- and v- component velocities, averaging them, then taking inverse tangent). ===== emis.zip emis - The CMAQ modeling was done using a 36x36 km grid. However, the emissions were also processed on a 12x12 km grid. Non-point source emissions were merged into a 2-d dataset for each day of 2011, with hourly time steps. sumemis_2011ei.nc - Sum of emissions over each hour in 2011. Resulting units are 10^6 mole (for gas-phase species) and Mg, or metric tonnes, for aerosols. These emissions include all source sectors, including mobile sources and biogenic emissions. Emissions are speciated using the cb05 chemical mechanism. Additionally, emissions are available from five point source sectors. Emissions from these sectors were placed onto the 12-km grid and summed to calculate annual totals. ptegu.2011 - point source emissions from electric generating units (power plants) ptnonipm.2011 - point source emissions from non-EGU sources (i.e., not calculated by the Integrated Planning Model). pt_oilgas.2011 - point source emissions from the oil and gas sector ptfire.2011 - point source emissions from wildland fires in the year 2011 othpt.2011 - point source emissions within the modeling domain but outside the U.S. The gas-phase emissions may be converted from moles to tons by multiplying by the molecular weight. However, be advised that the nonpoint emissions in sumemis_2011ei.nc contain mobile source emissions as well as biogenic emissions, which may not be desirable for this project. If necessary emission totals could be recomputed without those sectors, but it would take some time and may not be worthwhile. ===== REFERENCES Fann, N. L., Nolte, C. G., Sarofim, M. C., Martinich, J., Nassikas, N. J. (2021), Associations between simulated future changes in climate, air quality, and human health, JAMA Network Open, 4, e2032064, doi:10.1001/jamanetworkopen.2020.32064. Nolte, C. G., Spero, T. L., Bowden, J. H., Sarofim, M. C., Martinich, J., Mallard, M. S., Fann, N., "Regional Temperature-Ozone Relationships Across the U.S. Under Multiple Climate and Emissions Scenarios," submitted manuscript, 2020. U.S. EPA, Emissions Inventory for Air Quality Modeling Technical Support Document: Heavy-Duty Vehicle Greenhouse Gase Phase 2 Final Rule, U.S. Environmental Protection Agency, EPA 420-R-16-008, 2016. This file prepared by: Christopher G. Nolte Center for Environmental Measurement and Modeling U.S. Environmental Protection Agency Research Triangle Park, NC 27711 29 March 2021