Natural Hazard Histories Dataset

Type  File Geodatabase Feature Class

Thumbnail

Tags

Summary

The United States (US) Environmental Protection Agency (EPA) collected and analyzed yearly records of nine natural hazards events for the 50 US States and Puerto Rico (PR) from 2000 to 2019 to investigate temporal and spatial trends of natural hazard exposures, both individually and in combination, across the 20-year timespan. This feature class contains yearly natural hazard exposure estimates (percent area) for all US counties, boroughs, parishes, and PR municipalities (exceptions are documented). The hazards included in this data set are hurricanes, tropical storms, tornadoes, landslides, wildfires, drought, coastal and inland flooding, and earthquakes.

Description

This feature class contains yearly (2000-2019) natural hazard exposure estimates (percent area) at the county-level for the entire US (including AK, HI, and PR). Secondary data sources were used to collect both tabular and spatial hazard exposure information related to hurricanes, tropical storms, tornadoes, landslides, wildfires, drought, coastal and inland flooding, and earthquakes. Data origin, as well as information regarding temporal and geographic coverage, and methods for calculating exposure estimates are discussed later in this document on a per-hazard basis. Candidate secondary data was reviewed for accessibility, temporal and spatial scale, and data formatting. Target data acceptance criteria: open access, per year basis, and vector or raster data presentation. When multiple data sources were available for the same natural hazard, data sets were compared for comparability and the source most likely to continue publishing data was selected. In cases where data did not fully meet acceptance criteria, the best available data was used for further analysis.

Credits

Summers, J.K., L.C. Harwell, K.D. Buck, L.M. Smith, D.N. Vivian. J.E. Harvey, M.D. McLaughlin and S.F. Hafner. 2017. Development of a Climate Resilience Screening Index (CRSI) Sustainable and Healthy Communities Research Program Technical Report. EPA600/R-17/238. Office of Research & Development, Washington, DC.

Summers, J.K., L.C. Harwell, K.D. Buck, L.M. Smith, D.N. Vivian. J.E. Harvey, M.D. McLaughlin, S.F. Hafner and C. A. McMillion. 2020. Development of a Cumulative Resilience Screening Index (CRSI) for Natural Hazards. Sustainable and Healthy Communities Research Program Technical Report. EPA600/R-20/274. Office of Research & Development, Washington, DC. (Revision and update of earlier report)

Use limitations

Please check sources, scale, accuracy, currency and other available information. Please confirm that you are using the most recent copy of both data and metadata.

Extent

West  -169.2     East -65.1
North  71.6     South 17.2

Scale Range

Maximum (zoomed in)  1:5,000
Minimum (zoomed out)  1:150,000,000

Topics and Keywords 

Themes or categories of the resource Environment

Theme keywords Environment, Exposure, Hazards, Resilience

Thesaurus's language English (UNITED STATES)
Thesaurus 
Title EPA GIS Keyword Thesaurus
Publication date 2007-11-02

Resource location online
Online location (URL)https://www.epa.gov/geospatial/epa-metadata-technical-specification
Name EPA Metadata Technical Specification
Function performed information

Theme keywords 020:097

Thesaurus's language English (UNITED STATES)
Thesaurus 
Title Federal Program Inventory
Publication date 2013-09-16

Resource location online
Online location (URL)https://www.performance.gov/federalprograminventory
Name Federal Program Inventory
Function performed information

Theme keywords Coastal Flooding, Drought, Earthquakes, Hurricanes, Inland Flooding, Landslides, Tornadoes, Tropical Storms, Wildfires

Thesaurus's language English (UNITED STATES)
Thesaurus 
Title User
Publication date 2021-08-05

Place keywords Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, United States, Utah, Vermont, Virginia, Washington, Washington DC, West Virginia, Wisconsin, Wyoming

Thesaurus's language English (UNITED STATES)
Thesaurus 
Title EPA Place Names
Publication date 2015-01-31

Resource location online
Online location (URL)https://ofmpub.epa.gov/sor_internet/registry/termreg/searchandretrieve/taxonomies/search.do?search=&searchString=&taxonomyName=WBT%20-%20Geographic%20Locations
Name Web Taxonomy - Geographic Locations
Function performed information

Citation 

Title Natural Hazard Histories Dataset
Publication date 2021-08-26

Presentation formats digital map

Citation Contacts 

Responsible party - point of contact
Individual's name Kevin Summers
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position Research Ecologist

Contact information 
Phone
Voice 850-934-9244
Address
Type both
City 1 Sabine Island Drive
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresssummers.kevin@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Resource Details 

Dataset languages English (UNITED STATES)
Dataset character set utf8 - 8 bit UCS Transfer Format

Status completed
Spatial representation type vector

Supplemental information
In some cases, data was not available for the entire spatial (contiguous US, AK, HI, and PR) or temporal extent (2000-2019) for a natural hazard. The value -999 was used to represent “no data available”. For example, each county for 2019 for both landslide and tornado exposure contains a percent area value of -999, since no data was available for this year. This is mentioned again within a process step under the lineage section of this document.

Please note that percent of county area impacted for natural hazards was processed within separate maps using ArcGIS Pro 2.6.3. One map was created for each region (contiguous US, AK, HI, and PR). Each map, along with the spatial data it contained, was projected in Albers projection. More specifically, the contiguous US map was projected in "USA Contiguous Albers Equal Area Conic" or ESRI:102003, AK was EPSG:3338, and HI was ESRI:102007. PR obtained a custom projection, which was identical to the contiguous US's Albers projection, except that the central meridian was set equal to PR state plane's central meridian value. Map projections were necessary because some ArcGIS Pro geoprocessing tools require data to be projected, e.g. the buffer tool. Separate maps were necessary because no singular projection was appropriate for all regions.

Processing environment Microsoft Windows 10 Version 10.0 (Build 18363) ; Esri ArcGIS 12.6.3.24783

Credits
Summers, J.K., L.C. Harwell, K.D. Buck, L.M. Smith, D.N. Vivian. J.E. Harvey, M.D. McLaughlin and S.F. Hafner. 2017. Development of a Climate Resilience Screening Index (CRSI) Sustainable and Healthy Communities Research Program Technical Report. EPA600/R-17/238. Office of Research & Development, Washington, DC.

Summers, J.K., L.C. Harwell, K.D. Buck, L.M. Smith, D.N. Vivian. J.E. Harvey, M.D. McLaughlin, S.F. Hafner and C. A. McMillion. 2020. Development of a Cumulative Resilience Screening Index (CRSI) for Natural Hazards. Sustainable and Healthy Communities Research Program Technical Report. EPA600/R-20/274. Office of Research & Development, Washington, DC. (Revision and update of earlier report)

ArcGIS item properties
Name NatHazHistories_Dataset
Location file://\\L2626XLOANER2\C$\Users\ALAMPER\OneDrive - Environmental Protection Agency (EPA)\Profile\Documents\Telework\NatHazHistories\GENERAL\Deliverable\Deliverable_20210825.gdb
Access protocol Local Area Network

Extents 

Extent
Description
This bounding box and temporal period extent are representative of the data set as a whole. As previously mentioned, multiple sources did not provide data for the entire spatial (contiguous US, AK, HI, and PR) or temporal extent (2000-2019). For further information on this, please refer to each hazard's source extent, which is found under the given hazard's data source within the lineage section of this document.

Geographic extent
Bounding rectangle
Extent type
Extent used for searching
West longitude -169.2
East longitude -65.1
South latitude 17.2
North latitude 71.6
Extent contains the resource Yes

Temporal extent
Beginning date 2000-01-01 00:00:00
Ending date 2019-12-31 00:00:00

Resource Points of Contact 

Point of contact - publisher
Individual's name Kevin Summers
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position Research Ecologist

Contact information 
Phone
Voice 850-934-9244
Address
Type both
City 1 Sabine Island Drive
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresssummers.kevin@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Point of contact - point of contact
Individual's name Andrea Lamper
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position ArcGIS Data Specialist (ORAU SSC)

Contact information 
Phone
Voice 850-934-9336
Address
Type both
Delivery point 1 Sabine Island Drive
City Gulf Breeze
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresslamper.andrea@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Resource Maintenance 

Resource maintenance
Update frequency as needed

Maintenance contact - point of contact
Individual's name Kevin Summers
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position Research Ecologist

Contact information 
Phone
Voice 850-934-9244
Address
Type both
City 1 Sabine Island Drive
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresssummers.kevin@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Resource Constraints 

Constraints
Limitations of use
Please check sources, scale, accuracy, currency and other available information. Please confirm that you are using the most recent copy of both data and metadata.

Security constraints
Classification unclassified

User note
Public dataset: User defined text describing access to the dataset (limit is 255 characters)

Legal constraints
Limitations of use
EPA Public Domain License

Access constraints unrestricted license

Other constraints
https://edg.epa.gov/EPA_Data_License.html

Spatial Reference 

ArcGIS coordinate system
Type Geographic
Geographic coordinate reference GCS_WGS_1984
Coordinate reference details
GeographicCoordinateSystem
WKID 4326
XOrigin -400
YOrigin -400
XYScale 999999972.77078092
ZOrigin -100000
ZScale 10000
MOrigin -100000
MScale 10000
XYTolerance 8.9831528327088961e-09
ZTolerance 0.001
MTolerance 0.001
HighPrecision true
LeftLongitude -180
LatestWKID 4326
WKT GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433],AUTHORITY["EPSG",4326]]

Reference system identifier
Value EPSG:4326
Codespace EPSG
Version 6.14(3.0.1)


Spatial Data Properties 

Vector 
Level of topology for this dataset geometry only

Geometric objects
Feature class name NatHazHistories_Dataset
Object type composite
Object count 0
ArcGIS Feature Class Properties 
Feature class name NatHazHistories_Dataset
Feature type Simple
Geometry type Polygon
Has topology FALSE
Feature count 0
Spatial index TRUE
Linear referencing FALSE

Data Quality 

Scope of quality information 
Resource level dataset

Data quality report - Completeness commission 
Conformance test results
Test passed Yes

Product specification 
Title N/A

Lineage 

Lineage statement
A data source and process step are provided for each natural hazard within the data set. 

Each process step summarizes methods used to find percent area per hazard. Two additional process steps were included to describe methods conducted after percent area was calculated for all nine hazards. Within each data source, spatial and temporal extents for each hazard are specified. The earliest and latest date recorded in the original dataset are reported within each hazard's temporal period extent (seen under source extent). YYYY-MM-DD format is required here, but exact day may not be available. In this case, YYYY-01-01 or YYYY-12-31 was used, depending on if events were reported throughout the given year. 


Process step 
Description
Hurricane data was downloaded as a shapefile and was originally titled "IBTrACS.since1980.list.v04r00.lines". This shapefile, in addition to a 2018 US Census Bureau's county boundaries shapefile, was brought into ArcGIS Pro version 2.6.3. A definition query was used to limit hurricane tracks or features to just those of interest - hurricane tracks occurring in or after the year 2000. To do so, the query was written as "year >= 2000 and (TRACK_TYPE = 'main' or TRACK_TYPE = 'PROVISIONAL') and USA_STATUS LIKE 'H%' and USA_SSHS >= 1". Once limited, a copy of this shapefile was created.

To illustrate possible area of impact, each feature within this copy was buffered. To create these buffers two new fields or columns were calculated. Four 34-knot wind radii fields were provided (one for each quadrant). The max value across these four fields was found, multiplied by 1852 (converting from nautical miles to associated map unit, meters), and reported within the first field. Then, the average of these max values per each hurricane category was found using the summary statistics tool. Each category's mean max value was applied to all associated features and recorded in the second field. The buffer was based on this second field. Within this tool, a dissolve field can be specified. Features were buffered and subsequently dissolved by year, so that one feature would represent total exposure for a given year. Please note that one warning message, 000635: Skipping feature <value> because of NULL or EMPTY geometry, was generated upon the buffer tool's completion. Features with a "Shape_Length" equal to 0 were the reason for this message.

To calculate percent area of county exposed to hurricane events, this buffer output was summed to the US county shapefile via the sum within tool. Within this tool, the group by field was specified as "year", so that percent area per county per year was recorded. This tool generated a feature class containing all US counties, and a join table which held percent area per year by county. Counties with no percent area for a given year were not reported within the join table. Sum within trials and associated join tables for the contiguous US, HI, and PR (regions with percent area recorded) were exported as csv files and brought into Excel for further analysis.

Rationale
"TRACK_TYPE" was specified as either "main" or "PROVISIONAL" because the remaining values, "spur" and "PROVISIONAL_spur", essentially represent alternate positions of the storm, which were noted to be short lived and not included when counting storms. For more information on "TRACK_TYPE" please refer to the IBTrACS v04 column documentation. 

The four 34-knot wind radii quadrants or tropical storm force wind fields were chosen as the buffer's starting point, instead of the 50-knot or 64-knot quadrants, because they illustrate the largest possible extent or area exposed to a hurricane. The buffer was not based on each 34-knot wind radii quadrant value, or even the max value found across these four fields for each feature, because many of these quadrant fields contained a value of zero; this value represented no data. Without an average max value per category, these features would not have been buffered. The concept of finding an average value per category to represent possible area of exposure was derived from FEMA's NRI technical document (https://www.fema.gov/sites/default/files/documents/fema_national-risk-index_technical-documentation.pdf) - see hurricane, spatial processing section for reference.

Process step 
Description
The same processing methods used for hurricane events were applied to tropical storm data. The only exception being the definition query used to limit the original shapefile mentioned previously. The query was written as "year >= 2000 and (TRACK_TYPE = 'main' or TRACK_TYPE = 'PROVISIONAL') and USA_STATUS = 'TS' and USA_SSHS = 0".

Rationale
Refer to rationale provided under the previous process step.

Process step 
Description
Tornado data was downloaded as a shapefile and was originally titled "1950-2018-torn-aspath". This shapefile, in addition to a 2018 US Census Bureau's county boundaries shapefile, was brought into ArcGIS Pro version 2.6.3. A definition query was used to limit tornado tracks or features to just those of interest - tornadoes occurring in or after the year 2000 with a magnitude greater than or equal to 0. To do so, the query was written as "yr >= 2000 and mag >= 0". Once limited, a copy of this shapefile was created.

To illustrate possible area of exposure, each feature within this copy was buffered. To create these buffers two new fields or columns were calculated. A width field, called "wid", was provided. For each feature, this width value was taken, divided by 1.094 (converting from yards to associated map unit, meters), and reported within the first field. Then, the width value in meters was divided by 2 and reported in the second field, since the width represented the entire width of the event and not just storm center to edge. The buffer was based on this second field. Within this tool, a dissolve field can be specified. Features were buffered and subsequently dissolved by year, so that one feature would represent total exposure for a given year. Please note that two warning messages, "000635: Skipping feature <value> because of NULL or EMPTY geometry" and "000636: Skipping feature <value> because a negative or very small distance resulted in no geometry", were generated upon the buffer tool's completion. Features with a "Shape_Length" or "wid" equal to 0 were the reason for these messages.

To calculate percent area of county exposed to tornadoes, this buffer output was summed to the US county shapefile via the sum within tool. Within this tool, the group by field was specified as "yr", so that percent area per county per year was recorded. This tool generated a feature class containing all US counties, and a join table which held percent area per year by county. Counties with no percent area for a given year were not reported within the join table. Sum within trials and associated join tables for the contiguous US, HI, and PR (regions with percent area recorded) were exported as csv files and brought into Excel for further analysis.

Rationale
The "wid" field was chosen as the buffer's starting point because, other than length, it was the only extent-related information given. Events with a magnitude of less than 0 (value of -9) were excluded from further processing because their magnitude was unknown.

Process step 
Description
Landslide data was downloaded as a shapefile and was originally titled "US_Landslide_poly". This shapefile, in addition to a 2018 US Census Bureau's county boundaries shapefile, was brought into ArcGIS Pro version 2.6.3. A definition query was used to limit landslide events or features to just those of interest - landslides occurring in or after the year 2000 with a confidence greater than or equal to 3. To do so, the query was written as "Confidence >= 3 and (Date like '2000%' or Date like '2001%'...". Please note that the date field was formatted as a text field and was the only temporal field provided. Many dates were accompanied by text in various formats. Also, "like" within the query means "begins with". Once limited, a copy of this shapefile was created.

No buffer was necessary since events were presented as polygons. Although the buffer tool was not used, the dissolve tool was still needed to collapse events into a single feature per year. The dissolve was based on a field called "YEAR", which was created by extracting the first 4 characters of the "Date" field.

To calculate percent area of county exposed to landslides, this dissolve output was summed to the US county shapefile via the sum within tool. Within this tool, the group by field was specified as "YEAR", so that percent area per county per year was recorded. This tool generated a feature class containing all US counties, and a join table which held percent area per year by county. Counties with no percent area for a given year were not reported within the join table. Sum within trials and associated join tables for the contiguous US and AK (regions with percent area recorded) were exported as csv files and brought into Excel for further analysis.

Rationale
Events with a confidence value of less than 3 (values of 1 or 2) were omitted because of the uncertainty associated with the event's location. The source's metadata document provides a brief description of each confidence level. The following source offers more comprehensive definitions (https://kgs.uky.edu/kgsmap/helpfiles/landslide_help.shtm) - see below. 

"1 - Possible landslide occurred in the area: The lowest confidence level reflects the uncertain nature of some media reports and the lack of expert classification and characterization from old maps. 2 - Probable landslide in the area: Although the exact location and extent of the landslide is not documented, it is probable that a landslide did occur within close proximity to the specified location. This includes geologic mapping of landslide deposits that may correspond to multiple landslides as well as individual landslides mapped with low resolution topographic data."

Process step 
Description
Wildfire data was downloaded as a shapefile and was originally titled "InteragencyFirePerimeterHistory". This shapefile, in addition to a 2018 US Census Bureau's county boundaries shapefile, was brought into ArcGIS Pro version 2.6.3. A definition query was used to limit wildfire events or features to just those of interest - wildfires occurring in or after the year 2000. To do so, the query was written as "FIRE_YEAR like '2000%' or FIRE_YEAR like '2001%'...". Please note that the date field was formatted as a text field. Also, "like" within the query means "begins with". Once limited, a copy of this shapefile was created.

No buffer was necessary since events were presented as polygons. Although the buffer tool was not used, the dissolve tool was still needed to collapse events into a single feature per year. The dissolve was based on the "FIRE_YEAR" field.

To calculate percent area of county exposed to wildfire, this dissolve output was summed to the US county shapefile via the sum within tool. Within this tool, the group by field was specified as "FIRE_YEAR", so that percent area per county per year was recorded. This tool generated a feature class containing all US counties, and a join table which held percent area per year by county. Counties with no percent area for a given year were not reported within the join table. Sum within trials and associated join tables for the contiguous US, AK, HI, and PR were exported as csv files and brought into Excel for further analysis.

Process step 
Description
Drought data was pulled from US Drought Monitor's web database using R version 3.6.0. From a US Census Bureau county FIPS worksheet, a list of unique FIPS codes for each county in the United States was created as an R object. The FIPS codes were used to loop through the API pull-request URL so that drought severity statistics by percent area could be pulled between 1/1/2000 and 12/31/2020 for all US counties. Statistics type 1 is the measure of interest. The output data frames were stacked together to produce one large file to cleanse. Because the data came in a weekly format, the end-date of the week was used to determine the corresponding year. An average of all week data per county, per year became the average percent area of drought metric. We chose to only include D2 drought intensity and greater (D3 and D4). 

Rationale
D0 and D1 (abnormally dry and moderate drought) intensity values were excluded because lower classifications seem to be more widespread and variable from year to year. We were most interested in areas exposed to persistent drought. Given this, it was most appropriate to start with D2 (severe drought) as our lowest intensity level. Please refer to the data source link for more information on drought level classification.

Process step 
Description
Coastal flood data was downloaded as a geodatabase, which contained one raster file per state. These raster files, in addition to a 2018 US Census Bureau's county boundaries shapefile, were brought into ArcGIS Pro version 2.6.3. Each raster was converted to a polygon feature class, and subsequently merged into one. A definition query was used to limit coastal flood features to just those of interest - possible coastal flood areas with a hazard value or "gridcode" greater than or equal to 1. To do so, the query was written as "gridcode >= 1".

To calculate percent area of county possibly exposed to coastal flooding, this merged feature class was summed to the US county shapefile via the sum within tool. Please note that for the contiguous US, this county shapefile was split up into smaller subsets before running sum within trials. Each sum within trial generated one feature class containing a number of US counties and associated percent area. Sum within trials for the contiguous US were stitched back together before exporting. Unlike previous hazards, the group by field was not necessary because only one year or data release was provided. Sum within trials for the contiguous US, HI, and PR were exported as csv files and brought into Excel for further analysis.

As mentioned previously, coastal flood data was not available for AK. Instead of omitting AK entirely, a subset of inland flood data was used to replicate one of the layers used to create the coastal flood composite layer. Inland flood data was clipped to AK, limited to only those areas of moderate (0.2% annual chance) to high risk (1% annual chance) of flooding and dissolved to a single feature before running the sum within tool.

Rationale
The "gridcode" field represents the number of flood hazards that may occur at that specific pixel (for raster data) or feature's location. It was not made clear why NOAA reported on locations with a "gridcode" of 0, and how these areas would differ from those outside the bounds of these raster files. For this reason, 0 values were excluded from further analysis.

For every other hazard, the processing methods mention a need for the dissolve tool. Such was not necessary here. Event based hazards required a single feature per year. For example, inland flooding was dissolved to a single feature since only one year or data release was available. Inland flooding, like all other vector-derived hazards, was represented by millions of intricate polygon features, where it was possible and likely that many features overlapped several other features. This issue of overlapping features causes percent area values per county to balloon and over-represent the hazard. Coastal flooding was derived from raster data, so unlike vector data, overlapping features was a non-issue.

NOAA's Coastal Flood Exposure Mapper displays the source data provided to me by the NOAA affiliate mentioned within this hazard's data source section of this document. Within the mapper, this data is called the "coastal flood hazard composite layer". An information tab is available for this layer, which gives a brief summary of the five hazards used to create the composite layer. One of these hazards was a subset of the same source data used for inland flooding and could easily be replicated. We felt it was important to represent AK if possible and this was the best available data to do so.

Process step 
Description
Inland flood data was downloaded as a geodatabase, which contained 8 feature classes and 2 tables. The "S_Fld_Haz_Ar" feature class, in addition to a 2018 US Census Bureau's county boundaries shapefile, were brought into ArcGIS Pro version 2.6.3. A definition query was used to limit inland flood features to just those of interest - all designated flood zones, except for those defined as "Area not included", "D", or "Open Water". To do so, the query was written as "FLD_ZONE <> 'AREA NOT INCLUDED' Or FLD_ZONE <> 'D' Or FLD_ZONE <> 'OPEN WATER'". Once limited, a copy of this feature class was created.

Before calculating percent area, the check geometry and repair geometry tool was executed. After which, the newly created feature class was clipped to each region and then dissolved into a single feature.

To calculate percent area of county possibly exposed to inland flooding, each region's clipped and dissolved output was summed to the region's county shapefile via the sum within tool. Please note that for the contiguous US, this county shapefile was split up into smaller subsets before running sum within trials. Each sum within trial generated one feature class containing a number of US counties and associated percent area. Sum within trials for the contiguous US were stitched back together before exporting. Unlike previous hazards, the group by field was not necessary because only one year or data release was provided. Sum within trials for the contiguous US, AK, HI, and PR were exported as csv files and brought into Excel for further analysis.

Rationale
From the 8 feature classes provided, the "S_Fld_Haz_Ar" feature class was selected for further analysis because it was the only polygon data with information pertaining to flood zone type. These zone types give context to how often an area might experience some degree of inland flooding. This information was necessary for the definition query mentioned above. The flood zone types, "Area not included", "D", or "Open Water", were excluded because no likelihood of exposure to inland flooding is associated with these values. "D" is said to represent an area of undetermined flood risk. 

The check and repair geometry tools were necessary because invalid topology error messages were thrown when clipping to each region. The main issue was self-intersection of features. After executing these, the aforementioned feature class was clipped and dissolved to each region to reduce processing time when using the sum within tool. 

Process step 
Description
Earthquake data was downloaded in shapefile format. Two shapefiles, one for the central and another for the eastern US, were provided for each one-year short-term induced seismicity model. Such models were only available for 2016, 2017, and 2018. These shapefiles, in addition to a 2018 US Census Bureau's county boundaries shapefile, were brought into ArcGIS Pro version 2.6.3. Shapefiles for each year were merged. After which, a definition query was used to limit chance of damage from an earthquake to just those of interest - chance of damage greater than or equal to 1%. To do so, the query was written as "ValueRange <> '<1'". Once limited, a copy of each shapefile was created.

No buffer was necessary since features were presented as polygons. These polygons represented probability and not events, so a dissolve was not necessary either. 

To calculate percent area of county possibly exposed to earthquakes, each merged feature class was summed to the US county shapefile via the sum within tool. Each sum within trial generated one feature class containing all contiguous US counties and associated percent area. All sum within trials for the contiguous US were stitched back together before exporting. Unlike previous hazards, the group by field was not necessary because one sum within trial was run for each year. The merged sum within trial for just the contiguous US was exported as csv file and brought into Excel for further analysis.

Rationale
Sum within trials for AK, HI, and PR were not conducted because no data was available for these regions. The data provided was probabilistic, and therefore could not be replicated with any degree of confidence. Long-term seismicity data was available for a number of years for these regions but was deemed not useful since no spatial data was offered.

Process step 
Description
Processing outside of ArcGIS Pro:

Exports for sum within trials and associated join tables, per each region for each hazard, were joined in Excel version 2102 via the "Join_ID" column. Once all regions were joined per each hazard, all counties with a value for percent area were stacked together. 

Each hazard matrix was added to a new workbook where zero filling could be completed. Within this, counties which were not included within associated join tables were added back in and assigned a value of 0. This was done on a per hazard basis, for each county and for each year within the hazard's spatial and temporal extent. For example, every county within each year from 2000-2018 was represented, regardless of its percent area value, for tornadoes. The year 2019 was not included for tornadoes because no data was provided for that year.

Once the previous step was executed for each hazard, one sheet containing percent area for all counties for each year from 2000-2019 was created within a separate workbook. Each hazard was joined to this final matrix via a concatenated column containing "GEOID" and "YEAR". N/A values were replaced with -999, since N/A represented no data provided for a county within a given year.

Rationale
Zero filling was deemed appropriate because it represents counties with no exposure, which are arguably just as important as those counties with some degree of exposure. The best available data sources were used for each hazard, but that isn't to say that each source doesn't have its limitations and possible misreporting of events. Given this, zero exposure in terms of percent area does not mean such counties contain zero risk or likelihood of exposure. In the context of this dataset, zero percent area simply translates to no occurrence of a defined hazard event within that county for that year.

N/A values were replaced with -999 so that counties with no data could be easily identified.

Process step 
Description
Importing final matrix back into ArcGIS Pro:

The final data matrix was then joined to a copy of the US Census Bureau's 2018 county boundary shapefile in ArcGIS Pro version 2.6.3 via the "GEOID" field. Please note that this copy contained the US Virgin Islands 3 main islands; these were removed after the join was completed. A total of 64,400 records remained (3,220 counties multiplied by 20 years). Additionally, counties with a percent area value greater than 100% were capped at 100%. Such values only exceeded 100% by less than 1%.

Rationale
This join was necessary to create a spatial data set. Despite its current spatial context, this feature class can be easily exported as a table for use in other applications. 

The presence of values exceeding 100% by less than 1% were deemed an artifact of the sum within tool in ArcGIS Pro. Because of this, capping at 100% seemed appropriate.

Source data 
Description
Hurricane and Tropical Storm Data Source: International Best Track Archive for Climate Stewardship (IBTrACS) 
Data Source Link: https://www.ncdc.noaa.gov/ibtracs/index.php?name=ib-v4-access 
*Data updated weekly*

Downloaded "IBTrACS.since1980.list.v04r00.lines.zip" on 03/18/2021. This zip file contains a polyline shapefile, which houses all recorded hurricane tracks (both preliminary and final tracks) worldwide from 1980 to current. A technical document and column documentation pdf relating to this "version 4" data download are available as well. To view these, please visit the data source link listed above.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK, HI, and PR).
Temporal extent DOES cover entire timeframe of interest (2000-2019).

Temporal extent
Beginning date 1980-01-01 00:00:00
Ending date 2021-03-16 00:00:00

Source data 
Description
Tornado Data Source: NOAA's Storm Prediction Center (SPC)
Data Source Link: https://www.spc.noaa.gov/gis/svrgis/
*Data updated at unknown frequency. At time of download, data was last updated 10/01/2019*

Downloaded "1950-2018-torn-aspath.zip" on 09/02/2020. This zip file contains a polyline shapefile, which houses all recorded tornado paths within the US (including US territories) from 1950 to 2018. A database description document and changelog are available as well. To view these, please visit the data source link listed above. 

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK, HI, and PR).
Temporal extent DOES NOT cover entire timeframe of interest (2000-2019) - did not include 2019 at time of download.

Temporal extent
Beginning date 1950-01-03 00:00:00
Ending date 2018-12-31 00:00:00

Source data 
Description
Landslide Data Source: USGS's Landslide Inventories across the United States
Data Source Link: https://www.sciencebase.gov/catalog/item/5c7065b4e4b0fe48cb43fbd7
*Data updated annually. At time of download, data was last updated 03/21/2019 (even though publication date is listed as 09/19/2019)*

Downloaded "US_Landslide_1.zip" on 09/30/2020. This zip file contains a polygon shapefile, which houses all recorded landslide incidents within the US (including US territories) from 1900 to 2019. A metadata document provided as an xml is available as well. To view this document, please visit the data source link listed above.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK, HI, and PR).
Temporal extent DOES NOT cover entire timeframe of interest (2000-2019) - did not include 2019 at time of download. 

Please note that the "date" field (the only temporal field provided) is a text field, which contains numerical dates with adjoining text in a variety of formats. No cleaning of this field was conducted. Unlike other hazards, this hazard's begin and end date was pulled from the source's metadata. Also, begin date was specified as 1900-01-01, but this entry would not save within this metadata template - looks to be too early of a date.

Temporal extent
Ending date 2019-01-01 00:00:00

Source data 
Description
Wildfire Data Source: National Interagency Fire Center, Interagency Fire Perimeter History - All Years
Data Source Link: https://data-nifc.opendata.arcgis.com/datasets/interagency-fire-perimeter-history-all-years/explore?location=43.578757%2C63.134208%2C3.57
*Data updated at unknown frequency. At time of download, data was last updated 06/12/2020*

Downloaded "Interagency_Fire_Perimeter_History_-_All_Years.zip" on 09/08/2020. This zip file contains a polygon shapefile, which houses all recorded wildfire incidents within the US (including US territories) from 1835 to 2019. An attributes pdf document is available as well. To view this document, please visit the data source link listed above.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK, HI, and PR).
Temporal extent DOES cover entire timeframe of interest (2000-2019).

Please note begin date was specified as 1835-01-01, but this entry would not save within this metadata template - looks to be too early of a date.

Temporal extent
Ending date 2019-05-22 00:00:00

Source data 
Description
Drought Data Source: U.S. Drought Monitor
Data Source Link: https://droughtmonitor.unl.edu/DmData/DataDownload.aspx
*Data updated weekly*

Drought severity statistics for percent area per county for D2 drought severity and greater from 1/1/2000-12/31/2020 was pulled via API and brought into R for further processing. A metadata tab is available at the data source listed above.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK*, HI, and PR)
Temporal extent DOES cover entire timeframe of interest (2000-2019).

*Please note that drought data is collected for MOST of AK. No data was reported for 12 of the 29 Alaskan GEOIDs, these being: 02050, 02105, 02158, 02170, 02180, 02188, 02195, 02198, 02230, 02240, 02275 and 02290.

Temporal extent
Beginning date 2000-01-01 00:00:00
Ending date 2020-12-31 00:00:00

Source data 
Description
Coastal Flooding Data Source: Coastal Flood Exposure Mapper, Coastal Flood Hazard Composite Layer
Data Source Link: https://coast.noaa.gov/floodexposure/#-10575352,4439107,5z
*Data updated at unknown frequency*

Downloaded "OCM_CoastalFloodHazardComposite_FY19" from a shared google drive with NOAA affiliate on 03/23/2021. This zip file contains a geodatabase. Within this, raster files for coastal flooding are provided by state. Please note data for AK is not included, as the contact stated that no data currently exists. Also, this data is not event based - it only illustrates the most up to date representation of possible exposure to coastal flooding. This contact mentioned that although a variety of data sources were used to create this composite layer, each with a different production date, that the composite layer was created at the end of the 2019 calendar year. No metadata is available for this source.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES NOT cover entire area of interest (contiguous US, AK, HI, and PR) - did not include AK (per NOAA affiliate, this does not yet exist).
Temporal extent DOES NOT cover entire timeframe of interest (2000-2019) - only included 2019 data release at time of download.

Temporal extent
Beginning date 2019-01-01 00:00:00
Ending date 2019-01-01 00:00:00

Source data 
Description
Inland Flooding Data Source: National Flood Hazard Layer (NFHL), Seamless Nationwide NFHL GIS data
Data Source Link: https://catalog.data.gov/dataset/national-flood-hazard-layer-nfhl
*Data updated monthly*

Downloaded "NFHL_Key_Layers.gdb" from the data source link on 02/09/2021. This zip file contains a geodatabase. Within this, various polygon shapefiles, which represent different aspects of inland flooding, are provided. Please note, this data is not event based - it only illustrates the most up to date representation of possible exposure to inland flooding. Two metadata documents and a database technical reference are available as well. To view these, please visit the data source link listed above. Within these metadata documents, no "last update" date pertaining to the data itself is specified. A process step within these documents mentions that the resource is updated on a monthly basis. Since these documents were last updated 11/2020, we felt it was most appropriate to state that the data was representative of 2019. We did not feel confident that inland flood data or FIRM panels for 2020 were included within this set of data.

Source medium name online link

Extent of the source data
Description
Spatial extent DOES cover entire area of interest (contiguous US, AK, HI, and PR)
Temporal extent DOES NOT cover entire timeframe of interest (2000-2019) - only included 2019 data release at time of download.

Temporal extent
Beginning date 2019-01-01 00:00:00
Ending date 2019-01-01 00:00:00

Source data 
Description
Earthquakes Data Source: Short-term Induced Seismicity Models
Data Source Link: https://www.usgs.gov/natural-hazards/earthquake-hazards/science/short-term-induced-seismicity-models?qt-science_center_objects=0#qt-science_center_objects
*Model updated at unknown frequency. At time of download, most recent one-year seismic hazard forecast was released in 2018*

Downloaded two zip files, one for each half of the contiguous US, for each year (2016, 2017, and 2018) from the data source link on 12/31/2020. Each zip file contains a singular shapefile, which illustrates percent chance of damage from an earthquake. Please note, this data is not event based - each set of shapefiles represent a one-year probabilistic seismic hazard forecast for the central and eastern US from induced and natural earthquakes. 

2018 downloads: Please note that data illustrating minor-damage, instead of moderate damage, was downloaded because the source referenced this data when stating "For consistency, the updated 2018 forecast is developed using the same probabilistic seismicity-based methodology as applied in the two previous forecasts."
1. "Chance of potentially minor-damage ground shaking in 2018 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Central and Eastern United States
2. "Chance of potentially minor-damage ground shaking in 2018 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Western United States

2017 downloads:
1. "Chance of damage from an earthquake in 2017 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Central and Eastern United States"
2. "Chance of damage from an earthquake in 2017 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Western United States"

2016 downloads: 
1. "Chance of damage from an earthquake in 2016 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Central and Eastern United States"
2. "Chance of damage from an earthquake in 2016 based on the average of horizontal spectral response acceleration for 1.0-second period and peak ground acceleration for the Western United States"

Source medium name online link

Extent of the source data
Description
Spatial extent DOES NOT cover entire area of interest (contiguous US, AK, HI, and PR) - did not include AK, HI, or PR at time of download.
Temporal extent DOES NOT cover entire timeframe of interest (2000-2019) - only included models for 2016, 2017, and 2018 at time of download.

Temporal extent
Beginning date 2016-01-01 00:00:00
Ending date 2018-01-01 00:00:00

Distribution 

Distribution format
Name File Geodatabase Feature Class

Fields 

Details for object NatHazHistories_Dataset 
Type File Geodatabase Feature Class
Row count 1
Definition
Natural Hazard Histories Dataset

Definition source
Point of Contact

Field OBJECTID 
Data type Object ID
Width 5
Alias OBJECTID
Precision 0
Scale 0

Field description
Sequential unique whole numbers that are automatically generated

Description source
Esri

Range of values
Minimum value 1
Maximum value 64400

Description of values
Sequential unique whole numbers that are automatically generated

Field Shape 
Data type Geometry
Width 7
Alias Shape
Precision 0
Scale 0

Field description
Coordinates defining the features

Description source
Esri

List of values
Value Polygon
Description Polygon Shape
Enumerated domain value definition source ArcGIS Pro

Description of values
Coordinates defining the features

Field STATEFP 
Data type Text
Width 2
Alias STATEFP
Precision 0
Scale 0

Field description
Current state Federal Information Processing Series (FIPS) code

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field COUNTYFP 
Data type Text
Width 3
Alias COUNTYFP
Precision 0
Scale 0

Field description
Current county Federal Information Processing Series (FIPS) code

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field COUNTYNS 
Data type Text
Width 8
Alias COUNTYNS
Precision 0
Scale 0

Field description
Current county Geographic Names Information System (GNIS) code

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field AFFGEOID 
Data type Text
Width 14
Alias AFFGEOID
Precision 0
Scale 0

Field description
American FactFinder summary level code + geovariant code + '00US' + GEOID

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field GEOID 
Data type Text
Width 5
Alias GEOID
Precision 0
Scale 0

Field description
County identifier; a concatenation of current state Federal Information Processing Series (FIPS) code and county FIPS code

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field NAME 
Data type Text
Width 100
Alias NAME
Precision 0
Scale 0

Field description
Current county name

Description source
US Census Bureau

Coded values
Name of codelist cb_2018_us_county_500k.zip
Source https://www.census.gov/programs-surveys/geography/technical-documentation/naming-convention/cartographic-boundary-file.html

Field YEAR 
Data type Double
Width 4
Alias YEAR
Precision 0
Scale 0

Field description
Year of percent area exposure

Description source
Point of Contact

Range of values
Minimum value 2000
Maximum value 2019

Field UNIT_S_ 
Alias UNIT(S)
Data type Text
Width 255
Precision 0
Scale 0

Field description
Unit of exposure within hazard columns

Description source
Point of Contact

List of values
Value PERCENT_AREA
Description Percent Area
Enumerated domain value definition source Point of Contact

Field HURRICANE 
Data type Double
Width 4
Alias HURRICANE
Precision 0
Scale 0

Field description
Percent area of county exposed to hurricane(s) (category 1 or greater) within a given year

Description source
Point of Contact

Range of values
Minimum value 0
Maximum value 100

Field TROPICAL_STORM 
Alias TROPICAL STORM
Data type Double
Width 4
Precision 0
Scale 0

Field description
Percent area of county exposed to tropical storm(s) (category 0) within a given year

Description source
Point of Contact

Range of values
Minimum value 0
Maximum value 100

Field TORNADO 
Data type Double
Width 4
Alias TORNADO
Precision 0
Scale 0

Field description
Percent area of county exposed to tornado(s) (magnitude 0 or greater) within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 9.08

Field LANDSLIDE 
Data type Double
Width 4
Alias LANDSLIDE
Precision 0
Scale 0

Field description
Percent area of county exposed to landslide(s) (confidence level 3 or greater) within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 7.11

Field WILDFIRE 
Data type Double
Width 4
Alias WILDFIRE
Precision 0
Scale 0

Field description
Percent area of county exposed to wildfire(s) within a given year

Description source
Point of Contact

Range of values
Minimum value 0
Maximum value 64.05

Field DROUGHT 
Data type Double
Width 4
Alias DROUGHT
Precision 0
Scale 0

Field description
Percent area of county exposed to drought (drought severity 2 or greater) within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 100

Field COASTAL_FLOODING 
Alias COASTAL FLOODING
Data type Double
Width 4
Precision 0
Scale 0

Field description
Percent area of county with possible exposure to coastal flooding (hazard number 1 or greater) within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 100

Field INLAND_FLOODING 
Alias INLAND FLOODING
Data type Double
Width 4
Precision 0
Scale 0

Field description
Percent area of county with possible exposure to inland flooding (flood zones equal to those starting with A, V, or X) within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 100

Field EARTHQUAKE 
Data type Double
Width 4
Alias EARTHQUAKE
Precision 0
Scale 0

Field description
Percent area of county with greater than or equal to 1% chance of damage from an earthquake within a given year

Description source
Point of Contact

Range of values
Minimum value -999
Maximum value 100

Field COUNTY_AREA_SQKM 
Alias COUNTY AREA (SQKM)
Data type Double
Width 7
Precision 0
Scale 0

Field description
"Shape_Area" generated by ArcGIS Pro divided by 1,000,000 (square meters to square kilometers)

Description source
Point of Contact

Description of values
"Shape_Area" generated by ArcGIS Pro divided by 1,000,000 (square meters to square kilometers)

Field Shape_Length 
Alias Shape Length
Data type Double
Width 8
Precision 0
Scale 0

Field description
Length of feature in internal units

Description source
Esri

Description of values
Positive real numbers that are automatically generated

Field Shape_Area 
Alias Shape Area
Data type Double
Width 8
Precision 0
Scale 0

Field description
Area of feature in internal units squared

Description source
Esri

Description of values
Positive real numbers that are automatically generated

Metadata Details 

Metadata language English (UNITED STATES)
Metadata character set utf8 - 8 bit UCS Transfer Format

Metadata identifier 3D3CDEDB-090C-45CC-BF95-BE40949183F1

Scope of the data described by the metadata dataset
Scope name

Last update 2021-08-26 

ArcGIS metadata properties
Metadata format ArcGIS 1.0

Created in ArcGIS for the item 2021-08-05 12:27:37
Last modified in ArcGIS for the item 2021-08-26 14:23:29

Automatic updates
Have been performed Yes
Last update 2021-08-25 15:54:59

Metadata Contacts 

Metadata contact - point of contact
Individual's name Kevin Summers
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position Research Ecologist

Contact information 
Phone
Voice 850-934-9244
Address
Type both
City 1 Sabine Island Drive
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresssummers.kevin@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Metadata contact - point of contact
Individual's name Andrea Lamper
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position ArcGIS Data Specialist (ORAU SSC)

Contact information 
Phone
Voice 850-934-9336
Address
Type both
Delivery point 1 Sabine Island Drive
City Gulf Breeze
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresslamper.andrea@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Metadata Maintenance 

Maintenance
Update frequency as needed

Maintenance contact - point of contact
Individual's name Kevin Summers
Organization's name USEPA/ORD/CEMM/GEMMD/EAB
Contact's position Research Ecologist

Contact information 
Phone
Voice 850-934-9244
Address
Type both
City 1 Sabine Island Drive
Administrative area FL
Postal code 32561-5299
Country US
e-mail addresssummers.kevin@epa.gov
Online resource
Online location (URL)https://www.epa.gov/aboutepa/about-sustainable-and-healthy-communities-research-program
Connection protocol WWW:LINK

Thumbnail and Enclosures 

Thumbnail
Thumbnail type
Image file