##### 04/25 \Data include predicted observed-to-expected (O/E) scores of taxonomic composition modeled from the \2007 National Lakes Assessment (NLA) (https://www.epa.gov/national-aquatic-resource-surveys/nla). \Predictions were made to all lakes within the medium resolution National Hydrography Dataset version 2.1 \(NHDPlusV21, waterbodies layer) that met the criteria of being within the NLA sample frame. \Predictions were made with a random forest regression model that used NLA 2007 O/E plankton scores as the response \variable and LakeCat (https://www.epa.gov/national-aquatic-resource-surveys/lakecat-dataset) \watershed metrics as predictor variables. Application of the model to NHDPlusV21 lakes within the NLA sampling \frame with LakeCat data produced predictions to 268,700 lakes. Rows in the data file are greater than \can be displayed in MS Excel and data may be lost if accessed with this software. \See Hill et al. 2023 for model details: https://doi.org/10.1073/pnas.2120259119 #### COMID – Unique identifier of each lake that can be tied back to the waterbodies layer of the NHDPlusV21 and LakeCat. LakeAreaM2 – Area of lake in square meters. FTYPE – NHDPlus waterbody type. These have been filtered down to LakePond and Reservoir to match the NLA sampling frame for 2007. STATE – State name where lake is located. Prd_OE – Predicted O/E score based on 2007 NLA plankton O/E scores (Hill et al. 2023; https://doi.org/10.1073/pnas.2120259119). Prd_OE range = 0.26-1.29. NARS_ECO3 – Aggregated ecoregions of Western Mountains (WMTNS), Plains and Lowlands (PLNLOW), or Easter Highlands (EHIGH) used by the National Aquatic Resource Surveys. Longitude – Longitude of lake in decimal degrees (coordinate reference system 4269). Latitude – Latitude of lake in decimal degrees (coordinate reference system 4269).