『Abstract
The occurrence of elevated levels of arsenic and nitrate in aquifers
impacted by agricultural activities is common and can result in
adverse health effects in rural areas. Numerous wells located
in the Ogallala aquifer in the southern High Plains of Texas have
tested positive for both arsenic and nitrate MCL exceedance. To
model the simultaneous exceedance of both chemicals, two types
of Logistic Regression (LR) models were developed by (a) treating
arsenic and nitrate independently and combining the marginal probabilities
of their exceedance, and (b) treating the two exceedances together
by using a multinominal model. Influencing variables representative
of both soil and aquifer properties and data for which was readily
available were identified. The predictive capacities of the two
models were evaluated using Received Operating Characteristics
(ROCs) and spatial trends in predictions were studied. The LR
model constructed from the marginal probabilities had lower overall
accuracy (59% correct classifications) and was extremely conservative
by over-predicting outcomes. In contrast, the multinominal model
showed good overall accuracy (79% correct classifications), made
the correct predictions 90% of the time when both arsenic and
nitrate MCL exceedances were observed, and was a good fit for
wells located in agricultural areas. The results of the multinominal
model also confirm previous studies that attributed shallow subsurface
arsenic to anthropogenic activities. Based on the insights provided
by the model it is recommended that where agricultural areas are
concerned, the occurrence of arsenic and nitrate are better evaluated
together.
Keywords: Arsenic; Nitrate; Logistic regression; Ogallala aquifer;
Receiver operating characteristics; Land use』
1. Introduction
2. Methodology
2.1. Conceptual model
2.2. Data
2.3. Simultaneous exceedance assuming independence among arsenic
and nitrate sources
2.4. Multinominal logistic regression for simultaneous exceedance
2.5. Selection of influencing variables
2.6. Metrics for model evaluation
3. Results and discussion
3.1. Ordinary logistic regression models for arsenic and
nitrate
3.1.1. Arsenic
3.1.2. Nitrate
3.2. Multinominal LP for simultaneous exceedance
3.3. Comparison of performance of the multinominal and independent
LR models
4. Summary and conclusions
Acknowledgments
Appendix A. Supplementary material
References