Preferred Name
Jacob Peters
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Date of Graduation
5-7-2020
Semester of Graduation
Spring
Degree Name
Master of Science (MS)
Department
Department of Biology
Second Advisor
Bruce A. Wiggins
Third Advisor
Patrice M. Ludwig
Abstract
American ginseng (Panax quinquefolius) is a well-known and sought-after medicinal plant native to North America that is facing increased threat of extinction due to overharvesting, herbivory, and habitat loss. Species distribution and habitat suitability models may be valuable to landowners interested in sustainable harvest or to institutions interested in the conservation and restoration of the species. With unequal sampling efforts across a region of interest, it is likely that some locations with appropriate habitat may be misrepresented in model predictions. This study refined a state-derived species distribution model for ginseng through increased sampling effort across the Cumberland Plateau of Virginia and experimental manipulation of model parameters using the machine learning method Random Forest. Through many iterations, sixteen final models were constructed with various parameters such as spatial partitioning, removal of correlated variables, and limiting the spatial extent for background point generation in an effort to reduce overfitting and increase accuracy. Models were evaluated using partial dependence plots, area under the curve (AUC), and out-of-bag error (OOB error). Of those models, this study determined that various methods may be used depending on the goal of the project—resulting in more accurate and realistic species distribution and habitat suitability models than were previously available. This study concludes that, although various model parameters can be altered to change the product thereby increasing accuracy or reducing overfitting, the most effective means of reducing the impact of sampling deficiency is to balance sampling effort across the region of study.
Included in
Biostatistics Commons, Botany Commons, Forest Biology Commons, Forest Management Commons, Population Biology Commons, Statistical Models Commons