Supplement 1. R code and the data set necessary to conduct the Random Forest analysis. Y. KaratayevAlexander E. BurlakovaLyubov E. MastitskySergey K. PadillaDianna 2016 <h2>File List</h2><div> <p><a href="dreissena_in_lakes_of_belarus.csv">dreissena_in_lakes_of_belarus.csv</a> (MD5: 3dc2d2f89af3064223358983c785771d)</p> <p><a href="r_script_random_forest.R">r_script_random_forest.R</a> (MD5: af1295890d60bc832955e940889e4575)</p> </div><h2>Description</h2><div> <p>This Supplementary material contains two files necessary to fully reproduce the results obtained using the Random Forest classifier. The first of these files, dreissena_in_lakes_of_belarus.csv, is a plain text table that has 553 records, each described with the following variables:</p> <p>1. Lake_Code: numeric codes uniquely identifying each lake (for reference only, not used in analysis explicitely).</p> <p>2. ZMpresence: indicator of whether a lake is infested with zebra mussel (0 – for non-infested, 1 – for infested).</p> <p>3. LAREA: lake area</p> <p>4. LVOL: lake volume</p> <p>5. MAXD: maximal depth</p> <p>6. AVED: average depth</p> <p>7. SPECWATSHED: specific watershed (i.e., drainage area)</p> <p>8. TRANSP: Secci depth</p> <p>9. COLOR: water color</p> <p>10. pH: water pH</p> <p>11. HCO3: HCO3 content</p> <p>12. SO4: SO4 content</p> <p>13. Cl: CL content</p> <p>14. Ca: Ca content</p> <p>15. Mg: Mg content</p> <p>16. TDS: total dissolved solids</p> <p>17: Fe: Fe content</p> <p>18. Si: Si content</p> <p>19. NH4: NH4 content</p> <p>20. NO2: NO2 content</p> <p>21. PO4: PO4 content</p> <p>22. PermOx: permanganate oxydizability</p> <p>23. N: latitude (decimal degree)</p> <p>24: E: longitude (decimal degree)</p> <p>Missing values in the data set are denoted as NA. </p> <p>The second file, r_script_random_forest.R, loads the data into R (assuming that the file dreissena_in_lakes_of_belarus.csv is stored in the current R working directory), fits the Random Forest model, and plots the results. The analysis relies on three add-on packages: caret, geosphere, randomForest, and ggplot2. All these packages are assumed to be already installed on the user's computer (if not, they can be freely downloaded from the Comprehensive R Archive Network, cran.r-project.org, or installed directly from within R using the following command: install.packages(c("caret", "geosphere", "randomForest", "ggplot2"))). </p> </div>