Appendix B — Package References

C Packages & Functions Reference

This table consolidates the packages and commands used throughout the book, what each command does, and where it is first introduced.

Package	Command	What it does	First introduced
base R	`<-`	Assigns values to objects for later use.	Introduction to R
base R	`c()`	Combines multiple values into a single vector.	Introduction to R
base R	`:`	Creates integer sequences (e.g., `1:10`).	Introduction to R
base R	`[]`	Indexes and subsets elements from vectors or data frames.	Introduction to R
base R	`$`	Accesses or creates columns within a data frame.	Introduction to R
base R	`sqrt()`	Computes square roots.	Introduction to R
base R	`log()` / `log10()`	Computes natural and base-10 logarithms.	Introduction to R
base R	`round()`	Rounds numeric values to a specified number of digits.	Introduction to R
base R	`class()`	Identifies the data type (class) of an object.	Introduction to R
base R	`length()`	Returns the number of elements in a vector.	Introduction to R
base R	`factor()`	Converts character data into categorical (factor) variables.	Introduction to R
base R	`levels()`	Displays the levels associated with a factor.	Introduction to R
base R	`data.frame()`	Combines vectors into a tabular data structure.	Introduction to R
base R	`head()` / `tail()`	Displays the first or last rows of a dataset.	Introduction to R
base R	`str()`	Displays the internal structure and data types of a dataset.	Introduction to R
base R	`summary()`	Produces descriptive summaries of variables or model results.	Introduction to R
base R	`table()`	Creates frequency tables for categorical data.	Introduction to R
base R	`nrow()` / `ncol()`	Returns the number of rows or columns in a dataset.	Introduction to R
base R	`colnames()`	Displays or modifies column names of a data frame.	Introduction to R
base R	`read.csv()`	Imports CSV files into R as data frames.	Introduction to R
base R	`getwd()` / `setwd()`	Gets or sets the current working directory.	Introduction to R
base R	`install.packages()`	Installs packages from CRAN.	Introduction to R
base R	`library()`	Loads an installed package into the current R session.	Introduction to R
base R	`?function_name`	Accesses built-in help documentation for a function.	Introduction to R
magrittr	`%>%`	Passes the result of one operation into the next.	Introduction to tidyverse
dplyr	`select()`	Chooses specific columns from a dataset.	Introduction to tidyverse
dplyr	`filter()`	Keeps rows that meet logical conditions.	Introduction to tidyverse
dplyr	`arrange()`	Orders rows based on column values.	Introduction to tidyverse
dplyr	`mutate()`	Creates or modifies columns.	Introduction to tidyverse
dplyr	`rename()`	Renames columns using `new_name = old_name`.	Introduction to tidyverse
dplyr	`distinct()`	Returns unique rows or value combinations.	Introduction to tidyverse
dplyr	`if_else()`	Creates values based on a binary condition.	Introduction to tidyverse
dplyr	`case_when()`	Applies multiple conditional rules.	Introduction to tidyverse
base R	`is.na()`	Identifies missing (`NA`) values.	Introduction to tidyverse
tidyr	`drop_na()`	Removes rows containing missing values.	Introduction to tidyverse
dplyr	`count()`	Counts observations by group.	Introduction to tidyverse
dplyr	`group_by()`	Groups data for grouped operations.	Introduction to tidyverse
dplyr	`summarise()`	Computes summary statistics for groups.	Introduction to tidyverse
dplyr	`n()`	Returns group size within `summarise()`.	Introduction to tidyverse
base R	`sessionInfo()`	Displays information about the current R session, including loaded packages.	Introduction to tidyverse
dplyr	`inner_join()`	Performs a SQL-style inner join, keeping only rows that match in both datasets.	Comparing Two Groups
tidyr	`pivot_longer()`	Converts data from wide format to long format.	Comparing Two Groups
tidyr	`pivot_wider()`	Converts data from long format back to wide format.	Comparing Two Groups
base R	`rbind()`	Combines multiple data frames by binding rows together.	Comparing Two Groups
base R	`merge()`	Joins two data frames together based on a shared key variable.	Comparing Two Groups
base R	`mean()`	Calculates the average of numeric values.	Comparing Two Groups
stats	`t.test()`	Tests whether two group means differ significantly.	Comparing Two Groups
stats	`cor()`	Computes Pearson correlation coefficients.	Correlation Analysis
stats	`cor.test()`	Computes and tests correlations.	Correlation Analysis
base R	`pairs()`	Creates a scatterplot matrix.	Correlation Analysis
GGally	`ggpairs()`	Enhanced scatterplot matrix with correlations.	Correlation Analysis
ppcor	`pcor.test()`	Computes partial correlations.	Correlation Analysis
base R	`ifelse()`	Recodes variables conditionally.	Correlation Analysis
base R	`set.seed()`	Ensures reproducibility when generating random data.	Comparing Multiple Means
stats	`rnorm()`	Generates random values from a normal distribution.	Comparing Multiple Means
stats	`aov()`	Fits ANOVA models.	Comparing Multiple Means
supernova	`supernova()`	Displays ANOVA results in structured tables.	Comparing Multiple Means
stats	`TukeyHSD()`	Performs post-hoc pairwise comparisons.	Comparing Multiple Means
base R	`plot()`	Visualizes post-hoc comparison results.	Comparing Multiple Means
AICcmodavg	`aictab()`	Compares models using AIC.	Comparing Multiple Means
base R	`xtabs()`	Constructs contingency tables using a formula interface.	Analyzing Categorical Data
janitor	`tabyl()`	Creates clean contingency tables.	Analyzing Categorical Data
janitor	`adorn_percentages()`	Converts counts to percentages.	Analyzing Categorical Data
janitor	`adorn_ns()`	Displays counts and percentages together.	Analyzing Categorical Data
janitor	`clean_names()`	Cleans names of an object.	Correlations
stats	`chisq.test()`	Performs Chi-Square tests of independence.	Analyzing Categorical Data
gmodels	`CrossTable()`	Detailed cross-tabulations.	Analyzing Categorical Data
pheatmap	`pheatmap()`	Heatmap visualization of residuals or contributions.	Analyzing Categorical Data
rcompanion	`cramerV()`	Measures association strength between categorical variables.	Analyzing Categorical Data
stats	`lm()`	Fits linear regression models.	Linear Regression
broom	`tidy()`	Tidies model coefficients.	Linear Regression
broom	`glance()`	Extracts model-level statistics.	Linear Regression
lmtest	`bptest()`	Tests heteroscedasticity.	Linear Regression
stats	`AIC()`	Compares regression models.	Linear Regression
stats	`step()`	Performs stepwise model selection.	Linear Regression
stats	`glm()`	Fits generalized linear models, including logistic regression (binomial family).	Logistic Regression
base R	`exp()`	Converts log-odds to odds ratios.	Logistic Regression
caTools	`sample.split()`	Splits data into training/testing sets.	Logistic Regression
caret	`confusionMatrix()`	Evaluates classification performance.	Logistic Regression
pscl	`pR2()`	Computes pseudo R² values.	Logistic Regression
caret	`varImp()`	Assesses predictor importance.	Logistic Regression
car	`vif()`	Detects multicollinearity.	Logistic Regression
pROC	`roc()`	Builds ROC curves.	Logistic Regression
pROC	`auc()`	Computes area under the ROC curve (AUC).	Logistic Regression
knitr	`knitr::opts_chunk$set()`	Sets global chunk options in R Markdown.	Reproducible Reporting
knitr	`kable()`	Creates formatted tables for reports.	Reproducible Reporting
nycOpenData	`nyc311()`	Downloads NYC 311 Service Request data from NYC Open Data.	Reproducible Reporting

citation("base")

To cite R in publications use:

  R Core Team (2025). _R: A Language and Environment for Statistical
  Computing_. R Foundation for Statistical Computing, Vienna, Austria.
  <https://www.R-project.org/>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2025},
    url = {https://www.R-project.org/},
  }

We have invested a lot of time and effort in creating R, please cite it
when using it for data analysis. See also 'citation("pkgname")' for
citing R packages.

citation("ggplot2")

To cite ggplot2 in publications, please use

  H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
  Springer-Verlag New York, 2016.

A BibTeX entry for LaTeX users is

  @Book{,
    author = {Hadley Wickham},
    title = {ggplot2: Elegant Graphics for Data Analysis},
    publisher = {Springer-Verlag New York},
    year = {2016},
    isbn = {978-3-319-24277-4},
    url = {https://ggplot2.tidyverse.org},
  }

citation("dplyr")

To cite package 'dplyr' in publications use:

  Wickham H, François R, Henry L, Müller K, Vaughan D (2023). _dplyr: A
  Grammar of Data Manipulation_. doi:10.32614/CRAN.package.dplyr
  <https://doi.org/10.32614/CRAN.package.dplyr>, R package version
  1.1.4, <https://CRAN.R-project.org/package=dplyr>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {dplyr: A Grammar of Data Manipulation},
    author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan},
    year = {2023},
    note = {R package version 1.1.4},
    url = {https://CRAN.R-project.org/package=dplyr},
    doi = {10.32614/CRAN.package.dplyr},
  }

citation("tidyr")

To cite package 'tidyr' in publications use:

  Wickham H, Vaughan D, Girlich M (2024). _tidyr: Tidy Messy Data_.
  doi:10.32614/CRAN.package.tidyr
  <https://doi.org/10.32614/CRAN.package.tidyr>, R package version
  1.3.1, <https://CRAN.R-project.org/package=tidyr>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {tidyr: Tidy Messy Data},
    author = {Hadley Wickham and Davis Vaughan and Maximilian Girlich},
    year = {2024},
    note = {R package version 1.3.1},
    url = {https://CRAN.R-project.org/package=tidyr},
    doi = {10.32614/CRAN.package.tidyr},
  }