| Package | Command | What it does | First introduced |
|---|---|---|---|
| base R | <- |
Assigns values to objects for later use. | Introduction to R |
| base R | c() |
Combines multiple values into a single vector. | Introduction to R |
| base R | : |
Creates integer sequences (e.g., 1:10). |
Introduction to R |
| base R | [] |
Indexes and subsets elements from vectors or data frames. | Introduction to R |
| base R | $ |
Accesses or creates columns within a data frame. | Introduction to R |
| base R | sqrt() |
Computes square roots. | Introduction to R |
| base R | log() / log10() |
Computes natural and base-10 logarithms. | Introduction to R |
| base R | round() |
Rounds numeric values to a specified number of digits. | Introduction to R |
| base R | class() |
Identifies the data type (class) of an object. | Introduction to R |
| base R | length() |
Returns the number of elements in a vector. | Introduction to R |
| base R | factor() |
Converts character data into categorical (factor) variables. | Introduction to R |
| base R | levels() |
Displays the levels associated with a factor. | Introduction to R |
| base R | data.frame() |
Combines vectors into a tabular data structure. | Introduction to R |
| base R | head() / tail() |
Displays the first or last rows of a dataset. | Introduction to R |
| base R | str() |
Displays the internal structure and data types of a dataset. | Introduction to R |
| base R | summary() |
Produces descriptive summaries of variables or model results. | Introduction to R |
| base R | table() |
Creates frequency tables for categorical data. | Introduction to R |
| base R | nrow() / ncol() |
Returns the number of rows or columns in a dataset. | Introduction to R |
| base R | colnames() |
Displays or modifies column names of a data frame. | Introduction to R |
| base R | read.csv() |
Imports CSV files into R as data frames. | Introduction to R |
| base R | getwd() / setwd() |
Gets or sets the current working directory. | Introduction to R |
| base R | install.packages() |
Installs packages from CRAN. | Introduction to R |
| base R | library() |
Loads an installed package into the current R session. | Introduction to R |
| base R | ?function_name |
Accesses built-in help documentation for a function. | Introduction to R |
| magrittr | %>% |
Passes the result of one operation into the next. | Introduction to tidyverse |
| dplyr | select() |
Chooses specific columns from a dataset. | Introduction to tidyverse |
| dplyr | filter() |
Keeps rows that meet logical conditions. | Introduction to tidyverse |
| dplyr | arrange() |
Orders rows based on column values. | Introduction to tidyverse |
| dplyr | mutate() |
Creates or modifies columns. | Introduction to tidyverse |
| dplyr | rename() |
Renames columns using new_name = old_name. |
Introduction to tidyverse |
| dplyr | distinct() |
Returns unique rows or value combinations. | Introduction to tidyverse |
| dplyr | if_else() |
Creates values based on a binary condition. | Introduction to tidyverse |
| dplyr | case_when() |
Applies multiple conditional rules. | Introduction to tidyverse |
| base R | is.na() |
Identifies missing (NA) values. |
Introduction to tidyverse |
| tidyr | drop_na() |
Removes rows containing missing values. | Introduction to tidyverse |
| dplyr | count() |
Counts observations by group. | Introduction to tidyverse |
| dplyr | group_by() |
Groups data for grouped operations. | Introduction to tidyverse |
| dplyr | summarise() |
Computes summary statistics for groups. | Introduction to tidyverse |
| dplyr | n() |
Returns group size within summarise(). |
Introduction to tidyverse |
| base R | sessionInfo() |
Displays information about the current R session, including loaded packages. | Introduction to tidyverse |
| dplyr | inner_join() |
Performs a SQL-style inner join, keeping only rows that match in both datasets. | Comparing Two Groups |
| tidyr | pivot_longer() |
Converts data from wide format to long format. | Comparing Two Groups |
| tidyr | pivot_wider() |
Converts data from long format back to wide format. | Comparing Two Groups |
| base R | rbind() |
Combines multiple data frames by binding rows together. | Comparing Two Groups |
| base R | merge() |
Joins two data frames together based on a shared key variable. | Comparing Two Groups |
| base R | mean() |
Calculates the average of numeric values. | Comparing Two Groups |
| stats | t.test() |
Tests whether two group means differ significantly. | Comparing Two Groups |
| stats | cor() |
Computes Pearson correlation coefficients. | Correlation Analysis |
| stats | cor.test() |
Computes and tests correlations. | Correlation Analysis |
| base R | pairs() |
Creates a scatterplot matrix. | Correlation Analysis |
| GGally | ggpairs() |
Enhanced scatterplot matrix with correlations. | Correlation Analysis |
| ppcor | pcor.test() |
Computes partial correlations. | Correlation Analysis |
| base R | ifelse() |
Recodes variables conditionally. | Correlation Analysis |
| base R | set.seed() |
Ensures reproducibility when generating random data. | Comparing Multiple Means |
| stats | rnorm() |
Generates random values from a normal distribution. | Comparing Multiple Means |
| stats | aov() |
Fits ANOVA models. | Comparing Multiple Means |
| supernova | supernova() |
Displays ANOVA results in structured tables. | Comparing Multiple Means |
| stats | TukeyHSD() |
Performs post-hoc pairwise comparisons. | Comparing Multiple Means |
| base R | plot() |
Visualizes post-hoc comparison results. | Comparing Multiple Means |
| AICcmodavg | aictab() |
Compares models using AIC. | Comparing Multiple Means |
| base R | xtabs() |
Constructs contingency tables using a formula interface. | Analyzing Categorical Data |
| janitor | tabyl() |
Creates clean contingency tables. | Analyzing Categorical Data |
| janitor | adorn_percentages() |
Converts counts to percentages. | Analyzing Categorical Data |
| janitor | adorn_ns() |
Displays counts and percentages together. | Analyzing Categorical Data |
| janitor | clean_names() |
Cleans names of an object. | Correlations |
| stats | chisq.test() |
Performs Chi-Square tests of independence. | Analyzing Categorical Data |
| gmodels | CrossTable() |
Detailed cross-tabulations. | Analyzing Categorical Data |
| pheatmap | pheatmap() |
Heatmap visualization of residuals or contributions. | Analyzing Categorical Data |
| rcompanion | cramerV() |
Measures association strength between categorical variables. | Analyzing Categorical Data |
| stats | lm() |
Fits linear regression models. | Linear Regression |
| broom | tidy() |
Tidies model coefficients. | Linear Regression |
| broom | glance() |
Extracts model-level statistics. | Linear Regression |
| lmtest | bptest() |
Tests heteroscedasticity. | Linear Regression |
| stats | AIC() |
Compares regression models. | Linear Regression |
| stats | step() |
Performs stepwise model selection. | Linear Regression |
| stats | glm() |
Fits generalized linear models, including logistic regression (binomial family). | Logistic Regression |
| base R | exp() |
Converts log-odds to odds ratios. | Logistic Regression |
| caTools | sample.split() |
Splits data into training/testing sets. | Logistic Regression |
| caret | confusionMatrix() |
Evaluates classification performance. | Logistic Regression |
| pscl | pR2() |
Computes pseudo R² values. | Logistic Regression |
| caret | varImp() |
Assesses predictor importance. | Logistic Regression |
| car | vif() |
Detects multicollinearity. | Logistic Regression |
| pROC | roc() |
Builds ROC curves. | Logistic Regression |
| pROC | auc() |
Computes area under the ROC curve (AUC). | Logistic Regression |
| knitr | knitr::opts_chunk$set() |
Sets global chunk options in R Markdown. | Reproducible Reporting |
| knitr | kable() |
Creates formatted tables for reports. | Reproducible Reporting |
| nycOpenData | nyc311() |
Downloads NYC 311 Service Request data from NYC Open Data. | Reproducible Reporting |
Appendix B — Package References
C Packages & Functions Reference
This table consolidates the packages and commands used throughout the book, what each command does, and where it is first introduced.
citation("base")To cite R in publications use:
R Core Team (2025). _R: A Language and Environment for Statistical
Computing_. R Foundation for Statistical Computing, Vienna, Austria.
<https://www.R-project.org/>.
A BibTeX entry for LaTeX users is
@Manual{,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2025},
url = {https://www.R-project.org/},
}
We have invested a lot of time and effort in creating R, please cite it
when using it for data analysis. See also 'citation("pkgname")' for
citing R packages.
citation("ggplot2")To cite ggplot2 in publications, please use
H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
Springer-Verlag New York, 2016.
A BibTeX entry for LaTeX users is
@Book{,
author = {Hadley Wickham},
title = {ggplot2: Elegant Graphics for Data Analysis},
publisher = {Springer-Verlag New York},
year = {2016},
isbn = {978-3-319-24277-4},
url = {https://ggplot2.tidyverse.org},
}
citation("dplyr")To cite package 'dplyr' in publications use:
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). _dplyr: A
Grammar of Data Manipulation_. doi:10.32614/CRAN.package.dplyr
<https://doi.org/10.32614/CRAN.package.dplyr>, R package version
1.1.4, <https://CRAN.R-project.org/package=dplyr>.
A BibTeX entry for LaTeX users is
@Manual{,
title = {dplyr: A Grammar of Data Manipulation},
author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan},
year = {2023},
note = {R package version 1.1.4},
url = {https://CRAN.R-project.org/package=dplyr},
doi = {10.32614/CRAN.package.dplyr},
}
citation("tidyr")To cite package 'tidyr' in publications use:
Wickham H, Vaughan D, Girlich M (2024). _tidyr: Tidy Messy Data_.
doi:10.32614/CRAN.package.tidyr
<https://doi.org/10.32614/CRAN.package.tidyr>, R package version
1.3.1, <https://CRAN.R-project.org/package=tidyr>.
A BibTeX entry for LaTeX users is
@Manual{,
title = {tidyr: Tidy Messy Data},
author = {Hadley Wickham and Davis Vaughan and Maximilian Girlich},
year = {2024},
note = {R package version 1.3.1},
url = {https://CRAN.R-project.org/package=tidyr},
doi = {10.32614/CRAN.package.tidyr},
}