Experimental design in Education
Educational Statistics and Research Methods (ESRM) Program*
University of Arkansas
2025-01-13
Self introduction
The syllabus
Goal of this class
Delivery method of class materials
Assignments
Policy
Weekly schedule
Brief introduction to statistical software
Tell me:
No examination
Light homeworks, 3-5 assignments with multiple-choice questions and light calculations
To teach big picture and general directions rather than statistical details
A lot of examples
All materials use R!
Philosophy: Focus on accessibility + learning-by-doing
The AMS class heavily emphasize on hands-on task-oriented practices
No anxiety-prone tasks (e.g., hand calculations, memorizing formulas)
No anxiety-prone methods of evaluation (e.g., timed tests)
Materials:
Lecture slides present concepts—the what and the why
Example documents: reinforce the concepts and demonstrate the how using software—R packages
All available at the course website (hosted outside of Blackboard)
Participants will have the opportunity to earn up to 100 total points in this course.
Up to 88 points can be earned from homework assignment (3~6 assignments in total)
Up to 12 points may be earned from submitting in-class quiz. In-class quiz will be delivered randomly in class. These will be graded on effort only—incorrect answers will not be penalized.
Bonus points (10 points)
Late Assignment:
My job (besides providing materials and assignments):
Answer questions via email, in individual meetings, or in group-based zoom office hours—you can each work on homework during office hours and get immediate assistance (and then keep working)
Your job (in descending order of timely importance):
Ask questions—preferably in class, but any time is better than none
Frequently review the class material, focusing on mastering the vocabulary, logic, and procedural skills
Don’t wait until the last minute to start homework, and don’t be afraid to ask for help if you get stuck on one thing for more than 15 minutes
Practice using the software to implement the techniques you are learning on data you care about
Do the readings for a broader perspective and additional example (best after the lecture)
Attendance: Strongly recommended but not required
Please do not attend in-person if you might be sick!
Please do not attend if you received the inclement weather notification
You can also join the class via Zoom
You won’t miss out: I will post YouTube recordings (audio + screen share) by requested at the course website.
Changes will be sent via email by 9 am on class days
I will update the homework and in-class quiz links on class days. If not uploaded, then there are two situations: (1) I forget to do that. I will re-upload later and notify you by emails. (2) I decide not to upload it or remove it.
I may change to zoom-only for dangerous weather or if I am sick.
I will show examples primarily using R and R packages. Some important R packages include:
Tidyverse: a comprehensive R package including multiple mini packages for multiple data cleaning, data transformation.
ggplot2: a popular package for data visualization
Why not SPSS?
SPSS could only be used for some—but not all–of our content
More importantly, it doesn’t have as much room to grow; R has many new packages being developed via CRAN and GitHub
Why not SAS?
SAS is not open-sourced, meaning that we cannot check source code if something goes wrong
SAS is also commercial, but R is free
There are some point to consider
R packages are only as good as their authors (so little quality control)
Syntax and capabilities are idiosyncratic to the packages
The good things are:
If you really master R, you can do by yourself (write your own algorithm for complex model)
You can check the source code of R packages and know where issues come from
You can communicate with R package authors and provide some suggestions
You can be R package author yourself and be famous
R is an comprehensive statistical and graphical programming language
We can use R language via multiple graphical user inferences or IDE, i.e., terminal, VS Code or RStudio.
We will mainly focus on RStudio because of its convenience
Rstudio is a product of posit company and is free to use for personal use
You can download and install R base via r-project.org (currently R-4.4.1)
Then, after the installation of R, you can download RStudio via posit.co (currently)
After installation of R and RStudio, you can open up the RStudio to start your R programming.
however, your R only has the base package
To enhance its utility, most users will install R packages for certain purposes
R packages are uploaded to some platforms (i.e., CRAN or Github) by researchers or companies
Those R packages typically have their version numbers. Some functions may be available for some version (like Ver. 1.1) but not be available in other versions.
Do not upgrade your packages if you code is running well
R users are free to download and use those R packages
To download certain package, you should know package name
For example, if you want to download the latest version of tidyverse
package, you can type in following command in the console panel of Rstudio
It contains the up-to-date version of R which may potentially be unstable
You can download the package from Github using pak
package
You can update the package and its dependencies
To operate certain tasks, you need to use functions contained in R packages
There are two ways of using R functions
Direct way: you don’t have to load your package first
Use-after-load way: Package is loaded in your session before you can call the function name without specifying the package name
How do you know you already load the package or not
You can use sessionInfo
function
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS 15.2
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] cmdstanr_0.8.1.9000 rstan_2.32.6 StanHeaders_2.32.10
loaded via a namespace (and not attached):
[1] tensorA_0.36.2.1 utf8_1.2.4 generics_0.1.3
[4] digest_0.6.37 magrittr_2.0.3 evaluate_1.0.1
[7] grid_4.4.1 fastmap_1.2.0 jsonlite_1.8.9
[10] processx_3.8.4 pkgbuild_1.4.5 backports_1.5.0
[13] ps_1.8.1 gridExtra_2.3 fansi_1.0.6
[16] QuickJSR_1.4.0 scales_1.3.0 codetools_0.2-20
[19] abind_1.4-8 cli_3.6.3 rlang_1.1.4
[22] munsell_0.5.1 yaml_2.3.10 tools_4.4.1
[25] inline_0.3.20 parallel_4.4.1 checkmate_2.3.2
[28] dplyr_1.1.4 colorspace_2.1-1 ggplot2_3.5.1
[31] curl_6.0.0 vctrs_0.6.5 posterior_1.6.0
[34] R6_2.5.1 matrixStats_1.4.1 stats4_4.4.1
[37] lifecycle_1.0.4 V8_6.0.0 pkgconfig_2.0.3
[40] RcppParallel_5.1.9 pillar_1.9.0 gtable_0.3.6
[43] loo_2.8.0 glue_1.8.0 Rcpp_1.0.13-1
[46] xfun_0.49 tibble_3.2.1 tidyselect_1.2.1
[49] rstudioapi_0.17.1 knitr_1.49 htmltools_0.5.8.1
[52] rmarkdown_2.29 compiler_4.4.1 distributional_0.5.0
It outputs multiple info:
R version, Operations System, Matrix operation package, Locale
Attached packages (you can call the functions of those package)
Loaded package via a namespace (and not attached), which you cannot call functions and need to library
or require
them
After you finish R script, you have multiple ways of running the code:
Method 1: you can click Run
button in the top right-head of Rstudio
Method 2: you can select certain code and press Ctrl + Enter
(Win) or Command + Return
(Mac)
Method 3: you can Rscript [FILENAME].r
to run the whole script
Method 4: you can using R notebook to interactively run R code
Script file is .R | Script file is .rmd or .qmd | |
---|---|---|
Run the whole script |
|
|
Run the partial script |
|
|
ESRM 64503: Lecture01