STAT 624 with Dr. Richardson

Homework 9 is active.

Homework Link	Due Date
Homework 1	September 16 at 6:00 p.m.
Homework 2	September 23 at 6:00 p.m.
Homework 3	September 30 at 6:00 p.m.
Homework 4	October 10 at 11:59 p.m.
Homework 5	October 17 at 11:59 p.m.
Homework 6	October 26 at 11:59 p.m.
Homework 7	November 7 at 11:59 p.m.
Homework 8	November 22 at 11:59 p.m.
Homework 9	December 14 at 11:59 P.M.

Best Practices Grading (10 points available in every assignment)

Well-commented code (3 pts)
Scripts begin with description of what is in the file
Library calls at beginning of the code
Variables that have meaning are well-labeled
Code is indented to improve readability of functions and other similar environments
Values that are used multiple times in code are assigned to variables
References to file paths are local and not global
Loops are avoided when possible

Python for R users: Slides

2017 Final
2018 Midterm
2018 Final

2018 Final.
Data: pop.csv
The solutions use code get_ests.cpp and final2018_p1.R and final2018_p2.R
The video of me working through the solution is

Video: Final 2018

2019 Midterm
R solution: Code using this data: Data
Video: Midterm 2019

2019 Final.
The website on the final is outdated. Instead use https://richardson.byu.edu/old624F2019/Final2019Data/.
The solutions use code getdata.py and ks.cpp and kscomp.R
The video of me working through the solution is

Video: Final 2019

2020 Midterm
R solution: Code

2023 Midterm
R solution: Code

Project

November 20th: Lecture
November 21st: Lecture
November 29th: Lecture

Overview of course:

Computing in the department and getting set up:

Jamison Orton, CSR: Introduction to departmental computing resources and policies
Statistics Computing Resources. Login and try again if you get "Access Denied".
Introduction to rencher.byu.edu
Text-based login (ssh)
- Linux distributions have a builtin ssh client.
- Mac OS X has a builtin ssh client.
- Windows ssh client: PuTTY.
File transfer (scp)
- Linux distributions have a builtin scp client.
- Mac OS X has a built in scp client.
- Windows file transfer using WinSCP.
Enter your account on rencher and change password

ssh net.id@rencher.byu.edu
Enter temporary password
Change password: enter passwd then follow prompts to enter in old and input new password.

Introduction to vi. Basic UNIX commands: cd; ls; mkdir; rm; rm -r; cp; vi; mv; scp

Setting up account for git access. Use the following terminal commands and just press enter when given any options:

ssh-keygen
cat ~/.ssh/id_rsa.pub

Email the output starting from ssh-rsa and ending with net.id@rencher

R: An Introduction to R.
Python: An Introduction to Python.

Running scripts in BATCH mode.
Using tmux for split vi/R or vi/Python session.

More on tmux. Cheat Sheet

Download this file and rename it ~/.vimrc
Download this file and rename it ~/.vimrc.line-feeder-3
Download this file and rename it ~/.vimrc.latex-helper
Download this file and rename it ~/.tmux.conf

Clone your personal git repository

git clone git624@collings.byu.edu:your.netid

Read-only STAT 624 general git repository:

Clone command: git clone git624@collings.byu.edu:general624

Git: A free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Introduction to git
Initial set up for git:
- git config --global user.name "Your Name Comes Here"
- git config --global user.email you@yourdomain.example.com
- git config --global merge.tool meld
Hosting providers: GitLab.com, Bitbucket.org, and GitHub.com
Git tutorials
Pro Git book --- a comprehensive reference

Submitting Homework

git pull
git add filename or git add .
git commit -m "message"
git push

Note: Check git status before each step while you are getting used to it.

Latex: Introduction

Article
Presentation

Simulation Studies: A Comprehensive Insight

Simulation studies are techniques used to imitate a real-world process on a computer, with the objective of understanding or evaluating the statistical properties associated with the process. It serves as a powerful tool for analytical or numerical analysis.
Applications of simulation studies include, but are not limited to:
- Evaluating point estimators: Point estimators are used to provide the single "best guess" of a parameter based on observed data.
- Assessing confidence intervals: It helps in determining the reliability of estimate intervals through repeated sampling.
- Checking the finite-sample statistical properties of estimators and testing procedures that have been motivated through asymptotics. Check the demonstrations in R and Python through the links: meantest.R and meantest.py
- Testing hypotheses: Simulation studies assist in testing the validity of a hypothesis under specific assumptions.
- Describing distributions: Helps in understanding and describing the distributions of random variables.
- And many others: Including optimizing solutions, predicting outcomes, etc.
Simulation studies are particularly valuable when theoretical derivations are unavailable, difficult, or intractable, offering a practical avenue to gain insights and solutions.
The terms "simulation study" and "Monte Carlo study" are synonymous, both involving statistical simulations to approximate solutions to quantitative problems.

Explore More: Monte Carlo Integration

Best Coding Practices:

Well-commented code (3 pts)
Scripts begin with description of what is in the file
Library calls at beginning of the code
Variables that have meaning are well-labeled
Code is indented to improve readability of functions and other similar environments
Values that are used multiple times in code are assigned to variables
References to file paths are local and not global
Loops are avoided when possible

This is demonsrated in good_form.R and good_form.py. The opposite extreme is in bad_form.R and bad_form.py

When your code isn't working, what are your options? Here are some debugging tips:

Check that all the packages you need are loaded
Check all of you parentheses
Print out small pieces of your code
Run code one line at a time outside any loops
Create a flag
Check plots of results of certain steps
Remember that not all bugs throw errors
Copy the paste the error directly into google or bot

I personally like this list: https://blog.hartleybrody.com/debugging-code-beginner/

The code find_the_bugs.R and find_the_bugs.py is code for a simulation study to test the effect of an outlier on regression coefficient confidence interval coverage. But it doesn't work. Use these principles too find out why!

Parallel Computing

foreach.
Example using source code:rf.R

multiprocessing.
Example using source code:rf.py

Before we delve into the simulations and tests, let's review some important terms:

Size: It refers to the probability of rejecting the null hypothesis when it is true.
Power: This is the probability that a test correctly rejects the null hypothesis when the alternative hypothesis is true.
P-value: This is a function of the observed sample results that is used to decide whether to reject the null hypothesis within the context of a specific statistical model.

Hypothesis Testing and Power Calculation via simulation:

hyp.R: This R script demonstrates the process of hypothesis testing and power calculation through simulation techniques.
hyp.py: This Python script is parallel to the R script and explores hypothesis testing and power calculation using Python.

Permutation Tests: