Poisson Distribution and Binomial Distribution Regulation

In this final project, it will analyze bias between a Poisson distribution and Binomial distribution by R simulation. There will be three parts included in this project: firstly, what are Poisson distribution and Binomial distribution, and why bias happens in these two distributions. Secondly, I will try to…

Continue ReadingPoisson Distribution and Binomial Distribution Regulation

Financial Crises Analysis

Financial Crises According to the neoclassical economic school of thought, capitalist economies are cyclical in nature (Claessens, Kose, Laeven and Valenci 2). A period of growth is usually accompanied with a big decline in economic. However, although the boom and bust aspects of the capitalist economy are somewhat…

Continue ReadingFinancial Crises Analysis

Singular Value Decomposition

options(digits=3) A = cbind(c(1,3,4,5,0,0,0), c(1,3,4,5,2,0,1), c(1,3,4,5,0,0,0), c(0,0,0,0,4,5,2), c(0,0,0,0,4,5,2)) # default decomposition s1 = svd(A) A1 = s1$u %*% diag(s1$d) %*% t(s1$v) s1 A1 # lower-rank decomposition (thin SVD) k = 3 s2 = svd(A, k, k) D = diag(k) diag(D) = s2$d[1:k] A2 = s2$u %*% D %*%…

Continue ReadingSingular Value Decomposition

Regression and Classification Trees

############################################################# ############ Regression and Classification Trees ############ ############################################################# #################################### ############# R code 1 ############# #################################### # install the rpart.plot package: install.packages("rpart.plot") library(dplyr) # to access the select() function library(caret) # to access the train() function library(rpart.plot) # to access the prp() function for plotting reg. trees homes <-…

Continue ReadingRegression and Classification Trees

Ridge Regression and the LASSO

######################################################## ############ Ridge Regression and the LASSO ############ ######################################################## ################################## ############# R code ############# ################################## library(ISLR) # to access the Hitters dataset library(dplyr) # to access the select() function library(glmnet) # to access the cv.glmnet() function # let's examine the Hitters dataset str(Hitters) View(Hitters) # remove any variables…

Continue ReadingRidge Regression and the LASSO

K-fold Cross-Validation

##### How to manually run k-fold cross-validation using a for loop ##################################################################### # load the CollegesNew dataset colleges <- read.csv("C:/Users/casem/Google Drive/Drew/Statistical Machine Learning/Data/CollegesNew.csv", header = TRUE ) ##### initial stuff ################################################# # set a seed for reproducibility since k-fold CV involves random numbers set.seed(100) # initial stuff k…

Continue ReadingK-fold Cross-Validation

Cross-validation

######################################################## #################### Holdout Method #################### ######################################################## ################################## ############ R code 1 ############ ################################## # load the CollegesNew dataset colleges <- read.csv("C:/Users/casem/Google Drive/Drew/Statistical Machine Learning/Data/CollegesNew.csv", header = TRUE ) ##### setting a seed # set a seed for reproducible results whenever generating random numbers set.seed(1) # nothing special about…

Continue ReadingCross-validation

KNN, Logistic Regression, Linear /Quadratic Discriminant Analysis

######################################################## ############## k-Nearest Neighbors (k-NN) ############## ######################################################## ################################## ############ R code 1 ############ ################################## # we'll use the caret package for k-NN (and other methods later) # another common package is called class (just an FYI) # install the following packages: caret and e1071 # install.packages(c("caret", "e1071")) #…

Continue ReadingKNN, Logistic Regression, Linear /Quadratic Discriminant Analysis

K-means Clustering and Hierarchical Clustering

######################################################## ################### k-Means Clustering ################# ######################################################## ################################## ############ R code 1 ############ ################################## library(tidyverse) # to access select() and others (see last part of code) library(factoextra) # to access the fviz_cluster() and fviz_nbclust() functions library(gridExtra) # to access the grid.arrange() function drivers <- read.table("C:/Users/casem/Google Drive/Drew/Statistical Machine Learning/Data/delivery_drivers.txt", header…

Continue ReadingK-means Clustering and Hierarchical Clustering