library(Sleuth3)
library(ggplot2)
source(url("http://stat512.cwick.co.nz/code/stat_qqline.r"))
#look at the first few entries
head(ex0824)
#Fit the model
fit <- lm(Rate~Age,data=ex0824)
#Look at the TWO residual plots to assess linearity and constant spread
qplot(.fitted, .resid, data = fit) + geom_smooth()#Residuals Vs. fitted values
qplot(Age, .resid, data = fit) + geom_smooth()#Residuals Vs. Age (explanatory variable)
#' From these two plots it definately looks like like the constant variance assumption is violated.
#' Exploratory analysis indicates that as age increases the standard deviation (or variance or spread)
#' decreases. While less evident it also appears the linearity assumption is also violated.
#Let's also check the normality
qplot(sample = .resid, data = fit) + stat_qqline()
#' From this plot it appears there are some outliers on the right. So our assumption about normality may
#' not be that great. How will this affect our results?
#' Let's see what the fabulous log transform does here
fit <- lm(log(Rate)~Age,data=ex0824)
#Look at the TWO residual plots to assess linearity and constant spread
qplot(.fitted, .resid, data = fit) + geom_smooth()#Residuals Vs. fitted values
qplot(Age, .resid, data = fit) + geom_smooth()#Residuals Vs. Age (explanatory variable)
#Let's also check the normality
qplot(sample = .resid, data = fit) + stat_qqline()
#' This time it appears to me that all of our assumptions are met.
#' So now how about we look at the interpretation of the slope. Remeber we log transformed so first
#' let us find the estimate and back transform.
summary(fit)#This tells us the estimate of the slope is -0.019
exp(-0.0190)# this value is equal to 0.9812
exp(-0.0190-1.96*0.0007357);exp(-0.0190+1.96*0.0007357)#This is the 95% CI and the values are 0.9798 and 0.9826
#' Sample evidence indicates that an increase in age of one year is associated with a multiplicative change in the median respiratory rate of 0.9825. (corresponding 95% Confidence interval 0.9798 and 0.9826).
#' So in regualar english "As age increases by 1 year the median respiratory rate will be about 2% less."