library(Sleuth3)
library(ggplot2)
#' #Q1
#' ## 1
# head(ex0923)
qplot(AFQT, Income2005, data=ex0923)
#' Income and intelligence score appear to be positively related. The spread
#' of Income appears to increase with increasing average income, an
#' indication the constant spread assumption my be violated. A log
#' transform of income may alleviate this problem. In general, income is very
#' right skewed (more evidence to do a log transform) resulting in a few outliers
#' with very high income.
qplot(Educ, Income2005, data=ex0923)
#' Education and income also appear to be positively related. It's really hard to
#' evaluate spread because of the drastically different numbers of people in the
#' education categories. I probably would do boxplots:
qplot(factor(Educ), Income2005, data=ex0923, geom = "boxplot")
#' Now it's obvious spread is also increasing with increasing Education, validating
#' our suggestion to log transform income.
# qplot(AFQT, log(Income2005), data=ex0923) # things improve!
# qplot(factor(Educ), log(Income2005), data=ex0923, geom = "boxplot")
qplot(AFQT,Educ, data=ex0923)
#' There appears to be a positive relationship between education and intelligence
#' score, although there is a lot of spread. For example, at very 8 or less
#' years of education, AFQT scores are all below 25, but at 12 years of education
#' we see scores over the entire range of AFQT. Again, the picture might be clearer
#' with boxplots
qplot(factor(Educ), AFQT, data=ex0923, geom = "boxplot")
#' There seems to be a positive non-linear relationship between average intelligence score
#' and years of education, the spread in intelligence score also seems smaller at
#' the extremes of Education. Since, both education and intelligence score
#' are entering our model as explantories we don't require an distributional
#' assumptions and no transfomations are neccessary for them.
#'
#' There are a few outlying people who are very highly educated ( > 20 years) but
#' with low intelligence score
#'
#'
#' ## 2
fit <- lm(log(Income2005)~Gender + Educ + AFQT, data=ex0923)
summary(fit)
confint(fit)
#' ## 3
exp(0.6245)
exp(c(0.557, 0.691))
#'
#' There is convincing evidence that the median income in 2005 is not the same
#' for males and females after accounting for education and intelligence score
#' (p-value < 0.0001, t-test for different intercepts). It is estimated that
#' the median income for males is 1.86 times that for females with the same
#' education and intelligence score. With 95% confidence, the median income for
#' males is between 1.74 and 1.99 times higher than for females with the same
#' education and intelligence score. (Or, equivalently, With 95% confidence, the
#' median income for males is between 74% and 99% higher than for females with the
#' same education and intelligence score.)
#'
#' *(Also acceptable)*
#' There is convincing evidence that the mean log Income in 2005 is not the same
#' for males and females after accounting for education and intelligence score
#' (p-value < 0.0001, t-test for different intercepts). It is estimated that
#' the mean log income for males is 0.62 units higher than females with the same
#' education and intelligence score. With 95% confidence, the mean log income for
#' males is between 0.56 and 0.69 units higher than females with the same
#' education and intelligence score.
#'
#'
###Q2
qplot(log(Height),log(Force),data = ex0722,colour = Species)
fit <- lm(log(Force) ~ log(Height) * Species, data = ex0722)
summary(fit)
unique(ex0722$Species)
confint(fit)
ex0722$Species1 <- relevel(ex0722$Species, ref = "Hemigrapsus nudus")
fit <- lm(log(Force) ~ log(Height) * Species1, data = ex0722)
summary(fit)
confint(fit)
#' There is moderate evidence that the relationship between mean log force and log height is different for Hemigrapsus nudus compared to Cancer productus (t-test for equal slope, p-value = 0.04). It is estimated that a one unit increase in mean log force for Hemigrapsus nudus is associated with a 1.66 unit increase in log height *less* than Cancer productus. With 95% confidence, a one unit increase in mean log force for Hemigrapsus nudus is associated with between a 0.05 and 3.27 unit increase in log height *less* than Cancer productus.
#'
#' There is no evidence that the relationship between mean log force and log height is different for Lophopanopeus bellus compared to Cancer productus (t-test for equal slope, p-value = 0.28). It is estimated that a one unit increase in mean log force for Lophopanopeus bellus is associated with a 0.9 unit increase in log height *more* than Cancer productus. With 95% confidence a one unit increase in mean log force for Lophopanopeus bellus is associated with between a 2.59 unit increase in log height *more* than Cancer productus.and a 0.79 unit increase in log height *less* than Cancer productus.
#'
#' There is convincing evidence that the relationship between mean log force and log height is different for Lophopanopeus bellus compared to Hemigrapsus nudus (t-test for equal slope, p-value = 0.001). It is estimated that a one unit increase in mean log force for Lophopanopeus bellus is associated with a 2.57 unit increase in log height more than Hemigrapsus nudus. With 95% confidence a one unit increase in mean log force for Lophopanopeus bellus is associated with between a 1.06 and 4.06 unit increase in log height more than Hemigrapsus nudus.