MISY 262

Exam Date and Time: Friday June 17 at 7pm EST (no make-ups). ALL students will be
required to take Exam 1 at the same time (7pm EST on June 17) which is the time
designated by the University of Delaware for this course.
Format: Online on Canvas and Zoom. Exam 1 will be proctored by Professor Kourpas.
You will need to log into Zoom at the same time and have your camera on. You will not
be able to take the exam and receive a grade if you do not have your computer camera
on Zoom. Exam 1 monitoring software on Canvas will note what programs you have
open during Exam 1, so any use to email, chat, or other online programs is strictly
prohibited and will reported, enforcing the appropriate University disciplinary action.
Absolutely no use of cellphones or multiple computers during the exam.
Zoom Link for Exam 1: https://udel.zoom.us/j/95130647263
Do not try to access before Exam 1.
Lectures covered in Exam 1: The first 8 lectures will be covered (Lectures 1-8) with
emphasis on the Linear Regression lectures (Lectures 4-8)
Time Limit: 90 minutes (DSS Students will have adjusted time). You can’t start, stop,
and continue! Time starts from the time you first access the exam! If you are a DSS
student, you will need to have approval from the DSS office sent directly to Professor
Kourpas (some students have already done so).
Access to Material: You can have access to your computers, including lecture slides,
lecture videos, R software, and scripts you have built as part of this class (absolutely no
email/chat, or any material and notes that have not been part of my course). In some
parts, you will be required to run models on R, transfer your results on Canvas, and
answer questions. In other parts, you will be required to interpret R output and make
decisions. All answers will be submitted on Canvas. Some questions will be essay
format, some will be multiple choice.
Exams should be completed individually! Any indications of academic dishonesty
will be reported and result in enforcing the appropriate University disciplinary
Study Guide & Hints
The following study guide was designed as a summary to help you study and prepare
for the cumulative Final. It is not supposed to be an exhaustive list of questions or
Basic R Programming and Statistical Concepts: Lectures 1-3
• Know the different type of variables and how to recognize them (continuous,
discrete, categorical, binary, nominal, ordinal, etc.)
• Understand the difference between independent variables (predictors) and
dependent variables (response, outcome) and how to use in regression.
• Understand how to read files in R, and how to use and recognize the right syntax of
the read command. Example: > d=read.csv(“Packages.csv”, header=TRUE)
• Understand the basic R commands, how to use and recognize them.
Examples: length, mean, median, unique, quantile, dim, cor
• Understand the use of the “which” command, how to use it appropriately and how to
recognize its meaning/logic.
• Know how to use the aggregate command. The aggregate command can have more
than 1 variables in the grouping list. Understand the logic.
Example: >aggregate(d$driversworking, by=list(d$year, d$weekend), mean)
• Understand how to use the histogram command in R, and how to interpret
histograms. What do histograms show us?
Example: hist(d$avghoursperdriver), hist(fit$resid), hist(tstresid)
• Understand how to use the plot command in R to create scatterplots, and how to
interpret scatterplot. What do scatterplot show us?
Example: plot(d$avghoursperdriver, d$driversworking)
• Know how to use and interpret correlation in R (i.e., cor function in R). What does
correlation mean? (just a number means nothing…)
• Know how to divide and cut a dataset, creating a new subset of the data in R. You
have to be able to recognize and understand the syntax and logic.
Example: w = d[which(d$age> 25),]
Linear Regressions (simple & multiple): Lectures 4-8
• Understand when to use Linear versus Logistic Regression
• Understand and recognize the R command for building a linear regression model.
Ex. : > fit = lm(d$avghoursperdriver ~ d$pctoversized + d$weatherconditions)
• Understand what statistical significance means and how to test for it. What does it
mean if a variable is not statistically significant?
• To test for significance at standard levels of significance (e.g., α=0.05) use stars or
p-values (if p-value is less than the level of significance α, then the
variable/coefficient is statistically significant). P-values allow you to test statistical
significance at non-standard levels of significance (e.g., α=0.0375).
• Understand how to build a model equation after you have tested for statistical
• Make sure you know how to interpret the intercept in a model, the
coefficients/slopes, and the Coefficient of Determination (R2).
• If a variable is not statistically significant, it is a candidate for removal. DO NOT
interpret the associated coefficient, because we do not have enough
evidence/confidence (at the specific confidence level α) that the coefficient is
different than zero, since we failed to reject the null hypothesis. So what happens
then? The correct thing is to rerun the model without the not statistically significant
predictor variables. IF really pressed for time, some business analytics people and
statisticians just write the model with the rest of the coefficients (rest of independent
variables), but this is less preferred.
• Now what happens to the coefficient of determination (R2). The coefficient of
determination is still valid (for all independent variables in the R2, even the not
statistically significant ones) IF the associated p-value of the F-test is less than the
significance level α (e.g., usually 0.05). So you can still make the statement of
interpretation that X% of the variation in the dependent variable (Y) is explained by
the independent variables (X’s, including the non-significant ones). The only time the
R2 is not interpretable is if ALL independent variables are not statistically significant.
• What is left unexplained is 100% minus the R2.
• Note: If you add additional independent variables to an existing model, R-squared
CAN NEVER decrease, even if the variables added are not-significant,
• Incremental impacts on dependent variable (Y) and economic significance have to
do with the sign (increase or decrease) and magnitude (how much) of the associated
coefficients of independent variables. Model fit (how good your model is, how much
of the variation in dependent variable is explained by the independent variables) is
given by the coefficient of determination (R2).
• Sometimes the intercept in a model may not be interpretable (e.g., negative weight,
see example on Lecture 8, pages 19-21 and Lecture 8 Video/Part 3) but it should
still be included in the model as it helps us build the line of best fit

Leave a Comment

Your email address will not be published. Required fields are marked *

+1 587-331-9072
We will write your work from scratch and ensure that it is plagiarism FREE, you just submit the completed work.