# Practical Machine Learning Quiz

Enroll Now

## Quiz 1

1.
Question 1
Which of the following are components in building a machine learning algorithm?

1 point

• Machine learning
• Statistical inference
• Training and test sets
• Artificial intelligence

Question 1

Which of the following are components in building a machine learning algorithm?

Artificial intelligence
Machine learning
Training and test sets
Creating features.
Statistical inference

2.
Question 2
Suppose we build a prediction algorithm on a data set and it is 100% accurate on that data set. Why might the algorithm not work well if we collect a new data set?

1 point

• We have too few predictors to get good out of sample accuracy.
• We may be using a bad algorithm that doesn’t predict well on this kind of data.
• Our algorithm may be overfitting the training data, predicting both the signal and the noise.
• We have used neural networks which has notoriously bad performance.

3.
Question 3
What are typical sizes for the training and test sets?

1 point

• 90% training set, 10% test set
• 10% test set, 90% training set
• 60% in the training set, 40% in the testing set.
• 50% training set, 50% test set

Question 3

What are typical sizes for the training and test sets?

90% training set, 10% test set
0% training set, 100% test set.
50% in the training set, 50% in the testing set.
80% training set, 20% test set

4.
Question 4
What are some common error rates for predicting binary variables (i.e. variables with two possible values like yes/no, disease/normal, clicked/didn’t click)? Check the correct answer(s).

1 point

• R^2
• Root mean squared error
• Correlation
• Sensitivity
• Median absolute deviation

Question 4

What are some common error rates for predicting binary variables (i.e. variables with two possible values like yes/no, disease/normal, clicked/didn’t click)? Check the correct answer(s).

Predictive value of a positive
Correlation
Root mean squared error
Median absolute deviation
R^2

5.
Question 5
Suppose that we have created a machine learning algorithm that predicts whether a link will be clicked with 99% sensitivity and 99% specificity. The rate the link is clicked is 1/1000 of visits to a website. If we predict the link will be clicked on a specific visit, what is the probability it will actually be clicked?

1 point

• 50%
• 99%
• 9%
• 99.9%

## Quiz 2

1.
Question 1
Load the Alzheimer’s disease data using the commands:

2
library(AppliedPredictiveModeling)
data(AlzheimerDisease)
Which of the following commands will create non-overlapping training and test sets with about 50% of the observations assigned to each?

1 point

testIndex = createDataPartition(diagnosis, p = 0.50,list=FALSE)

2.
Question 2
Load the cement data using the commands:

1234567
library(AppliedPredictiveModeling)
data(concrete)
library(caret)
set.seed(1000)
inTrain = createDataPartition(mixtures\$CompressiveStrength, p = 3/4)[]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
Make a plot of the outcome (CompressiveStrength) versus the index of the samples. Color by each of the variables in the data set (you may find the cut2() function in the Hmisc package useful for turning continuous covariates into factors). What do you notice in these plots?

1 point

There is a non-random pattern in the plot of the outcome versus index that is perfectly explained by the Age variable so there may be a variable missing.

There is a non-random pattern in the plot of the outcome versus index that is perfectly explained by the FlyAsh variable.

The outcome variable is highly correlated with FlyAsh.

There is a non-random pattern in the plot of the outcome versus index that does not appear to be perfectly explained by any predictor suggesting a variable may be missing.

3.
Question 3
Load the cement data using the commands:

1234567
library(AppliedPredictiveModeling)
data(concrete)
library(caret)
set.seed(1000)
inTrain = createDataPartition(mixtures\$CompressiveStrength, p = 3/4)[]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
Make a histogram and confirm the SuperPlasticizer variable is skewed. Normally you might use the log transform to try to make the data more symmetric. Why would that be a poor choice for this variable?

1 point

The log transform does not reduce the skewness of the non-zero values of SuperPlasticizer

The log transform produces negative values which can not be used by some classifiers.

The SuperPlasticizer data include negative values so the log transform can not be performed.

There are values of zero so when you take the log() transform those values will be -Inf.

Question 3

Make a histogram and confirm the SuperPlasticizer variable is skewed. Normally you might use the log transform to try to make the data more symmetric. Why would that be a poor choice for this variable?

There are a large number of values that are the same and even if you took the log(SuperPlasticizer + 1) they would still all be identical so the distribution would not be symmetric.

4.
Question 4
Load the Alzheimer’s disease data using the commands:

123456
library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)data(AlzheimerDisease)
Find all the predictor variables in the training set that begin with IL. Perform principal components on these variables with the preProcess() function from the caret package. Calculate the number of principal components needed to capture 90% of the variance. How many are there?

1 point

12

9

5

10

5.
Question 5
Load the Alzheimer’s disease data using the commands:

123456
library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)data(AlzheimerDisease)
Create a training data set consisting of only the predictors with variable names beginning with IL and the diagnosis. Build two predictive models, one using the predictors as they are and one using PCA with principal components explaining 80% of the variance in the predictors. Use method=”glm” in the train function.

What is the accuracy of each method in the test set? Which is more accurate?

1 point

Non-PCA Accuracy: 0.65

PCA Accuracy: 0.72

Non-PCA Accuracy: 0.74

PCA Accuracy: 0.74

Non-PCA Accuracy: 0.72

PCA Accuracy: 0.71

Non-PCA Accuracy: 0.72

PCA Accuracy: 0.65

## Quiz 3

1.
Question 1
For this quiz we will be using several R packages. R package versions change over time, the right answers have been checked using the following versions of the packages.

AppliedPredictiveModeling: v1.1.6

caret: v6.0.47

ElemStatLearn: v2012.04-0

pgmm: v1.1

rpart: v4.1.8

If you aren’t using these versions of the packages, your answers may not exactly match the right answer, but hopefully should be close.

Load the cell segmentation data from the AppliedPredictiveModeling package using the commands:

123
library(AppliedPredictiveModeling)
data(segmentationOriginal)
library(caret)
1. Subset the data to a training set and testing set based on the Case variable in the data set.

2. Set the seed to 125 and fit a CART model with the rpart method using all predictor variables and default caret settings.

3. In the final model what would be the final model prediction for cases with the following variable values:

a. TotalIntench2 = 23,000; FiberWidthCh1 = 10; PerimStatusCh1=2

b. TotalIntench2 = 50,000; FiberWidthCh1 = 10;VarIntenCh4 = 100

c. TotalIntench2 = 57,000; FiberWidthCh1 = 8;VarIntenCh4 = 100

d. FiberWidthCh1 = 8;VarIntenCh4 = 100; PerimStatusCh1=2

1 point

a. PS

b. WS

c. PS

d. Not possible to predict

a. PS

b. WS

c. PS

d. WS

a. PS

b. Not possible to predict

c. PS

d. Not possible to predict

a. WS

b. WS

c. PS

d. Not possible to predict

2.
Question 2
If K is small in a K-fold cross validation is the bias in the estimate of out-of-sample (test set) accuracy smaller or bigger? If K is small is the variance in the estimate of out-of-sample (test set) accuracy smaller or bigger. Is K large or small in leave one out cross validation?

1 point

The bias is larger and the variance is smaller. Under leave one out cross validation K is equal to one.

The bias is larger and the variance is smaller. Under leave one out cross validation K is equal to the sample size.

The bias is smaller and the variance is bigger. Under leave one out cross validation K is equal to one.

The bias is smaller and the variance is smaller. Under leave one out cross validation K is equal to the sample size.

3.
Question 3
Load the olive oil data using the commands:

123
library(pgmm)
data(olive)
olive = olive[,-1]
(NOTE: If you have trouble installing the pgmm package, you can download the -code-olive-/code- dataset here: olive_data.zip. After unzipping the archive, you can load the file using the -code-load()-/code- function in R.)

These data contain information on 572 different Italian olive oils from multiple regions in Italy. Fit a classification tree where Area is the outcome variable. Then predict the value of area for the following data frame using the tree command with all defaults

1
newdata = as.data.frame(t(colMeans(olive)))
What is the resulting prediction? Is the resulting prediction strange? Why or why not?

1 point

0.005291005 0 0.994709 0 0 0 0 0 0. There is no reason why the result is strange.

2.783. It is strange because Area should be a qualitative variable – but tree is reporting the average value of Area as a numeric variable in the leaf predicted for newdata

0.005291005 0 0.994709 0 0 0 0 0 0. The result is strange because Area is a numeric variable and we should get the average within each leaf.

4.59965. There is no reason why the result is strange.

4.
Question 4
Load the South Africa Heart Disease Data and create training and test sets with the following code:

123456
library(ElemStatLearn)
data(SAheart)
set.seed(8484)
train = sample(1:dim(SAheart),size=dim(SAheart)/2,replace=F)
trainSA = SAheart[train,]
testSA = SAheart[-train,]
Then set the seed to 13234 and fit a logistic regression model (method=”glm”, be sure to specify family=”binomial”) with Coronary Heart Disease (chd) as the outcome and age at onset, current alcohol consumption, obesity levels, cumulative tabacco, type-A behavior, and low density lipoprotein cholesterol as predictors. Calculate the misclassification rate for your model using this function and a prediction on the “response” scale:

1
missClass = function(values,prediction){sum(((prediction > 0.5)*1) != values)/length(values)}
What is the misclassification rate on the training set? What is the misclassification rate on the test set?

1 point

Test Set Misclassification: 0.38

Training Set: 0.25

Test Set Misclassification: 0.31

Training Set: 0.27

Test Set Misclassification: 0.35

Training Set: 0.31

Test Set Misclassification: 0.32

Training Set: 0.30

5.
Question 5
Load the vowel.train and vowel.test data sets:

123
library(ElemStatLearn)
data(vowel.train)
data(vowel.test)
Set the variable y to be a factor variable in both the training and test set. Then set the seed to 33833. Fit a random forest predictor relating the factor variable y to the remaining variables. Read about variable importance in random forests here: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#ooberr The caret package uses by default the Gini importance.

Calculate the variable importance using the varImp function in the caret package. What is the order of variable importance?

[NOTE: Use randomForest() specifically, not caret, as there’s been some issues reported with that approach. 11/6/2016]

1 point

The order of the variables is:

x.10, x.7, x.9, x.5, x.8, x.4, x.6, x.3, x.1,x.2

The order of the variables is:

x.2, x.1, x.5, x.6, x.8, x.4, x.9, x.3, x.7,x.10

The order of the variables is:

x.10, x.7, x.5, x.6, x.8, x.4, x.9, x.3, x.1,x.2

The order of the variables is:

x.1, x.2, x.3, x.8, x.6, x.4, x.5, x.9, x.7,x.10

## Quiz 4

1.
Question 1
For this quiz we will be using several R packages. R package versions change over time, the right answers have been checked using the following versions of the packages.

AppliedPredictiveModeling: v1.1.6

caret: v6.0.47

ElemStatLearn: v2012.04-0

pgmm: v1.1

rpart: v4.1.8

gbm: v2.1

lubridate: v1.3.3

forecast: v5.6

e1071: v1.6.4

If you aren’t using these versions of the packages, your answers may not exactly match the right answer, but hopefully should be close.

Load the vowel.train and vowel.test data sets:

12345
library(ElemStatLearn)

data(vowel.train)

data(vowel.test)
Set the variable y to be a factor variable in both the training and test set. Then set the seed to 33833. Fit (1) a random forest predictor relating the factor variable y to the remaining variables and (2) a boosted predictor using the “gbm” method. Fit these both with the train() command in the caret package.

What are the accuracies for the two approaches on the test data set? What is the accuracy among the test set samples where the two methods agree?

1 point

RF Accuracy = 0.6082

GBM Accuracy = 0.5152

Agreement Accuracy = 0.5152

RF Accuracy = 0.6082

GBM Accuracy = 0.5152

Agreement Accuracy = 0.6361

RF Accuracy = 0.9881

GBM Accuracy = 0.8371

Agreement Accuracy = 0.9983

RF Accuracy = 0.9987

GBM Accuracy = 0.5152

Agreement Accuracy = 0.9985

2.
Question 2
Load the Alzheimer’s data using the following commands

1234567891011121314151617
library(caret)

library(gbm)

set.seed(3433)

library(AppliedPredictiveModeling)

data(AlzheimerDisease)

Set the seed to 62433 and predict diagnosis with all the other variables using a random forest (“rf”), boosted trees (“gbm”) and linear discriminant analysis (“lda”) model. Stack the predictions together using random forests (“rf”). What is the resulting accuracy on the test set? Is it better or worse than each of the individual predictions?

1 point

Stacked Accuracy: 0.76 is better than random forests and boosting, but not lda.

Stacked Accuracy: 0.80 is better than random forests and lda and the same as boosting.

Stacked Accuracy: 0.93 is better than all three other methods

Stacked Accuracy: 0.88 is better than all three other methods

3.
Question 3
Load the concrete data with the commands:

1234567891011
set.seed(3523)

library(AppliedPredictiveModeling)

data(concrete)

inTrain = createDataPartition(concrete\$CompressiveStrength, p = 3/4)[]

training = concrete[ inTrain,]

Set the seed to 233 and fit a lasso model to predict Compressive Strength. Which variable is the last coefficient to be set to zero as the penalty increases? (Hint: it may be useful to look up ?plot.enet).

1 point

CoarseAggregate

Cement

BlastFurnaceSlag

FineAggregate

4.
Question 4
Load the data on the number of visitors to the instructors blog from here:

Using the commands:

123456789
library(lubridate) # For year() function below

training = dat[year(dat\$date) < 2012,]

testing = dat[(year(dat\$date)) > 2011,]

tstrain = ts(training\$visitsTumblr)
Fit a model using the bats() function in the forecast package to the training time series. Then forecast this model for the remaining time points. For how many of the testing points is the true value within the 95% prediction interval bounds?

1 point

96%

93%

95%

94%

5.
Question 5
Load the concrete data with the commands:

1234567891011
set.seed(3523)

library(AppliedPredictiveModeling)

data(concrete)

inTrain = createDataPartition(concrete\$CompressiveStrength, p = 3/4)[]

training = concrete[ inTrain,]

Set the seed to 325 and fit a support vector machine using the e1071 package to predict Compressive Strength using the default settings. Predict on the testing set. What is the RMSE?

1 point

6.93

6.72

107.44

35.59

### Peer-graded Assignment: Prediction Assignment Writeup

https://github.com/sabhi27/Coursera-Assignment-Week-4—–Practical-Machine-Learning

### Course Project Prediction Quiz

1.
Question 1
Prediction for case 1

1 point

A

B

C

D

E

2.
Question 2
Prediction for case 2

1 point

A

B

C

D

E

3.
Question 3
Prediction for case 3

1 point

A

B

C

D

E

4.
Question 4
Prediction for case 4

1 point

A

B

C

D

E

5.
Question 5
Prediction for case 5

1 point

A

B

C

D

E

6.
Question 6
Prediction for case 6

1 point

A

B

C

D

E

7.
Question 7
Prediction for case 7

1 point

A

B

C

D

E

8.
Question 8
Prediction for case 8

1 point

A

B

C

D

E

9.
Question 9
Prediction for case 9

1 point

A

B

C

D

E

10.
Question 10
Prediction for case 10

1 point

A

B

C

D

E

11.
Question 11
Prediction for case 11

1 point

A

B

C

D

E

12.
Question 12
Prediction for case 12

1 point

A

B

C

D

E

13.
Question 13
Prediction for case 13

1 point

A

B

C

D

E

14.
Question 14
Prediction for case 14

1 point

A

B

C

D

E

15.
Question 15
Prediction for case 15

1 point

A

B

C

D

E

16.
Question 16
Prediction for case 16

1 point

A

B

C

D

E

17.
Question 17
Prediction for case 17

1 point

A

B

C

D

E

18.
Question 18
Prediction for case 18

1 point

A

B

C

D

E

19.
Question 19
Prediction for case 19

1 point

A

B

C

D

E

20.
Question 20
Prediction for case 20

1 point

A

B

C

D

E