Skip to Content

Cheat sheet for prediction and classification models in R

David Smith's picture

Ricky Ho has created a reference a 6-page PDF reference card on Big Data Machine Learning, with examples implemented in the R language. (A free registration to DZone Refcardz is required to download the PDF.) The examples cover:

  • Predictive modeling overview (how to set up test and training sets in R)
  • Linear regression (using lm)
  • Logistic regression (using glm)
  • Regression with regularization (using the glmnet package)
  • Neural networks (using nnet)
  • Support vector machines (using tune.svm from the e1071 package)
  • Naïve Bayes models (using naiveBayes from the e1071 package)
  • K-nearest-neighbors classification (using the knn function from the class package)
  • Decision trees (using rpart)
  • Ensembles of trees (using the randomForest package)
  • Gradient boosting (using the gbm package)


The examples use the traditional built-in R data sets (such as the iris data, used to create the neural network above), so there's unfortunately not much of a "big data" aspect to the reference card. But if you're just getting started with prediction and classification models in R, this cheat sheet is a useful guide.

DZone Refcardz:  Big Data Machine Learning Patterns for Predictive Analytics