Skip to Content



RTextTools is a machine learning package for automatic text classification that makes it simple for novice users to get started with machine learning, while allowing experienced users to easily experiment with different settings and algorithm combinations. The package includes nine algorithms for ensemble classification (svm, slda, boosting, bagging, random forests, glmnet, decision trees, neural networks, maximum entropy), comprehensive analytics, and thorough documentation.


maxent is an R package with tools for low-memory multinomial logistic regression, also known as maximum entropy. The focus of this maximum entropy classifier is to minimize memory consumption on very large datasets, particularly sparse document-term matrices represented by the tm package. The classifier is based on an efficient C++ implementation written by Dr. Yoshimasa Tsuruoka.


Computationally efficient procedures for regularized estimation with the semiparametric additive hazards regression model.


Regression modeling using rules with added instance-based corrections


L1 regularized regression (Lasso) solver using the Cyclic Coordinate Descent algorithm aka Lasso Shooting is fast. This implementation can choose which coefficients to penalize. It support coefficient-specific penalities and it can take X'X and X'y instead of X and y.


This package facilitates the use of data mining algorithms in classification and regression tasks by presenting a short and coherent set of functions. While several DM algorithms can be used, it is particularly suited for Neural Networks (NN) and Support Vector Machines (SVM). Versions: 1.3.1 minor corrections; 1.3 - new classification and regression metrics (improved mmetric function); 1.2 - new input importance methods (improved Importance function); 1.1 - minor error corrections; 1.0 - first version.


The Stuttgart Neural Network Simulator (SNNS) is a library containing many standard implementations of neural networks. This package wraps the SNNS functionality to make it available from within R. Using the RSNNS low-level interface, all of the algorithmic functionality and flexibility of SNNS can be accessed. Furthermore, the package contains a convenient high-level interface, so that the most common neural network topologies and learning algorithms integrate seamlessly into R.


Two classification ensemble methods based on logic regression models. Logforest uses a bagging approach to contruct an ensemble of logic regression models. LBoost uses a combination of boosting and cross-validation to construct and ensemble of logic regression models. Both methods are used for classification of binary responses based on binary predictors and for identification of important variables and variable interactions predictive of a binary outcome.


RGP is a simple modular Genetic Programming (GP) system build in pure R. In addition to general GP tasks, the system supports Symbolic Regression by GP through the familiar R model formula interface. GP individuals are represented as R expressions, an (optional) type system enables domain-specific function sets containing functions of diverse domain- and range types. A basic set of genetic operators for variation (mutation and crossover) and selection is provided.


Functions to perform dimensionality reduction for classification if the covariance matrices of the classes are unequal.