Skip to Content

How to conserve the ID field when using RTextTools

I am using RTextTools to train and classify data which comes from a MySQL table. I have a field called id that identifies each document in the database. However, after using the following code, the id field is no longer present.

matrix <- create_matrix(cbind(data$text,data$id),
language="english", removeNumbers=TRUE,
removeSparseTerms=.998)
 
corpus <- create_corpus(matrix,
as.numeric(data$valid),
trainSize=1:750, testSize=751:1000,
virgin=FALSE)
 
SVM <- train_model(corpus,"SVM")
 
SVM_CLASSIFY <- classify_model(corpus, SVM)

As stated above, the data$id seems to be lost during the process. Any idea how I can keep the ID linked to the data?