Manipulating values with commas
I actually have a solution for this problem, but I am curious if there is a better way to do what I was trying to do.
I scraped some data from the majorleaguesoccer.com and read it into R using
mls.reg.tmp <- read.table("../data/mls_reg_season_20100812.csv", header = F, sep = ";")
Note that I used sep = ";" because some of the attendance figures where in the thousands on the websites and I scraped "as is", e.g.,
In hindsight, I should've just removed the commas in python in the pre-processing step of this project. I should also point out that there were some text fields in the data set as well.
Given that I want to use the attendance data as a numeric value, I converted using as.numeric and gsub. As an example in a call to ggplot:
ggplot(data = mls.reg.dat, aes(x = as.numeric(gsub(",", "", mls.reg.dat$a_tot)), y = sog)) + geom_point() + facet_wrap(~ team)
Question: Is this the most efficient way of working with data such as this? Or is there a specialized function for doing something along these lines?
I'm posting the question here because I spent quite a bit of time (> 30 min) just working in this simple solution and thought that others might benefit from this as well.