Skip to Content

Plotting a shape file with ggplot2 error

Over the last few days I pretty much dove head first into using R for mapping. I have used R extensively for modelling etc. but not this kind of work before. I have some questions and issues regarding shapefiles, how they're read and so on.

I have downloaded the shape files from the Australian Bureau of Statistics there are numerous files with state borders, post codes, cities and so on.
The shape files are massive, the Australian state borders has about 1.8 million coordinate points in it, the other file I tried was the statistical area which has over 8 million in it. I didn't do anything with this file as it is just too big for my R set up.

I read the shape file in with readShapePoly and converted it like so

AUS@data$id = rownames(AUS@data)    
AUS.points = fortify(AUS, region="id")
AUS.df = join(AUS.points, AUS@data, by="id")

Once I had converted the State borders shape file from SpatialPolygonsDataFrame to a regular dataframe I plotted it successfully but it took forever and the detail was too great.
I thought to use thinnedSpatialPoly to simplify it but it gives the error:

Error in stopifnot(length(Sr@polygons) == nrow(data)) :trying to get slot "polygons" from an object of a basic class ("NULL") with no slots

Which google cannot help me with.

My next strategy was to read it into SAS and use proc greduce which takes the file and creates a density field and you can choose how dense the polygons are.

proc mapimport out=states datafile='\Digital Boundaries\States\Shape file\STE_2011_AUST.shp';
id ste_code11;    run;
proc greduce data = states out = reduced_states;
id ste_code11;    run;

SAS has crap graphics and couldn't even plot the thing for me so I exported the dataset and read it back into R with the new density field which I hoping to subset the dataframe by and use in my plots.

My problem now is that when I go to plot in R i get this

ggplot(data=states.df, aes(X, Y, group=SEGMENT)) + 
geom_polygon(colour='black', fill='white') + theme_bw()

I guess it is because the polygons are not in order or have broken? I used this function to try and rejoin my polygons but still no luck

RegroupElements <- function(df, longcol, idcol){
g <- rep(1, length(df[,longcol]))
if (diff(range(df[,longcol])) > 300) { # check if longitude within group differs more                than 300 deg, ie if element was split
d <- df[,longcol] > mean(range(df[,longcol])) # we use the mean to help us separate the   extreme values
g[!d] <- 1 # some marker for parts that stay in place (we cheat here a little, as we do not take into account concave polygons)
g[d] <- 2 # parts that are moved 
g <- paste(df[, idcol], g, sep=".") # attach to id to create unique group variable for   the dataset
df$group.regroup <- g
### Function to close regrouped polygons
# Takes dataframe, checks if 1st and last longitude value are the same, if not, inserts  first as last and reassigns order variable
ClosePolygons <- function(df, longcol, ordercol){
if (df[1,longcol] != df[nrow(df),longcol]) {
tmp <- df[1,]
df <- rbind(df,tmp)
o <- c(1: nrow(df)) # rassign the order variable
df[,ordercol] <- o

So, finally my questions!
How do people deal with large overly detailed shape files?
Why wasn't thinnedspatialpoly working (I'd like to avoid SAS if possible)?
How can I get my plot to not look like crap?

Finally my R specs:

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    
attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     
other attached packages:
[1] gridExtra_0.9   gpclib_1.5-1    ggmap_2.1       maptools_0.8-16
[5] lattice_0.20-6  rgeos_0.2-7     plyr_1.7.1      stringr_0.6    
[9] ggplot2_0.9.1   sp_0.9-99       shapefiles_0.6  foreign_0.8-50 
[13] fastshp_0.1-0  
loaded via a namespace (and not attached):
[1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.2       labeling_0.1      
[5] MASS_7.3-18        memoise_0.1        munsell_0.3        png_0.1-4         
[9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1     RgoogleMaps_1.2.0 
[13] rjson_0.2.8        scales_0.2.1       tools_2.15.1