fromJSON {RJSONIO}
Description
This function and its methods read content in JSON format and de-serializes it into R objects. JSON content is made up of logicals, integers, real numbers, strings, arrays of these and associative arrays/hash tables using key: value pairs. These map very naturally to R data types (logical, integer, numeric, character, and named lists).
Usage
fromJSON(content, handler = NULL,
default.size = 100, depth = 150L, allowComments = TRUE,
asText = isContent(content), data = NULL,
maxChar = c(0L, nchar(content)), simplify = Strict,
nullValue = NULL, simplifyWithNames = TRUE,
encoding = NA_character_, stringFun = NULL, ...)
Arguments
- content
- the JSON content. This can be the name of a file or the content itself as a character string. We will add support for connections in the near future.
- handler
- an R object that is responsible for processing each individual token/element within the JSON content. By default, this is
NULLand we use the fast libjson parsing approach. Unless you want to customize the processing of the nodes in the tree, useNULL. This can be an R function, a list of functions with class"JSONParserHandler"havingupdateandvalueelements, or the address of a native (C) routine. In the case of the latter, thedataparameter can be used to specify an object that is passed to the C routine each time it is called. This will commonly be anexternalptrobject. - default.size
- a number giving the default buffer size to use for arrays and objects in an effort to avoid reallocating each time we add a new element.
- depth
- the maximum number of nested JSON levels, i.e. arrays and objects within arrays and objects.
- allowComments
- a logical value indicating whether to allow C-style comments within the JSON content or to raise an error if they are encountered.
- asText
- a logical value indicating whether the value of the
contentargument should be treated as the JSON content, i.e. read directly rather than considered the name of a file. - data
- a value that is only used when the value of
handleris a native (C) routine. In this case, the value is passed in each call to that C routine by the JSON tokenizer. - maxChar
- an integer vector of length 2 giving the start and end offsets in the character string to be processed. This allows the caller to specify a subset of the string to process without explicitly having to make a copy of the substring.
- simplify
- either a logical value or a number, e.g. the value of the variable
Strict(the default). This controls whether we attempt to collapse collections/arrays of homogeneous scalar elements to R vectors. If this isFALSE, no effort to combine scalars is made and they remain as separate list elements. If this isTRUE, then logicals, numbers and strings are collapsed to their common types in the same manner asc. The valueStrictdoes attempt to collapse collections of scalars but only if they are all of the same type, i.e. all strings, all numbers or all logicals. If we want to collapse numbers, but not logicals or characters, we can useStrictNumeric. Similarly, to collapse logicals but not numeric or character collections, we useStrictLogical. And, to collapse only character collections, we useStrictCharacter. If we want to collapse two types but not a third, we add the two values, e.g.StrictLogical + StrictNumeric, or pass them as a vectorc(StrictLogical, StrictNumeric).Strictis merely the combination of all 3 of the individual strict variables. Currently this is only implemented when the caller does not provide a handler and in the C code. - nullValue
- an R value that is used when we encounter a JSON
nullvalue in the JSON content. This can be used to mapnullto something more R-like such asNA. This can be an arbitrary R object. - simplifyWithNames
- a logical value that controls whether we attempt to collapse collections if the elements have names in the JSON content, i.e. a dictionary/associative array. If this is
TRUE, then we consider collapsing according to the value ofsimplify. If this isFALSE, if the collection has names, we do not attempt to simplify. - encoding
- the encoding for the content. This is used to ensure the encoding of any resulting strings/character vectors have this encoding. The default for this value is to use the same encoding as the input content.
- ...
- additional parameters for methods.
- stringFun
- an R function or a compiled routine (by address or name). The purpose of this is to process every string as it is encountered in the JSON content and to either convert return it as-is, or to convert it to a suitable R value. This, for example, might convert strings of the form "/new Date(2313213)/" or "/Date(12312312)/". The result is placed in the R object being generated from the JSON content where the original string would appear. So this allows us to handle strings with a special meaning.
If this is an R function, it is passed a single argument - the value of the string - and it can return that or any other R object, presumably derived from that original string. If a compiled routine is specified, it can be one of two types. Both take a simple C string. The default type returns a
SEXP, i.e. an R object. If the class ofstringFunis eitherAsIsorNativeStringRoutine, then that routine must return a C string, i.e. a char *. This will then be converted to an R character vector of length 1, using the default encoding given byencoding.
Values
An R object created by mapping the JSON content to its R equivalent.
References
See Also
toJSON the non-exported collector function {RJSONIO:::basicJSONHandler}.
Examples
fromJSON(I(toJSON(1:10))) fromJSON(I(toJSON(1:10 + .5))) fromJSON(I(toJSON(c(TRUE, FALSE, FALSE, TRUE)))) x = fromJSON('{"ok":true,"id":"x123","rev":"1-1794908527"}') # Reading from a connection. It is a text connection so we could # just read the text directly, but this could be a dynamic connection. m = matrix(1:27, 9, 3) txt = toJSON(m) con = textConnection(txt) identical(m, fromJSON(con)) # not true! fromJSON() returns just a list. # Use a connection and move the cursor ahead to skip over some lines. f = system.file("sampleData", "obj1.json", package = "RJSONIO") con = file(f, "r") readLines(con, 1) fromJSON(con) close(con) f = system.file("sampleData", "embedded.json", package = "RJSONIO") con = file(f, "r") readLines(con, 1) # eat the first line fromJSON(con, maxNumLines = 4) close(con) ## Not run: if(require(rjson)) { # We see an approximately a factor of 3.9 speed up when we use # this approach that mixes C-level tokenization and an R callback # function to gather the results into objects. f = system.file("sampleData", "usaPolygons.as", package = "RJSONIO") t1 = system.time(a <- RJSONIO:::fromJSON(f)) t2 = system.time(b <- fromJSON(paste(readLines(f), collapse = " "))) } ## End(Not run) # Use a C routine fromJSON(I("[1, 2, 3, 4]"), getNativeSymbolInfo("R_json_testNativeCallback", "RJSONIO")) # Use a C routine that populates an R integer vector with the # elements read from the JSON array. Note that we must ensure # that the array is big enough. fromJSON(I("[1, 2, 3, 4]"), getNativeSymbolInfo("R_json_IntegerArrayCallback", PACKAGE = "RJSONIO"), data = rep(1L, 5)) x = fromJSON(I("[1.1, 2.2, 3.3, 4.4]"), getNativeSymbolInfo("R_json_RealArrayCallback", PACKAGE = "RJSONIO"), data = rep(1, 5)) length(x) = 4 # This illustrates a "specialized" handler which knows what it is # expecting and pre-allocates the answer # This then populates the answer with the values. # The speed improvement is 1.8 versus "infinity"! x = rnorm(1000000) str = toJSON(x, digits = 6) fromJSON(I(str), getNativeSymbolInfo("R_json_RealArrayCallback", PACKAGE = "RJSONIO"), data = numeric(length(x))) # This is another example of very fast reading of specific JSON. x = matrix(rnorm(1000000), 1000, 1000) str = toJSON(x, digits = 6) v = fromJSON(I(str), getNativeSymbolInfo("R_json_RealArrayCallback", PACKAGE = "RJSONIO"), data = matrix(0, 1000, 1000)) # nulls and NAs fromJSON("{ 'abc': 1, 'def': 23, 'xyz': null, 'ooo': 4}", nullValue = NA) fromJSON("{ 'abc': 1, 'def': 23, 'xyz': null, 'ooo': 4}", nullValue = NULL) # default fromJSON("[1, 2, 3, null, 4]", nullValue = NA) fromJSON("[1, 2, 3, null, 4]", nullValue = NULL) # we can supply a complex object for null if we ever should need to. fromJSON('[ 1, 2, null]', nullValue = list(a = 1, b = 1:10))[[3]] # Using StrictNumeric, etc. x = list(sub1 = list(a = 1:10, b = 100, c = 1000), sub2 = list(animal1 = "ape", animal2 = "bear", animal3 = "cat"), sub3 = rep(c(TRUE, FALSE), 3)) js = toJSON(x) fromJSON(js) # leave character strings uncollapsed fromJSON(js, simplify = StrictNumeric + StrictLogical) fromJSON(js, simplify = c(StrictNumeric, StrictLogical)) fromJSON(js, simplifyWithNames = FALSE) fromJSON(js, simplifyWithNames = TRUE) ####### # stringFun txt = '{ "magnitude": 3.8, "longitude": -125.012, "latitude": 40.382, "date": "new Date(1335515917000)", "when": "/Date(1335515917000)/", "country": "USA", "verified": true }' convertJSONDate = function(x) { if(grepl("/?(new )?Date\\(", x)) { val = gsub(".*Date\\(([0-9]+)\\).*", "\1", x) structure(as.numeric(val)/1000, class = c("POSIXct", "POSIXt")) } else x } fromJSON(txt, stringFun = convertJSONDate) # A C routine for converting dates jtxt = '[ 1, "/new Date(12312313)", "/Date(12312313)"]' ans = fromJSON(jtxt) ans = fromJSON(jtxt, stringFun = "R_json_dateStringOp") # A C routine that returns a char * - leaves strings as is c = fromJSON(jtxt, stringFun = I("dummyStringOperation")) c = fromJSON(jtxt, stringFun = I(getNativeSymbolInfo("dummyStringOperation"))) c = fromJSON(jtxt, stringFun = I(getNativeSymbolInfo("dummyStringOperation")$address)) # I() or class = "NativeStringRoutine". c = fromJSON(jtxt, stringFun = structure("dummyStringOperation", class = "NativeStringRoutine"))
