De-, re-classify data?
Some of the data I work with contain sensitive information (names of persons, dates, locations, etc). But I sometimes need to share "the numbers" with other persons to get help with statistical analysis, or process it on more powerful machines where I can't manage who looks at the data.
Ideally I would like to work like this:
- Read the data into R (look at it, clean it, etc.)
- Select a data frame that I want to de-classify, run it through a package and receive two "files": the de-classified data and a translation-file. The latter I will keep myself.
- The de-classified data can be shared, manipulated and processed without worries.
- I re-classify the processed data together with the translation-file.
I suppose that this can also be useful when uploading data for processing "in the cloud" (Amazon etc.).
Have you been in this situation? I first thought about writing a "randomize" function myself, but then I realized there is no end on how sophisticated this can be done (for example offsetting time-stamps without loosing order). Maybe there is already a defined method or tool?
Thanks to everyone who contributes to [r]-tag here at Stack Overflow!