R performance with data reshaping -
I am trying to resize a data frame in R and use the suggested method to do this There is a problem in doing The data frame has the following structure:
id DATE1 DATE2 VALTYPE VALUE 'abcd1233' 2009-11-12 2009-12-23 'TYPE1' 123.45 ...
VALTYPE
is a string and is a factor with only 2 values ( TYPE1
and TYPE2
). I need to change the following data frame ("broad" relocation) based on common id and dates:
ID DATE1 DATE2 VALUE.TYPE1 VALUE.TYPE2 'abcd1233' 2009-11-12 2009-12-23 123.45 NA ...
There are more than 4,500,000 comments in the data frame (although about 70% VALUE
s about NA < / Code>). The machine is Intel based Linux workstation with 4 GB RAM. By loading data (from a compressed Rdata file) into a fresh R process, it increases to about 250 MB, which obviously leaves a lot of space for the rescheduling.
These are my experiences so far:
RESULT: Error: size can not allocate vectors of 4.8 GB
-
Cast ()
Use thereshape
package of the method:tbl2 & lt; - cast (tbl), id + DATE1 + DATE2 ~ VALTYPE);
-
Using
to ()
andmerge ()
:SP & LT; - (TBL [C (1,2,3,5)], TBL $ walpie, function (x) x); TBL & LT; - Merge (SP [["TYPE1"]], SP [["TYPE2"]] = C ("ID", "DATE1", "DATE2"), all = true, sort = true);
RESULT: R process There is no end in the consumption of all RAM. Eventually this process had to be killed.
RESULT: Works fine, though it is not very elegant and stupid (i.e. it will break if more types are added).
In order to add harm to the injury, in almost 3 lines of AOWK or Pearl (and, with hardly any RAM), the operation can be achieved, the question is whether all available What is a better way to do this operation using recommended methods without using RAM?
A useful move is to combine the id variables into a character vector and then to rearrange.
tbl $ NEWID & lt; - (tbl, paste (id, DATE1, DATE2, SP = ";" ")) TB2 and LT; - Restat (TB2, Nude ~ Valtipie, Measurement =" VALUE ")
In the same size problem it is almost 40% faster than the pair of my Intel Core 2 2.2ghz in the MacBook.
Comments
Post a Comment