Problem to discuss is how to convert the following list of data frames into one data frame.
x <- list(data.frame(math=90, science=85), data.frame(math=98, science=82))
One commonly known approach is to convert the list to vector using unlist(), and then convert to matrix, and then finally make it as data frame.
> x <- list(data.frame(math=90, science=85), + data.frame(math=98, science=82)) > > as.data.frame(matrix(ncol=2, byrow=TRUE, unlist(x))) V1 V2 1 90 85 2 98 82
But this has drawback as unlist() will perform type-coercion as vector can contain only one data type. For example, we get weird names in the example below.
> x <- list(data.frame(name="foo", value=1), + data.frame(name="bar", value=2)) > > as.data.frame(matrix(ncol=2, byrow=TRUE, unlist(x))) V1 V2 1 1 1 2 1 2
Solution for this is to use do.call() with rbind(). It calls rbind for each elements in the list.
> x <- list(data.frame(name="foo", value=1), + data.frame(name="bar", value=2)) > > do.call(rbind, x) name value 1 foo 1 2 bar 2
But the issue is that the rbind is slow.
> x <- lapply(1:10000, function(x) { + data.frame(name=paste(x, "foo"), value=x) + }) > head(x) [[1]] name value 1 1 foo 1 [[2]] name value 1 2 foo 2 [[3]] name value 1 3 foo 3 [[4]] name value 1 4 foo 4 [[5]] name value 1 5 foo 5 [[6]] name value 1 6 foo 6 > system.time(do.call(rbind, x)) user system elapsed 10.100 0.014 10.114
Ten seconds is not bad, right? But it can be several minutes as data frame gets more columns in it.
rbindlist in data.table solves this problem as in the below. See that it takes almost negligible amount of time.
> library(data.table) > x <- lapply(1:10000, function(x) { + data.frame(name=paste(x, "foo"), value=x) + }) > system.time(rbindlist(x)) user system elapsed 0.003 0.000 0.003
data.table is subclass of data.frame. Thus, it should work where data.frame is necessary. And, if needed, data.table can be easily converted to data.frame using as.data.frame().