D.1 Creating Data Frames

Most frequently, we will be creating data frames based on a series of numeric, logical, characters vectors of identical lengths.

##           u     v w
## 1 0.1815171  TRUE A
## 2 0.9197226 FALSE B
## 3 0.3117235 FALSE C
## 4 0.0641516  TRUE D
## 5 0.3964216 FALSE E

Note that when we create objects of type data frame, strings are automatically converted to factors.

## [1] "factor"

Throughout the history of computing with R, this has caused way too many bugs (recall, for instance, what’s the result of calling as.numeric() on a factor). In order to change this behaviour, either pass stringsAsFactors=FALSE argument to data.frame() or switch this feature off globally (recommended):

Some objects, such as matrices, can easily be coerced to data frames:

##      x  y  z  w
## [1,] 1  2  3  4
## [2,] 5  6  7  8
## [3,] 9 10 11 12
##   x  y  z  w
## 1 1  2  3  4
## 2 5  6  7  8
## 3 9 10 11 12

Named lists are amongst other candidates for conversion:

## $setosa
##    min median   mean    max 
##  4.300  5.000  5.006  5.800 
## 
## $versicolor
##    min median   mean    max 
##  4.900  5.900  5.936  7.000 
## 
## $virginica
##    min median   mean    max 
##  4.900  6.500  6.588  7.900
##        setosa versicolor virginica
## min     4.300      4.900     4.900
## median  5.000      5.900     6.500
## mean    5.006      5.936     6.588
## max     5.800      7.000     7.900