r - Determine the data types of a data frame's columns

ID : 20274

viewed : 12

Tags : rdataframetypesr

Top 5 Answer for r - Determine the data types of a data frame's columns

vote vote

90

Your best bet to start is to use ?str(). To explore some examples, let's make some data:

set.seed(3221)  # this makes the example exactly reproducible my.data <- data.frame(y=rnorm(5),                        x1=c(1:5),                        x2=c(TRUE, TRUE, FALSE, FALSE, FALSE),                       X3=letters[1:5]) 

@Wilmer E Henao H's solution is very streamlined:

sapply(my.data, class)         y        x1        x2        X3  "numeric" "integer" "logical"  "factor"  

Using str() gets you that information plus extra goodies (such as the levels of your factors and the first few values of each variable):

str(my.data) 'data.frame':  5 obs. of  4 variables: $ y : num  1.03 1.599 -0.818 0.872 -2.682 $ x1: int  1 2 3 4 5 $ x2: logi  TRUE TRUE FALSE FALSE FALSE $ X3: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 

@Gavin Simpson's approach is also streamlined, but provides slightly different information than class():

sapply(my.data, typeof)        y        x1        x2        X3  "double" "integer" "logical" "integer" 

For more information about class, typeof, and the middle child, mode, see this excellent SO thread: A comprehensive survey of the types of things in R. 'mode' and 'class' and 'typeof' are insufficient.

vote vote

88

sapply(yourdataframe, class) 

Where yourdataframe is the name of the data frame you're using

vote vote

72

I would suggest

sapply(foo, typeof) 

if you need the actual types of the vectors in the data frame. class() is somewhat of a different beast.

If you don't need to get this information as a vector (i.e. you don't need it to do something else programmatically later), just use str(foo).

In both cases foo would be replaced with the name of your data frame.

vote vote

67

Simply pass your data frame into the following function:

data_types <- function(frame) {   res <- lapply(frame, class)   res_frame <- data.frame(unlist(res))   barplot(table(res_frame), main="Data Types", col="steelblue", ylab="Number of Features") } 

to produce a plot of all data types in your data frame. For the iris dataset we get the following:

data_types(iris) 

enter image description here

vote vote

54

For small data frames:

library(tidyverse)  as_tibble(mtcars) 

gives you a print out of the df with data types

# A tibble: 32 x 11      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1 

For large data frames:

glimpse(mtcars) 

gives you a structured view of data types:

Observations: 32 Variables: 11 $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17.... $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, ... $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6... $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215... $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.0... $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440... $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90... $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, ... $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, ... $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, ... $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, ... 

To get a list of the columns' data type (as said by @Alexandre above):

map(mtcars, class) 

gives a list of data types:

$mpg [1] "numeric"  $cyl [1] "numeric"  $disp [1] "numeric"  $hp [1] "numeric" 

To change data type of a column:

library(hablar)  mtcars %>%    convert(chr(mpg, am),           int(carb)) 

converts columns mpg and am to character and the column carb to integer:

# A tibble: 32 x 11    mpg     cyl  disp    hp  drat    wt  qsec    vs am     gear  carb    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <int>  1 21        6  160    110  3.9   2.62  16.5     0 1         4     4  2 21        6  160    110  3.9   2.88  17.0     0 1         4     4  3 22.8      4  108     93  3.85  2.32  18.6     1 1         4     1  4 21.4      6  258    110  3.08  3.22  19.4     1 0         3     1 

Top 3 video Explaining r - Determine the data types of a data frame's columns

Related QUESTION?