Skip to contents

[Experimental] codebook() is a function to view the codebook of a dataset or selected variables.

codebook_detail() would show more detailed information. It is a wrapper of datawizard::data_codebook() with a better output format.

Usage

codebook(.data, ...)

codebook_detail(
  .data,
  ...,
  .type = c("flextable", "tibble"),
  n = Inf,
  max_values = 10,
  range_at = 6,
  verbose = TRUE
)

Arguments

.data

The input data (data frame or tibble).

...

<tidy-select> or <data-masking> Variables to include in the codebook. This argument can be omitted.

.type

The output type. Default for codebook_detail() is "flextable".

n

The number of rows to display. Default is Inf.

max_values

Number of maximum values that should be displayed. Can be used to avoid too many rows when variables have lots of unique values.

range_at

Indicates how many unique values in a numeric vector are needed in order to print a range for that variable instead of a frequency table for all numeric values. Can be useful if the data contains numeric variables with only a few unique values and where full frequency tables instead of value ranges should be displayed.

verbose

Toggle warnings and messages on or off.

Value

A tibble or a flextable.

Examples

starwars
#> # A tibble: 87 × 14
#>    name     height  mass hair_color skin_color eye_color birth_year sex   gender
#>    <chr>     <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> 
#>  1 Luke Sk…    172    77 blond      fair       blue            19   male  mascu…
#>  2 C-3PO       167    75 NA         gold       yellow         112   none  mascu…
#>  3 R2-D2        96    32 NA         white, bl… red             33   none  mascu…
#>  4 Darth V…    202   136 none       white      yellow          41.9 male  mascu…
#>  5 Leia Or…    150    49 brown      light      brown           19   fema… femin…
#>  6 Owen La…    178   120 brown, gr… light      blue            52   male  mascu…
#>  7 Beru Wh…    165    75 brown      light      blue            47   fema… femin…
#>  8 R5-D4        97    32 NA         white, red red             NA   none  mascu…
#>  9 Biggs D…    183    84 black      light      brown           24   male  mascu…
#> 10 Obi-Wan…    182    77 auburn, w… fair       blue-gray       57   male  mascu…
#> # ℹ 77 more rows
#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
#> #   vehicles <list>, starships <list>

codebook(starwars)
#> # A tibble: 14 × 4
#>    variable   type          n unique
#>    <chr>      <chr>     <int>  <int>
#>  1 name       character    87     87
#>  2 height     integer      81     45
#>  3 mass       double       59     38
#>  4 hair_color character    82     12
#>  5 skin_color character    87     31
#>  6 eye_color  character    87     15
#>  7 birth_year double       43     36
#>  8 sex        character    83      4
#>  9 gender     character    83      2
#> 10 homeworld  character    77     48
#> 11 species    character    83     37
#> 12 films      list         87     24
#> 13 vehicles   list         87     11
#> 14 starships  list         87     17

codebook(starwars, 1:4)
#> # A tibble: 4 × 4
#>   variable   type          n unique
#>   <chr>      <chr>     <int>  <int>
#> 1 name       character    87     87
#> 2 height     integer      81     45
#> 3 mass       double       59     38
#> 4 hair_color character    82     12

codebook(starwars, ends_with("color"))
#> # A tibble: 3 × 4
#>   variable   type          n unique
#>   <chr>      <chr>     <int>  <int>
#> 1 hair_color character    82     12
#> 2 skin_color character    87     31
#> 3 eye_color  character    87     15

codebook(starwars, where(is.numeric))
#> # A tibble: 3 × 4
#>   variable   type        n unique
#>   <chr>      <chr>   <int>  <int>
#> 1 height     integer    81     45
#> 2 mass       double     59     38
#> 3 birth_year double     43     36

codebook(lifeexp)
#> # A tibble: 6 × 5
#>   variable  label                    type             n unique
#>   <chr>     <chr>                    <chr>        <int>  <int>
#> 1 region    Region                   double+label    68      3
#> 2 country   Country                  character       68     68
#> 3 popgrowth Avg. annual % growth     double          68     30
#> 4 lexp      Life expectancy at birth double          68     18
#> 5 gnppc     GNP per capita           double          63     62
#> 6 safewater Safe water               double          40     29

# codebook_detail() is less stable than codebook().
# Some column types may not be recognized.
lifeexp %>%
 dplyr::select(-region) %>%
 codebook_detail()

id

name

label

type

missings

values

n

prop

row_id

character

character

character

character

character

character

character

character

integer

1

country

Country

character

0 (0.0%)

Albania

1

1.5%

1

Argentina

1

1.5%

1

Armenia

1

1.5%

1

Austria

1

1.5%

1

Azerbaijan

1

1.5%

1

Belarus

1

1.5%

1

Belgium

1

1.5%

1

Bolivia

1

1.5%

1

Bosnia and Herzegovina

1

1.5%

1

Brazil

1

1.5%

1

(...)

1

1

2

popgrowth

Avg. annual % growth

numeric

0 (0.0%)

[-0.5, 3]

68

2

2

3

lexp

Life expectancy at birth

numeric

0 (0.0%)

[54, 79]

68

3

3

4

gnppc

GNP per capita

numeric

5 (7.4%)

[370, 39980]

63

4

4

5

safewater

Safe water

numeric

28 (41.2%)

[28, 100]

40

5

5

n: 20