Usage
desc_facvar(
.data,
vf,
format = "n_/N_ (pc_%)",
digits = 0,
pad_width = 12,
ncat_max = 20,
export_raw_values = FALSE
)Arguments
- .data
A data.frame, where
vfare column names of categorical variables- vf
A character vector
- format
A character string, formatting options.
- digits
A numeric. Number of digits for the percentage (passed to interval formatting function).
- pad_width
A numeric. Minimum character length of value output (passed to
stringr::str_pad()).- ncat_max
A numeric. How many levels should be allowed for all variables? See details.
- export_raw_values
A logical. Should the raw values be exported?
Value
A data.frame with columns
varthe variable namelevelthe level of the variablevaluethe formatted value with possible number of casesn_, number of available casesN_, and percentagepc_, depending on format argument.n_availthe number of cases with available data for this variable.
Details
Many other packages provide tools to summarize data. This one is just
the package author's favorite.
Important format inputs are
n_number of patients with the categorical variable at said levelN_the first quartile number of patients with an available value for this variablepc_percentage of n / N
The format argument should contain at least the words "n_", "N_",
and optionally "pc_".
ncat_max ensures that you didn't provided a continuous
variable to desc_facvar(). If you have many levels for one of your variables,
set to Inf or high value.
Equivalent for continuous data is desc_cont().
Examples
df1 <-
data.frame(
smoke_status = c("smoker", "non-smoker",
"smoker", "smoker",
"smoker", "smoker",
"non-smoker"
),
hypertension = c(1, 1, 0, 1, 1, 1, 1),
age = c(60, 50, 56, 49, 75, 69, 85),
bmi = c(18, 30, 25, 22, 23, 21, 22)
)
# Use default formatting
desc_facvar(.data = df1, vf = c("hypertension", "smoke_status"))
#> # A tibble: 4 × 4
#> var level value n_avail
#> <chr> <chr> <chr> <int>
#> 1 hypertension 0 " 1/7 (14%) " 7
#> 2 hypertension 1 " 6/7 (86%) " 7
#> 3 smoke_status non-smoker " 2/7 (29%) " 7
#> 4 smoke_status smoker " 5/7 (71%) " 7
# Use custom formatting
desc_facvar(.data = df1,
vf = c("hypertension", "smoke_status"),
format = "n_ out of N_, pc_%",
digits = 1)
#> # A tibble: 4 × 4
#> var level value n_avail
#> <chr> <chr> <chr> <int>
#> 1 hypertension 0 1 out of 7, 14.3% 7
#> 2 hypertension 1 6 out of 7, 85.7% 7
#> 3 smoke_status non-smoker 2 out of 7, 28.6% 7
#> 4 smoke_status smoker 5 out of 7, 71.4% 7
# You might want to export raw values, to run plotting or
# other formatting functions
desc_facvar(.data = df1,
vf = c("hypertension", "smoke_status"),
export_raw_values = TRUE)
#> # A tibble: 4 × 6
#> var level value n_avail n pc
#> <chr> <chr> <chr> <int> <int> <dbl>
#> 1 hypertension 0 " 1/7 (14%) " 7 1 14.3
#> 2 hypertension 1 " 6/7 (86%) " 7 6 85.7
#> 3 smoke_status non-smoker " 2/7 (29%) " 7 2 28.6
#> 4 smoke_status smoker " 5/7 (71%) " 7 5 71.4
