Usage
tb_subset(
wd_in,
wd_out,
subset_var = c("drecno", "medprod_id", "meddra_id", "age"),
sv_selection,
rm_suspdup = TRUE
)
Arguments
- wd_in
Source directory pathway (character)
- wd_out
Output directory pathway (character)
- subset_var
One of
"drecno"
,"medprod_id"
,"meddra_id"
,"age"
- sv_selection
A named list or a vector containing the appropriate ids (according to the method, see details)
- rm_suspdup
A logical. Should suspected duplicates be removed? TRUE by default
Value
Parquet files in the output directory. All files from a vigibase ECL main folder are returned subsetted (including suspectedduplicates).
Details
You must select a subset variable with subset_var
and provide
an appropriate list according to this variable in sv_selection
.
Available subset_var
:
drecno
will use Drug Record Number (DrecNo), from WHO Drug, and will subset fromdrug
(seeget_drecno()
).medprod_id
will use MedicinalProd_Id, also fromdrug
. May be useful if requesting from ATC classes. (seeget_atc_code()
).meddra_id
will use MedDRA_Id, subset fromadr
. (seeget_llt_soc()
or Seeget_llt_smq()
).age
will use AgeGroup fromdemo
. See below.
Age groups ids are as follows:
1 0 - 27 days
2 28 days to 23 months
3 2 - 11 years
4 12 - 17 years
5 18 - 44 years
6 45 - 64 years
7 65 - 74 years
8 >= 75 years
9 Unknown
Example: To work with patients aged 18 to 74, provide c(5, 6, 7)
as sv_selection
.
Use dt_parquet()
to load the tables afterward.
Examples
if (FALSE) { # interactive()
# --- technical steps ---- #
wd_in <- paste0(tempdir(), "/", "tbsubsetex")
dir.create(wd_in)
create_ex_main_pq(wd_in)
# Select a subset_var and corresponding data
# Subset on adr colitis codes
adr_llt <-
list(
colitis = "Colitis"
) |>
get_llt_soc(term_level = "pt", meddra_, verbose = FALSE)
wd_out <- paste0(wd_in, "/", "colitis_subset", "/")
tb_subset(wd_in, wd_out,
subset_var = "meddra_id",
sv_selection = adr_llt)
# Subset on drug codes
d_drecno <-
list(
ipi = "ipilimumab") |>
get_drecno(mp = mp_, verbose = FALSE)
wd_out <- paste0(wd_in, "/", "nivolumab_subset", "/")
tb_subset(wd_in, wd_out,
subset_var = "drecno",
sv_selection = d_drecno)
# Subset on age > 65 year-old
sv_selection <-
c(7, 8)
wd_out <- paste0(wd_in, "/", "more_than_65_subset", "/")
tb_subset(wd_in, wd_out,
subset_var = "age",
sv_selection = sv_selection)
unlink(wd_in, recursive = TRUE)
}