Transform Vigibase WHO .txt files
to .parquet files
WHODrug is delivered as zipped text files folder, that you should
transform to a more efficient format. Parquet format from arrow has many advantages:
It can work with out-of-memory data, which makes it possible to process tables on
a computer with not-so-much RAM. It is also lightweighted and standard across different
languages.
The function also creates variables in each table. See tb_vigibase()
for some running examples, and try ?mp_
or ?thg_
for more details.
Use dt_parquet()
to load the tables afterward.
Arguments
- path_who
Character string, a directory containing whodrug txt tables. It is also the output directory.
- force
Logical, to be passed to
cli::cli_progress_update()
. Used for internal purposes.
Value
.parquet files into the path_who
directory.
Some columns are returned as integer
(all Id columns, including MedicinalProd_Id,
with notable exception of DrecNo),
and some columns as numeric
(Quantity from ingredient table)
All other columns are character
.
Examples
if (FALSE) { # interactive()
# Use the examples from tb_main if you want to see these functions in action.
path_who <- paste0(tempdir(), "/whodrug_directory/")
dir.create(path_who)
create_ex_who_txt(path_who)
tb_who(path_who = path_who)
# Clear temporary files when you're done
unlink(path_who, recursive = TRUE)
}