We have examples to demonstrate how to recode variables with the
recodeflow function rec_with_table()
Our examples use following packages:
Package recodeflow
Steps on how to install recodeflow
are in how to install
#Load the package
library(recodeflow)
Package dplyr
to combine datasets (function: bind_rows
).
Our examples use example data
Our examples use the dataset pbc
from the package survival.
We’ve split this dataset in two (tester1 and tester2) to mimic real data
e.g., the same survey preformed in separate years. For our examples,
we’ve also added columns (agegrp5
and
agegrp10
) to this dataset.
test1 <- survival::pbc[1:209,]
test2 <- survival::pbc[210:418,]
#Adapting the data for How To examples. Breaking cont age variable into categories - 5 and 10 year age groups.
agegrp <- cut(test1$age, breaks = c(25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80), right = FALSE)
agegrp <- as.numeric(agegrp)
tester1 <- cbind(test1, agegrp)
agegrp <- cut(test2$age, breaks = c(20, 30, 40, 50, 60, 70, 80), right = FALSE)
agegrp <- as.numeric(agegrp)
tester2 <- cbind(test2, agegrp)
In our example datasets, the variable sex
contains the
values: m for males and f for females.
Using dataset tester1, we’ll recode the variable sex
into a harmonized sex
variable. The harmonized
sex
variable has the values: 0 for males and 1 for
females.
sex
variable in tester1.
sex_1 <- rec_with_table(data = tester1,
variables = "sex",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
var_labels = c(sex = "sex"),
database_name = 'tester1'
)
#> The variable sex was recoded into sex for the database tester1 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 27
#> 2 f f 182
#> 3 NA::a 9 0
#> 4 NA(b) else 0
#> sex
#> 1 f
#> 2 f
#> 3 m
#> 4 f
#> 5 f
#> 6 f
We’ll recode and combine the variable sex
for our two
datasets.
sex
variable in tester1 and tester2.
sex_1 <- rec_with_table(data = tester1,
variables = "sex",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
var_labels = c(sex = "Sex"),
database_name = 'tester1'
)
#> The variable sex was recoded into sex for the database tester1 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 27
#> 2 f f 182
#> 3 NA::a 9 0
#> 4 NA(b) else 0
head(sex_1)
#> sex
#> 1 f
#> 2 f
#> 3 m
#> 4 f
#> 5 f
#> 6 f
sex_2 <- rec_with_table(data = tester2,
variables = "sex",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
var_labels = c(sex = "Sex"),
database_name = 'tester2'
)
#> The variable sex was recoded into sex for the database tester2 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 17
#> 2 f f 192
#> 3 NA::a 9 0
#> 4 NA(b) else 0
tail(sex_2)
#> sex
#> 413 f
#> 414 f
#> 415 f
#> 416 f
#> 417 f
#> 418 f
sex_combined <- bind_rows(sex_1, sex_2)
#> sex
#> 1 f
#> 2 f
#> 3 m
#> 4 f
#> 5 f
#> 6 f
#> sex
#> 413 f
#> 414 f
#> 415 f
#> 416 f
#> 417 f
#> 418 f
Labels are lost during the database merging.
Use set_data_labels()
to label the variables in your
final dataset. set_data_labels()
sets the labels with the
original information in variables
and
variable_details
.
labeled_sex_combined <- set_data_labels(
data_to_label = sex_combined,
variable_details = recodeflow::tester_variable_details,
variables_sheet = recodeflow::tester_variables
)
You could have a situation where a variable is the same across datasets but its categories change.
In our example data the variable agegrp
is different in
tester1 and tester2.
agegrp
variable is 5-year age groups:
20-24, 25-29, 30-34, etc.agegrp
variable is 10-year age groups:
20-29, 30-39, 40-49, etc.There are three options to facilitate the use of variables with inconsistent categories across datasets.
agegrp
variable into a common
variable for only datasets with the same category responses
Recode the agegrp
variable into a common variable only
in datasets were the categories are the same. If the categories are
different between datasets, separate columns will be created.
The categories in the agegrp
variable in tester1 are
different than the categories of agegrp
in tester2.
Therefore, it is not possible to have the same agegrp
categories across our example data sets.
agegrp5
in tester1 and recode
agegrp10
in tester2.
agegrp_1 <- rec_with_table(data = tester1,
variables = "agegrp5",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester1'
)
#> The variable agegrp was recoded into agegrp5 for the database tester1 the following recodes were made:
#> # A tibble: 12 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 1 1 2
#> 2 2 2 12
#> 3 3 3 15
#> 4 4 4 37
#> 5 5 5 37
#> 6 6 6 40
#> 7 7 7 28
#> 8 8 8 21
#> 9 9 9 9
#> 10 10 10 6
#> 11 11 11 2
#> 12 Na::b else 0
head(agegrp_1)
#> agegrp5
#> 1 7
#> 2 7
#> 3 10
#> 4 6
#> 5 3
#> 6 9
agegrp_2 <- rec_with_table(data = tester2,
variables = "agegrp10",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester2')
#> The variable agegrp was recoded into agegrp10 for the database tester2 the following recodes were made:
#> # A tibble: 7 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 1 1 1
#> 2 2 2 39
#> 3 3 3 52
#> 4 4 4 67
#> 5 5 5 45
#> 6 6 6 5
#> 7 NA(b) else 0
head(agegrp_2)
#> agegrp10
#> 210 3
#> 211 4
#> 212 3
#> 213 4
#> 214 5
#> 215 3
agegrp5
in tester1 with
the harmonized agegrp10
in tester2.
agegrp_combined <- bind_rows(agegrp_1, agegrp_2)
#> agegrp5 agegrp10
#> 1 7 <NA>
#> 2 7 <NA>
#> 3 10 <NA>
#> 4 6 <NA>
#> 5 3 <NA>
#> 6 9 <NA>
#> agegrp5 agegrp10
#> 413 <NA> 2
#> 414 <NA> 5
#> 415 <NA> 2
#> 416 <NA> 4
#> 417 <NA> 4
#> 418 <NA> 4
agegrp
variable into a
continuous age_cont
variable
Recode categorical variable agegrp
into a single
harmonized continuous variable age_cont
.
age_cont
takes the midpoint age of each category for
‘agegrp’ across datasets. With this option, the categorical variable
‘agegrp’ from each dataset can be combined into a single dataset.
agegrp
in tester1 and
agegrp
in tester2 to the harmonized continuous variable
age_cont
.
agegrp_1_cont <- rec_with_table(data = tester1,
variables = "age_cont",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester1')
#> The variable agegrp was recoded into age_cont for the database tester1 the following recodes were made:
#> # A tibble: 12 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 27 1 2
#> 2 32 2 12
#> 3 37 3 15
#> 4 42 4 37
#> 5 47 5 37
#> 6 52 6 40
#> 7 57 7 28
#> 8 62 8 21
#> 9 67 9 9
#> 10 72 10 6
#> 11 77 11 2
#> 12 NA else 0
head(agegrp_1_cont)
#> age_cont
#> 1 57
#> 2 57
#> 3 72
#> 4 52
#> 5 37
#> 6 67
agegrp_2_cont <- rec_with_table(data = tester2,
variables = "age_cont",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester2')
#> The variable agegrp was recoded into age_cont for the database tester2 the following recodes were made:
#> # A tibble: 7 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 25 1 1
#> 2 35 2 39
#> 3 45 3 52
#> 4 55 4 67
#> 5 65 5 45
#> 6 75 6 5
#> 7 NA else 0
head(agegrp_2_cont)
#> age_cont
#> 210 45
#> 211 55
#> 212 45
#> 213 55
#> 214 65
#> 215 45
age_cont
from
tester1 and tester2.
agegrp_cont_combined <- bind_rows(agegrp_1_cont, agegrp_2_cont)
#> age_cont
#> 1 57
#> 2 57
#> 3 72
#> 4 52
#> 5 37
#> 6 67
#> age_cont
#> 413 35
#> 414 65
#> 415 35
#> 416 55
#> 417 55
#> 418 55
agegrp
variable into a
harmonized categorical variable
Dataset tester1 has 5-year age groups (e.g., 30-34, 35-39), and tester2 has 10-year age groups (e.g., 30-39). Therefore, we can collapse the 5-year age groups in dataset tester1 to the same 10-year age groups in dataset tester2.
agegrp
in tester1 into
agegrp10
. recode variable agegrp
in tester2
into agegrp10
.
agegrp10_1 <- rec_with_table(data = tester1,
variables = "agegrp10",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester1')
#> The variable agegrp was recoded into agegrp10 for the database tester1 the following recodes were made:
#> # A tibble: 12 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 1 1 2
#> 2 2 2 12
#> 3 2 3 15
#> 4 3 4 37
#> 5 3 5 37
#> 6 4 6 40
#> 7 4 7 28
#> 8 5 8 21
#> 9 5 9 9
#> 10 6 10 6
#> 11 6 11 2
#> 12 NA(b) else 0
head(agegrp10_1)
#> agegrp10
#> 1 4
#> 2 4
#> 3 6
#> 4 4
#> 5 2
#> 6 5
agegrp10_2 <- rec_with_table(data = tester2,
variables = "agegrp10",
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester2')
#> The variable agegrp was recoded into agegrp10 for the database tester2 the following recodes were made:
#> # A tibble: 7 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 1 1 1
#> 2 2 2 39
#> 3 3 3 52
#> 4 4 4 67
#> 5 5 5 45
#> 6 6 6 5
#> 7 NA(b) else 0
head(agegrp10_2)
#> agegrp10
#> 210 3
#> 211 4
#> 212 3
#> 213 4
#> 214 5
#> 215 3
age_cat
from tester1 and tester2.
agegrp10_combined <- bind_rows(agegrp10_1, agegrp10_2)
#> agegrp10
#> 1 4
#> 2 4
#> 3 6
#> 4 4
#> 5 2
#> 6 5
#> agegrp10
#> 413 2
#> 414 5
#> 415 2
#> 416 4
#> 417 4
#> 418 4
The variables argument in rec_with_table()
allows
multiple variables to be recoded from a dataset.
In this example, the age
and sex
variables
from the tester1 and tester2 datasets will be recoded and labeled using
rec_with_table()
.
We’ll then combine the two recoded datasets into a single dataset and
labeled using set_data_labels()
.
age
and sex
in dataset tester1 and
tester2
age_sex_1 <- rec_with_table(data = tester1,
variables = c("age", "sex"),
variable_details = recodeflow::tester_variable_details,
log = TRUE,
var_labels = c(age = "Age", sex = "Sex"),
database_name = 'tester1')
#> The variable age was recoded into age for the database tester1 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 NA else 0
#> The variable sex was recoded into sex for the database tester1 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 27
#> 2 f f 182
#> 3 NA::a 9 0
#> 4 NA(b) else 0
head(age_sex_1)
#> age sex
#> 1 58.76523 f
#> 2 56.44627 f
#> 3 70.07255 m
#> 4 54.74059 f
#> 5 38.10541 f
#> 6 66.25873 f
age_sex_2 <- rec_with_table(data = tester2,
variables = c("age", "sex"),
variable_details = recodeflow::tester_variable_details,
log = TRUE,
var_labels = c(age = "Age", sex = "Sex"),
database_name = 'tester2'
)
#> The variable age was recoded into age for the database tester2 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 NA else 0
#> The variable sex was recoded into sex for the database tester2 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 17
#> 2 f f 192
#> 3 NA::a 9 0
#> 4 NA(b) else 0
head(age_sex_2)
#> age sex
#> 210 49.76318 m
#> 211 52.91444 f
#> 212 47.26352 f
#> 213 50.20397 f
#> 214 69.34702 f
#> 215 41.16906 f
age
and
sex
from tester1 and tester2.
combined_age_sex <- bind_rows(age_sex_1, age_sex_2)
head(combined_age_sex)
#> age sex
#> 1 58.76523 f
#> 2 56.44627 f
#> 3 70.07255 m
#> 4 54.74059 f
#> 5 38.10541 f
#> 6 66.25873 f
Use set_data_labels()
to label the variables in your
final dataset. set_data_labels()
sets the labels with the
original information in variables
and
variable_details
.
var_labels
can be used all the variables in
variables.csv
or a subset of variables.
labeled_combined_age_sex <-
set_data_labels(
data_to_label = combined_age_sex,
variable_details = recodeflow::tester_variable_details,
variables_sheet = recodeflow::tester_variables
)
You can check if labels have been added to your recoded dataset by
using get_label()
.
library(sjlabelled)
#>
#> Attaching package: 'sjlabelled'
#> The following object is masked from 'package:dplyr':
#>
#> as_label
get_label(labeled_combined_age_sex)
#> age sex
#> "age" "sex"
For more information on get_label()
and other label
helper functions, please refer to the sjlabelled
package.
All the variables listed in variables
worksheet can be
recoded with rec_with_table()
.
In this example, all variables specified in the
variables
worksheet will be recoded and combined for the
datasets tester1 and tester2.
recoded1 <- rec_with_table(data = tester1,
variables = recodeflow::tester_variables,
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester1'
)
#> The variable age was recoded into age for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 <NA> else 0
#> The variable agegrp was recoded into age_cont for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 52 6 40
#> 2 27 1 2
#> 3 32 2 12
#> 4 37 3 15
#> 5 42 4 37
#> 6 47 5 37
#> 7 57 7 28
#> 8 62 8 21
#> 9 67 9 9
#> 10 72 10 6
#> 11 77 11 2
#> 12 <NA> else 0
#> The variable agegrp was recoded into agegrp10 for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 6 10 6
#> 2 6 11 2
#> 3 2 3 15
#> 4 3 4 37
#> 5 1 1 2
#> 6 2 2 12
#> 7 4 7 28
#> 8 5 8 21
#> 9 3 5 37
#> 10 4 6 40
#> 11 5 9 9
#> 12 NA(b) else 0
#> The variable agegrp was recoded into agegrp5 for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 4 4 37
#> 2 5 5 37
#> 3 1 1 2
#> 4 2 2 12
#> 5 3 3 15
#> 6 9 9 9
#> 7 6 6 40
#> 8 7 7 28
#> 9 8 8 21
#> 10 10 10 6
#> 11 11 11 2
#> 12 Na::b else 0
#> The variable albumin was recoded into albumin for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [1,5] 209
#> 2 NA::a 99 0
#> 3 <NA> else 0
#> The variable alk.phos was recoded into alk.phos for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [200,15000] 209
#> 2 NA::a 99999 0
#> 3 <NA> else 0
#> The variable ascites was recoded into ascites for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 18
#> 2 NA::a 9 0
#> 3 0 0 191
#> 4 NA(b) else 0
#> The variable ast was recoded into ast for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [20,500] 209
#> 2 NA::a 9999 0
#> 3 <NA> else 0
#> The variable bili was recoded into bili for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,100] 209
#> 2 <NA> else 0
#> The variable chol was recoded into chol for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [100,2000] 186
#> 2 NA::a 9999 0
#> 3 <NA> else 23
#> The variable copper was recoded into copper for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 208
#> 2 NA::a 9999 0
#> 3 <NA> else 1
#> The variable edema was recoded into edema for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 0.5 0.5 22
#> 2 1 1 16
#> 3 0 0 171
#> 4 NA::a 9 0
#> 5 NA(b) else 0
#> The variable hepato was recoded into hepato for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 0 0 102
#> 2 1 1 107
#> 3 NA::a 9 0
#> 4 NA(b) else 0
#> The variable platelet was recoded into platelet for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 205
#> 2 NA::a 9999 0
#> 3 <NA> else 4
#> The variable protime was recoded into protime for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [5, 30] 209
#> 2 NA::a 99 0
#> 3 <NA> else 0
#> The variable sex was recoded into sex for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 m m 27
#> 2 NA::a 9 0
#> 3 f f 182
#> 4 NA(b) else 0
#> The variable spiders was recoded into spiders for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 0 0 145
#> 2 NA::a 9 0
#> 3 1 1 64
#> 4 NA(b) else 0
#> The variable stage was recoded into stage for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 12
#> 2 2 2 46
#> 3 4 4 74
#> 4 NA::a 9 0
#> 5 3 3 77
#> 6 NA(b) else 0
#> The variable status was recoded into status for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 7
#> 2 2 2 108
#> 3 0 0 94
#> 4 NA::a 9 0
#> 5 NA(b) else 0
#> The variable time was recoded into time for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,5000] 209
#> 2 NA::a 9999 0
#> 3 <NA> else 0
#> The variable trig was recoded into trig for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 185
#> 2 NA::a 9999 0
#> 3 <NA> else 24
#> The variable trt was recoded into trt for the database tester1 the following recodes were made:
#> value_to From rows_recoded
#> 1 2 2 103
#> 2 NA::a 9 0
#> 3 1 1 106
#> 4 NA(b) else 0
recoded2 <- rec_with_table(data = tester2,
variables = recodeflow::tester_variables,
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester2'
)
#> The variable age was recoded into age for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 <NA> else 0
#> The variable agegrp was recoded into age_cont for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 25 1 1
#> 2 35 2 39
#> 3 45 3 52
#> 4 55 4 67
#> 5 65 5 45
#> 6 75 6 5
#> 7 <NA> else 0
#> The variable agegrp was recoded into agegrp10 for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 1
#> 2 2 2 39
#> 3 3 3 52
#> 4 4 4 67
#> 5 5 5 45
#> 6 6 6 5
#> 7 NA(b) else 0
#> The variable albumin was recoded into albumin for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [1,5] 209
#> 2 NA::a 99 0
#> 3 <NA> else 0
#> The variable alk.phos was recoded into alk.phos for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [200,15000] 103
#> 2 NA::a 99999 0
#> 3 <NA> else 106
#> The variable ascites was recoded into ascites for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 6
#> 2 NA::a 9 0
#> 3 0 0 97
#> 4 NA(b) else 106
#> The variable ast was recoded into ast for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [20,500] 103
#> 2 NA::a 9999 0
#> 3 <NA> else 106
#> The variable bili was recoded into bili for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,100] 209
#> 2 <NA> else 0
#> The variable chol was recoded into chol for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [100,2000] 98
#> 2 NA::a 9999 0
#> 3 <NA> else 111
#> The variable copper was recoded into copper for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 102
#> 2 NA::a 9999 0
#> 3 <NA> else 107
#> The variable edema was recoded into edema for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 0.5 0.5 22
#> 2 1 1 4
#> 3 0 0 183
#> 4 NA::a 9 0
#> 5 NA(b) else 0
#> The variable hepato was recoded into hepato for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 0 0 50
#> 2 1 1 53
#> 3 NA::a 9 0
#> 4 NA(b) else 106
#> The variable platelet was recoded into platelet for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 202
#> 2 NA::a 9999 0
#> 3 <NA> else 7
#> The variable protime was recoded into protime for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [5, 30] 207
#> 2 NA::a 99 0
#> 3 <NA> else 2
#> The variable sex was recoded into sex for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 m m 17
#> 2 NA::a 9 0
#> 3 f f 192
#> 4 NA(b) else 0
#> The variable spiders was recoded into spiders for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 0 0 77
#> 2 NA::a 9 0
#> 3 1 1 26
#> 4 NA(b) else 106
#> The variable stage was recoded into stage for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 9
#> 2 2 2 46
#> 3 4 4 70
#> 4 NA::a 9 0
#> 5 3 3 78
#> 6 NA(b) else 6
#> The variable status was recoded into status for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 1 1 18
#> 2 2 2 53
#> 3 0 0 138
#> 4 NA::a 9 0
#> 5 NA(b) else 0
#> The variable time was recoded into time for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,5000] 209
#> 2 NA::a 9999 0
#> 3 <NA> else 0
#> The variable trig was recoded into trig for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 copy [0,1000] 97
#> 2 NA::a 9999 0
#> 3 <NA> else 112
#> The variable trt was recoded into trt for the database tester2 the following recodes were made:
#> value_to From rows_recoded
#> 1 2 2 51
#> 2 NA::a 9 0
#> 3 1 1 52
#> 4 NA(b) else 106
combined_dataset <- bind_rows(recoded1, recoded2)
labeled_combined <- set_data_labels(data_to_label = combined_dataset,
variable_details = recodeflow::tester_variable_details,
variables_sheet = recodeflow::tester_variables
)
To know the origin of each row of data, you can use the
rec_with_table
argument attach_data_name
. When
the argument attach_data_name
is set to true it will add a
column with the name of the dataset the row is from.
age
and sex
and attach
dataset name for tester1 and tester2.
age_sex_1 <- rec_with_table(data = tester1,
variables = c("age", "sex"),
variable_details = recodeflow::tester_variable_details,
var_labels = c(age = "Age", sex = "Sex"),
log = TRUE,
attach_data_name = TRUE,
database_name = 'tester1'
)
#> The variable age was recoded into age for the database tester1 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 NA else 0
#> The variable sex was recoded into sex for the database tester1 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 27
#> 2 f f 182
#> 3 NA::a 9 0
#> 4 NA(b) else 0
age_sex_2 <- rec_with_table(data = tester2,
variables = c("age", "sex"),
variable_details = recodeflow::tester_variable_details,
var_labels = c(age = "Age", sex = "Sex"),
log = TRUE,
attach_data_name = TRUE,
database_name = 'tester2'
)
#> The variable age was recoded into age for the database tester2 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [20,80] 209
#> 2 NA::a 999 0
#> 3 NA else 0
#> The variable sex was recoded into sex for the database tester2 the following recodes were made:
#> # A tibble: 4 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 m m 17
#> 2 f f 192
#> 3 NA::a 9 0
#> 4 NA(b) else 0
combined_age_sex <- bind_rows(age_sex_1, age_sex_2)
head(combined_age_sex)
#> age sex data_name
#> 1 58.76523 f tester1
#> 2 56.44627 f tester1
#> 3 70.07255 m tester1
#> 4 54.74059 f tester1
#> 5 38.10541 f tester1
#> 6 66.25873 f tester1
tail(combined_age_sex)
#> age sex data_name
#> 413 35.00068 f tester2
#> 414 67.00068 f tester2
#> 415 39.00068 f tester2
#> 416 56.99932 f tester2
#> 417 58.00137 f tester2
#> 418 52.99932 f tester2
Derived variables are variables that are not in the original dataset; rather they are created using variables from the original dataset.
Descriptions of derived functions are in the article derived functions
To recode a derived variable, you must:
variables
and variable_details
,Our example derived variable example_der
equals
chol
times bili
.
chol
and
bili
and the derived variable example_der
for
tester1 and tester2.
derived1 <- rec_with_table(data = tester1,
variables = c("chol", "bili","example_der"),
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester1')
#> The variable bili was recoded into bili for the database tester1 the following recodes were made:
#> # A tibble: 2 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [0,100] 209
#> 2 NA else 0
#> The variable chol was recoded into chol for the database tester1 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [100,2000] 186
#> 2 NA::a 9999 0
#> 3 NA else 23
derived2 <- rec_with_table(data = tester2,
variables = c("chol", "bili","example_der"),
variable_details = recodeflow::tester_variable_details,
log = TRUE,
database_name = 'tester2')
#> The variable bili was recoded into bili for the database tester2 the following recodes were made:
#> # A tibble: 2 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [0,100] 209
#> 2 NA else 0
#> The variable chol was recoded into chol for the database tester2 the following recodes were made:
#> # A tibble: 3 × 3
#> value_to From rows_recoded
#> <chr> <chr> <int>
#> 1 copy [100,2000] 98
#> 2 NA::a 9999 0
#> 3 NA else 111
chol
,
bili
, and exampler_der
combined_der <- bind_rows(derived1, derived2)