There are two types derived variables in the CCHS surveys. Both types
of derived variables are supported in cchsflow
.
cchsflow
calculates these more complex derived variables
using functions that are referenced in variable_details.csv
within RecTo
section with the prefix ‘Func::’. The
variables used in the function are referenced in the
variableStart
section with the prefix ‘DerivedVar::’. For
example, BMI (HWTGBMI_der
) includes
Func::bmi_fun
in the RecTo
section; and
DerivedVar::[HWTGHTM, HWTGWTK]
in the
variableStart
section, which indicates the two starting
variables (HWTGHTM, HWTGWTK
).
While BMI is calculated across all CCHS cycles, the method in which
it is calculated varies across CCHS cycles, leading to misclassification
error that might affect your study. As such, a derived variable for BMI
has been created in cchsflow
that uses harmonized height
(HWTGHTM) and weight (HWTGWTK) variables across all CCHS cycles.
Using rec_with_table()
you can transform the derived BMI
variable across multiple CCHS cycles and create a transformed
dataset.
In order derive variables, you must load the existing custom function associated with the derived variable
# Custom ifelse for evaluating NA
if_else2 <- function(x, a, b) {
falseifNA <- function(x) {
ifelse(is.na(x), FALSE, x)
}
ifelse(falseifNA(x), a, b)
}
#BMI derived variable
# HWTGHTM: height (in meters)
# HWTGWTK: weight (in kilograms)
bmi_fun <-
function(HWTGHTM,
HWTGWTK) {
ifelse2((!is.na(HWTGHTM)) & (!is.na(HWTGWTK)),
(HWTGWTK/(HWTGHTM*HWTGHTM)), NA)
}
bmi2003 <- rec_with_table(cchs2003_p, variables = c("HWTGHTM", "HWTGWTK",
"HWTGBMI_der"), log = TRUE)
## No variable_details detected.
## Loading cchsflow variable_details
## Using the passed data variable name as database_name
## NOTE for HWTGHTM: 2001 and 2003 CCHS use inches, values converted to meters to 3 decimal points
## NOTE for HWTGHTM: 74+ inches converted to 76 inches
## The variable HWTCGHT was recoded into HWTGHTM for the database cchs2003_p the following recodes were made:
## value_to From rows_recoded
## 1 1.118 1 0
## 2 1.143 2 0
## 3 1.168 3 0
## 4 1.194 4 0
## 5 1.219 5 0
## 6 1.245 6 0
## 7 1.27 7 0
## 8 1.295 8 0
## 9 1.321 9 0
## 10 1.346 10 0
## 11 1.372 11 0
## 12 1.397 12 0
## 13 1.422 13 0
## 14 1.448 14 1
## 15 1.473 15 0
## 16 1.499 16 2
## 17 1.524 17 5
## 18 1.549 18 8
## 19 1.575 19 14
## 20 1.6 20 16
## 21 1.626 21 17
## 22 1.651 22 19
## 23 1.676 23 17
## 24 1.702 24 25
## 25 1.727 25 16
## 26 1.753 26 10
## 27 1.778 27 13
## 28 1.803 28 14
## 29 1.829 29 10
## 30 1.854 30 6
## 31 1.93 31 6
## 32 NA::a 96 0
## 33 NA::b 99 1
## The variable HWTCGWTK was recoded into HWTGWTK for the database cchs2003_p the following recodes were made:
## value_to From rows_recoded
## 1 copy [27.0,135.0] 192
## 2 NA::a 996 0
## 3 NA::b [997,999] 8
bmi2010 <- rec_with_table(cchs2010_p, variables = c("HWTGHTM", "HWTGWTK",
"HWTGBMI_der"), log = TRUE)
## No variable_details detected.
## Loading cchsflow variable_details
## Using the passed data variable name as database_name
## NOTE for HWTGHTM: Height is a reported in meters from 2005 CCHS onwards
## The variable HWTGHTM was recoded into HWTGHTM for the database cchs2010_p the following recodes were made:
## value_to From rows_recoded
## 1 copy [0.914,2.134] 190
## 2 NA::a 9.996 2
## 3 NA::b [9.997,9.999] 8
## The variable HWTGWTK was recoded into HWTGWTK for the database cchs2010_p the following recodes were made:
## value_to From rows_recoded
## 1 copy [27.0,135.0] 186
## 2 NA::a 999.96 0
## 3 NA::b [999.97,999.99] 14
Since derived variables are based on previously transformed
variables, if you want to only transform your derived variable, you must
also specify its base CCHS variables in rec_with_table()
as
shown above. So for the derived BMI variable, you will have to also
specify the height (HWTGHTM
) and weight
(HWTGWTK
) variables.
Using bind_rows()
, you can then combine your transformed
datasets.
HWTGHTM | HWTGWTK | HWTGBMI_der |
---|---|---|
1.651 | 81 | 29.71604 |
1.753 | 81 | 26.35853 |
1.575 | 77 | 31.04056 |
1.829 | 106 | 31.68681 |
1.727 | 72 | 24.14059 |
1.549 | 81 | 33.75843 |
1.727 | 81 | 27.15816 |
1.651 | NA | NA |
1.727 | 77 | 25.81702 |
1.676 | 68 | 24.20811 |
Creating a derived variable requires the harmonization of existing CCHS variables, and a custom function that uses those harmonized variables. For more information on how to create a derived variable see here