recodeflow supports the use of derived variables. Derived variables can be any custom function as long as the variable can be calculated on a per row basis. Functions requiring operations across rows or on the full data set are not supported.
The two most common uses for derived variables are:
To create derived variables, you need to complete two steps:
We’ll walk through an example of creating a derived variable with our example data.
Our customized derived function is multiplying the blood
concentration of cholesterol (chol
) with the blood
concentration of bilirunbin (bili
).
Create the custom function: Here is the customized
function for our derived variable
(chol
*bili
):
#example_der_fun caluclates chol*bili
#@param chol the row value for chol
#@param bili the row value for bili
#@export
example_der_fun <- function(chol, bili){
# as numeric is used to coerce in case categorical numeric variables are used.
# Warning either chol or bili being NA will result in NA return
example_der <- as.numeric(chol)*as.numeric(bili)
return(example_der)
}
Note: You must use roxygen2 documentation for custom functions otherwise the function cannot be attached to a package. See roxygen2 on how to format and document your function.
Load the custom function into your R environment. Load the customized function by either:
rec_with_table
parameter to pass the path
to your function R script.If you don’t load the customized function you cannot create the derived variable.
variable_details
and
variables
worksheets.
Add the derived variable to the variables
worksheet.
You’ll use the same nomenclature as any other variable. See the article
variables_sheet
for nomenclature rules.
Add the derived variable to the variable_details. See the article variable_details
for nomenclature rules.
Use the function rec_with_table
to recode your derived
function.
#Load the package
library(recodeflow)
chol
and
bili
) and the derived variable
(example_der
).
derived1 <- rec_with_table(
data = pbc,
variables = c("chol", "bili","example_der"),
variable_details = recodeflow::tester_variable_details,
database_name = 'tester1',
log = TRUE)
## The variable bili was recoded into bili for the database tester1 the following recodes were made:
## # A tibble: 2 × 3
## value_to From rows_recoded
## <chr> <chr> <int>
## 1 copy [0,100] 418
## 2 NA else 0
## The variable chol was recoded into chol for the database tester1 the following recodes were made:
## # A tibble: 3 × 3
## value_to From rows_recoded
## <chr> <chr> <int>
## 1 copy [100,2000] 284
## 2 NA::a 9999 0
## 3 NA else 134
## bili chol example_der
## 1 14.5 261 3784.5
## 2 1.1 302 332.2
## 3 1.4 176 246.4
## 4 1.8 244 439.2
## 5 3.4 279 948.6
## 6 0.8 248 198.4