Skip to contents

Main generation functions

Generate categorical, continuous, date, and survival variables. Use create_mock_data() for batch generation or individual functions for fine-grained control.

create_cat_var()
Create categorical variable for MockData
create_con_var()
Create continuous variable for MockData
create_date_var()
Create date variable for MockData
create_survival_dates()
Create paired survival dates for cohort studies
create_wide_survival_data()
Create wide survival data for cohort studies
create_mock_data()
Create mock data from configuration files

Configuration

Read and validate v0.2 configuration files. Import existing recodeflow metadata or create new configurations for mock data generation workflows.

read_mock_data_config()
Read and validate MockData configuration file defining variable specifications for mock data generation
read_mock_data_config_details()
Read and validate MockData configuration details file containing distribution parameters and category proportions
validate_mock_data_config()
Validate MockData configuration against schema requirements including required columns and unique identifiers
validate_mock_data_config_details()
Validate MockData configuration details against schema requirements including proportion sums and parameter completeness
validate_mockdata_metadata()
Validate MockData Extension Fields
import_from_recodeflow()
Import and convert recodeflow variables and variable details metadata files to MockData configuration format

Helper functions

Utilities for metadata processing, proportions, type coercion, and data quality. Support main generation functions or use directly for custom workflows.

get_variable_details()
Get variable details for specific variable
extract_proportions()
Extract proportions from details subset
extract_distribution_params()
Extract distribution parameters from details
sample_with_proportions()
Sample with proportions
apply_missing_codes()
Apply missing codes to values
apply_rtype_defaults()
Apply rType defaults to variable details
add_garbage()
Add garbage specifications to variables data frame
apply_garbage()
Apply garbage data from variables.csv
has_garbage()
Check if garbage is specified
make_garbage()
Make garbage
generate_garbage_values()
Generate garbage values
print(<mockdata_validation_result>)
Print MockData Validation Results
get_variables_by_role()
Get variables by role
get_enabled_variables()
Get enabled variables
identify_derived_vars()
Identify derived variables using recodeflow patterns
get_raw_var_dependencies()
Extract raw variable dependencies from derived variable metadata
get_cycle_variables()
Get list of variables used in a specific database/cycle
get_raw_variables()
Get list of unique raw variables for a database/cycle

Parsers

Parse recodeflow notation for variable specifications and range syntax. Extract structured information from metadata for mock data generation.

parse_range_notation()
Parse range notation from variable_details
parse_variable_start()
Parse variableStart field to extract raw variable name