Reference
variables
Contains all the variables needed for algorithm development and implementation
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable | string | |
| role | The different processes that the variable was part of during the algorithm development | string | |
| variableType | The statistical type of the variable | category | Categorical: Categorical type Continuous: Continuous type |
| databaseStart | The databases that the variable can be harmonized from | string | |
| variableStart | The names of the variable for each starting database | string | |
| units | The units for the variable | string | |
| label | A short description of the variable | string | |
| labelLong | A long description of the variable | string | |
| description | Should contain any issues that need to be taken into account before using this variable | string |
variable-details
Contains the details for the variables defined in a variables sheet
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable | string | |
| dummyVariable | The name of the dummy variable. Not valid if the variable is continuous. | string | |
| typeEnd | The type of the variable | category | cat: Categorical type cont: Continuous type |
| databaseStart | The starting databases whose harmonization details this row contains | string | |
| variableStart | The names of the variable for each starting database | string | |
| typeStart | The type of the starting variables | category | cat: Categorical type cont: Continuous type N/A: Used mainly for derived variables when there are many start variables |
| recEnd | The value that the variable will be harmonized to. Can be a number or an interval. | string | |
| catLabel | The description for the category. | string | |
| catLabelLong | A longer description for the category. | string | |
| numValidCat | The number of valid categories for the variables. | number | |
| units | The units for the variable. Only for continuous variables. | string | |
| recStart | The value of the starting variable to harmonize from. Can be a single value or an interval. | string | |
| catStartLabel | The description for the category of the start variable. | string | |
| variableStartShortLabel | A short description of the starting variables | string | |
| variableStartLabel | A long description of the starting variables | string | |
| notes | Any issues/problems that users of this variable should consider. Can also include any problems encountered during harmonization. | string |
lookup
Contains the score range that each bin belongs to
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| catValue | The bin number | number | |
| range | The range if score values for this bin. Should use the mathematical range notation. | string |
descriptive
Contains descriptive information, for example mean, median etc. for variables
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The variable whose descriptive statistics the row contains | string | |
| catValue | The value of the category whose descriptive statistics the row contains. N/A for continuous variables. | number | |
| n | The number of individuals used to calculate the row’s statistics | number | |
| proportion | The proportion of all the individuals in the study used to calculate this row’s statistics | number | |
| median | The median statistic | number |
model-export
Contains the list of all files that are part of the export for an algorithm
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| fileType | The type of file | category | variables: A variables file variable-details: A variable details file descriptive: A descriptive lookup file descriptive-bins: A descriptive bins file model-export: A model export file model-steps: A model steps dummy: A dummy file center: A centering file rcs: An RCS file interaction: An interaction file fine-and-gray: A fine and gray file cox: A cox file survival_bins: File with survival data for the different bins lookup: Contains the range of score values for each biin in a survival algorithm validate: Contains the validation rules for variables in the algorithm tables: Defines a tables model export file |
| filePath | The path to the file relative to the model export file | string |
model-steps
Contains the steps to score an algorithm
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| step | The type of step to perform | category | dummy: Step used to create dummy variables center: Step used to create centered variables rcs: Steps used to create spline terms interaction: Step used to create interaction terms fine-and-gray: Step used to calculate the outcome of a fine and gray model cox: Step used to calculate the outcome of a cox proportional hazards model simple-model: Step used to calculate the outcome of a simple model logistic-regression: Step used to calculate the outcome of a logistic regression model |
| fileType | The type of file referenced in the file path column | category | N/A: Missing since certain steps don’t need to specify the file type beta-coefficients: A beta coefficients file for a fine and gray model or a cox model baseline-hazards: The baseline hazards for a fine and gray model or a cox model |
| filePath | The path to the file relative to the model location of the model steps file | string | |
| notes | Any notes for the future | string |
dummy
Contains the variables to convert to dummy variables
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| origVariable | The name of the variable to dummy | string | |
| catValue | The category value in the original variable which the dummy variable represents | string | |
| dummyVariable | The name of the dummy variable | string |
center
Contains the variables to create centered variables
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| origVariable | The name of the variable to center | string | |
| centerValue | The value to center with | string | |
| centeredVariable | The name of the new centered variable | string | |
| centeredVariableType | The type of the new centered variable | category | cat: Categorical type cont: Continuous type |
interaction
Contains the interaction variables for an algorithm
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| interactionVariable | The name of the interaction variable | string | |
| interactingVariables | The names of the variables that are part of this interaction variable | string | |
| interactionVariableType | The statistical type of the interaction variable | category | cat: Categorical type cont: Continuous type |
rcs
Contains the RCS variables for an algorithm
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable which will be splined | string | |
| rcsVariables | The names of the spline variables in correct order | string | |
| knots | The knot values to use seperated by a semi-colon | string |
validate
Contains the validation rules for variables in the algorithm
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable to validate | string | |
| rule | The type of validation rule to apply to a variable | category | type: Validate the data type of the variable range: Validate that the variable is within a range of values. Meant for continuous variables. allowed: Validate that the variable contains a value from a set of valid values. Meant for categorical variables. nullable: Whether missing values are allowed for a variable |
| value | The value to use when applying the validation | string | |
| error_handle | How to handle failed validations | category | error: Failed validations for the variable should throw an error, stopping the scoring process warning: Failed validations for the variable should log a warning and continue the scoring process truncate: Failed validations for the variable should log a warning, truncate the failed variable value, and continue scoring process |
| error_replace | Value to replace variables that fail validations whose errorHandle value is warning | string | |
| location | Which step in the scoring process the validation should be used | string |
beta-coefficients
Contains the coefficients for regression models
Algorithm type(s): cox, fine-and-gray, logistic-regression
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable whose beta coefficient the row contains. If this is the coefficient for the intercept then use the name Intercept | string | |
| coefficient | The beta coefficient | number | |
| type | The statistical type of the variable | category | cat: Categorical type cont: Continuous type |
baseline-hazards
Contains the baseline hazards to use with a cox proportional hazards model or a fine and grey model
Algorithm type(s): cox, fine-and-gray
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| time | The time upto which the baseline hazard should be used | number | |
| baselineHazard | The baseline hazard value | number |
survival_bins
Contains the survival data for the different bins in a survival algorithm
Algorithm type(s): survival
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| catValue | The bin number | number |
lookup
Contains the range of score values for each bin in a survival algorithm
Algorithm type(s): survival
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| catValue | The bin number | number | |
| range | The range if score values for this bin. Should use the mathematical range notation. | string |
tables
Contains the list of tables referenced in the variable and variable detail files
Algorithm type(s): all
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| tableName | The name of the table | string | |
| tablePath | The path to the table relative to this file | string |
simple-model
Contains the metadata for a simple model
Algorithm type(s): simple-model
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| name | The name of the metadata | category | outputVariableName: The name of the output variable for the model. Should be defined in the variables and variable details sheets. |
| value | The value of the metadata | string |
logistic-regression
Contains the beta coefficients for a logistic regression model
Algorithm type(s): logistic-regression
Columns
| Column Name | Description | Type | Category Values |
|---|---|---|---|
| variable | The name of the variable whose beta coefficient the row contains. If this is the coefficient for the intercept then use the name Intercept | string | |
| coefficient | The beta coefficient | number | |
| type | The statistical type of the variable | category | cat: Categorical type cont: Continuous type |