Model Parameters Step Tests
This section documents the unit tests for Model Parameters transformation
steps. Each subdirectory under tests/testdata/step-tests/ represents a test
case for a specific transformation step.
Directory Structure
tests/testdata/step-tests/
├── test-data.csv # Shared test data used by all step tests
├── test-variables.csv # Shared variables definition file
└── test-{stepname}/ # Test directory for a specific step
├── test-model-export.csv # Model export file defining files for transformation
├── test-model-steps.csv # Model steps file defining transformation steps
├── test-{stepname}.csv # Step-specific parameters file
└── test-expected.csv # Expected output (auto-generated)
Adding a New Unit Test
To add a unit test for a Model Parameters transformation step:
1. Create Test Directory
Create a new subdirectory named test-{stepname} where {stepname} is the
name of the transformation step you want to test.
2. Create Required Files
Each test directory must contain the following files:
test-model-export.csv
This file defines the files used for the transformation. It generally has the same structure for each unit test:
fileType,filePath
variables,../test-variables.csv
model-steps,./test-model-steps.csv
The file references:
test-variables.csv– The shared variables definition file (located in the parent directory)test-model-steps.csv– The model steps file (located in the test directory)
test-model-steps.csv
This file defines the transformation steps to execute. The structure is:
step,fileType,filePath,notes
{stepname},N/A,./test-{stepname}.csv,
Replace {stepname} with your step name. If the step doesn’t require a
separate file, use N/A for fileType and filePath.
test-{stepname}.csv
This file contains step-specific parameters. The structure depends on the transformation step being tested. For example, a logistic regression step would have the coefficients for each variable and the intercept. Refer to the Model Parameters documentation for details on each step’s required parameters.
3. Generate Expected Output
After creating your test files, generate the expected output by running:
# From the project root directory
# Generate expected output for all tests
python -m tests.generate_step_tests_expected
# Or generate for only your new test
python -m tests.generate_step_tests_expected stepname
The generate_step_tests_expected function in
tests/generate_step_tests_expected.py will:
Read the shared test data from
test-data.csvIterate through each subdirectory in the step-tests folder (or only specified steps)
Run the model pipeline using each
test-model-export.csvfileSave the pipeline output as
test-expected.csvin each subdirectory
The generated test-expected.csv file will be used by the unit tests to
verify correct transformation behavior.
You can also call the function directly from Python:
from tests.generate_step_tests_expected import generate_step_tests_expected
# Generate expected output for all tests
generate_step_tests_expected()
# Or generate for only your new test
generate_step_tests_expected(steps=["stepname"])
4. Run Tests Automatically
Once you’ve created your test directory and generated the expected output, your new unit test will be automatically discovered and run when the test suite executes. No additional registration or configuration is needed.
The test automation is implemented in tests/test_pipeline.py, which:
Scans all subdirectories in the
step-tests/folderAutomatically runs each test it finds
Compares the pipeline output against the
test-expected.csvfile
To run all tests, execute:
pytest
Your new step test will be included automatically alongside all existing step tests.
Model Parameters Steps Reference
For detailed information about available transformation steps and their parameters, see the Model Parameters Reference Documentation.
Regenerating Expected Output
If you need to update the expected output after fixing a bug or updating
transformation logic, follow the same process as described in section “3.
Generate Expected Output” above. The generate_step_tests_expected() function
can regenerate output for all tests or specific tests using the steps
parameter.
After regenerating, review the changes to ensure the new output is correct, then
commit the updated test-expected.csv files.