This vignette demonstrates a more complex use of Rydra
for creating an insurance risk model.
1. The Insurance Risk Model
This model calculates the risk score for an individual based on their age, BMI, and smoking status. It also includes a high-risk adjustment for older individuals with a high BMI.
The model is defined in the insurance_risk_model.yaml
file:
model_name: "insurance_risk"
constants:
age: 40
bmi: 25
main_model:
intercepts:
baseline: 0.5
coefficients:
age_centered: 0.02
bmi_log: 0.1
smoker_modifier: 0.3
high_risk_adjustment: 0.2
transformations:
- name: "age_centered"
formula: "center_variable(age, constants.age)"
- name: "bmi_log"
formula: "log_transform(bmi)"
factors:
- name: "smoker"
levels:
- value: 0
coefficient: "intercepts.baseline"
- value: 1
coefficient: "coefficients.smoker_modifier"
conditions:
- name: "high_risk_age_bmi"
condition: "age > 50 && bmi > 30"
coefficient: "coefficients.high_risk_adjustment"
output_transformation: "truncate_variable(result, 0, 1)"
2. Using the Insurance Risk Model
We can use the rydra_calculate
function to calculate the risk score for different individuals.
Scenario 1: Young, healthy non-smoker
library(Rydra)
input_data <- list(
age = 20,
bmi = 18,
smoker = 0
)
result <- rydra_calculate(
config_path = system.file("extdata", "insurance_risk_model.yaml", package = "Rydra"),
data = input_data,
model_name = "main_model"
)
print(result)
Scenario 2: Older, smoker with high BMI
input_data <- list(
age = 52,
bmi = 31,
smoker = 1
)
result <- rydra_calculate(
config_path = system.file("extdata", "insurance_risk_model.yaml", package = "Rydra"),
data = input_data,
model_name = "main_model"
)
print(result)
How the calculation works
- Transformations
- age_centered = center_variable(age, centering.age) = age − 40
- bmi_log = log_transform(bmi) = ln(bmi)
- Aggregate score (result before output transformation)
- base_score = intercepts.baseline + 0.02 × age_centered + 0.1 × bmi_log
- factor_coeffs_sum comes from the active level of
smoker
:
- 0 → intercepts.baseline (0.5)
- 1 → coefficients.smoker_modifier (0.3)
- conditional_coeffs_sum comes from
age > 50 && bmi > 30
→ adds coefficients.high_risk_adjustment (0.2) when true
- total_score = base_score + factor_coeffs_sum + conditional_coeffs_sum
- Output transformation
- final_result = truncate_variable(result, 0, 1)
Worked examples - Scenario 1 (age=20, bmi=18, smoker=0): - age_centered = 20 − 40 = −20; bmi_log ≈ ln(18) ≈ 2.8904 - base_score = 0.5 + 0.02×(−20) + 0.1×2.8904 = 0.5 − 0.4 + 0.2890 ≈ 0.3890 - factor_coeffs_sum = 0.5; conditional_coeffs_sum = 0 (condition false) - total_score ≈ 0.3890 + 0.5 + 0 = 0.8890 → final_result ≈ 0.8890 (not truncated) - Scenario 2 (age=52, bmi=31, smoker=1): - age_centered = 12; bmi_log ≈ ln(31) ≈ 3.4340 - base_score = 0.5 + 0.02×12 + 0.1×3.4340 = 0.5 + 0.24 + 0.3434 ≈ 1.0834 - factor_coeffs_sum = 0.3; conditional_coeffs_sum = 0.2 (condition true) - total_score ≈ 1.0834 + 0.3 + 0.2 = 1.5834 → final_result = 1.0 after truncation
3. Conclusion
This example demonstrates how Rydra
can be used to create more complex models with conditional logic. The use of YAML makes the model easy to understand and maintain, even as the complexity grows. The output transformation ensures that the final risk score is always within a valid range.