Vignette 2: Insurance Risk Model • Rydra

This vignette demonstrates a more complex use of Rydra for creating an insurance risk model.

1. The Insurance Risk Model

This model calculates the risk score for an individual based on their age, BMI, and smoking status. It also includes a high-risk adjustment for older individuals with a high BMI.

The model is defined in the insurance_risk_model.yaml file:

model_name: "insurance_risk"

constants:
  age: 40
  bmi: 25

main_model:
  intercepts:
    baseline: 0.5

  coefficients:
    age_centered: 0.02
    bmi_log: 0.1
    smoker_modifier: 0.3
    high_risk_adjustment: 0.2

  transformations:
    - name: "age_centered"
      formula: "center_variable(age, constants.age)"
    - name: "bmi_log"
      formula: "log_transform(bmi)"

  factors:
    - name: "smoker"
      levels:
        - value: 0
          coefficient: "intercepts.baseline"
        - value: 1
          coefficient: "coefficients.smoker_modifier"

  conditions:
    - name: "high_risk_age_bmi"
      condition: "age > 50 && bmi > 30"
      coefficient: "coefficients.high_risk_adjustment"

  output_transformation: "truncate_variable(result, 0, 1)"

2. Using the Insurance Risk Model

We can use the rydra_calculate function to calculate the risk score for different individuals.

Scenario 1: Young, healthy non-smoker

library(Rydra)

input_data <- list(
  age = 20,
  bmi = 18,
  smoker = 0
)

result <- rydra_calculate(
  config_path = system.file("extdata", "insurance_risk_model.yaml", package = "Rydra"),
  data = input_data,
  model_name = "main_model"
)

print(result)

[1] 0.8890372

Scenario 2: Older, smoker with high BMI

input_data <- list(
  age = 52,
  bmi = 31,
  smoker = 1
)

result <- rydra_calculate(
  config_path = system.file("extdata", "insurance_risk_model.yaml", package = "Rydra"),
  data = input_data,
  model_name = "main_model"
)

print(result)

[1] 1

How the calculation works

Transformations

age_centered = center_variable(age, centering.age) = age − 40
bmi_log = log_transform(bmi) = ln(bmi)

Aggregate score (result before output transformation)

base_score = intercepts.baseline + 0.02 × age_centered + 0.1 × bmi_log
factor_coeffs_sum comes from the active level of smoker:
- 0 → intercepts.baseline (0.5)
- 1 → coefficients.smoker_modifier (0.3)
conditional_coeffs_sum comes from age > 50 && bmi > 30 → adds coefficients.high_risk_adjustment (0.2) when true
total_score = base_score + factor_coeffs_sum + conditional_coeffs_sum

Output transformation

final_result = truncate_variable(result, 0, 1)

Worked examples - Scenario 1 (age=20, bmi=18, smoker=0): - age_centered = 20 − 40 = −20; bmi_log ≈ ln(18) ≈ 2.8904 - base_score = 0.5 + 0.02×(−20) + 0.1×2.8904 = 0.5 − 0.4 + 0.2890 ≈ 0.3890 - factor_coeffs_sum = 0.5; conditional_coeffs_sum = 0 (condition false) - total_score ≈ 0.3890 + 0.5 + 0 = 0.8890 → final_result ≈ 0.8890 (not truncated) - Scenario 2 (age=52, bmi=31, smoker=1): - age_centered = 12; bmi_log ≈ ln(31) ≈ 3.4340 - base_score = 0.5 + 0.02×12 + 0.1×3.4340 = 0.5 + 0.24 + 0.3434 ≈ 1.0834 - factor_coeffs_sum = 0.3; conditional_coeffs_sum = 0.2 (condition true) - total_score ≈ 1.0834 + 0.3 + 0.2 = 1.5834 → final_result = 1.0 after truncation

3. Conclusion

This example demonstrates how Rydra can be used to create more complex models with conditional logic. The use of YAML makes the model easy to understand and maintain, even as the complexity grows. The output transformation ensures that the final risk score is always within a valid range.