Validate a Single Study's Data Files Against the Project Schema

Reads the metadata.yml and data.csv files for a specific study ID within a metawoRld project and checks their structure and content against the rules defined in the project's _metawoRld.yml schema. It also checks internal consistency (e.g., links between data and metadata).

This is useful for checking manually edited files (e.g., those created using add_study_template).

Usage

validate_study(study_id, path = ".", check_linkages = TRUE)

Arguments

study_id: Character string. The unique identifier of the study to validate.
path: Character string. The path to the root directory of the metawoRld project. Defaults to the current working directory (.).
check_linkages: Logical. Should the consistency checks between data.csv (method_ref_id, group_label) and metadata.yml (measurement_methods, outcome_groups) be performed? Defaults to TRUE.

Value

Returns TRUE (invisibly) if all validation checks pass. If any check fails, the function stops with an informative error message using rlang::abort().

Examples

if (FALSE) { # \dontrun{
# --- Setup: Create a project and add a template ---
proj_path <- file.path(tempdir(), "validate_study_test")
create_metawoRld(
  path = proj_path,
  project_name = "Validate Study Test",
  project_description = "Testing validate_study()"
)
add_study_template(proj_path, "ManualStudy01")

# --- Scenario 1: Validate the raw template (might fail required field checks) ---
# This will likely fail because placeholders like "REQUIRED: ..." are still present.
tryCatch(
  validate_study("ManualStudy01", path = proj_path),
  error = function(e) print(paste("Validation Failed (as expected for raw template):", e$message))
)

# --- Scenario 2: Manually Edit Files (Simulated) ---
# Imagine the user fills the files. Let's simulate correct filling:
meta_file <- file.path(proj_path, "data", "ManualStudy01", "metadata.yml")
data_file <- file.path(proj_path, "data", "ManualStudy01", "data.csv")

# Create valid-looking content programmatically for the example
valid_meta <- list(
  study_id = "ManualStudy01", title = "Manually Entered Study",
  authors = list("User U"), year = 2024, journal = "Data Entry Journal",
  study_design = "Cross-sectional", country = "Local", sample_type = "Serum",
  outcome_groups = list(
    g1 = list(name = "Group A", definition = "Criteria A"),
    g2 = list(name = "Group B", definition = "Criteria B")
  ),
  measurement_methods = list(
    m_elisa = list(analysis_type = "ELISA", target_cytokine = "CYTOK", unit = "pg/mL")
  )
  # Optional fields omitted for brevity
)
yaml::write_yaml(valid_meta, meta_file)

valid_data <- data.frame(
  measurement_id = "meas1", method_ref_id = "m_elisa", cytokine_name = "CYTOK",
  group_label = "g1", gestational_age_timing = "Any", n = 50,
  statistic_type = "mean_sd", value1 = 10.0, value2 = 2.5
)
readr::write_csv(valid_data, data_file)

# --- Scenario 3: Validate the correctly filled study ---
validate_study("ManualStudy01", path = proj_path) # Should now pass and print success

# --- Scenario 4: Introduce an error (e.g., bad group_label in data.csv) ---
invalid_data <- valid_data
invalid_data$group_label[1] <- "g_BAD" # This label doesn't exist in metadata
readr::write_csv(invalid_data, data_file)

tryCatch(
  validate_study("ManualStudy01", path = proj_path),
  error = function(e) print(paste("Validation Failed (invalid group link):", e$message))
)

# --- Scenario 5: Validate without checking linkages ---
# This should pass even with the bad group_label, as linkage check is skipped
validate_study("ManualStudy01", path = proj_path, check_linkages = FALSE)

# --- Clean up ---
unlink(proj_path, recursive = TRUE)
} # }