Validate a Single Study's Data Files Against the Project Schema
validate_study.Rd
Reads the metadata.yml
and data.csv
files for a specific study ID
within a metawoRld project and checks their structure and content against
the rules defined in the project's _metawoRld.yml
schema. It also checks
internal consistency (e.g., links between data and metadata).
This is useful for checking manually edited files (e.g., those created using
add_study_template
).
Arguments
- study_id
Character string. The unique identifier of the study to validate.
- path
Character string. The path to the root directory of the metawoRld project. Defaults to the current working directory (
.
).- check_linkages
Logical. Should the consistency checks between
data.csv
(method_ref_id
,group_label
) andmetadata.yml
(measurement_methods
,outcome_groups
) be performed? Defaults toTRUE
.
Value
Returns TRUE
(invisibly) if all validation checks pass.
If any check fails, the function stops with an informative error message
using rlang::abort()
.
Examples
if (FALSE) { # \dontrun{
# --- Setup: Create a project and add a template ---
proj_path <- file.path(tempdir(), "validate_study_test")
create_metawoRld(
path = proj_path,
project_name = "Validate Study Test",
project_description = "Testing validate_study()"
)
add_study_template(proj_path, "ManualStudy01")
# --- Scenario 1: Validate the raw template (might fail required field checks) ---
# This will likely fail because placeholders like "REQUIRED: ..." are still present.
tryCatch(
validate_study("ManualStudy01", path = proj_path),
error = function(e) print(paste("Validation Failed (as expected for raw template):", e$message))
)
# --- Scenario 2: Manually Edit Files (Simulated) ---
# Imagine the user fills the files. Let's simulate correct filling:
meta_file <- file.path(proj_path, "data", "ManualStudy01", "metadata.yml")
data_file <- file.path(proj_path, "data", "ManualStudy01", "data.csv")
# Create valid-looking content programmatically for the example
valid_meta <- list(
study_id = "ManualStudy01", title = "Manually Entered Study",
authors = list("User U"), year = 2024, journal = "Data Entry Journal",
study_design = "Cross-sectional", country = "Local", sample_type = "Serum",
outcome_groups = list(
g1 = list(name = "Group A", definition = "Criteria A"),
g2 = list(name = "Group B", definition = "Criteria B")
),
measurement_methods = list(
m_elisa = list(analysis_type = "ELISA", target_cytokine = "CYTOK", unit = "pg/mL")
)
# Optional fields omitted for brevity
)
yaml::write_yaml(valid_meta, meta_file)
valid_data <- data.frame(
measurement_id = "meas1", method_ref_id = "m_elisa", cytokine_name = "CYTOK",
group_label = "g1", gestational_age_timing = "Any", n = 50,
statistic_type = "mean_sd", value1 = 10.0, value2 = 2.5
)
readr::write_csv(valid_data, data_file)
# --- Scenario 3: Validate the correctly filled study ---
validate_study("ManualStudy01", path = proj_path) # Should now pass and print success
# --- Scenario 4: Introduce an error (e.g., bad group_label in data.csv) ---
invalid_data <- valid_data
invalid_data$group_label[1] <- "g_BAD" # This label doesn't exist in metadata
readr::write_csv(invalid_data, data_file)
tryCatch(
validate_study("ManualStudy01", path = proj_path),
error = function(e) print(paste("Validation Failed (invalid group link):", e$message))
)
# --- Scenario 5: Validate without checking linkages ---
# This should pass even with the bad group_label, as linkage check is skipped
validate_study("ManualStudy01", path = proj_path, check_linkages = FALSE)
# --- Clean up ---
unlink(proj_path, recursive = TRUE)
} # }