Assess Relevance for a Batch of Study Identifiers
df_assess_batch.RdRuns the relevance assessment workflow (df_assess_relevance) for multiple
DOIs/PMIDs, leveraging caching and providing a summary of results.
Usage
df_assess_batch(
chat,
identifiers,
metawoRld_path,
force_fetch = FALSE,
force_assess = FALSE,
email = NULL,
ncbi_api_key = NULL,
stop_on_error = FALSE,
...
)Arguments
- identifiers
Character vector. A vector of DOIs and/or PMIDs.
- metawoRld_path
Character string. Path to the root of the metawoRld project.
- force_fetch
Logical. If TRUE, bypass the metadata cache for all identifiers.
- force_assess
Logical. If TRUE, bypass the assessment cache for all identifiers.
Character string (optional). Email for NCBI Entrez.
- ncbi_api_key
Character string (optional). NCBI API key.
- stop_on_error
Logical. If TRUE, the batch process stops if any single assessment fails. If FALSE (default), it attempts to process all identifiers and reports errors in the summary.
- ...
Additional arguments passed down to
df_assess_relevanceand subsequently to the LLM API call function (e.g.,temperature).- service
Character string. The LLM service to use (e.g., "openai").
- model
Character string. The specific LLM model name.
Value
A data frame (tibble) summarizing the assessment results for each identifier, with columns:
identifierThe DOI or PMID.
status"Success" or "Failure".
decisionAssessment decision ("Include", "Exclude", etc.) if status is "Success".
scoreConfidence score if status is "Success".
rationaleLLM rationale if status is "Success".
error_messageThe error message if status is "Failure".
Also prints progress and summary information to the console. Assessment
results are saved to the cache within the metawoRld project.
Examples
if (FALSE) { # \dontrun{
# --- Prerequisites ---
# 1. Set API key: usethis::edit_r_environ("project") -> add OPENAI_API_KEY=sk-... -> Restart R
# 2. Create a dummy metawoRld project
proj_path <- file.path(tempdir(), "assess_batch_proj")
metawoRld::create_metawoRld(
proj_path,
project_name = "Test Batch Assessment",
project_description = "Testing DataFindR batch assessment",
inclusion_criteria = c("Human study", "Pregnancy", "Serum or Plasma", "Cytokine measurement"),
exclusion_criteria = c("Animal study", "Review article", "Non-English")
)
# --- Identifiers from a hypothetical search ---
ids_to_assess <- c(
"31772108", # Should likely be Include
"25376210", # Should likely be Include
"invalid_pmid", # Should fail fetch
"10.1038/nature14539" # Example DOI (Nature review, likely Exclude)
)
# --- Run Batch Assessment ---
batch_results <- df_assess_batch(
identifiers = ids_to_assess,
metawoRld_path = proj_path,
email = "your.email@example.com", # Replace with your email
service = "openai",
model = "gpt-3.5-turbo",
stop_on_error = FALSE # Continue processing even if one fails
)
# --- View Results ---
print(batch_results)
# --- Clean up ---
unlink(proj_path, recursive = TRUE)
} # }