Import Cached Extraction Data for a Batch of Studies

Reads cached extraction data (JSON format saved by df_extract_batch or manually) for multiple studies and imports them into the specified metawoRld project using df_import_extraction.

Usage

df_import_batch(
  identifiers,
  metawoRld_path,
  overwrite = FALSE,
  validate_json = TRUE,
  merge_metadata = TRUE,
  stop_on_error = FALSE
)

Arguments

identifiers: Character vector. A vector of DOIs and/or PMIDs for studies that have cached extraction data ready for import.
metawoRld_path: Character string. Path to the root of the target metawoRld project.
overwrite: Logical. Passed to metawoRld::add_study_data via df_import_extraction. If TRUE, overwrite existing study data in the metawoRld project. Defaults to FALSE.
validate_json: Logical. Passed to df_import_extraction. If TRUE, validate the cached JSON against the schema before importing. Defaults to TRUE.
merge_metadata: Logical. Passed to df_import_extraction. If TRUE, merge cached bibliographic metadata with extracted metadata. Defaults to TRUE.
stop_on_error: Logical. If TRUE, stop the batch if any single import fails. Defaults to FALSE.

Value

A data frame (tibble) summarizing the import attempt for each identifier, with columns:

identifier: The DOI or PMID.
status: "Success", "Skipped" (e.g., cache file missing), or "Failure".
metawoRld_study_path: Path to the study directory if import was successful.
error_message: The error message if status is "Failure" or "Skipped".

Also prints progress and summary information.