Skip to contents

Creates a detailed prompt for an LLM based on the schema defined in a metawoRld project, guiding the extraction of data primarily from the Methods, Results, and Tables/Figures of a full research paper. It fetches the schema directly from the specified project path.

Usage

.generate_extraction_prompt(
  identifier = NULL,
  fetched_metadata = NULL,
  prompt_template = readr::read_file(system.file(fs::path("prompts",
    "_extraction_prompt.txt"), package = "DataFindR"))
)

Arguments

identifier

Character string (optional). The study identifier (DOI/PMID) to potentially pre-fill or reference in the prompt. Can be used as the default study_id.

fetched_metadata

List (optional). Pre-fetched metadata (e.g., from df_fetch_metadata) containing fields like title, authors, year etc. If provided, the prompt can instruct the LLM to focus less on these.

metawoRld_path

Character string. Path to the root of the metawoRld project containing the _metawoRld.yml file with the schema.

Value

Character string. The formatted LLM prompt for data extraction.

Examples

if (FALSE) { # \dontrun{
# --- Setup: Create a dummy metawoRld project ---
proj_path <- file.path(tempdir(), "prompt_test_proj_full")
metawoRld::create_metawoRld(
   proj_path,
   project_name = "Prompt Generation Test Full",
   project_description = "Testing prompt generation",
   inclusion_criteria = c("Human", "Pregnancy"),
   exclusion_criteria = c("Animal")
   # Assuming default schema is created inside
)
doi <- "10.1234/test.doi.full"

# Example with pre-fetched metadata
fetched_meta_example <- list(
    identifier = doi, type = "doi", title = "Pre-Fetched Title",
    authors = list("Smith J", "Doe A"), year = 2023, journal = "Fetched Journal",
    abstract = "Fetched abstract text..."
)

# --- Generate the prompt (with fetched metadata hint) ---
extraction_prompt_with_meta <- df_generate_extraction_prompt(
    metawoRld_path = proj_path,
    identifier = doi,
    fetched_metadata = fetched_meta_example
)
cat("--- Prompt with Fetched Metadata Hint --- \n")
# cat(extraction_prompt_with_meta) # View the generated prompt

# --- Generate the prompt (without fetched metadata hint) ---
extraction_prompt_no_meta <- df_generate_extraction_prompt(
    metawoRld_path = proj_path,
    identifier = doi,
    fetched_metadata = NULL # Explicitly NULL
)
cat("\n--- Prompt without Fetched Metadata Hint --- \n")
# cat(extraction_prompt_no_meta) # View the generated prompt

# --- Clean up ---
unlink(proj_path, recursive = TRUE)
} # }