Before you runDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/namakala/inappropriate-acid-suppressor-agent-use/llms.txt
Use this file to discover all available pages before exploring further.
tar_make(), you must supply a single CSV file containing the extracted study data. The pipeline reads this file as its only external input — every downstream model, plot, and report derives from it. This page describes the required directory layout, the expected column format, and the variable transformations that clean() applies before any statistical modeling begins.
Directory structure
Place your extracted study data at exactly the following path relative to the project root:data/raw/ directory is scanned by lsData() at pipeline startup. The file must be named data.csv. The data/processed/ directory is reserved for any intermediate or feature-engineered outputs generated during the run.
Input data format
Yourdata.csv must contain the following columns. Column names are matched exactly as they appear in the source code; extra columns are ignored.
| Column | Type | Notes |
|---|---|---|
Author | string | Study author name(s) |
Patient's age | string | Age descriptor; copied as-is into the Age variable |
Year of Publication | numeric | Four-digit year; cast to integer by clean() |
Prevalence | string or numeric | Proportion of inappropriate use; accepts comma (,) or dot (.) as decimal separator |
Sample size | numeric | Total number of patients; cast to integer |
Inappropriate indication | numeric | Count of patients with inappropriate indication; cast to integer |
Continent | string | Must contain "Asia", "Europe", or "North America" as a substring; anything else maps to "Other" |
Setting | string | Must contain "Hospital" as a substring to map to "Hospital Setting"; otherwise "Other" |
Guideline | string | "Yes" maps to "Followed Guideline(s)"; any other value maps to "No Guideline" |
JBI_Classification | string | Methodological quality classification; used as-is in subgroup and regression models |
Data cleaning
Theclean() function in src/R/clean.R applies all variable transformations in a single dplyr::mutate() call. The cleaned data frame is stored as the tbl_clean target and is the direct input to every modeling target.
Age standardization
Patient's age is renamed to the syntactically valid R name Age. No type conversion is applied; the value is carried forward as-is for use in subgroup analysis.
Year as integer
Year of Publication is cast to integer, dropping any trailing decimals that may result from spreadsheet export. Year is used as a continuous covariate in the multivariable meta-regression.
Prevalence normalization
0,42). gsub() replaces all commas with dots before coercing to numeric. The resulting value is a proportion between 0 and 1.
Sample size as integer
Continent classification
"North America" (the first level of the factor).
Setting dichotomization
"Hospital" maps to "Hospital Setting"; all others become "Other". The factor reference level is "Hospital Setting".
Guideline use categorization
Guideline column is recoded to a descriptive label. Studies that reported using a guideline are tagged "Followed Guideline(s)"; all others are tagged "No Guideline".
Logit prevalence
-Inf or Inf; verify that no study reports a prevalence at the boundary before running the pipeline.
Variance of logit prevalence
vi (variance) argument.
Column name sanitization
make.names() is applied to all column names in the final step, replacing spaces and special characters with dots so every column is a syntactically valid R name (e.g., Patient's age → Patient.s.age).