Claude literature-review skill: 10 PRISMA pipelines (2026)

systematic-literature-review

Sibling skill that ships a PRISMA-flow renderer if you want the diagram in the same prompt.

citation-management

Cleans the deduped list against CrossRef before screening.

Semantic Scholar full-text query → ranked list

Hit the Semantic Scholar Graph API once, return a ranked table (title · year · citations · venue · DOI) for the top 20 hits.

ForResearchers who want a citation-weighted starting set in under a minute.

The prompt

Use the literature-review skill. Query Semantic Scholar's /graph/v1/paper/search endpoint for 'retrieval-augmented generation evaluation'. Sort the top 20 by citationCount descending. Return a markdown table with columns: Title | Year | Cites | Venue | DOI. Save raw JSON to ./review/s2-rag.json so I can re-rank locally if needed.

What slides.md looks like

import requests, json
URL = "https://api.semanticscholar.org/graph/v1/paper/search"
params = {"query": "retrieval augmented generation evaluation",
          "limit": 20,
          "fields": "title,year,citationCount,venue,externalIds"}
r = requests.get(URL, params=params, timeout=30); r.raise_for_status()
data = sorted(r.json()["data"],
              key=lambda p: p.get("citationCount", 0),
              reverse=True)
json.dump(data, open("review/s2-rag.json", "w"), indent=2)
for p in data:
    doi = p.get("externalIds",{}).get("DOI","–")
    print(f"| {p['title'][:60]} | {p['year']} | {p.get('citationCount',0)} | {p.get('venue','')[:24]} | {doi} |")

One-line tweak

The free public endpoint allows ~1 req/sec shared across users. Set `x-api-key` from a partner key when you scale past 50 papers per query.

Pairs with

arxiv-search

Identical UX against the arXiv corpus — pair them when your topic is preprint-heavy.

search-papers

MCP server that exposes Semantic Scholar + arXiv + bioRxiv as one tool surface for long-running agents.

Boolean search across arXiv + OpenAlex + PubMed

Run the same Boolean expression against three databases in parallel and merge results into one deduplicated CSV.

ForCross-disciplinary reviews where one corpus is never enough (CS + biology, ML + medicine).

The prompt

Use the literature-review skill. For the query ("large language model" AND ("clinical" OR "medical")) AND year:2024-2026, hit arXiv (export.arxiv.org/api/query), OpenAlex (api.openalex.org/works) and PubMed (eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi). Merge by DOI, save to ./review/triple-merge.csv with columns: doi, title, source, year, citations.

What slides.md looks like

import requests, csv, urllib.parse as u
Q = '"large language model" AND ("clinical" OR "medical")'
endpoints = {
  "arxiv":    f"http://export.arxiv.org/api/query?search_query=all:{u.quote(Q)}&max_results=50",
  "openalex": f"https://api.openalex.org/works?search={u.quote(Q)}&filter=publication_year:2024-2026&per-page=50",
  "pubmed":   f"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term={u.quote(Q)}&retmode=json&retmax=50",
}
rows = []
for src, url in endpoints.items():
    rows += parse(src, requests.get(url, timeout=30))   # parse() returns dicts
# dedupe by DOI, fall back to lowercased title
seen, out = set(), []
for r in rows:
    k = r["doi"] or r["title"].lower();
    if k not in seen: seen.add(k); out.append(r)
csv.DictWriter(open("review/triple-merge.csv","w"), fieldnames=out[0].keys()).writerows(out)

One-line tweak

Add bioRxiv via api.biorxiv.org/details/biorxiv when your topic skews preprint — same merge step, one extra source row.

Pairs with

academic-researcher

Adds CrossRef-side metadata enrichment and DOI-resolver fallbacks.

pubmed

Long-lived MCP session against the PubMed E-utilities — useful when the same review reruns weekly.

BibTeX export with consistent citation style

Convert the included-set into a clean references.bib with APA / Nature / Vancouver / Chicago / IEEE templates ready for pandoc.

ForAnyone whose journal target dictates the citation style and rejects messy BibTeX.

The prompt

Use the literature-review skill. Take ./review/04-included.json, fetch BibTeX from doi.org for each entry (Accept: application/x-bibtex), normalize keys to AuthorYearKeyword format, write ./review/refs.bib. Emit a second file ./review/style-block.tex selecting the natbib style my journal wants (default: nature.bst).

What slides.md looks like

import json, requests, re
included = json.load(open("review/04-included.json"))
out = []
for p in included:
    doi = p["externalIds"]["DOI"]
    bib = requests.get(f"https://doi.org/{doi}",
        headers={"Accept": "application/x-bibtex"}, timeout=20).text
    # normalize cite-key: First-author surname + year + first-keyword
    first = p["authors"][0]["name"].split()[-1].lower()
    key  = f"{first}{p['year']}{p['title'].split()[0].lower()}"
    bib  = re.sub(r"@(\w+)\{[^,]+,", f"@\\1{{{key},", bib, count=1)
    out.append(bib)
open("review/refs.bib","w").write("\n\n".join(out))
open("review/style-block.tex","w").write(r"\bibliographystyle{nature}\bibliography{refs}")

One-line tweak

Swap nature.bst for apalike.bst, vancouver.bst, chicago.bst, or ieeetr.bst. The skill's SKILL.md ships templates for all five — the FAQ below covers which one your target wants.

Pairs with

citation-management

Cross-validates each BibTeX entry against CrossRef so a fabricated DOI fails loud, not silent.

zotero

Pushes refs.bib straight into a Zotero collection — avoids the manual Zotero import dance.

Auto-extract methods sections from N papers

Read the methods section of every included paper and emit a comparison table — sample size, model, dataset, evaluation metric.

ForAnyone synthesising methodology trends across a field (the meat of a thesis chapter).

The prompt

Use the literature-review skill on ./review/04-included.json. For each PDF, locate the Methods/Methodology section, extract: (a) sample size, (b) model architecture or technique, (c) dataset name, (d) primary evaluation metric. Write a markdown table to ./review/methods-table.md with one row per paper. If a paper has no PDF, fall back to its Semantic Scholar abstract.

What slides.md looks like

import json, requests
from pathlib import Path
included = json.load(open("review/04-included.json"))
rows = ["| Paper | n | Model | Dataset | Metric |", "|---|---|---|---|---|"]
for p in included:
    pdf = Path(f"pdfs/{p['paperId']}.pdf")
    text = read_section(pdf, heading="Methods") if pdf.exists() else p["abstract"]
    fields = claude_extract(text, schema={"n":"int","model":"str",
                                          "dataset":"str","metric":"str"})
    rows.append(f"| {p['title'][:40]} | {fields['n']} | {fields['model']} "
                f"| {fields['dataset']} | {fields['metric']} |")
open("review/methods-table.md","w").write("\n".join(rows))

One-line tweak

Pipe the same table into a Pareto plot (sample size vs. citation count) with the data-analysis skill — surfaces which methods are over- or under-evaluated.

Pairs with

pdf-to-markdown

Turns the PDF Methods section into clean text Claude can structure-extract from.

data-analysis

Renders the methods-table.md as a Pareto/heatmap so reviewers see the gap.

Find-the-gap analysis (compare 5 papers' contributions)

Cluster the contributions of the top 5 most-cited papers, then surface the open questions none of them answers — the literature gap.

ForPhD students writing the 'Why this thesis?' opening of chapter one.

The prompt

Use the literature-review skill on the top 5 papers in ./review/03-screened.json (sorted by citationCount). For each, extract its single-sentence contribution. Cluster the five contributions into themes. Then write a 200-word 'Gap analysis' section to ./review/gap.md naming three open questions that none of the five papers tackles — cite each claim.

What slides.md looks like

import json
top5 = sorted(json.load(open("review/03-screened.json")),
              key=lambda p: p.get("citationCount",0), reverse=True)[:5]

prompt = """For each paper, in 1 sentence: what is its central contribution?
Then group contributions into 2-3 themes.
Then list 3 open questions that NONE of the five address.
Cite every claim with [SmithYearKeyword]. Output markdown."""

# claude_call() is the literature-review skill helper
md = claude_call(prompt, attachments=[p["abstract"] for p in top5])
open("review/gap.md","w").write(md)

One-line tweak

Set `top10` instead of `top5` for a research proposal — three gaps from five papers is a chapter; three gaps from ten is a grant section.

Pairs with

research-paper-writer

Drops the gap.md straight into the introduction template.

summarize

When the abstracts are too dense, summarize first then cluster.

Citation graph (paper A → cited by N → cited by M)

Snowball outward two hops from a seed paper, render a citation graph, save as Graphviz DOT plus a CSV of edges.

ForAnyone tracing the lineage of an idea — 'who has built on Vaswani 2017 in the last 18 months?'

The prompt

Use the literature-review skill. Seed = DOI 10.48550/arXiv.1706.03762 (Attention is All You Need). Walk Semantic Scholar's /paper/{id}/citations endpoint two hops. Cap at 50 nodes per hop. Save the edge list to ./review/cite-graph.csv (src_id, dst_id, year) and a Graphviz DOT file to ./review/cite-graph.dot.

What slides.md looks like

import requests, csv
S2 = "https://api.semanticscholar.org/graph/v1/paper"
seed = "10.48550/arXiv.1706.03762"
edges, frontier, depth = [], [seed], 0
while depth < 2:
    next_front = []
    for pid in frontier:
        r = requests.get(f"{S2}/{pid}/citations",
            params={"limit": 50, "fields": "paperId,year"}).json()
        for c in r["data"]:
            cp = c["citingPaper"]
            edges.append((pid, cp["paperId"], cp.get("year","")))
            next_front.append(cp["paperId"])
    frontier, depth = next_front, depth + 1
csv.writer(open("review/cite-graph.csv","w")).writerows(
    [("src","dst","year"), *edges])
with open("review/cite-graph.dot","w") as f:
    f.write("digraph G {\n")
    for s,d,_ in edges: f.write(f'  "{s[:8]}" -> "{d[:8]}";\n')
    f.write("}\n")

One-line tweak

Render the DOT to PNG with `dot -Tpng cite-graph.dot -o cite-graph.png`, then drop the image into the gap-analysis section from use case 6 to make 'who built on whom' visible.

Pairs with

data-analysis

Loads cite-graph.csv into a NetworkX graph for centrality / cluster analysis.

arxiv-mcp-server

Streams arXiv-side metadata for the same paper IDs across many turns.

Snowball search from a seed paper

Backward + forward snowball: pull every reference of the seed, then every paper that cites the seed, dedupe to one CSV.

ForCochrane-style reviewers who need supplementary search to clear the PRISMA 'records identified through other sources' box.

The prompt

Use the literature-review skill. Seed paperId = 6b85b63579a916f705a8e10a49bd8d849d91b1fc. Hit /paper/{id}/references and /paper/{id}/citations. Save backward+forward into ./review/snowball.csv. Tag each row with origin='backward' or origin='forward' so I can audit which arm contributed which paper.

What slides.md looks like

import requests, csv
S2  = "https://api.semanticscholar.org/graph/v1/paper"
SEED = "6b85b63579a916f705a8e10a49bd8d849d91b1fc"

def fetch(arm):
    url = f"{S2}/{SEED}/{'references' if arm=='backward' else 'citations'}"
    r = requests.get(url, params={"limit": 200,
        "fields":"paperId,title,year,externalIds"}).json()
    key = "citedPaper" if arm=="backward" else "citingPaper"
    return [(arm, p[key]["paperId"], p[key]["title"],
             p[key].get("year",""),
             p[key].get("externalIds",{}).get("DOI","–"))
            for p in r["data"]]

rows = fetch("backward") + fetch("forward")
csv.writer(open("review/snowball.csv","w")).writerows(
    [("origin","paperId","title","year","doi"), *rows])
print(f"backward={sum(1 for r in rows if r[0]=='backward')} "
      f"forward={sum(1 for r in rows if r[0]=='forward')}")

One-line tweak

Set `limit=1000` and paginate with `offset` once your seed has more than 200 cites; the API caps at 1000 per page.

Pairs with

agentarxiv

Layered agent loop that runs snowball + screening across many seeds in one job.

citeassist-citation-retrieval

MCP route when you want snowball expansion as a tool inside a chat agent, not a script.

Auto-summary table (Author / Year / Method / Sample / Finding)

Render the canonical literature-review summary table — one row per included paper, five columns the reviewers expect to see.

ForAnyone whose chapter / journal section opens with the 'Table 1: summary of included studies' grid.

The prompt

Use the literature-review skill on ./review/04-included.json. For each paper extract: first-author surname, year, method (one phrase), sample size or dataset, primary finding (one sentence). Output a sortable markdown table to ./review/summary-table.md AND a tidy CSV to ./review/summary-table.csv that I can pivot in pandas.

What slides.md looks like

import json, csv
included = json.load(open("review/04-included.json"))
header = ["author","year","method","sample","finding"]
rows = []
for p in included:
    fields = claude_extract(p["abstract"], schema={
        "method":"str","sample":"str","finding":"str"})
    rows.append({
      "author": p["authors"][0]["name"].split()[-1],
      "year":   p["year"],
      "method": fields["method"],
      "sample": fields["sample"],
      "finding": fields["finding"],
    })
# markdown
with open("review/summary-table.md","w") as f:
    f.write("| " + " | ".join(header) + " |\n")
    f.write("|" + "|".join(["---"]*len(header)) + "|\n")
    for r in rows: f.write("| " + " | ".join(str(r[h]) for h in header) + " |\n")
# csv
csv.DictWriter(open("review/summary-table.csv","w"),
               fieldnames=header).writerows(rows)

One-line tweak

Add a sixth column 'risk-of-bias' and have Claude grade each study against the Cochrane RoB-2 tool — turns the table into a Cochrane-grade artifact.

Pairs with

data-analysis

Pivot the CSV by `method` to count studies per technique — reveals where the field crowds.

research-paper-writer

Drops the markdown table directly into a thesis chapter Methods section.

Export full review as a markdown thesis-chapter draft

Stitch the cookbook outputs into one publication-shaped chapter — abstract, search strategy, PRISMA flow, summary table, gap analysis, references.

ForPhD students at the 'I have all the artifacts, now write chapter 2' moment.

The prompt

Use the literature-review skill. Read ./review/{gap.md, summary-table.md, methods-table.md, refs.bib, prisma.png}. Write ./review/chapter-2.md with sections: 1) 250-word abstract, 2) Search strategy (Boolean string + databases + dates), 3) PRISMA flow (![](prisma.png)), 4) Summary table, 5) Methodology synthesis, 6) Gap analysis, 7) References. Use natbib \\citep keys from refs.bib.

What slides.md looks like

# scripts/stitch_chapter.py
import pathlib
chapter = []
chapter.append("# Chapter 2 — Systematic Literature Review\n")
chapter.append("## 2.1 Abstract\n" +
    pathlib.Path("review/abstract.md").read_text())
chapter.append("## 2.2 Search strategy\n" +
    pathlib.Path("review/search-strategy.md").read_text())
chapter.append("## 2.3 PRISMA flow\n![PRISMA](prisma.png)\n")
chapter.append("## 2.4 Summary of included studies\n" +
    pathlib.Path("review/summary-table.md").read_text())
chapter.append("## 2.5 Methodology synthesis\n" +
    pathlib.Path("review/methods-table.md").read_text())
chapter.append("## 2.6 Gap analysis\n" +
    pathlib.Path("review/gap.md").read_text())
chapter.append("## 2.7 References\n\\bibliography{refs}\n")
pathlib.Path("review/chapter-2.md").write_text("\n\n".join(chapter))
# pandoc -> PDF: pandoc chapter-2.md --citeproc --bibliography=refs.bib -o ch2.pdf

One-line tweak

Append `--csl=nature.csl` to the pandoc call to lock the citation style. The skill's `scripts/generate_pdf.py` wraps this so you don't fight pandoc flags.

Pairs with

research-paper-writer

Sibling skill that owns the writing voice — pair when the chapter needs a proper academic tone.

zotero

Pushes refs.bib + chapter-2.md into a Zotero-linked Notion / Obsidian vault for live editing.

Community signal

Three voices that frame why a Claude skill is the right shape for this problem. The first is the Allen Institute’s own framing of the corpus the skill leans on. The second is the publisher-coverage problem that forces you to merge databases in the first place. The third is the methodological floor every systematic review has to clear.

“Most Semantic Scholar endpoints are available to the public without authentication, but they are rate-limited to 1000 requests per second shared among all unauthenticated users.”

Allen Institute for AI · Semantic Scholar · Blog

The reason use cases 2, 3, 7, 8 hit the API directly: it's free, public, and 214 million papers wide. The skill is the orchestrator; this is the corpus.

“Unfortunately the big publishers (Elsevier and Springer) are forcing other indices like OpenAlex, etc. to remove abstracts so they're harder to get.”

shishy (HN, ex-scite.ai) · Hacker News

Why a single-database review will under-cover: Elsevier-side coverage is contracting in OpenAlex. Use case 3 (triple-merge) exists for exactly this reason.

“A systematic review cannot be done with semantic search and should never be done in a preprint collection.”

tokai (HN) · Hacker News

The methodological floor: Boolean expansion plus PRISMA stages — not vibes-based vector retrieval — is what separates a review from a vibe.

The contrarian take

Not everyone is sold on AI-driven literature reviews. The most cited critique — both in academia and in the popular press — comes from the bibliographic-fabrication literature itself. Roger Watson · Enago Academy (citing Deakin University study) summarised the headline number:

“ChatGPT (GPT-4o) fabricated roughly one in five academic citations, with more than half (56%) of citations either fake or containing errors. Among fabricated citations that included DOIs, 64% linked to real but completely unrelated papers — making the errors harder to spot without careful verification.”

Roger Watson · Enago Academy (citing Deakin University study) · Blog

Synthesising the Deakin University and multi-model citation-fabrication studies.