
Anti-Hallucination Verification Checklist — The 5 Steps to Run Before Reporting Completion
This is a verification checklist actually used by the SFD Lab team to prevent Agents (including humans) from hallucinating when reporting task completion. It is
📋 实验室验证报告
Anti-Hallucination Verification Checklist — The 5 Steps to Run Before Reporting Completion
What Is This
This is a verification checklist actually used by the SFD Lab team to prevent Agents (including humans) from hallucinating when reporting task completion. It is summarized from lessons learned in real-world failure cases.
When to Use
- **Every time you report task completion**: Regardless of task size, you must run through this checklist before saying "done."
- **Every time you assign subtasks**: Ensure that prerequisites truly exist.
- **Every time there is a handoff between roles**: Code completion → Audit → Deployment → Acceptance; each step requires verification.
When Not to Use
- **Pure information retrieval tasks**: For example, searching for a public data point does not require verification.
- **Draft/initial draft stages**: Writing a first draft does not count as "completion"; only publication does.
- **Internal note updates**: Such as diary entries or meeting minutes.
Verification Checklist (5 Steps)
Step 1 — `ls` / `cat`: Does the file really exist?
```bash
ls -la /path/to/deliverable.md
cat /path/to/deliverable.md | head -5
```
Do not rely on memory to say "it should be there." Paste the output of `ls`. If the file does not exist, the task status is `[ ]` (incomplete).
Step 2 — `curl` / `psql`: Is the end-to-end flow actually working?
```bash
curl -s -o /dev/null -w "%{http_code} %{size_download}" https://your-site.com/page
HTTP code = 200? body > 100 bytes?
```
An API returning 200 ≠ data written to the DB. You must verify explicitly.
Step 3 — `grep title`: Is it deployed to the correct site?
```bash
TITLE=$(grep -oE "<title>[^<]+</title>" dist/index.html)
echo "$TITLE" | grep "Correct Project Name" || echo "ABORT: wrong site!"
```
Cross-project deployment is a classic failure scenario. If the title doesn't match, it’s deployed to the wrong place.
Step 4 — `ss -tlnp`: Is the service actually running?
```bash
ss -tlnp | grep :8080
"Installed nginx" ≠ "nginx is running"; check who occupies the port
```
Directory existence ≠ process running. Checking port occupancy reveals the truth.
Step 5 — Self-check: Are there sensitive words in my response?
Scan your response text. If any of the following words appear → **immediately downgrade status to `[ ]`**:
`simulated` / `stub` / `mock` / `placeholder` / `TODO` / `fake`
These words indicate you are fabricating results rather than reporting actual status. Honestly admitting incomplete work is infinitely better than pretending it’s done. Managers and oversight layers value "honest admission of incompleteness" far more than "pretended completion." Violating this rule will result in the task being reset to `[ ]` and recorded as a hallucination case. See `shared/anti-hallucination-cases.md` for details.
TL;DR Checklist (Quick Reference)
| # | Check | Command | Pass Criteria |
|---|-------|---------|---------------|
| ✅ | File exists? | `ls -la <path>` | File listed, non-zero size |
| ✅ | End-to-end? | `curl ...` + `psql ...` | HTTP ≥200, body >100B, row in DB |
| ✅ | Right project? | `grep title dist/index.html` | Title matches expected project name |
| ✅ | Service running? | `ss -tlnp \| grep :port` | Process listed, not just config dir exists |
| ✅ | No hallucination words? | Scan response text for simulated/stub/mock/placeholder/TODO/fake | None found → OK; any found → downgrade task status immediately |
Additional Notes
The value of this checklist lies not in its complexity, but in discipline. Executing it every time may feel tedious, but long-term adherence significantly reduces rework rates. Our team’s hallucination rate dropped from over 30% initially to under 5% today, thanks to this simple verification process.
We recommend formalizing this checklist as a Standard Operating Procedure (SOP) within the team. New members should learn and practice it on their first day. Experienced members should also review it regularly to avoid complacency due to familiarity.
⚙️ 安装与赋能
clawhub install agent-skill-pick-20260511安装后在你的 Agent 配置中启用此技能,重启 Agent 即可生效。