Add automated docs & notebooks freshness + normalization checks#3228
Add automated docs & notebooks freshness + normalization checks#3228
Conversation
Introduce a new CLI tool (.github/tools/docs_and_notebooks_check.py) to scan notebooks and Markdown docs for staleness and verification metadata under the 'deeplabcut' namespace. Adds a default YAML config (.github/tools/docs_and_notebooks_report_config.yml), a README for the tool (.github/tools/docs_and_notebooks_tool_README.md), and an output ignore entry in .gitignore. The tool uses pydantic schemas, computes last_git_updated from git history, reads/writes notebook top-level metadata and Markdown frontmatter (idempotent updates), and supports report/check/update modes. Outputs machine- and human-readable reports (nb_docs_status.json / .md). Requires pydantic and PyYAML; designed to be safe-by-default for CI (read-only unless --write is passed).
Introduce a GitHub Actions workflow to scan docs and notebooks for staleness. The workflow runs on push and PRs to main, checks out full git history, uses Python 3.12, installs pydantic and pyyaml, and runs a read-only staleness report and an optional policy check using .github/tools/docs_and_notebooks_check.py with tools/staleness_config.yml. Results (JSON/MD) are uploaded as the staleness-report artifact. Workflow is limited to content read permissions and has a 10-minute timeout.
Rename OUTPUT_FILENAME from 'nb_docs_status' to 'docs_nb_checks' and use it for the default --out-dir (tmp/docs_nb_checks). Update the README to show the check command as a fenced code block and clarify allowlist behavior. Update .gitignore to ignore the new tmp/docs_nb_checks path.
Ensure DLC metadata is JSON-serializable by converting date/datetime fields to ISO strings and preserving exclude_none behavior. Uses pydantic v2 API (model_dump(mode="json", exclude_none=True)) and falls back to pydantic v1 via json.loads(meta.json(...)). Adds a docstring and clarifying comments. This prevents json.dumps from failing when writing .ipynb files and keeps compatibility across pydantic versions.
Update docs-and-notebooks tool to use nbformat for reading/writing notebooks, validate .ipynb files, and detect whether notebooks are normalized. Add notebook_is_normalized helper and ensure write_ipynb_meta uses nbformat.writes/validate. Introduce a new policy field require_notebook_normalized (and add it to the report config defaults) and enforce it to emit violations when notebooks are not normalized. Also update CI job to install nbformat and pin pydantic, and update the script header notes to list the new dependency. These changes let CI detect invalid or non-normalized notebooks and reduce formatting churn when normalizing files.
Add a local pre-commit hook 'dlc-docs-notebooks-check' that runs .github/tools/docs_and_notebooks_check.py to check DLC docs and notebooks for staleness, validate nbformat, and perform normalization. The hook targets Jupyter and Markdown files, passes filenames to the script, and declares additional dependencies (pydantic>=2,<3, pyyaml, nbformat>=5).
There was a problem hiding this comment.
Pull request overview
Adds an automated “docs + notebooks freshness” scanning tool and wires it into CI/pre-commit to surface stale or invalid content (git-updated date, human “last_verified”, nbformat validation, and notebook canonical-format drift).
Changes:
- Introduces
.github/tools/docs_and_notebooks_check.py+ YAML config to scan docs/notebooks, emit JSON/MD reports, and (optionally) update metadata in-place. - Adds a GitHub Actions workflow to run
report+checkand upload report artifacts. - Adds an (optional) pre-commit hook and ignores generated report output under
tmp/.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| .pre-commit-config.yaml | Adds local hook intended to run the checks on touched Markdown/notebooks. |
| .gitignore | Ignores generated report output directory under tmp/. |
| .github/workflows/docs_and_notebooks_checks.yml | Adds CI job to run report/check and upload artifacts. |
| .github/tools/docs_and_notebooks_tool_README.md | Documents the tool’s purpose, metadata locations, and usage. |
| .github/tools/docs_and_notebooks_report_config.yml | Defines scan include/exclude globs and policy thresholds/allowlists. |
| .github/tools/docs_and_notebooks_check.py | Implements scanning/reporting/checking/updating logic using git + nbformat + pydantic. |
Comments suppressed due to low confidence (1)
.github/workflows/docs_and_notebooks_checks.yml:43
- The comment says check mode “will fail only once you populate allowlists”, but
docs_and_notebooks_check.pyalso returns non-zero if any notebooks produceerrors(e.g., nbformat validation failures). If the intent is non-blocking until allowlists are set, either adjust the script’s exit-code behavior or update this comment to reflect that errors can already fail the job.
# Optional: run check mode (will fail only once you populate allowlists in config)
- name: Run staleness policy check (optional gate)
run: |
python .github/tools/docs_and_notebooks_check.py check \
--config .github/tools/docs_and_notebooks_report_config.yml --out-dir tmp/docs_nb_checks
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Use docs_and_notebooks_report_config.yml as the default config and resolve it relative to the script. Rename machine/human report outputs to docs_nb_checks.{json,md}. Add an optional --targets argument to the report and check subcommands; scan_files now accepts a targets list and filters scanned paths to only those targets. Make --config default a string path and adjust error-exit logic so parsing errors don't cause a non-zero exit in report mode. Minor doc/formatting tweaks.
Update references and examples in .github/tools/docs_and_notebooks_tool_README.md: change config reference to .github/tools/docs_and_notebooks_report_config.yml, update report output paths to tmp/docs_nb_checks/..., simplify the example 'check' command, and replace usages of tools/staleness.py with .github/tools/docs_and_notebooks_check.py in the update/example commands. Also tidy the 'Writes' section formatting.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Rename .github/tools/docs_and_notebooks_* to tools/ and update references. Updated workflow (.github/workflows/docs_and_notebooks_checks.yml) and pre-commit config to call tools/docs_and_notebooks_check.py and use tools/docs_and_notebooks_report_config.yml, updated the tool script's internal docs and the README paths, and tweaked the workflow name to "Docs & notebooks freshness and formatting checks".
Treat metadata-only commits specially: add META_COMMIT_MARKER and compute a last_content_updated by skipping commits that contain that marker (falling back to raw git-touched date with a warning). Rename policy/config and record fields from git->content (warn_if_content_older_than_days, last_content_updated, days_since_content_update) and add debug last_git_touched plus last_metadata_updated metadata. Add guardrail requiring --ack-meta-commit-marker when writing metadata/normalizing notebooks, print a suggested commit message, and introduce a new normalize subcommand to deterministically reformat notebooks. Misc: factor git date parsing, update update_files API and behavior to optionally set content dates from git, propagate warnings when fallback is used, and update reporting output to surface content/git-touched/metadata timestamps.
Add a comprehensive contract test suite at tests/tools/docs_and_notebooks_checks/test_check_contracts.py verifying constants, DLCMeta schema, git-derived dates, scan/update/normalize behaviors, and output writing. Also call model_rebuild() for the Pydantic models in tools/docs_and_notebooks_check.py to ensure models are rebuilt correctly when using __future__ annotations so the tests import and validate the runtime models as expected.
Clarify and expand the Docs & Notebooks checks tool README: rename last_git_updated to last_content_updated (computed from git but ignoring metadata-only commits), add last_metadata_updated and verified_for metadata fields, and emphasize separation of content vs metadata. Document the META_COMMIT_MARKER requirement for metadata-only/normalization commits and provide suggested commit messaging and guardrails for update/normalize operations. Reorganize commands (report, check, update, normalize), note that update/normalize are write-only for maintainers, and add CI guidance (use actions/checkout fetch-depth: 0) and required dependencies. Also include troubleshooting tips and mention deterministic notebook normalization and Pydantic model rebuild guidance.
Reduce job timeout from 10 to 5 minutes, upgrade actions/checkout to v6 and actions/setup-python to v6, and allow the staleness policy check step to continue-on-error. These changes use newer action releases, shorten runtime limits, and ensure the optional gate doesn't fail the workflow.
Include pydantic>2 in pyproject.toml dependencies to require Pydantic v2+ for project data models/validation and ensure compatibility with code expecting Pydantic v2 behavior.
Differentiate between absent and invalid DLC metadata in notebooks and markdown files. read_ipynb_meta now returns a has_dlc flag; parse_dlc_meta returns (meta, valid). scan_files uses the new flags to set rec.meta and append explicit warnings "missing_metadata" or "invalid_metadata" instead of treating all None as missing. Call sites in update_files and normalize_notebooks updated to unpack the extra return value. Misc cleanup around frontmatter handling and error/warning reporting.
Add two tests to verify notebook metadata validation: one ensures a notebook missing the "deeplabcut" namespace triggers a "missing_metadata" warning and leaves meta as None; the other ensures an invalid "deeplabcut" namespace (bad last_verified value) triggers an "invalid_metadata" warning and leaves meta as None. Both tests create minimal notebooks in a temp git repo, commit them, run the tool scan, and assert the expected warnings and record kinds.
Change GitHub Actions test step to install the package with development extras (pip install -e .[dev]) instead of only pytest. Update pyproject.toml dev dependency group by adding pydantic>2 and nbformat>5 and removing black so CI/tests have the required dev libraries available.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
pyproject.toml:118
- Version specifiers
pydantic>2/nbformat>5exclude the major boundary versions (e.g., pydantic 2.0.0, nbformat 5.0.0) and don’t set an upper bound. For stability and to match the constraints used elsewhere in the PR (e.g.,pydantic>=2,<3), consider using>=... ,<...style constraints here too.
[dependency-groups]
dev = [
"pydantic>2",
"nbformat>5",
"coverage",
"pytest",
"pytest-cov",
]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| - name: Install dependencies | ||
| shell: bash -el {0} # Important: activates the conda environment | ||
| run: | | ||
| python -m pip install --upgrade pip setuptools wheel | ||
| pip install --no-cache-dir -e . | ||
|
|
||
| - name: Install ffmpeg | ||
| run: | | ||
| if [ "$RUNNER_OS" == "Linux" ]; then | ||
| sudo apt-get update | ||
| sudo apt-get install ffmpeg | ||
| elif [ "$RUNNER_OS" == "macOS" ]; then | ||
| brew install ffmpeg || true | ||
| else | ||
| choco install ffmpeg | ||
| fi | ||
| shell: bash | ||
|
|
||
| - name: Run pytest tests | ||
| shell: bash -el {0} # Important: activates the conda environment | ||
| run: | | ||
| pip install --no-cache-dir pytest | ||
| pip install --no-cache-dir -e .[dev] | ||
| python -m pytest |
There was a problem hiding this comment.
The workflow installs the package twice: first pip install -e . in the dependencies step, then pip install -e .[dev] right before running pytest. This is redundant and increases CI time/variance. Consider installing -e .[dev] once (or installing only the extra test deps you need) and removing the duplicate install.
| # Dry-run normalize: should set would_change True if not normalized | ||
| records = tool.normalize_notebooks( | ||
| repo_root=repo, | ||
| cfg=cfg, | ||
| targets=[rel], | ||
| write=False, | ||
| ack_meta_commit_marker=True, | ||
| ) | ||
| assert len(records) == 1 | ||
| assert records[0].kind == "ipynb" | ||
| # may be True depending on canonical formatting differences | ||
| assert records[0].would_change in (True, False) | ||
|
|
There was a problem hiding this comment.
test_normalize_is_explicit_and_marks_would_change currently asserts records[0].would_change in (True, False), which is always true and doesn’t validate the intended contract. If the test’s purpose is to ensure non-canonical notebooks are detected, it should construct input that is guaranteed to differ from nbformat.writes(..., indent=2) and then assert would_change is True (or otherwise assert on a specific, deterministic condition).
tools/docs_and_notebooks_check.py
Outdated
|
|
||
| lines.append("## Notes\n") | ||
| lines.append("- 'Out of date' does not necessarily mean 'broken'. Use this as a triage signal.\n") | ||
| lines.append("- last_git_updated is computed from git history. last_verified is human-controlled.\n\n") |
There was a problem hiding this comment.
The markdown report notes mention last_git_updated, but the tool uses last_git_touched / last_content_updated terminology. This is confusing and makes the report inconsistent with the schema and README; update the text to match the actual field names.
| lines.append("- last_git_updated is computed from git history. last_verified is human-controlled.\n\n") | |
| lines.append("- last_git_touched / last_content_updated are computed from git history. last_verified is human-controlled.\n\n") |
tools/docs_and_notebooks_check.py
Outdated
| "errors": sum(1 for r in records if r.errors), | ||
| "missing_metadata": sum(1 for r in records if "missing_metadata" in r.warnings), | ||
| "missing_last_verified": sum(1 for r in records if "missing_last_verified" in r.warnings), | ||
| "git_stale": sum(1 for r in records if any(w.startswith("git_stale") for w in r.warnings)), |
There was a problem hiding this comment.
summarize() computes the git_stale total by looking for warnings starting with git_stale, but the scanner emits content_stale>... warnings. As a result, the report summary will always show Git-stale as 0 even when content is stale. Rename the total key and/or update the warning prefix check so the summary matches the emitted warnings, and align the label in to_markdown() accordingly.
| "git_stale": sum(1 for r in records if any(w.startswith("git_stale") for w in r.warnings)), | |
| "git_stale": sum(1 for r in records if any(w.startswith("content_stale") for w in r.warnings)), |
| # Optional: mark maintenance time if we actually write | ||
| if write: | ||
| meta.last_metadata_updated = today | ||
|
|
There was a problem hiding this comment.
In update_files(), last_metadata_updated is set whenever --write is used (before checking whether any metadata actually changed). This makes the operation non-idempotent and can cause daily rewrites/diffs even when no other fields changed. Consider only setting/updating last_metadata_updated if a write is actually needed (i.e., after detecting a change) so reruns don’t produce churn.
| if write: | ||
| # Rewrite notebook in canonical form | ||
| write_ipynb_meta(abs_path, nb) | ||
|
|
||
| # Update embedded maintenance timestamp | ||
| meta = rec.meta or DLCMeta() | ||
| meta.last_metadata_updated = today | ||
|
|
||
| nb_meta = nb.setdefault("metadata", {}) | ||
| prev = nb_meta.get(DLC_NAMESPACE, {}) | ||
| if not isinstance(prev, dict): | ||
| prev = {} | ||
| merged = dict(prev) | ||
| merged.update(meta_to_jsonable(meta)) | ||
| nb_meta[DLC_NAMESPACE] = merged | ||
|
|
||
| # Write again to persist metadata update (still canonical) | ||
| write_ipynb_meta(abs_path, nb) | ||
| rec.meta = meta |
There was a problem hiding this comment.
normalize_notebooks() writes the notebook once to normalize formatting, then updates last_metadata_updated and writes again. This doubles I/O and increases the chance of inconsistencies if the first write succeeds but the second fails. You can set the metadata update before writing and perform a single canonical write when write=True.
| - name: Upload staleness artifacts | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: staleness-report | ||
| path: | | ||
| tmp/docs_nb_checks/*.json | ||
| tmp/docs_nb_checks/*.md | ||
| if-no-files-found: error No newline at end of file |
There was a problem hiding this comment.
Note : in summary this sometimes gets rendered twice, check
Introduce a --no-step-summary CLI flag and use it in the GitHub Actions workflow to prevent writing to GITHUB_STEP_SUMMARY during the docs/notebooks check job. The script now respects args.no_step_summary when deciding whether to emit the step summary. Also adjust the summary writing behavior to include the full markdown content (previous truncation to 220 lines was removed/commented) when writing is enabled.
Update GitHub Actions workflow to use actions/checkout@v6 and modify the editable install command used in the test step. Replaces "pip install -e .[dev]" with "pip install -e . --group dev" before running pytest, aligning the workflow with the newer checkout action and the revised dependency installation syntax.
| @@ -0,0 +1,186 @@ | |||
| # Docs & Notebooks Checks Tool | |||
There was a problem hiding this comment.
TODO -Update CONTRIBUTING.md as well
Document the marker convention (chore(metadata)):
- when to use it
- examples of metadata‑only changes
- why it matters (preserves meaningful content age)
Summary / Purpose
This PR introduces a safe-by-default automation tool to improve release confidence by continuously tracking the “freshness” and validity of our docs and notebooks. Specifically, it:
docs/**/*.md) and notebooks (examples/COLAB/**/*.ipynb,examples/JUPYTER/**/*.ipynb, anddocs/**/*.ipynb)The intent is to reduce regressions and staleness, without blocking development or rewriting content in CI by default.
Related
For other linting efforts PRs see :
Design Overview
Safety-first approach (default read-only)
The tool is designed to be non-invasive in CI:
reportandcheckare read-only modes intended for PR CI and scheduled jobs.update --writeis an explicit opt-in mode intended for maintainers (local runs or dedicated maintenance PRs).Minimal and predictable notebook handling
For notebooks:
Schemas & config
Uses Pydantic v2 models with an explicit schema_version to keep report/config evolvable.
Behavior is controlled via a YAML config (docs_and_notebooks_report_config.yml) including include/exclude globs and policy thresholds.
Policy “ratcheting” is supported through allowlists:
require_metadata, require_recent_verification, and require_notebook_normalized (all empty by default)
This will let us set priority targets and deadlines that must not go out-of-date for CI to pass.
Specifics of implementation and separation of concerns for PRs will have to be decided as well.
What’s Included
New tool
.github/tools/docs_and_notebooks_check.pyreport: generate JSON/MD reportcheck: report + policy enforcement (currently allowlist-only)update--write: update last_git_updated (and optionally verification fields) — explicit opt-innbformat.validateNew configuration
.github/tools/docs_and_notebooks_report_config.ymlCI integration (standalone job)
Adds a dedicated workflow/job that runs:
python .github/tools/docs_and_notebooks_check.py reportpython .github/tools/docs_and_notebooks_check.py checkUploads the output artifacts (docs_nb_checks.json / .md) for easy review
Uses fetch-depth: 0 so git dates are accurate
Repo maintenance tools
Non-goals
How To Use
Pre-commit / Developer Workflow (optional but recommended)
This PR includes a pre-commit hook so contributors get fast feedback when editing notebooks/docs.
The hook should run the script in check mode on touched files.
Policy & Ratcheting Plan (Future-proofing)
This PR intentionally starts permissive and allows checks to becomes stricter over time without breaking existing workflows:
Testing / Verification
Future Actions & Extensions (Ideas)
Reviewer Notes / Expected Diff Characteristics
If
update --write(or notebook normalization) is run, some notebooks may show larger diffs due to nbformat canonical formatting.After the initial normalization, diffs should remain small and predictable, similar to other linting in pre-commit hooks.
The tool is designed to avoid touching notebook cells/outputs; only metadata is modified.