X Tutup
Skip to content

Add automated docs & notebooks freshness + normalization checks#3228

Draft
C-Achard wants to merge 23 commits intomainfrom
cy/automated-docs&nb-report
Draft

Add automated docs & notebooks freshness + normalization checks#3228
C-Achard wants to merge 23 commits intomainfrom
cy/automated-docs&nb-report

Conversation

@C-Achard
Copy link
Collaborator

@C-Achard C-Achard commented Mar 5, 2026

Summary / Purpose

This PR introduces a safe-by-default automation tool to improve release confidence by continuously tracking the “freshness” and validity of our docs and notebooks. Specifically, it:

  • Scans docs (docs/**/*.md) and notebooks (examples/COLAB/**/*.ipynb, examples/JUPYTER/**/*.ipynb, and docs/**/*.ipynb)
  • Captures two complementary signals:
    • last_git_updated: computed from git history (most recent commit touching the file)
    • last_verified: human-controlled “verified correct/working” date (missing treated as a warning initially)
  • Validates notebooks using nbformat.validate and detects when a notebook is not normalized to our canonical write format (warns with notebook_not_normalized)
  • Produces machine-readable and human-readable output reports to make drift visible and actionable

The intent is to reduce regressions and staleness, without blocking development or rewriting content in CI by default.

Related

For other linting efforts PRs see :

Design Overview

Safety-first approach (default read-only)

The tool is designed to be non-invasive in CI:

  • report and check are read-only modes intended for PR CI and scheduled jobs.
  • update --write is an explicit opt-in mode intended for maintainers (local runs or dedicated maintenance PRs).

Minimal and predictable notebook handling

For notebooks:

  • Use nbformat for read/validate/write (notebooks are “notebook-native” objects).
  • If writing, restrict updates to top-level notebook metadata only under the metadata.deeplabcut namespace (no cell/output changes).
  • Detect formatting drift by comparing on-disk content with nbformat.writes(..., indent=2): if mismatch → warning notebook_not_normalized

Schemas & config

Uses Pydantic v2 models with an explicit schema_version to keep report/config evolvable.
Behavior is controlled via a YAML config (docs_and_notebooks_report_config.yml) including include/exclude globs and policy thresholds.
Policy “ratcheting” is supported through allowlists:
require_metadata, require_recent_verification, and require_notebook_normalized (all empty by default)

This will let us set priority targets and deadlines that must not go out-of-date for CI to pass.
Specifics of implementation and separation of concerns for PRs will have to be decided as well.

What’s Included

New tool

  • .github/tools/docs_and_notebooks_check.py
    • report: generate JSON/MD report
    • check: report + policy enforcement (currently allowlist-only)
    • update --write: update last_git_updated (and optionally verification fields) — explicit opt-in
  • Notebook validation via nbformat.validate
  • Notebook canonical formatting drift detection (notebook_not_normalized warning)

New configuration

  • .github/tools/docs_and_notebooks_report_config.yml
  • Includes scan patterns for examples/COLAB, examples/JUPYTER, and docs
  • Excludes _build, build, .ipynb_checkpoints
  • Sets warning thresholds to 365 days
  • Treats missing last_verified as a warning (until consensus-driven tiering/ratcheting)

CI integration (standalone job)

Adds a dedicated workflow/job that runs:

  • python .github/tools/docs_and_notebooks_check.py report
  • python .github/tools/docs_and_notebooks_check.py check

Uploads the output artifacts (docs_nb_checks.json / .md) for easy review
Uses fetch-depth: 0 so git dates are accurate

Repo maintenance tools

  • .gitignore updated to ignore generated outputs:
    • **/tmp/docs_notebooks_status/
    • **/tmp/docs_notebooks_status/

Non-goals

  • No automatic rewriting of notebooks/docs in PR CI (read-only by default).
  • No tier auto-assignment: tier exists in schema but is intentionally left unset unless the project agrees on definitions.
  • Not trying to guarantee that every notebook executes end-to-end in CI (separate, costlier dimension).

How To Use

  • See included README
  • See new pre-commit hook

Pre-commit / Developer Workflow (optional but recommended)

This PR includes a pre-commit hook so contributors get fast feedback when editing notebooks/docs.
The hook should run the script in check mode on touched files.

Policy & Ratcheting Plan (Future-proofing)

This PR intentionally starts permissive and allows checks to becomes stricter over time without breaking existing workflows:

  • Phase 1 (now): warn only (no CI failures) for missing metadata / missing last_verified / notebook normalization drift.
  • Phase 2: populate allowlists for “high priority” docs/notebooks:
    • require_metadata: must have DLC metadata/frontmatter
    • require_recent_verification: must have recent last_verified
    • require_notebook_normalized: must match canonical nbformat writes

Testing / Verification

  • Ran report locally and inspected generated docs_nb_checks.md
  • Ran check locally (no unexpected failures)
  • Verified notebook validation works (nbformat errors appear under errors)
  • Confirmed notebook_not_normalized warning appears for non-canonical notebooks
  • CI workflow runs successfully and uploads artifacts

Future Actions & Extensions (Ideas)

  • Notebook tiering / ownership metadata (once consensus exists).
  • Integrate with release process: require verification for key docs/notebooks for RCs / stable releases.

Reviewer Notes / Expected Diff Characteristics

If update --write (or notebook normalization) is run, some notebooks may show larger diffs due to nbformat canonical formatting.
After the initial normalization, diffs should remain small and predictable, similar to other linting in pre-commit hooks.
The tool is designed to avoid touching notebook cells/outputs; only metadata is modified.

C-Achard added 6 commits March 5, 2026 11:08
Introduce a new CLI tool (.github/tools/docs_and_notebooks_check.py) to scan notebooks and Markdown docs for staleness and verification metadata under the 'deeplabcut' namespace. Adds a default YAML config (.github/tools/docs_and_notebooks_report_config.yml), a README for the tool (.github/tools/docs_and_notebooks_tool_README.md), and an output ignore entry in .gitignore. The tool uses pydantic schemas, computes last_git_updated from git history, reads/writes notebook top-level metadata and Markdown frontmatter (idempotent updates), and supports report/check/update modes. Outputs machine- and human-readable reports (nb_docs_status.json / .md). Requires pydantic and PyYAML; designed to be safe-by-default for CI (read-only unless --write is passed).
Introduce a GitHub Actions workflow to scan docs and notebooks for staleness. The workflow runs on push and PRs to main, checks out full git history, uses Python 3.12, installs pydantic and pyyaml, and runs a read-only staleness report and an optional policy check using .github/tools/docs_and_notebooks_check.py with tools/staleness_config.yml. Results (JSON/MD) are uploaded as the staleness-report artifact. Workflow is limited to content read permissions and has a 10-minute timeout.
Rename OUTPUT_FILENAME from 'nb_docs_status' to 'docs_nb_checks' and use it for the default --out-dir (tmp/docs_nb_checks). Update the README to show the check command as a fenced code block and clarify allowlist behavior. Update .gitignore to ignore the new tmp/docs_nb_checks path.
Ensure DLC metadata is JSON-serializable by converting date/datetime fields to ISO strings and preserving exclude_none behavior. Uses pydantic v2 API (model_dump(mode="json", exclude_none=True)) and falls back to pydantic v1 via json.loads(meta.json(...)). Adds a docstring and clarifying comments. This prevents json.dumps from failing when writing .ipynb files and keeps compatibility across pydantic versions.
Update docs-and-notebooks tool to use nbformat for reading/writing notebooks, validate .ipynb files, and detect whether notebooks are normalized. Add notebook_is_normalized helper and ensure write_ipynb_meta uses nbformat.writes/validate. Introduce a new policy field require_notebook_normalized (and add it to the report config defaults) and enforce it to emit violations when notebooks are not normalized. Also update CI job to install nbformat and pin pydantic, and update the script header notes to list the new dependency. These changes let CI detect invalid or non-normalized notebooks and reduce formatting churn when normalizing files.
Add a local pre-commit hook 'dlc-docs-notebooks-check' that runs .github/tools/docs_and_notebooks_check.py to check DLC docs and notebooks for staleness, validate nbformat, and perform normalization. The hook targets Jupyter and Markdown files, passes filenames to the script, and declares additional dependencies (pydantic>=2,<3, pyyaml, nbformat>=5).
@C-Achard C-Achard added this to the Centralize linting update milestone Mar 5, 2026
@C-Achard C-Achard self-assigned this Mar 5, 2026
@C-Achard C-Achard added enhancement New feature or request COLAB Jupyter related to jupyter notebooks documentation documentation updates/comments CI Related to CI/CD jobs and automated testing labels Mar 5, 2026
@C-Achard C-Achard requested a review from Copilot March 5, 2026 10:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an automated “docs + notebooks freshness” scanning tool and wires it into CI/pre-commit to surface stale or invalid content (git-updated date, human “last_verified”, nbformat validation, and notebook canonical-format drift).

Changes:

  • Introduces .github/tools/docs_and_notebooks_check.py + YAML config to scan docs/notebooks, emit JSON/MD reports, and (optionally) update metadata in-place.
  • Adds a GitHub Actions workflow to run report + check and upload report artifacts.
  • Adds an (optional) pre-commit hook and ignores generated report output under tmp/.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
.pre-commit-config.yaml Adds local hook intended to run the checks on touched Markdown/notebooks.
.gitignore Ignores generated report output directory under tmp/.
.github/workflows/docs_and_notebooks_checks.yml Adds CI job to run report/check and upload artifacts.
.github/tools/docs_and_notebooks_tool_README.md Documents the tool’s purpose, metadata locations, and usage.
.github/tools/docs_and_notebooks_report_config.yml Defines scan include/exclude globs and policy thresholds/allowlists.
.github/tools/docs_and_notebooks_check.py Implements scanning/reporting/checking/updating logic using git + nbformat + pydantic.
Comments suppressed due to low confidence (1)

.github/workflows/docs_and_notebooks_checks.yml:43

  • The comment says check mode “will fail only once you populate allowlists”, but docs_and_notebooks_check.py also returns non-zero if any notebooks produce errors (e.g., nbformat validation failures). If the intent is non-blocking until allowlists are set, either adjust the script’s exit-code behavior or update this comment to reflect that errors can already fail the job.
      # Optional: run check mode (will fail only once you populate allowlists in config)
      - name: Run staleness policy check (optional gate)
        run: |
          python .github/tools/docs_and_notebooks_check.py check \
          --config .github/tools/docs_and_notebooks_report_config.yml --out-dir tmp/docs_nb_checks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

C-Achard added 4 commits March 5, 2026 14:38
Use docs_and_notebooks_report_config.yml as the default config and resolve it relative to the script. Rename machine/human report outputs to docs_nb_checks.{json,md}. Add an optional --targets argument to the report and check subcommands; scan_files now accepts a targets list and filters scanned paths to only those targets. Make --config default a string path and adjust error-exit logic so parsing errors don't cause a non-zero exit in report mode. Minor doc/formatting tweaks.
Update references and examples in .github/tools/docs_and_notebooks_tool_README.md: change config reference to .github/tools/docs_and_notebooks_report_config.yml, update report output paths to tmp/docs_nb_checks/..., simplify the example 'check' command, and replace usages of tools/staleness.py with .github/tools/docs_and_notebooks_check.py in the update/example commands. Also tidy the 'Writes' section formatting.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
C-Achard added 2 commits March 6, 2026 11:38
Rename .github/tools/docs_and_notebooks_* to tools/ and update references. Updated workflow (.github/workflows/docs_and_notebooks_checks.yml) and pre-commit config to call tools/docs_and_notebooks_check.py and use tools/docs_and_notebooks_report_config.yml, updated the tool script's internal docs and the README paths, and tweaked the workflow name to "Docs & notebooks freshness and formatting checks".
C-Achard added 4 commits March 6, 2026 14:24
Treat metadata-only commits specially: add META_COMMIT_MARKER and compute a last_content_updated by skipping commits that contain that marker (falling back to raw git-touched date with a warning). Rename policy/config and record fields from git->content (warn_if_content_older_than_days, last_content_updated, days_since_content_update) and add debug last_git_touched plus last_metadata_updated metadata. Add guardrail requiring --ack-meta-commit-marker when writing metadata/normalizing notebooks, print a suggested commit message, and introduce a new normalize subcommand to deterministically reformat notebooks. Misc: factor git date parsing, update update_files API and behavior to optionally set content dates from git, propagate warnings when fallback is used, and update reporting output to surface content/git-touched/metadata timestamps.
Add a comprehensive contract test suite at tests/tools/docs_and_notebooks_checks/test_check_contracts.py verifying constants, DLCMeta schema, git-derived dates, scan/update/normalize behaviors, and output writing. Also call model_rebuild() for the Pydantic models in tools/docs_and_notebooks_check.py to ensure models are rebuilt correctly when using __future__ annotations so the tests import and validate the runtime models as expected.
Clarify and expand the Docs & Notebooks checks tool README: rename last_git_updated to last_content_updated (computed from git but ignoring metadata-only commits), add last_metadata_updated and verified_for metadata fields, and emphasize separation of content vs metadata. Document the META_COMMIT_MARKER requirement for metadata-only/normalization commits and provide suggested commit messaging and guardrails for update/normalize operations. Reorganize commands (report, check, update, normalize), note that update/normalize are write-only for maintainers, and add CI guidance (use actions/checkout fetch-depth: 0) and required dependencies. Also include troubleshooting tips and mention deterministic notebook normalization and Pydantic model rebuild guidance.
Reduce job timeout from 10 to 5 minutes, upgrade actions/checkout to v6 and actions/setup-python to v6, and allow the staleness policy check step to continue-on-error. These changes use newer action releases, shorten runtime limits, and ensure the optional gate doesn't fail the workflow.
C-Achard added 4 commits March 6, 2026 14:36
Include pydantic>2 in pyproject.toml dependencies to require Pydantic v2+ for project data models/validation and ensure compatibility with code expecting Pydantic v2 behavior.
Differentiate between absent and invalid DLC metadata in notebooks and markdown files. read_ipynb_meta now returns a has_dlc flag; parse_dlc_meta returns (meta, valid). scan_files uses the new flags to set rec.meta and append explicit warnings "missing_metadata" or "invalid_metadata" instead of treating all None as missing. Call sites in update_files and normalize_notebooks updated to unpack the extra return value. Misc cleanup around frontmatter handling and error/warning reporting.
Add two tests to verify notebook metadata validation: one ensures a notebook missing the "deeplabcut" namespace triggers a "missing_metadata" warning and leaves meta as None; the other ensures an invalid "deeplabcut" namespace (bad last_verified value) triggers an "invalid_metadata" warning and leaves meta as None. Both tests create minimal notebooks in a temp git repo, commit them, run the tool scan, and assert the expected warnings and record kinds.
Change GitHub Actions test step to install the package with development extras (pip install -e .[dev]) instead of only pytest. Update pyproject.toml dev dependency group by adding pydantic>2 and nbformat>5 and removing black so CI/tests have the required dev libraries available.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

pyproject.toml:118

  • Version specifiers pydantic>2 / nbformat>5 exclude the major boundary versions (e.g., pydantic 2.0.0, nbformat 5.0.0) and don’t set an upper bound. For stability and to match the constraints used elsewhere in the PR (e.g., pydantic>=2,<3), consider using >=... ,<... style constraints here too.
[dependency-groups]
dev = [
  "pydantic>2",
  "nbformat>5",
  "coverage",
  "pytest",
  "pytest-cov",
]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 63 to 85
- name: Install dependencies
shell: bash -el {0} # Important: activates the conda environment
run: |
python -m pip install --upgrade pip setuptools wheel
pip install --no-cache-dir -e .

- name: Install ffmpeg
run: |
if [ "$RUNNER_OS" == "Linux" ]; then
sudo apt-get update
sudo apt-get install ffmpeg
elif [ "$RUNNER_OS" == "macOS" ]; then
brew install ffmpeg || true
else
choco install ffmpeg
fi
shell: bash

- name: Run pytest tests
shell: bash -el {0} # Important: activates the conda environment
run: |
pip install --no-cache-dir pytest
pip install --no-cache-dir -e .[dev]
python -m pytest
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow installs the package twice: first pip install -e . in the dependencies step, then pip install -e .[dev] right before running pytest. This is redundant and increases CI time/variance. Consider installing -e .[dev] once (or installing only the extra test deps you need) and removing the duplicate install.

Copilot uses AI. Check for mistakes.
Comment on lines +313 to +325
# Dry-run normalize: should set would_change True if not normalized
records = tool.normalize_notebooks(
repo_root=repo,
cfg=cfg,
targets=[rel],
write=False,
ack_meta_commit_marker=True,
)
assert len(records) == 1
assert records[0].kind == "ipynb"
# may be True depending on canonical formatting differences
assert records[0].would_change in (True, False)

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_normalize_is_explicit_and_marks_would_change currently asserts records[0].would_change in (True, False), which is always true and doesn’t validate the intended contract. If the test’s purpose is to ensure non-canonical notebooks are detected, it should construct input that is guaranteed to differ from nbformat.writes(..., indent=2) and then assert would_change is True (or otherwise assert on a specific, deterministic condition).

Copilot uses AI. Check for mistakes.

lines.append("## Notes\n")
lines.append("- 'Out of date' does not necessarily mean 'broken'. Use this as a triage signal.\n")
lines.append("- last_git_updated is computed from git history. last_verified is human-controlled.\n\n")
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown report notes mention last_git_updated, but the tool uses last_git_touched / last_content_updated terminology. This is confusing and makes the report inconsistent with the schema and README; update the text to match the actual field names.

Suggested change
lines.append("- last_git_updated is computed from git history. last_verified is human-controlled.\n\n")
lines.append("- last_git_touched / last_content_updated are computed from git history. last_verified is human-controlled.\n\n")

Copilot uses AI. Check for mistakes.
"errors": sum(1 for r in records if r.errors),
"missing_metadata": sum(1 for r in records if "missing_metadata" in r.warnings),
"missing_last_verified": sum(1 for r in records if "missing_last_verified" in r.warnings),
"git_stale": sum(1 for r in records if any(w.startswith("git_stale") for w in r.warnings)),
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

summarize() computes the git_stale total by looking for warnings starting with git_stale, but the scanner emits content_stale>... warnings. As a result, the report summary will always show Git-stale as 0 even when content is stale. Rename the total key and/or update the warning prefix check so the summary matches the emitted warnings, and align the label in to_markdown() accordingly.

Suggested change
"git_stale": sum(1 for r in records if any(w.startswith("git_stale") for w in r.warnings)),
"git_stale": sum(1 for r in records if any(w.startswith("content_stale") for w in r.warnings)),

Copilot uses AI. Check for mistakes.
Comment on lines +554 to +557
# Optional: mark maintenance time if we actually write
if write:
meta.last_metadata_updated = today

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In update_files(), last_metadata_updated is set whenever --write is used (before checking whether any metadata actually changed). This makes the operation non-idempotent and can cause daily rewrites/diffs even when no other fields changed. Consider only setting/updating last_metadata_updated if a write is actually needed (i.e., after detecting a change) so reruns don’t produce churn.

Copilot uses AI. Check for mistakes.
Comment on lines +636 to +654
if write:
# Rewrite notebook in canonical form
write_ipynb_meta(abs_path, nb)

# Update embedded maintenance timestamp
meta = rec.meta or DLCMeta()
meta.last_metadata_updated = today

nb_meta = nb.setdefault("metadata", {})
prev = nb_meta.get(DLC_NAMESPACE, {})
if not isinstance(prev, dict):
prev = {}
merged = dict(prev)
merged.update(meta_to_jsonable(meta))
nb_meta[DLC_NAMESPACE] = merged

# Write again to persist metadata update (still canonical)
write_ipynb_meta(abs_path, nb)
rec.meta = meta
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalize_notebooks() writes the notebook once to normalize formatting, then updates last_metadata_updated and writes again. This doubles I/O and increases the chance of inconsistencies if the first write succeeds but the second fails. You can set the metadata update before writing and perform a single canonical write when write=True.

Copilot uses AI. Check for mistakes.
Comment on lines +51 to +58
- name: Upload staleness artifacts
uses: actions/upload-artifact@v4
with:
name: staleness-report
path: |
tmp/docs_nb_checks/*.json
tmp/docs_nb_checks/*.md
if-no-files-found: error No newline at end of file
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note : in summary this sometimes gets rendered twice, check

C-Achard added 2 commits March 6, 2026 20:01
Introduce a --no-step-summary CLI flag and use it in the GitHub Actions workflow to prevent writing to GITHUB_STEP_SUMMARY during the docs/notebooks check job. The script now respects args.no_step_summary when deciding whether to emit the step summary. Also adjust the summary writing behavior to include the full markdown content (previous truncation to 220 lines was removed/commented) when writing is enabled.
Update GitHub Actions workflow to use actions/checkout@v6 and modify the editable install command used in the test step. Replaces "pip install -e .[dev]" with "pip install -e . --group dev" before running pytest, aligning the workflow with the newer checkout action and the revised dependency installation syntax.
@@ -0,0 +1,186 @@
# Docs & Notebooks Checks Tool
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO -Update CONTRIBUTING.md as well
Document the marker convention (chore(metadata)):

  • when to use it
  • examples of metadata‑only changes
  • why it matters (preserves meaningful content age)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Related to CI/CD jobs and automated testing COLAB documentation documentation updates/comments enhancement New feature or request Jupyter related to jupyter notebooks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

X Tutup