X Tutup
Skip to content

Update FMPose3D modelzoo integration#3221

Draft
deruyter92 wants to merge 19 commits intoDeepLabCut:mainfrom
deruyter92:jaap/fmpose_modelzoo_integration
Draft

Update FMPose3D modelzoo integration#3221
deruyter92 wants to merge 19 commits intoDeepLabCut:mainfrom
deruyter92:jaap/fmpose_modelzoo_integration

Conversation

@deruyter92
Copy link
Collaborator

@deruyter92 deruyter92 commented Feb 25, 2026

Note: this PR depends on the bleeding-edge code from the FMPose3D repository. After release, the latest FMPose3D dependency needs to be specified in DeepLabCut before merging.

Summary:
This PR adds a dedicated FMPose3D inference path in the PyTorch modelzoo flow and wires early routing for fmpose3d_* models. It relocates FMPose-specific wrappers/inference helpers to pose_estimation_pytorch/modelzoo/fmpose_3d, standardizes outputs to include explicit 3D artifacts (*_3d.h5 / *_3d.json).

Motivation:
Where the first implementation of FMPose3D in #3208 was rather separated from the rest of the modelzoo pipeline, this PR aims to integrate it better in the existing modelzoo video inference pipeline.

Changes:

  1. Revert preliminary implementation in Add fmpose3d functionality #3208:

    • move api wrapper to pose_estimation_pytorch
    • delete recipe notebook on how to use the API
  2. Route video inference through video_inference_superanimal:

    • early branching to own video inference pipeline
    • just a thin wrap around FMPose3D, using the superanimal inference format
  3. Add brief example usage in the existing superanimal docs

Example usage:

import deeplabcut

video_path = "demo-video.mp4"
result = deeplabcut.video_inference_superanimal(
    videos=[video_path],
    superanimal_name="",  # ignored for fmpose3d models
    model_name="fmpose3d_animals",
    batch_size=8,
    fmpose_return_3d=True,  # (optional: include 3D dataframe in returned payload)
)

Additional considerations / limitations:

  • Single-individual 3D limitation: upstream FMPose3D lifting still effectively uses one individual, so multi-animal settings are clamped/warned rather than fully supported.
  • Mixed return contract: enabling fmpose_return_3d changes payload structure, which adds API complexity for callers.
  • Inconsistent parameter intuition: the entire 2D modelzoo convention currently follows the intuition that superanimal_name -> means 'dataset family' and model_name -> means 'model architecture'. For FMPose3D the 'dataset family' is different and therefore ignored.
  • Inseparability: the core principle of FMPose3D is lifting 2D keypoints to 3D, however to simplify the integration, FMPose3D is treated as a single model that directly computes video -> 2D -> 3D and the 2D step is now made inseparable. Therefore, users that want to use the 3D lifting on their own 2D keypoints should use the FMPose3D api, rather than this modelzoo inference function.
  • No Video adaptation / Transfer learning: while the other 2D modelzoo models can be finetuned using video adaptation or further trained using pretrained weight initialization this is not the case for the FMPose3D model. If people want to finetune the backbone 2D estimator, they should work with a standard 2D superanimal model and use the FMPose3D api for lifting their keypoints, which is fairly straightforward.
  • Redundant outputs: consistent with the 2D superanimal pattern, both JSON and HDF 3D artifacts are written, which increases storage and maintenance overhead. This might need to be reconsidered in the future.
  • Format bridge complexity: FMPose outputs are converted into DLC-style 3D DataFrames, consistent with the 3D pipelines in DeepLabCut. While this works fine if formatting conventions are kept the same, a more robust (validated) data format would be useful.

- Add fmpose3d_humans and fmpose3d_animals framework mapping to pytorch.
- Branch early in video_inference_superanimal(...) when model_name.startswith("fmpose3d").
- Route directly to _video_inference_fmpose3d(...)
Add _video_inference_fmpose3d(...) as a dedicated path.

Keep batch loop minimal in DLC:
- result_2d = api.prepare_2d(np.stack(frames))
- result_3d = api.pose_3d(result_2d.keypoints, result_2d.image_size)

DLC responsibilities only:
- iterate video frames/chunks,
- convert 2D results to DLC {"bodyparts": ...} layout,
- save .h5, .json, _3d.json, and optional labeled video.
@deruyter92 deruyter92 requested a review from C-Achard February 25, 2026 16:01
Copy link
Collaborator

@C-Achard C-Achard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good overall. I generally agree with your point on separation of concerns, but probably the main concern for now is that it conforms to the modelzoo API

Comment on lines +102 to +111
import deeplabcut

video_path = "demo-video.mp4"
deeplabcut.video_inference_superanimal(
videos=[video_path],
superanimal_name="superanimal_quadruped", # ignored for fmpose3d models
model_name="fmpose3d_animals",
batch_size=8,
fmpose_return_3d=True, # include 3D dataframe in returned payload
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being able to call it like this is really nice in the end, great job !

@deruyter92 deruyter92 marked this pull request as ready for review February 26, 2026 13:43
@C-Achard C-Achard requested a review from Copilot February 26, 2026 14:03
@deruyter92 deruyter92 removed the request for review from MMathisLab February 26, 2026 14:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates FMPose3D monocular 3D pose estimation into the DeepLabCut modelzoo pipeline by creating a dedicated inference path. It relocates FMPose3D functionality from deeplabcut/modelzoo/fmpose_3d to deeplabcut/pose_estimation_pytorch/modelzoo/fmpose_3d and routes FMPose3D models through the video_inference_superanimal function with early branching. The PR standardizes outputs by writing both 2D predictions and 3D artifacts (*.h5 and *.json files) and provides an optional return flag for in-memory 3D DataFrames.

Changes:

  • Relocated FMPose3D wrappers from modelzoo/fmpose_3d to pose_estimation_pytorch/modelzoo/fmpose_3d with enhanced inference pipeline
  • Added early routing in video_inference_superanimal for fmpose3d_* models with dedicated inference function
  • Removed preliminary FMPose3D recipe notebook and updated documentation to include usage example in ModelZoo.md

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/pose_estimation_pytorch/modelzoo/test_fmpose_integration.py Updated imports and added comprehensive unit tests with new pytest markers (unittest/functional)
pyproject.toml Added pytest markers for fmpose3d, unittest, and functional test categories
docs/recipes/fmpose3d.ipynb Deleted preliminary recipe notebook (reverting PR #3208)
docs/ModelZoo.md Added FMPose3D usage example with video_inference_superanimal API
deeplabcut/pose_estimation_pytorch/modelzoo/fmpose_3d/inference.py New dedicated FMPose3D inference pipeline with 2D-to-3D conversion and dataframe generation
deeplabcut/pose_estimation_pytorch/modelzoo/fmpose_3d/fmpose3d.py Relocated FMPose3D API wrapper with model metadata definitions
deeplabcut/pose_estimation_pytorch/modelzoo/fmpose_3d/init.py Added module with refactoring markers for future keypoint/pandas migrations
deeplabcut/pose_estimation_pytorch/modelzoo/fmpose_3d/README.md Added brief FMPose3D overview documentation
deeplabcut/modelzoo/video_inference.py Added early branching logic for fmpose3d models with parameter forwarding
deeplabcut/modelzoo/models_to_framework.json Registered fmpose3d_humans and fmpose3d_animals as pytorch models
deeplabcut/modelzoo/fmpose_3d/fmpose3d.py Deleted old location of FMPose3D wrapper
deeplabcut/modelzoo/fmpose_3d/init.py Deleted old module initialization
_toc.yml Removed fmpose3d recipe from table of contents

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MMathisLab
Copy link
Member

< Note: this PR depends on the bleeding-edge code from the FMPose3D repository >

Shall I make a new release of FMPose3D on pypi?

@MMathisLab
Copy link
Member

MMathisLab commented Feb 26, 2026

Here I disagree:

Inconsistent parameter intuition: the entire 2D modelzoo convention currently follows the intuition that superanimal_name -> means 'dataset family' and model_name -> means 'model architecture'. For FMPose3D the 'dataset family' is different and therefore ignored.

Because it uses the 2D SA_quadruped data, or human data, for example, then the architecture is the 3D lifting pipeline...

    superanimal_name="",  # ignored for fmpose3d models
    model_name="fmpose3d_animals",

why not use this, which better adheres to the SA style?

    superanimal_name="quadruped",  # quadruped, human
    model_name="fmpose3d", 

and keep naming consistent:

fmpose_return_3d=True --> fmpose3d_return_3d=True

@deruyter92
Copy link
Collaborator Author

< Note: this PR depends on the bleeding-edge code from the FMPose3D repository >

Shall I make a new release of FMPose3D on pypi?

yes that would be great!

@deruyter92 deruyter92 added the enhancement New feature or request label Feb 28, 2026
@deruyter92 deruyter92 marked this pull request as draft February 28, 2026 09:01
@deruyter92
Copy link
Collaborator Author

Note to self, @C-Achard, and @MMathisLab
Changed the status to draft to make sure that it is only merged after

  1. Releasing the latest FMPose3D package
  2. Updating the lower-bound required version of the fmpose3d extra in this PRs requirement.txt and setup.py.

@MMathisLab, let me know when new version of FMPose3D is released, then I'll proceed with updating the requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

X Tutup