X Tutup
Skip to content

fix(transfer_manager): Prevent path traversal in download_many_to_path#1768

Open
chandra-siri wants to merge 13 commits intomainfrom
disallow_folder_traversal
Open

fix(transfer_manager): Prevent path traversal in download_many_to_path#1768
chandra-siri wants to merge 13 commits intomainfrom
disallow_folder_traversal

Conversation

@chandra-siri
Copy link
Collaborator

@chandra-siri chandra-siri commented Mar 9, 2026

fix(transfer_manager): Prevent path traversal in download_many_to_path

This PR addresses a security vulnerability where download_many_to_path could be exploited to write files outside the intended destination directory.

The fix ensures that the resolved path for each blob download remains within the bounds of the user-provided destination_directory. If a blob name would result in a path outside this directory (e.g., by using ../), a warning is issued, and that specific blob download is skipped. This prevents directory traversal attacks.

Absolute paths in blob names (e.g., /etc/passwd) are now treated as relative to the destination_directory, so /etc/passwd will be downloaded to destination_directory/etc/passwd.

See b/449616593 for more details.

BREAKING CHANGE: Blobs that would resolve to a path outside the destination_directory are no longer downloaded. While this is a security fix, users relying on the previous behavior to write files outside the target directory will see a change.

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/python-storage API. labels Mar 9, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a security issue in the download_many_to_path function of the transfer_manager module. It prevents potential directory traversal vulnerabilities by ensuring that downloaded files are always saved within the intended destination directory. The changes include a new path resolution function, improved documentation, and a warning mechanism to alert users when a blob's path resolves outside the allowed directory.

Highlights

  • Security Fix: This PR prevents downloading files to directories outside the intended destination, addressing a potential security vulnerability.
  • Path Resolution: Introduces a _resolve_path function to safely resolve the destination path for downloaded blobs.
  • Warning Mechanism: Adds a warning message when a blob's resolved path falls outside the specified destination directory.
Changelog
  • google/cloud/storage/transfer_manager.py
    • Added Path import for path manipulation.
    • Added traceback import for debugging.
    • Wrapped the method name in executor.submit with parentheses for clarity.
    • Implemented _resolve_path function to resolve the target directory and blob path.
    • Wrapped the method name in executor.submit with parentheses for clarity.
    • Added a TODO to update the docstring for upload_many_from_filenames.
    • Updated the docstring for download_many_to_path to reflect path resolution.
    • Added a note and TODO to the docstring for download_many_to_path regarding path resolution outside the destination directory.
    • Added a TODO to the docstring for download_many_to_path regarding destination directory parameter.
    • Implemented logic to prevent downloads outside the destination directory and issue a warning.
  • tests/unit/test_transfer_manager.py
    • Modified the expected blob file paths in test_download_many_to_path to include the current working directory.
Activity
  • Introduced Path and traceback imports.
  • Implemented _resolve_path function for secure path resolution.
  • Added warning mechanism to prevent downloads outside the destination directory.
  • Updated docstrings to reflect the new security measures.
  • Modified tests to align with the path resolution changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a Path object and a _resolve_path function to resolve the destination path for downloaded blobs in the download_many_to_path function, preventing path traversal vulnerabilities. The tests were updated to reflect the change in destination path. Review feedback indicates that the implementation of _resolve_path is incorrect for relative paths and will raise a ValueError when blob_path is a relative path. Also, the pathlib.Path.is_relative_to() method was introduced in Python 3.9, and since this library supports Python 3.7+, using this method will cause an AttributeError on Python versions 3.7 and 3.8. The traceback module is imported but never used in the file. The TODO comment should be replaced with a concrete example before merging. The docstring for destination_directory contains a TODO comment, a reference to the old os.path.join implementation, and a redundant parenthetical note. The expression Path(destination_directory).resolve() is evaluated on every iteration of the loop.

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Mar 10, 2026
@chandra-siri chandra-siri marked this pull request as ready for review March 10, 2026 07:06
@chandra-siri chandra-siri requested review from a team as code owners March 10, 2026 07:06
@chandra-siri chandra-siri changed the title fix: prevent downloading file in directory outside fix(transfer_manager): Prevent path traversal in download_many_to_path Mar 10, 2026
@chandra-siri chandra-siri changed the title fix(transfer_manager): Prevent path traversal in download_many_to_path fix(transfer_manager): Prevent path traversal in download_many_to_path Mar 10, 2026
a `blob_name` "/etc/passwd" will be downloaded into
"destination_directory/etc/passwd" instead of "/etc/passwd"
Similarly,
"/tmp/my_fav_blob" downloads to "destination_directory/tmp/my_fav_blob"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a redundant example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. I gave one more example to be clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it if you insist :)

"blobname",
[
"../../local/target",
"../mypath",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krishnamd-jkp edge case we discussed yesterday

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Download directory should be /local/target but I think you are mentioning it as mypath/ below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works for mypath/ as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the googleapis/python-storage API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

X Tutup