File Upload¶
While it is possible to manually parse uploaded file content and metadata from the wsgi.input entry in the environment dictionary, doing so is often verbose and error-prone.
To simplify this, we will use the Multipart parser module. It provides a convenient parse_form_data function that returns a pair of MultiDict instances, each containing several MultipartPart objects.
The file_upload.py script below demonstrates how to upload and save multiple files:
#!/usr/bin/env python
from wsgiref.simple_server import make_server
from wsgiref.types import WSGIEnvironment, StartResponse
from wsgiref.validate import validator
from multipart import parse_form_data # type: ignore
from html import escape
from os.path import basename, join
html_template = '''
<!DOCTYPE html>
<html>
<body>
<form method="post" action="" enctype="multipart/form-data">
<fieldset>
<legend>Files to upload</legend>
<label for="file1">First File</label>
<input id="file1" type="file" name="file1" />
<label for="file2">Second File</label>
<input id="file2" type="file" name="file2" />
</fieldset>
<p>
<input type="submit" value="Upload Files"/>
</p>
</form>
<p>Uploaded Files: <pre>{files_items}</pre></p>
</body>
</html>
'''
# chose a suitable upload path directory
# and set permissions for the http server user
# In Linux /tmp is already permitted
upload_directory = r'/Uploaded_files'
def application (
environ: WSGIEnvironment, start_response: StartResponse
):
# MultiDict instances.
# One for form data and the other for files
forms, files = parse_form_data(environ)
# Each uploaded file properties goes in a list item
file_items = []
# loop over each uploaded file
for key in files.keys():
# MultipartPart instance
uploaded_file = files[key]
# read and close the file as binary
file_content = uploaded_file.file.read()
uploaded_file.file.close()
file_content_size = len(file_content)
# The name attribute of the <input> tag
name = uploaded_file.name
# basename strips leading path from file name
# to avoid directory traversal attacks
upload_path = join(
upload_directory, basename(uploaded_file.filename)
)
# format properties as text for presentation
file_items.append ('\n'.join([
f'{name} file size: {file_content_size}',
f'{name} upload path: {upload_path}',
f'{name} filename: {uploaded_file.filename}',
f'{name} size: {uploaded_file.size}',
f'{name} charset: {uploaded_file.charset}',
f'{name} Content-Disposition: {uploaded_file.disposition}',
f'{name} Content-Type: {uploaded_file.content_type}',
f'{name} Content: {file_content}',
]))
# save the file
open(
# strip leading path from file name
# to avoid directory traversal attacks
f'{upload_path}', 'wb'
).write(file_content)
output_values = dict(
files_items = escape(
'\n\n'.join([file for file in file_items])
),
)
response_body = html_template.format(**output_values).encode()
status = '200 OK'
response_headers = [
('Content-Type', 'text/html; charset=utf-8'),
('Content-Length', str(len(response_body)))
]
start_response(status, response_headers)
return [response_body]
validator_app = validator(application)
httpd = make_server('localhost', 8051, validator_app)
httpd.serve_forever()
Save the file, change the upload_directory path variable to a suitable location and execute the script.
On Fedora and likely all Red Hat-derived distributions SELinux will prevent the apache user from accessing the shelve file in the /tmp directory. To resolve this, run the following commands as root:
# setsebool -P httpd_unified on
# semanage boolean -l | grep httpd_unified
$ python /path/to/file_upload.py
Then, access it at http://localhost:8051