File Upload¶

While it is possible to manually parse uploaded file content and metadata from the wsgi.input entry in the environment dictionary, doing so is often verbose and error-prone.

To simplify this, we will use the Multipart parser module. It provides a convenient parse_form_data function that returns a pair of MultiDict instances, each containing several MultipartPart objects.

The file_upload.py script below demonstrates how to upload and save multiple files:

#!/usr/bin/env python

from wsgiref.simple_server import make_server
from wsgiref.types import WSGIEnvironment, StartResponse
from wsgiref.validate import validator
from multipart import parse_form_data # type: ignore
from html import escape
from os.path import basename, join

html_template = '''
<!DOCTYPE html>
<html>
  <body>
    <form method="post" action="" enctype="multipart/form-data">
      <fieldset>
        <legend>Files to upload</legend>
        <label for="file1">First File</label>
        <input id="file1" type="file" name="file1" />
        <label for="file2">Second File</label>
        <input id="file2" type="file" name="file2" />
      </fieldset>
      <p>
        <input type="submit" value="Upload Files"/>
      </p>
    </form>
    <p>Uploaded Files: <pre>{files_items}</pre></p>
  </body>
</html>
'''

# chose a suitable upload path directory
# and set permissions for the http server user
# In Linux /tmp is already permitted
upload_directory = r'/Uploaded_files'

def application (
   environ: WSGIEnvironment, start_response: StartResponse
):

   # MultiDict instances.
   # One for form data and the other for files
   forms, files = parse_form_data(environ)

   # Each uploaded file properties goes in a list item
   file_items = []

   # loop over each uploaded file
   for key in files.keys():

      # MultipartPart instance
      uploaded_file = files[key]

      # read and close the file as binary
      file_content = uploaded_file.file.read()
      uploaded_file.file.close()
      file_content_size = len(file_content)

      # The name attribute of the <input> tag
      name = uploaded_file.name

      # basename strips leading path from file name
      # to avoid directory traversal attacks
      upload_path = join(
         upload_directory, basename(uploaded_file.filename)
      )

      # format properties as text for presentation
      file_items.append ('\n'.join([
         f'{name} file size: {file_content_size}',
         f'{name} upload path: {upload_path}',
         f'{name} filename: {uploaded_file.filename}',
         f'{name} size: {uploaded_file.size}',
         f'{name} charset: {uploaded_file.charset}',
         f'{name} Content-Disposition: {uploaded_file.disposition}',
         f'{name} Content-Type: {uploaded_file.content_type}',
         f'{name} Content: {file_content}',
      ]))

      # save the file
      open(
         # strip leading path from file name
         # to avoid directory traversal attacks
         f'{upload_path}', 'wb'
      ).write(file_content)

   output_values = dict(
      files_items = escape(
         '\n\n'.join([file for file in file_items])
      ),
   )
   response_body = html_template.format(**output_values).encode()
   status = '200 OK'

   response_headers = [
      ('Content-Type', 'text/html; charset=utf-8'),
      ('Content-Length', str(len(response_body)))
   ]

   start_response(status, response_headers)
   return [response_body]

validator_app = validator(application)

httpd = make_server('localhost', 8051, validator_app)
httpd.serve_forever()

Save the file, change the upload_directory path variable to a suitable location and execute the script.

On Fedora and likely all Red Hat-derived distributions SELinux will prevent the apache user from accessing the shelve file in the /tmp directory. To resolve this, run the following commands as root:

# setsebool -P httpd_unified on
# semanage boolean -l | grep httpd_unified

$ python /path/to/file_upload.py

Then, access it at http://localhost:8051

Table of Contents

Related Topics

File Upload¶