X Tutup
Skip to content

Latest commit

 

History

History
167 lines (119 loc) · 5.19 KB

File metadata and controls

167 lines (119 loc) · 5.19 KB
title Python CSV Module - Python Cheatsheet
description Python has a csv module, which allows us to work easily with CSV files.
Python CSV Module

The csv module provides tools to read from and write to CSV files, which are commonly used for data exchange.

csv Module vs Manual File Read-Write While you can read and write CSV files using basic file operations and string methods (like open() with read or write and split()), the csv module is designed to handle edge cases such as quoted fields, embedded delimiters, and different line endings. It ensures compatibility with CSV files generated by other programs (like Excel) and reduces the risk of parsing errors. For most CSV tasks, prefer the csv module over manual parsing.
For more on file handling basics, see the File and directory Paths page.

To get started, import the module:

import csv

csv.reader()

This function receives a file which needs to be an iterable of strings. In other words, it should be the open file as it follows:

import csv

file_path = 'file.csv'

# Read CSV file
with open(file_path, 'r', newline='') as csvfile:
  reader = csv.reader(csvfile)

  # Iterate through each row
  for line in reader:
    print(line)

This function returns a reader object which can be easily iterated over to obtain each row. Each column in the corresponding rows can be accessed by the index, without the need to use the built-in function split().

csv.writer()

This function receives the file to be written as a csv file, similar to the reader function, it should be invoked as this:

import csv

file_path = 'file.csv'

# Open file for writing CSV
with open(file_path, 'w', newline='') as csvfile:
  writer = csv.writer(csvfile)

  # do something

The "do something" block could be replaced with the use of the following functions:

writer.writerow()

Writes a single row to the CSV file.

# Write header row
writer.writerow(['name', 'age', 'city'])
# Write data row
writer.writerow(['Alice', 30, 'London'])

writer.writerows()

Writes multiple rows at once.

# Prepare multiple rows
rows = [
    ['name', 'age', 'city'],
    ['Bob', 25, 'Paris'],
    ['Carol', 28, 'Berlin']
]
# Write all rows at once
writer.writerows(rows)

csv.DictReader

Allows you to read CSV files and access each row as a dictionary, using the first row of the file as the keys (column headers) by default.

import csv

# Read CSV as dictionary (first row becomes keys)
with open('people.csv', 'r', newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    # Access columns by name instead of index
    for row in reader:
        print(row['name'], row['age'])
  • Each row is an OrderedDict (or a regular dict in Python 3.8+).

  • If your CSV does not have headers, you can provide them with the fieldnames parameter:

    reader = csv.DictReader(csvfile, fieldnames=['name', 'age', 'city'])

csv.DictWriter

Lets you write dictionaries as rows in a CSV file. You must specify the fieldnames (column headers) when creating the writer.

import csv

fieldnames = ['name', 'age', 'city']
rows = [
    {'name': 'Alice', 'age': 30, 'city': 'London'},
    {'name': 'Bob', 'age': 25, 'city': 'Paris'}
]

# Write dictionaries to CSV
with open('people_dict.csv', 'w', newline='') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()  # writes the header row
    writer.writerows(rows)
  • Use writer.writeheader() to write the column headers as the first row.
  • Each dictionary in writer.writerows() must have keys matching the fieldnames specified when creating the writer.

Additional params to csv.reader() and csv.writer()

delimiter

Should be the character used to separate the fields. As the file type says, the default is the comma ','. Depending on the locale, Excel might generate csv files with the semicolon as a delimiter.

import csv

# Read CSV with semicolon delimiter
with open('data_semicolon.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=';')
    for row in reader:
        print(row)

lineterminator

Character or sequence of characters to end a line. Most common is "\r\n" but it could be "\n".

quotechar

Character used to quote fields containing special characters (default is ").

# Use single quote as quote character
reader = csv.reader(csvfile, quotechar="'")

For more details, see the official Python csv module documentation.

X Tutup