DevToolsForYou
Data Formats

How to Read and Write CSV

A practical guide to the CSV format — structure rules, edge cases with quotes and commas, parsing in JavaScript and Python, and common pitfalls.

2 min readUpdated Apr 11, 2026

CSV structure rules (RFC 4180)

CSV (Comma-Separated Values) is a plain-text format where each line is a record and fields within a record are separated by commas. There is no single definitive standard, but RFC 4180 is the most widely followed. Key rules: the optional header row has the same format as data rows; each row should have the same number of fields; line endings are CRLF in the spec but LF is widely accepted.

text
name,age,city
Alice,30,New York
Bob,25,London
"Carol, Jr.",28,"Paris, France"

Quoting rules

A field must be wrapped in double quotes if it contains a comma, a double quote, or a newline. A double quote inside a quoted field is escaped by doubling it.

text
# Field containing a comma — must be quoted
"New York, NY"

# Field containing a double quote — escape by doubling
"He said ""hello"""

# Field containing a newline — must be quoted
"line one
line two"

# Optional: quote a plain field (allowed but unnecessary)
"Alice"

Parsing CSV in JavaScript

Never split on commas manually — it breaks on quoted fields. Use a library or the native approach below for simple cases. For production use, papaparse is the most reliable browser/Node library.

javascript
// Using PapaParse (browser or Node)
import Papa from 'papaparse';

// Parse a CSV string
const result = Papa.parse(csvString, {
  header: true,       // first row becomes object keys
  skipEmptyLines: true,
  dynamicTyping: true, // numbers stay numbers
});
console.log(result.data);
// [ { name: 'Alice', age: 30, city: 'New York' }, ... ]

// Parse a file in Node.js
import fs from 'node:fs';
const csv = fs.readFileSync('data.csv', 'utf8');
const { data } = Papa.parse(csv, { header: true });

Parsing CSV in Python

Python's standard library includes the csv module which handles quoting and escaping correctly.

python
import csv

# Reading
with open('data.csv', newline='', encoding='utf-8') as f:
    reader = csv.DictReader(f)  # header row → dict keys
    for row in reader:
        print(row['name'], row['age'])

# Writing
fields = ['name', 'age', 'city']
rows = [{'name': 'Alice', 'age': 30, 'city': 'New York'}]

with open('out.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(f, fieldnames=fields)
    writer.writeheader()
    writer.writerows(rows)

Common pitfalls

Encoding: always specify UTF-8 explicitly — Excel defaults to a system encoding and will corrupt non-ASCII characters. Line endings: Windows tools write CRLF; Unix tools write LF. Open files in text mode (not binary) to let the platform normalize them. Trailing commas: some exporters add a trailing comma on every row, producing an extra empty field. BOM: Excel adds a UTF-8 BOM (EF BB BF) to its CSV exports; strip it with encoding='utf-8-sig' in Python.

Frequently asked questions

What is the difference between CSV and TSV?

TSV (Tab-Separated Values) uses a tab character as the delimiter instead of a comma. Because tabs rarely appear in data, TSV usually requires less quoting. It is common in bioinformatics and data pipelines but less common in user-facing tools.

Can CSV store multiple data types?

CSV has no type system — everything is a string. The consumer decides whether to parse a field as a number, date, or boolean. This is a common source of bugs: Excel will silently convert values like '001' or '1/2' to numbers or dates.

Related guidesAll guides →