How to read a text file in Python

Reading text files is a fundamental Python skill that every developer needs to master. Python's built-in functions like open() and read() make it straightforward to work with text data in your programs.

This guide covers essential techniques for handling text files efficiently. We've created practical code examples with Claude, an AI assistant built by Anthropic, to help you master file operations.

Basic file reading with `open()` and `read()`

file = open('example.txt', 'r')
content = file.read()
print(content)
file.close()

Hello, World!
This is a sample file.
Python file handling is easy.

The open() function creates a file object that provides a connection to your text file, while the 'r' parameter specifies read-only access. This approach prevents accidental file modifications and optimizes memory usage when working with large files.

Python's read() method loads the entire file content into memory as a single string. While this works well for small files, you should consider alternative methods for large files to avoid memory constraints. The close() call properly releases system resources after you finish reading.

Memory efficiency matters. Reading the whole file at once suits small text files but can strain resources with larger ones
Always close files explicitly to prevent resource leaks and potential data corruption
The read-only mode adds a layer of safety to your file operations

Common file reading techniques

Python offers several smarter ways to handle text files beyond basic read() operations, giving you more control over memory usage and error handling.

Reading a file line by line with a `for` loop

file = open('example.txt', 'r')
for line in file:
    print(line.strip())  # strip() removes newline characters
file.close()

Hello, World!
This is a sample file.
Python file handling is easy.

This approach processes text files one line at a time, making it ideal for handling larger files efficiently. The for loop automatically iterates through each line of the file, keeping only one line in memory at a time.

The strip() method removes both leading and trailing whitespace, including the newline character (\n) that typically appears at the end of each line
Python's file object acts as an iterator, eliminating the need for complex loop management or manual line counting
This method maintains consistent memory usage regardless of file size, unlike reading the entire file at once

The line-by-line technique balances simplicity with performance. It provides granular control over file processing while keeping your code clean and maintainable.

Using `readlines()` to get a list of lines

file = open('example.txt', 'r')
lines = file.readlines()
print(lines)
file.close()

['Hello, World!\n', 'This is a sample file.\n', 'Python file handling is easy.']

The readlines() method loads all lines from a text file into a Python list. Each line becomes a separate string element, preserving newline characters (\n) at the end of each line except the last one.

Unlike reading line-by-line with a loop, readlines() stores the entire file content in memory at once
This approach works well when you need random access to lines or plan to modify the content
For very large files, consider using line-by-line reading instead to manage memory efficiently

The resulting list structure makes it easy to process lines using Python's built-in list operations. You can slice, sort, or filter lines without additional file operations.

Using `with` statement for safer file handling

with open('example.txt', 'r') as file:
    content = file.read()
    print(content)
# File is automatically closed when leaving the with block

Hello, World!
This is a sample file.
Python file handling is easy.

The with statement provides a cleaner, more reliable way to handle file operations in Python. It automatically manages system resources by closing the file when you're done, even if errors occur during execution.

Python guarantees file closure when the code block completes or encounters an error
The as keyword creates a temporary variable (file) that exists only within the indented block
This approach eliminates the need for explicit close() calls, reducing the chance of resource leaks

Modern Python developers prefer the with statement because it combines safety with simplicity. The syntax clearly shows where file operations begin and end, making code more maintainable and less prone to bugs.

Advanced file operations

Python's file handling capabilities extend far beyond basic reading operations with powerful tools like seek(), tell(), and pathlib that give you precise control over file processing.

Reading specific portions with `seek()` and `tell()`

with open('example.txt', 'r') as file:
    file.seek(7)  # Move to the 7th byte in the file
    partial = file.read(5)  # Read 5 characters
    position = file.tell()  # Get current position
    print(f"Read '{partial}' and now at position {position}")

Read 'World' and now at position 12

The seek() and tell() functions give you precise control over file navigation. seek() moves the file pointer to a specific byte position, while tell() reports the current position in the file.

The seek(7) command positions the pointer at the 7th byte, skipping "Hello, " to start reading from "World"
read(5) retrieves exactly 5 characters from the current position
tell() confirms our new position at byte 12, which accounts for the initial seek plus the five characters we read

This granular control proves invaluable when you need to extract specific portions of text files or implement features like resumable downloads.

Working with different file encodings

with open('unicode_example.txt', 'r', encoding='utf-8') as file:
    content = file.read()
    print(f"File contains {len(content)} characters")
    print(content[:20])  # First 20 characters

File contains 45 characters
こんにちは, 世界! Hello

Python's encoding parameter enables you to work with text files containing characters from different languages and writing systems. The utf-8 encoding handles most international text formats reliably, making it the standard choice for modern applications.

The encoding parameter tells Python how to interpret the bytes in your text file
Without proper encoding, special characters and non-English text might appear garbled or cause errors
The len() function counts characters accurately regardless of their byte size in UTF-8

String slicing with content[:20] works seamlessly with encoded text. Python treats each character as a single unit, whether it's an English letter, Japanese character, or emoji.

Using `pathlib` for modern file operations

from pathlib import Path

file_path = Path('example.txt')
text = file_path.read_text(encoding='utf-8')
print(f"File exists: {file_path.exists()}")
print(text[:15])  # First 15 characters

File exists: True
Hello, World!
T

The pathlib module modernizes file handling in Python by treating file paths as objects instead of plain strings. This approach provides cleaner syntax and more intuitive operations for working with files.

The Path class creates a path object that represents your file location, making it easier to check file existence with exists()
The read_text() method simplifies file reading by combining multiple operations into one line. It automatically handles file opening and closing
Setting the encoding parameter ensures proper handling of special characters and international text

This object-oriented approach reduces common file handling errors and makes your code more maintainable. The pathlib module integrates seamlessly with other Python features like string formatting and slicing operations.

Processing CSV files for data analysis

Python's csv module transforms raw spreadsheet data into actionable insights by efficiently parsing comma-separated values and enabling rapid calculations across large datasets.

import csv

with open('sales_data.csv', 'r') as file:
    csv_reader = csv.reader(file)
    headers = next(csv_reader)
    total_sales = 0
    for row in csv_reader:
        total_sales += float(row[2])
    print(f"Total sales: ${total_sales:.2f}")

This code efficiently processes a CSV file containing sales records. The csv.reader() creates an iterator that reads each row as a list, making it easy to handle structured data. The next() function skips the first row containing column headers.

Each row represents a sales record, with the third column (index 2) containing the sale amount
The float() conversion transforms the string value into a number for calculations
The f-string formats the total with two decimal places and a dollar sign

The with statement ensures proper file handling by automatically closing the file after processing. This pattern works well for both small and large datasets since it processes one row at a time.

Analyzing log files with `re` for error monitoring

Python's re module combines with file handling to extract critical error patterns from log files, enabling developers to track and analyze application issues systematically.

import re
from collections import Counter

error_pattern = r"ERROR: (.*)"
errors = []

with open('application.log', 'r') as log_file:
    for line in log_file:
        match = re.search(error_pattern, line)
        if match:
            errors.append(match.group(1))

error_counts = Counter(errors)
print(f"Found {len(errors)} errors. Most common:")
for error, count in error_counts.most_common(3):
    print(f"{count} occurrences: {error}")

This code efficiently scans a log file to identify and count error messages. The re.search() function looks for lines matching the pattern ERROR: followed by any text. Each error message gets stored in a list for analysis.

The Counter class transforms the error list into a frequency table
The most_common(3) method reveals the top three recurring errors
Line-by-line processing keeps memory usage low even with large log files

The script outputs a summary showing the total error count and details about the most frequent issues. This approach helps developers quickly identify problematic patterns in their application logs.

Common errors and challenges

Python's file handling operations can trigger several common errors that require careful handling to maintain robust code functionality.

Handling `FileNotFoundError` gracefully

The FileNotFoundError occurs when Python can't locate a file you're trying to access. The basic file reading code below demonstrates a common mistake. It assumes the target file exists without implementing proper error checks.

def read_config(filename):
    file = open(filename, 'r')
    content = file.read()
    file.close()
    return content

# Will crash if config.txt doesn't exist
config = read_config('config.txt')
print("Configuration loaded")

The code fails because it directly attempts to open and read the file without checking its existence first. This creates an unhandled exception that crashes the program. The following code demonstrates a more resilient approach.

def read_config(filename):
    try:
        with open(filename, 'r') as file:
            return file.read()
    except FileNotFoundError:
        print(f"Config file {filename} not found, using defaults")
        return "default_setting=True"

config = read_config('config.txt')
print("Configuration loaded")

The improved code wraps file operations in a try-except block to handle missing files gracefully. Instead of crashing, it provides a default configuration when the file isn't found. The with statement ensures proper file closure regardless of success or failure.

Watch for this error when working with external configuration files, user-provided paths, or dynamic file generation
Always validate file existence before critical operations
Consider providing meaningful fallback behavior instead of just catching the error

Resolving `UnicodeDecodeError` with proper encoding

The UnicodeDecodeError appears when Python can't properly interpret special characters in text files. This common issue occurs when reading files containing non-ASCII characters like emojis or international text without specifying the correct encoding.

# Trying to read a UTF-8 file with default encoding
with open('international_text.txt', 'r') as file:
    content = file.read()  # May raise UnicodeDecodeError
    print(content)

The code assumes all text files use your system's default character encoding. When the file contains special characters like emojis or international text, Python can't decode them properly. The solution appears in the code below.

# Specifying the correct encoding
with open('international_text.txt', 'r', encoding='utf-8') as file:
    content = file.read()
    print(content)

The encoding='utf-8' parameter tells Python to interpret text using UTF-8, the standard encoding that supports international characters, emojis, and special symbols. This simple addition prevents decoding errors when your files contain non-ASCII text.

Watch for this error when processing user-uploaded files or data from international sources
Always specify UTF-8 encoding for web scraping and API responses
Text editors sometimes save files in different encodings. Check the file encoding if you encounter unexpected errors

Understanding file position when reading multiple times

Reading a file multiple times requires careful attention to the file pointer's position. When you call methods like read() or readline(), Python tracks your location in the file. The following code demonstrates a common mistake developers make when attempting sequential reads.

with open('example.txt', 'r') as file:
    first_line = file.readline()
    print(f"First line: {first_line.strip()}")
    
    # Trying to read the whole file again
    all_content = file.read()
    print(f"All content has {len(all_content)} characters")  # Fewer than expected

The file pointer remains at the end after the first readline() operation. Any subsequent read attempts will start from this position instead of the beginning. The code below demonstrates the proper way to handle multiple reads.

with open('example.txt', 'r') as file:
    first_line = file.readline()
    print(f"First line: {first_line.strip()}")
    
    # Reset the file position to the beginning
    file.seek(0)
    all_content = file.read()
    print(f"All content has {len(all_content)} characters")

The seek(0) command resets the file pointer to the beginning, enabling you to read the file's content multiple times within the same open session. Without this reset, subsequent reads would start from wherever the pointer last stopped, potentially missing content.

Watch for this issue when performing multiple read operations on the same file object
The file pointer moves forward automatically as you read. Each read() or readline() call advances it
Consider using seek() strategically when you need to process the same content in different ways

This pattern proves especially useful when validating file content before processing or when implementing features like progress tracking in file operations.

How to read a text file in Python

Basic file reading with `open()` and `read()`

Common file reading techniques

Reading a file line by line with a `for` loop

Using `readlines()` to get a list of lines

Using `with` statement for safer file handling

Advanced file operations

Reading specific portions with `seek()` and `tell()`

Working with different file encodings

Using `pathlib` for modern file operations

Processing CSV files for data analysis

Analyzing log files with `re` for error monitoring

Common errors and challenges

Handling `FileNotFoundError` gracefully

Resolving `UnicodeDecodeError` with proper encoding

Understanding file position when reading multiple times

FAQs

What is the difference between 'r' and 'rt' mode when opening a file?

How do you handle file encoding issues when reading text files?

What happens if you try to read a file that doesn't exist?

Do you need to manually close a file when using the with statement?

What's the difference between read(), readline(), and readlines() methods?

🏠

How to read a text file in Python

Basic file reading with open() and read()

Common file reading techniques

Reading a file line by line with a for loop

Using readlines() to get a list of lines

Using with statement for safer file handling

Advanced file operations

Reading specific portions with seek() and tell()

Working with different file encodings

Using pathlib for modern file operations

Processing CSV files for data analysis

Analyzing log files with re for error monitoring

Common errors and challenges

Handling FileNotFoundError gracefully

Resolving UnicodeDecodeError with proper encoding

Understanding file position when reading multiple times

FAQs

What is the difference between 'r' and 'rt' mode when opening a file?

How do you handle file encoding issues when reading text files?

What happens if you try to read a file that doesn't exist?

Do you need to manually close a file when using the with statement?

What's the difference between read(), readline(), and readlines() methods?

🏠

Basic file reading with `open()` and `read()`

Reading a file line by line with a `for` loop

Using `readlines()` to get a list of lines

Using `with` statement for safer file handling

Reading specific portions with `seek()` and `tell()`

Using `pathlib` for modern file operations

Analyzing log files with `re` for error monitoring

Handling `FileNotFoundError` gracefully

Resolving `UnicodeDecodeError` with proper encoding