Skip to main content

Overview

PhpSpreadsheet provides multiple ways to read spreadsheet files from various formats. You can use the IOFactory class for automatic file type detection or create specific reader instances for known file types.

Using IOFactory::load()

The simplest way to load a file is using the static load() method, which automatically detects the file type:
use PhpOffice\PhpSpreadsheet\IOFactory;

$spreadsheet = IOFactory::load('path/to/file.xlsx');
This method:
  • Attempts to identify the file type based on extension
  • Tests the file with appropriate readers
  • Loads the file and returns a Spreadsheet object

Loading with Flags

You can pass flags to control how the file is loaded:
use PhpOffice\PhpSpreadsheet\IOFactory;
use PhpOffice\PhpSpreadsheet\Reader\IReader;

$spreadsheet = IOFactory::load(
    'file.xlsx',
    IReader::READ_DATA_ONLY | IReader::IGNORE_EMPTY_CELLS
);
Available flags:
  • IReader::LOAD_WITH_CHARTS - Load charts from the file
  • IReader::READ_DATA_ONLY - Read only data, not formatting
  • IReader::IGNORE_EMPTY_CELLS - Skip empty cells
  • IReader::IGNORE_ROWS_WITH_NO_CELLS - Skip rows with no cells
  • IReader::ALLOW_EXTERNAL_IMAGES - Allow loading external images
  • IReader::DONT_ALLOW_EXTERNAL_IMAGES - Block external images

Limiting Reader Types

If you know the file format but want to test only specific readers:
$testAgainstFormats = [
    IOFactory::READER_XLSX,
    IOFactory::READER_XLS,
];

$spreadsheet = IOFactory::load('file.xls', 0, $testAgainstFormats);

Creating Reader Instances

Using IOFactory::createReader()

Create a reader for a specific format:
use PhpOffice\PhpSpreadsheet\IOFactory;

// Create reader by type
$reader = IOFactory::createReader('Xlsx');
$spreadsheet = $reader->load('file.xlsx');
Available reader types:
  • Xlsx - Excel 2007+ (.xlsx)
  • Xls - Excel 97-2003 (.xls)
  • Xml - Excel 2003 XML (.xml)
  • Ods - OpenDocument Spreadsheet (.ods)
  • Slk - Symbolic Link (.slk)
  • Gnumeric - Gnumeric (.gnumeric)
  • Html - HTML (.html, .htm)
  • Csv - CSV files (.csv)

Direct Instantiation

You can also instantiate readers directly:
use PhpOffice\PhpSpreadsheet\Reader\Xlsx;

$reader = new Xlsx();
$spreadsheet = $reader->load('file.xlsx');

Identifying File Types

Use IOFactory::identify() to detect the file type before loading:
use PhpOffice\PhpSpreadsheet\IOFactory;

// Get file type as string (e.g., "Xlsx")
$fileType = IOFactory::identify('file.xlsx');

// Get full class name (since PhpSpreadsheet 4.1.0)
$className = IOFactory::identify('file.xlsx', null, true);

// Create reader based on identified type
$reader = IOFactory::createReader($fileType);
$spreadsheet = $reader->load('file.xlsx');

Using createReaderForFile()

This method combines file type detection and reader creation:
$reader = IOFactory::createReaderForFile('file.xlsx');
$spreadsheet = $reader->load('file.xlsx');

Reader Options

Read Data Only

Ignore formatting and structure, read only cell values:
$reader = IOFactory::createReader('Xlsx');
$reader->setReadDataOnly(true);
$spreadsheet = $reader->load('file.xlsx');
When reading data only, date formatting is not loaded, so dates appear as numeric values. The Gnumeric reader is an exception and can still identify dates.

Ignore Empty Cells

Skip loading cells with null or empty values to save memory:
$reader->setReadEmptyCells(false);

Load Specific Sheets

Load only named worksheets:
// Load single sheet
$reader->setLoadSheetsOnly('Sheet1');

// Load multiple sheets
$reader->setLoadSheetsOnly(['Sheet1', 'Sheet3']);

// Reset to load all sheets
$reader->setLoadAllSheets();

Load Charts

Include charts when loading:
$reader->setIncludeCharts(true);
$spreadsheet = $reader->load('file.xlsx');

// Or using flags
$spreadsheet = $reader->load('file.xlsx', IReader::LOAD_WITH_CHARTS);

Using Read Filters

Read filters allow you to load only specific cells or ranges, which is useful for large files:
use PhpOffice\PhpSpreadsheet\Reader\IReadFilter;

class MyReadFilter implements IReadFilter
{
    public function readCell($columnAddress, $row, $worksheetName = ''): bool
    {
        // Read only rows 1-10 and columns A-E
        if ($row >= 1 && $row <= 10) {
            if (in_array($columnAddress, range('A', 'E'))) {
                return true;
            }
        }
        return false;
    }
}

$reader = IOFactory::createReader('Xlsx');
$reader->setReadFilter(new MyReadFilter());
$spreadsheet = $reader->load('large-file.xlsx');

Chunking Large Files

Process large files in chunks to reduce memory usage:
class ChunkReadFilter implements IReadFilter
{
    private int $startRow = 0;
    private int $endRow = 0;

    public function setRows(int $startRow, int $chunkSize): void
    {
        $this->startRow = $startRow;
        $this->endRow = $startRow + $chunkSize;
    }

    public function readCell($columnAddress, $row, $worksheetName = ''): bool
    {
        if ($row == 1 || ($row >= $this->startRow && $row < $this->endRow)) {
            return true;
        }
        return false;
    }
}

$reader = IOFactory::createReader('Xlsx');
$chunkSize = 1000;
$chunkFilter = new ChunkReadFilter();
$reader->setReadFilter($chunkFilter);

for ($startRow = 2; $startRow <= 100000; $startRow += $chunkSize) {
    $chunkFilter->setRows($startRow, $chunkSize);
    $spreadsheet = $reader->load('huge-file.xlsx');
    // Process chunk...
}
Read filters don’t renumber cells. If you filter rows 100-200, they remain numbered as 100-200, not 1-101. Use rangeToArray() instead of toArray() when working with filtered data.

Security Considerations

XML External Entity (XXE) Protection

PhpSpreadsheet automatically protects against XXE attacks in XML-based formats (Xlsx, Xml, Ods, Gnumeric). If XML entities are detected in the DOCTYPE, an exception is thrown.

External Images

By default, external images are not loaded. Enable with caution:
$reader->setAllowExternalImages(true);

// Or using flags
$spreadsheet = $reader->load('file.xlsx', IReader::ALLOW_EXTERNAL_IMAGES);

Error Handling

Always wrap file operations in try-catch blocks:
use PhpOffice\PhpSpreadsheet\IOFactory;
use PhpOffice\PhpSpreadsheet\Reader\Exception;

try {
    $spreadsheet = IOFactory::load('file.xlsx');
} catch (Exception $e) {
    die('Error loading file: ' . $e->getMessage());
}

Helper Methods

List Worksheet Names

Get worksheet names without loading the entire file:
$reader = IOFactory::createReader('Xlsx');
$worksheetNames = $reader->listWorksheetNames('file.xlsx');

foreach ($worksheetNames as $name) {
    echo $name . "\n";
}

List Worksheet Info

Get worksheet dimensions and metadata:
$worksheetData = $reader->listWorksheetInfo('file.xlsx');

foreach ($worksheetData as $worksheet) {
    echo "Name: {$worksheet['worksheetName']}\n";
    echo "Rows: {$worksheet['totalRows']}\n";
    echo "Columns: {$worksheet['totalColumns']}\n";
    echo "Range: A1:{$worksheet['lastColumnLetter']}{$worksheet['totalRows']}\n";
}

Best Practices

Performance Tips

  • Use setReadDataOnly(true) when you don’t need formatting
  • Use setReadEmptyCells(false) to reduce memory usage
  • Implement read filters for large files
  • Process large files in chunks
  • Use explicit reader types instead of auto-detection for better performance

Memory Management

  • Close spreadsheet objects when done: $spreadsheet->disconnectWorksheets()
  • Process files in chunks for very large datasets
  • Consider using read filters to load only required data

Build docs developers (and LLMs) love