Skip to main content
The arrow package for R provides access to Apache Arrow’s C++ library, enabling fast data processing, dplyr integration, and support for various data formats including Parquet and Arrow IPC.

Quick Install

Install the latest release from CRAN:
install.packages("arrow")
This works on Windows, macOS, and most Linux distributions without requiring additional system dependencies.

Installation Methods

The simplest method for most users:
1

Install from CRAN

install.packages("arrow")
This installs pre-compiled binaries on Windows and macOS, and attempts to install from source with automatic dependency resolution on Linux.
2

Load the package

library(arrow)
3

Check installation

arrow_info()
This displays information about your Arrow installation, including version and available features.

From R-universe

R-universe provides pre-compiled binaries for the most common operating systems:
install.packages(
  "arrow", 
  repos = c("https://apache.r-universe.dev", "https://cloud.r-project.org")
)
Linux users: Consult the R-universe documentation for guidance on binary availability and repository configuration for your distribution.

From conda-forge

If you’re using conda for environment management:
conda install -c conda-forge --strict-channel-priority r-arrow
The --strict-channel-priority flag ensures all dependencies come from conda-forge for consistency.

Platform-Specific Considerations

macOS

Architecture Compatibility: On macOS, the R version must match your processor architecture:
  • Apple Silicon (M1/M2/M3): Use R compiled for ARM64
  • Intel processors: Use R compiled for x86_64
Using Intel-compiled R on Apple Silicon Macs (even via Rosetta) will result in segfaults and crashes.
1

Check your R architecture

R.version$arch
Should show aarch64 for Apple Silicon or x86_64 for Intel.
2

Install arrow

install.packages("arrow")
3

Verify installation

library(arrow)
arrow_info()

Linux

On Linux, CRAN does not provide pre-compiled binaries, so the package is built from source. The installation process attempts to automatically handle dependencies.
For detailed Linux installation guidance, including distribution-specific instructions, see the official installation guide.

Ubuntu/Debian

For a smoother installation experience, you can pre-install system dependencies:
# Install system dependencies
sudo apt update
sudo apt install -y \
  libcurl4-openssl-dev \
  libssl-dev \
  libxml2-dev

# Then install arrow in R
install.packages("arrow")

RHEL/CentOS/Rocky Linux

# Install system dependencies
sudo yum install -y \
  libcurl-devel \
  openssl-devel \
  libxml2-devel
install.packages("arrow")

Windows

Windows users can simply install from CRAN:
install.packages("arrow")
Compiler Requirement: As of arrow 23.0.0, building from source requires C++20 support. On Windows, this means R 4.3 or later is required. R 4.2 has incomplete C++20 support and may work with special configuration.

Compiler Requirements

As of version 23.0.0, arrow requires C++20 to build from source. This has important implications:
  • Windows: R 4.3+ required (R 4.2 has limited support)
  • Linux: Modern GCC (11+) or Clang (12+) required
  • macOS: Xcode 13+ or Command Line Tools with equivalent version
Pre-built binaries bypass these requirements.

Verifying Your Installation

After installation, verify that arrow is working correctly:
1

Load the library

library(arrow)
2

Check Arrow info

arrow_info()
This displays:
  • Arrow version
  • Available features (Parquet, Dataset, S3, etc.)
  • Build configuration
3

Test basic functionality

# Create a simple table
df <- data.frame(
  name = c("Alice", "Bob", "Charlie"),
  age = c(25, 30, 35)
)

# Convert to Arrow table
tbl <- arrow_table(df)
print(tbl)

# Write to Parquet
write_parquet(df, "test.parquet")

# Read back
df2 <- read_parquet("test.parquet")
print(df2)

Common Use Cases

Reading and Writing Parquet Files

library(arrow)
library(dplyr)

# Write data to Parquet
mtcars %>%
  write_parquet("mtcars.parquet")

# Read Parquet file
df <- read_parquet("mtcars.parquet")

# Read with filtering (pushed down to Arrow)
df_filtered <- read_parquet(
  "mtcars.parquet",
  col_select = c(mpg, cyl, hp),
  as_data_frame = TRUE
)

Working with Large Datasets

library(arrow)
library(dplyr)

# Open a dataset (doesn't load into memory)
ds <- open_dataset("path/to/parquet/files/")

# Query using dplyr (executed in Arrow)
result <- ds %>%
  filter(year == 2023) %>%
  select(name, value) %>%
  group_by(name) %>%
  summarise(total = sum(value)) %>%
  collect()  # Only collect() loads data into R

Using dplyr with Arrow

library(arrow)
library(dplyr)

# Create Arrow table
tbl <- arrow_table(mtcars)

# Use dplyr verbs (computed in Arrow)
result <- tbl %>%
  filter(mpg > 20) %>%
  select(mpg, cyl, hp) %>%
  mutate(kpl = mpg * 0.425144) %>%
  arrange(desc(mpg)) %>%
  collect()  # Convert back to data.frame

Reading from CSV

library(arrow)

# Read CSV with Arrow (fast!)
df <- read_csv_arrow("large_file.csv")

# Or open as dataset for larger-than-memory files
ds <- open_dataset(
  "large_file.csv",
  format = "csv"
)

Installing Nightly Builds

For the latest development features:
install.packages(
  "arrow",
  repos = c("https://nightlies.apache.org/arrow/r", "https://cloud.r-project.org")
)
Nightly builds are development versions and may contain bugs. Use stable releases for production environments.
For more information, see the installing nightly builds guide.

Troubleshooting

Ensure system dependencies are installed:
# Ubuntu/Debian
sudo apt install -y libcurl4-openssl-dev libssl-dev libxml2-dev

# RHEL/CentOS
sudo yum install -y libcurl-devel openssl-devel libxml2-devel
Then retry the installation in R. See the installation details for more help.
This usually indicates an architecture mismatch:
  1. Check your R architecture: R.version$arch
  2. Should be aarch64 for M1/M2/M3 Macs
  3. If showing x86_64, download ARM64 R from CRAN
  4. Reinstall arrow after switching R versions
Check which features are available:
arrow_info()
If features are missing:
  • Try installing from R-universe or conda-forge instead of CRAN
  • CRAN builds include S3 but not GCS support
  • See the cloud storage article for details
You need R 4.3 or later:
  1. Update R to version 4.3+
  2. Or install pre-built binaries from R-universe
  3. See installation details

Additional Resources

Arrow for R Documentation

Complete R package documentation

R Cookbook

Practical recipes for common tasks

Scaling Up with R and Arrow

Free online book

Cheatsheet

Quick reference guide

Next Steps

Now that you have arrow installed:

Build docs developers (and LLMs) love