Native s3 Filesystem Blog by Samrat002 · Pull Request #860 · apache/flink-web

Samrat002 · 2026-06-11T18:58:49Z

No description provided.

gaborgsomogyi · 2026-06-13T12:14:29Z

+slug: "announcing-native-s3-fs"
+url: "/2026/06/14/announcing-native-s3-fs/"
+authors:
+- gabor:


Are there github names?

gaborgsomogyi · 2026-06-13T12:16:39Z

cc @davidradl if you have some time, since you're native 🙂

davidradl · 2026-06-15T08:18:37Z

+
+Apache Flink relies on the underlying filesystem for much of its work: reading and writing application data, materializing streaming sinks, and storing checkpoints and savepoints for recovery. For years, S3 support in Flink meant choosing between two Hadoop-based plugins, each with its own trade-offs and configuration quirks. With Flink 2.3, there is a better option.
+
+Today we're introducing `flink-s3-fs-native`, A ground-up, Hadoop-free S3 filesystem built specifically for Flink. It ships as an experimental opt-in plugin in Flink 2.3, is already running in production at scale at major technology companies, and delivers measurable, reproducible performance gains.


I wonder if the op-in plugin could be turned into a hyperlink to a section that describes it.

davidradl · 2026-06-15T08:21:21Z

+
+| | |
+|---|---|
+| **~2x faster checkpoints** | 48.8 s average vs 90.1 s with the Presto plugin; up to 4.5x at small state sizes |


maybe point to the test section with a hyper link

davidradl · 2026-06-15T08:22:22Z

+| **~2x faster checkpoints** | 48.8 s average vs 90.1 s with the Presto plugin; up to 4.5x at small state sizes |
+| **Drop-in replacement** | Swap the JAR, keep your existing `flink-conf.yaml`, restart your cluster |
+| **No Hadoop dependency** | ~13 MB JAR vs ~30–93 MB; no CVE triage on Hadoop transitive dependencies |
+| **AWS SDK v2** | Async-first I/O; AWS SDK v1 entered maintenance mode December 2025 |


hyper links for the AWS things and end date would be helpful.

davidradl · 2026-06-15T08:24:37Z

+
+Both share a common base layer that adapts a Hadoop `FileSystem` into a Flink `FileSystem`. This adaptation layer adds indirection, limits Flink-specific optimizations, and ties the implementation to Hadoop's configuration model and SDK lifecycle.
+
+As a result, you could have exactly-once sinks or a lighter read path, but not both. In addition, you are carrying Hadoop dependency hell.


nit: hell -> challenges

davidradl · 2026-06-15T08:28:10Z

+
+### Test environment
+
+The benchmark ran on Amazon EKS (ap-south-1) with a Flink 2.1.1 cluster composed of 1 JobManager (2 GB memory, 1 core) and 2 TaskManagers (6 GB memory, 1.5 cores, 4 task slots each) for a total parallelism of 8. The workload targeted 20 GB of RocksDB state with full, non-incremental checkpoints every 60 seconds in EXACTLY_ONCE mode. The test ran for approximately 77 minutes. Configurations for both plugins were identical except for the plugin JAR itself.


Add a caveat that users performance might differ.
Can we point to the payloads - so users can run exactly this benchmark?

davidradl · 2026-06-15T08:31:12Z

+- **Enhanced observability** : S3 operation metrics (latency, retry counts, throughput) exposed through Flink's metric system, giving platform teams visibility into S3 I/O behavior.
+- **Stream-based S3 read/write** : Improving memory efficiency for large object operations.
+
+**Phase 2: Recommended default.** Once stability is proven across a broad set of community deployments, the native plugin will be promoted to the recommended default for new Flink installations. Documentation, quickstarts, and tutorials will be updated accordingly.


how will be know that "proven across a broad set of community deployments" has happened.

davidradl · 2026-06-15T08:32:47Z

+
+**Phase 2: Recommended default.** Once stability is proven across a broad set of community deployments, the native plugin will be promoted to the recommended default for new Flink installations. Documentation, quickstarts, and tutorials will be updated accordingly.
+
+**Phase 3: Legacy deprecation.** The Hadoop and Presto plugins will be formally deprecated with a defined support window before removal.


we could deprecate in phase 1. As we do not intend to enhance these connectors.

davidradl · 2026-06-15T08:33:55Z

+
+`flink-s3-fs-native` is part of Apache Flink and is developed in the open. The module lives at `flink-filesystems/flink-s3-fs-native` in the [Flink repository](https://github.com/apache/flink).
+
+The migration is safe and requires minimal deployment changes. If your team is already evaluating or running this in production, we want to hear from you. Your feedback directly shapes the path from experimental to default.


Maybe be explicit as to how to contact us about this. Is there a tag that should be used in a dev list post?

Native s3 Filesystem Blog

1616168

gaborgsomogyi reviewed Jun 13, 2026

View reviewed changes

davidradl reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native s3 Filesystem Blog#860

Native s3 Filesystem Blog#860
Samrat002 wants to merge 1 commit into
apache:asf-sitefrom
Samrat002:s3-release-blog

Samrat002 commented Jun 11, 2026

Uh oh!

gaborgsomogyi Jun 13, 2026

Uh oh!

gaborgsomogyi commented Jun 13, 2026

Uh oh!

davidradl Jun 15, 2026

Uh oh!

davidradl Jun 15, 2026 •

edited

Loading

Uh oh!

davidradl Jun 15, 2026

Uh oh!

davidradl Jun 15, 2026

Uh oh!

davidradl Jun 15, 2026

Uh oh!

davidradl Jun 15, 2026 •

edited

Loading

Uh oh!

davidradl Jun 15, 2026

Uh oh!

davidradl Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		Apache Flink relies on the underlying filesystem for much of its work: reading and writing application data, materializing streaming sinks, and storing checkpoints and savepoints for recovery. For years, S3 support in Flink meant choosing between two Hadoop-based plugins, each with its own trade-offs and configuration quirks. With Flink 2.3, there is a better option.

		Today we're introducing `flink-s3-fs-native`, A ground-up, Hadoop-free S3 filesystem built specifically for Flink. It ships as an experimental opt-in plugin in Flink 2.3, is already running in production at scale at major technology companies, and delivers measurable, reproducible performance gains.


		Both share a common base layer that adapts a Hadoop `FileSystem` into a Flink `FileSystem`. This adaptation layer adds indirection, limits Flink-specific optimizations, and ties the implementation to Hadoop's configuration model and SDK lifecycle.

		As a result, you could have exactly-once sinks or a lighter read path, but not both. In addition, you are carrying Hadoop dependency hell.


		### Test environment

		The benchmark ran on Amazon EKS (ap-south-1) with a Flink 2.1.1 cluster composed of 1 JobManager (2 GB memory, 1 core) and 2 TaskManagers (6 GB memory, 1.5 cores, 4 task slots each) for a total parallelism of 8. The workload targeted 20 GB of RocksDB state with full, non-incremental checkpoints every 60 seconds in EXACTLY_ONCE mode. The test ran for approximately 77 minutes. Configurations for both plugins were identical except for the plugin JAR itself.


		Phase 2: Recommended default. Once stability is proven across a broad set of community deployments, the native plugin will be promoted to the recommended default for new Flink installations. Documentation, quickstarts, and tutorials will be updated accordingly.

		Phase 3: Legacy deprecation. The Hadoop and Presto plugins will be formally deprecated with a defined support window before removal.


		`flink-s3-fs-native` is part of Apache Flink and is developed in the open. The module lives at `flink-filesystems/flink-s3-fs-native` in the [Flink repository](https://github.com/apache/flink).

		The migration is safe and requires minimal deployment changes. If your team is already evaluating or running this in production, we want to hear from you. Your feedback directly shapes the path from experimental to default.

Conversation

Samrat002 commented Jun 11, 2026

Uh oh!

gaborgsomogyi Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

gaborgsomogyi commented Jun 13, 2026

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

davidradl Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidradl Jun 15, 2026 •

edited

Loading

davidradl Jun 15, 2026 •

edited

Loading