Summary
When using df.write.format("avro").save(path) on Dataproc Serverless runtime 3.0 (Spark 4.0.1, Scala 2.13), every avro write fails with:
java.lang.NoClassDefFoundError: scala/collection/immutable/StringOps
at org.apache.spark.sql.avro.AvroFileFormat.supportFieldName(AvroFileFormat.scala:163)
at org.apache.spark.sql.execution.datasources.DataSourceUtils$.$anonfun$checkFieldNames$1(DataSourceUtils.scala:74)
...
Caused by: java.lang.ClassNotFoundException: scala.collection.immutable.StringOps
Root cause
scala.collection.immutable.StringOps exists as a class in Scala 2.12 but was moved to scala.collection.StringOps in Scala 2.13 — scala.collection.immutable.StringOps is only a type alias (no .class file) in 2.13.
The AvroFileFormat.supportFieldName method referenced in the stack trace is not present in spark-avro_2.13-4.0.0.jar from Maven Central (Spark 4.0 migrated spark-avro to DataSource V2). The class loading from the Dataproc Serverless runtime 3.0's internal JAR bundle, which contains a AvroFileFormat compiled against Scala 2.12 while the runtime stdlib is Scala 2.13.
In other words: the runtime ships a Scala 2.12-compiled V1 compatibility shim for AvroFileFormat in a Scala 2.13 environment, causing class loading to fail at the first String operation inside the shim.
Reproduction
On Dataproc Serverless runtime 3.0 (Spark 4.0.1), submit any PySpark batch that writes a DataFrame in avro format:
df.write.mode("overwrite").format("avro").save("gs://my-bucket/output/")
Fails immediately with the ClassNotFoundException above.
- Workaround: use runtime 2.3 (Spark 3.5) instead — avro writes succeed.
- Supplying an external
spark-avro_2.13-4.0.0.jar does not help; the runtime's internal Scala 2.12 AvroFileFormat is still picked up by DataSourceUtils.checkFieldNames.
Environment
- Dataproc Serverless runtime: 3.0.13 (latest as of 2026-06)
- Spark version: 4.0.1
- Scala runtime: 2.13 (confirmed by runtime 3.0 docs)
- External spark-avro JAR:
spark-avro_2.13-4.0.0 (does NOT contain AvroFileFormat — only V2 classes in org.apache.spark.sql.v2.avro.*)
- Runtime 2.3 (Spark 3.5, Scala 2.13) with
spark-avro_2.13-3.5.5.jar: works correctly
Expected behavior
Avro format writes should work on Dataproc Serverless runtime 3.0 (Spark 4.0.1 + Scala 2.13) without ClassNotFoundException.
Suggested fix
Ensure the AvroFileFormat V1 compatibility shim bundled inside the Spark 4.0 / Dataproc runtime 3.0 distribution is compiled against Scala 2.13 (referencing scala.collection.StringOps, not scala.collection.immutable.StringOps).
Summary
When using
df.write.format("avro").save(path)on Dataproc Serverless runtime 3.0 (Spark 4.0.1, Scala 2.13), every avro write fails with:Root cause
scala.collection.immutable.StringOpsexists as a class in Scala 2.12 but was moved toscala.collection.StringOpsin Scala 2.13 —scala.collection.immutable.StringOpsis only a type alias (no.classfile) in 2.13.The
AvroFileFormat.supportFieldNamemethod referenced in the stack trace is not present inspark-avro_2.13-4.0.0.jarfrom Maven Central (Spark 4.0 migrated spark-avro to DataSource V2). The class loading from the Dataproc Serverless runtime 3.0's internal JAR bundle, which contains aAvroFileFormatcompiled against Scala 2.12 while the runtime stdlib is Scala 2.13.In other words: the runtime ships a Scala 2.12-compiled V1 compatibility shim for
AvroFileFormatin a Scala 2.13 environment, causing class loading to fail at the first String operation inside the shim.Reproduction
On Dataproc Serverless runtime 3.0 (Spark 4.0.1), submit any PySpark batch that writes a DataFrame in avro format:
Fails immediately with the
ClassNotFoundExceptionabove.spark-avro_2.13-4.0.0.jardoes not help; the runtime's internal Scala 2.12AvroFileFormatis still picked up byDataSourceUtils.checkFieldNames.Environment
spark-avro_2.13-4.0.0(does NOT containAvroFileFormat— only V2 classes inorg.apache.spark.sql.v2.avro.*)spark-avro_2.13-3.5.5.jar: works correctlyExpected behavior
Avro format writes should work on Dataproc Serverless runtime 3.0 (Spark 4.0.1 + Scala 2.13) without ClassNotFoundException.
Suggested fix
Ensure the
AvroFileFormatV1 compatibility shim bundled inside the Spark 4.0 / Dataproc runtime 3.0 distribution is compiled against Scala 2.13 (referencingscala.collection.StringOps, notscala.collection.immutable.StringOps).