Uses SparkSession/Dataframe over SparkContext/RDD when writing model metadata

With https://issues.apache.org/jira/browse/SPARK-48909, Spark ML is now using SparkSession and Dataframe API to write model metadata for example:
`spark.createDataFrame(Seq(Tuple1(metadataJson))).write.text(metadataPath)`

However, SynapseML still relies on SparkContext and the RDD API in places such as [this line](https://github.com/microsoft/SynapseML/blob/298c7ed29c3fec7b8daf4be190295f466e9bb5ee/core/src/main/scala/org/apache/spark/ml/ComplexParamsSerializer.scala#L98). This prevents SynapseML from functioning in environments where RDDs are no longer supported—for example, Databricks clusters with Unity Catalog enabled.

Would it be possible for SynapseML to adopt the changes introduced in SPARK-48909 to ensure compatibility with such environments?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uses SparkSession/Dataframe over SparkContext/RDD when writing model metadata #2401

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uses SparkSession/Dataframe over SparkContext/RDD when writing model metadata #2401

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions