Skip to content

Uses SparkSession/Dataframe over SparkContext/RDD when writing model metadata #2401

@qziyuan

Description

@qziyuan

With https://issues.apache.org/jira/browse/SPARK-48909, Spark ML is now using SparkSession and Dataframe API to write model metadata for example:
spark.createDataFrame(Seq(Tuple1(metadataJson))).write.text(metadataPath)

However, SynapseML still relies on SparkContext and the RDD API in places such as this line. This prevents SynapseML from functioning in environments where RDDs are no longer supported—for example, Databricks clusters with Unity Catalog enabled.

Would it be possible for SynapseML to adopt the changes introduced in SPARK-48909 to ensure compatibility with such environments?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions