-
Notifications
You must be signed in to change notification settings - Fork 76
Detect runtime env #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Detect runtime env #427
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other comments (2)
-
clickhouse-core/src/main/scala/com/clickhouse/spark/client/NodeClient.scala (63-66)
The parseVersionString method returns None when the version string has fewer than 3 parts, but the original code would return "Spark-ClickHouse-Connector" in this case. Consider handling this case explicitly to maintain the same behavior:
private def parseVersionString(version: String): Option[(String, String, String)] = { val parts = version.split("_") if (parts.length < 3) None else Some((parts(0), parts(1), parts(2))) }
-
clickhouse-core/src/main/scala/com/clickhouse/spark/client/NodeClient.scala (78-79)
Consider expanding the shouldInferRuntime method to handle more truthy values for better compatibility with different configuration styles:
private def shouldInferRuntime(): Boolean = { val value = nodeSpec.infer_runtime_env.toLowerCase value == "true" || value == "1" || value == "yes" || value == "on" }
💡 To request another review, post a new comment with "/windsurf-review".
def detectViaThreadNames(): Option[String] = { | ||
val threadNames = Thread.getAllStackTraces.keySet().asScala.map(_.getName.toLowerCase) | ||
|
||
if (threadNames.exists(_.contains("databricks"))) { | ||
Some("Databricks") | ||
} else if (threadNames.exists(t => t.contains("glue") || t.contains("awsglue"))) { | ||
Some("Glue") | ||
} else if (threadNames.exists(_.contains("emr"))) { | ||
Some("EMR") | ||
} else if (threadNames.exists(_.contains("dataproc"))) { | ||
Some("Dataproc") | ||
} else { | ||
None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detectViaThreadNames()
method doesn't check for Synapse in thread names, while the other detection methods do. Consider adding a check for Synapse thread names for consistency.
def detectViaThreadNames(): Option[String] = { | ||
val threadNames = Thread.getAllStackTraces.keySet().asScala.map(_.getName.toLowerCase) | ||
|
||
if (threadNames.exists(_.contains("databricks"))) { | ||
Some("Databricks") | ||
} else if (threadNames.exists(t => t.contains("glue") || t.contains("awsglue"))) { | ||
Some("Glue") | ||
} else if (threadNames.exists(_.contains("emr"))) { | ||
Some("EMR") | ||
} else if (threadNames.exists(_.contains("dataproc"))) { | ||
Some("Dataproc") | ||
} else { | ||
None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detectViaThreadNames()
method doesn't check for HDInsight, but HDInsight is included in the detectViaClassLoader()
method. Consider adding a check for HDInsight thread names for consistency.
def detectViaStackTrace(): Option[String] = { | ||
val stackTrace = Thread.currentThread().getStackTrace | ||
val stackClasses = stackTrace.map(_.getClassName.toLowerCase) | ||
|
||
// Check for platform-specific classes in stack | ||
if ( | ||
stackClasses.exists(c => | ||
c.contains("com.databricks.logging") || | ||
c.contains("databricks.spark") || | ||
c.contains("com.databricks.backend") | ||
) | ||
) { | ||
Some("Databricks") | ||
} else if ( | ||
stackClasses.exists { c => | ||
c.contains("com.amazonaws.services.glue") || | ||
c.contains("aws.glue") || | ||
c.contains("awsglue") | ||
} | ||
) { | ||
Some("Glue") | ||
} else if ( | ||
stackClasses.exists(c => | ||
c.contains("com.amazon.emr") || | ||
c.contains("amazon.emrfs") | ||
) | ||
) { | ||
Some("EMR") | ||
} else if ( | ||
stackClasses.exists(c => | ||
c.contains("com.google.cloud.dataproc") || | ||
c.contains("dataproc") | ||
) | ||
) { | ||
Some("Dataproc") | ||
} else if ( | ||
stackClasses.exists(c => | ||
c.contains("com.microsoft.azure.synapse") || | ||
c.contains("synapse.spark") | ||
) | ||
) { | ||
Some("Synapse") | ||
} else { | ||
None | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detectViaStackTrace()
method doesn't check for HDInsight, but HDInsight is included in the detectViaClassLoader()
method. Consider adding a check for HDInsight in stack traces for consistency.
Only Glue (AWS) and Databricks environments were tested and verified; the rest of the managed services were added based on online research and vendor documentation.
close #397