-
Couldn't load subscription status.
- Fork 0
#17 DataFrameImplicits #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * @param schema An optional schema to validate if the column already exists (a very low probability) | ||
| * @return A name that can be used as a unique column name | ||
| */ | ||
| def getUniqueName(prefix: String, schema: Option[StructType]): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make this an implicit on StructType. When not checked against a schema it's so trivial, I wouldn't event bother making it a common function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be removed since it will be replaced by this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how the linked code solves for the need of unique column name? 🤔
| log.warn(s"Column '$colName' already exists. The content of the column will be overwritten.") | ||
| overwriteWithErrorColumn(df, colName, colExpr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This again is pretty Encaladus specific. I would make this branch of code and input parameter.
ifExists: (DataFrame, String) => Unit = (_, _) => {}This would also eliminate the need for the specific errorColumn code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So remove all the overwriteWithErrorColumn function?
src/main/scala/za/co/absa/spark/commons/schema/SchemaUtils.scala
Outdated
Show resolved
Hide resolved
…e-implicits # Conflicts: # README.md
Closes #17