-
Couldn't load subscription status.
- Fork 9
Bugfix/48 no storer write #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- hdfs test enabled for build, while s3 ignored - readme update
TODO test/testadd if it works with s3-over-hadoopFs like this
| case _ => | ||
| Atum.log.info("No usable storer is set, therefore no data will be written the automatically with DF-save to an _INFO file.") | ||
| Atum.log.debug(s"SparkQueryExecutionListener.onSuccess: writing to Hadoop FS") | ||
| writeInfoFileForQuery(qe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the info file write here was the main cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Reviewed, Built, Ran in a stand-alone project.
| df.write.mode(SaveMode.Overwrite) | ||
| .parquet(outputPath) | ||
|
|
||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I get how this is styled. Shouldn't there be some keyword here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what what?
The {} block is used to limit the visibility of val outputPath, as a logical constraint. Or are you discussing the formatting of the block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, interesting, haven't thought about it that way. I have never seen a standalone block like this in scala. So it was weird to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just small stuff, mostly code style.
atum/src/main/scala/za/co/absa/atum/core/SparkQueryExecutionListener.scala
Outdated
Show resolved
Hide resolved
examples/src/test/scala/za/co/absa/atum/HdfsInfoIntegrationSuite.scala
Outdated
Show resolved
Hide resolved
examples/src/test/scala/za/co/absa/atum/HdfsInfoIntegrationSuite.scala
Outdated
Show resolved
Hide resolved
| import za.co.absa.atum.utils._ | ||
|
|
||
| class SampleMeasurementsS3RunnerSpec extends AnyFunSuite | ||
| @Ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because unlike the hadoop-fs tests, these tests should not be run against actual S3. Thus, they now:
- serve as an example
- can be run manually, provided certain conditions are met (files exist on S3 inside a specified bucket, KEY ID is supplied, local
samlprofile is supplied)
atum/src/main/scala/za/co/absa/atum/utils/ExecutionPlanUtils.scala
Outdated
Show resolved
Hide resolved
6a03ada to
a051dfa
Compare
|
I guess I can approve the functionality as we well. |
% Conflicts: % examples/pom.xml
Bugfix of #48.
A small integTest has been added to instill the correct behavior of this very case. Also tested with Enceladus's
aws-pocand the_INFOfile generation was observed without problems (without changes - explicit output path is used there anyway)