Skip to content

Conversation

@dk1844
Copy link
Collaborator

@dk1844 dk1844 commented Dec 14, 2020

Bugfix of #48.

A small integTest has been added to instill the correct behavior of this very case. Also tested with Enceladus's aws-poc and the _INFO file generation was observed without problems (without changes - explicit output path is used there anyway)

@dk1844 dk1844 requested a review from Zejnilovic December 14, 2020 13:17
case _ =>
Atum.log.info("No usable storer is set, therefore no data will be written the automatically with DF-save to an _INFO file.")
Atum.log.debug(s"SparkQueryExecutionListener.onSuccess: writing to Hadoop FS")
writeInfoFileForQuery(qe)
Copy link
Collaborator Author

@dk1844 dk1844 Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the info file write here was the main cause.

Copy link
Collaborator

@Zejnilovic Zejnilovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Reviewed, Built, Ran in a stand-alone project.

df.write.mode(SaveMode.Overwrite)
.parquet(outputPath)

{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I get how this is styled. Shouldn't there be some keyword here?

Copy link
Collaborator Author

@dk1844 dk1844 Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what what?
The {} block is used to limit the visibility of val outputPath, as a logical constraint. Or are you discussing the formatting of the block?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting, haven't thought about it that way. I have never seen a standalone block like this in scala. So it was weird to me.

Copy link
Collaborator

@benedeki benedeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small stuff, mostly code style.

import za.co.absa.atum.utils._

class SampleMeasurementsS3RunnerSpec extends AnyFunSuite
@Ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because unlike the hadoop-fs tests, these tests should not be run against actual S3. Thus, they now:

  1. serve as an example
  2. can be run manually, provided certain conditions are met (files exist on S3 inside a specified bucket, KEY ID is supplied, local saml profile is supplied)

@dk1844 dk1844 requested a review from benedeki January 4, 2021 10:26
@dk1844 dk1844 force-pushed the bugfix/48-no-storer-write branch from 6a03ada to a051dfa Compare January 4, 2021 10:31
@benedeki
Copy link
Collaborator

benedeki commented Jan 4, 2021

I guess I can approve the functionality as we well.
Worked on the merge of Enceladus AWS-POC branch. That requires ATUM 3.x, but some tests failed with the currently released version. With this snapshot, they pass.

@dk1844 dk1844 merged commit da979ee into master Jan 6, 2021
@dk1844 dk1844 deleted the bugfix/48-no-storer-write branch January 6, 2021 15:09
@dk1844 dk1844 linked an issue Jan 7, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Atum 3+ does not write pending checkpoints

4 participants