Skip to content

Conversation

@lqjack
Copy link
Contributor

@lqjack lqjack commented Oct 23, 2018

Fix the nextPos

Fix the nextPos
@steveloughran
Copy link
Contributor

  1. I think the same problem surfaces in remainingInCurrentRequest() too, which is where some inspection is going to be needed to carefully understand what's up.
  2. For this method, I think maybe we could cut it completely, and fix up available() to return a default value when wrappedStream==null (0 or maybe 1 except when at EOF), and when the stream isn't null, delegate to it. That pushes down the problem of estimating the number of non-blocking bytes into the http connector.

Tests.

In AbstractContractSeekTest there's some existing tests which could take some more asserts on that available call

  • testSeekZeroByteFile: available() == 0, always
  • testSeekReadClosedFile : call available() on an empty file, it should raise some IllegalStateException or IOE, but not an NPE

For some of the other tests, I think you insert a couple of checks to say " available > 0" after a seek + read() Call. This also clarifies that "what should be available after a seek but before a read()?" As for cloudstores with lazy seek, available == 0, though if that breaks gzip then maybe they should return 1, so the assert should be available >1 with an option in the contract to say "actually returns 0".

It's a tricky one to test on because really, the sole asserts should be

  1. fails if the stream is closed (unless its a 0?)
  2. returns >= if the stream is open
  3. is always < remaining file length.

Probably assert #3 is the one to check for: available() is valid when it is >=0 and <= (filelength -getPos()), with care taken about that second clause to make sure there's no off-by-one errors in the assert/code

@lqjack
Copy link
Contributor Author

lqjack commented Oct 24, 2018

@steveloughran Thanks very much , I will update.

1, Fix remainingInCurrentRequest
2, append assert in the Junit Test
@lqjack
Copy link
Contributor Author

lqjack commented Oct 25, 2018

@steveloughran I have updated the code , please help reveiw and comment, thanks .

@steveloughran
Copy link
Contributor

As the usual test police, which S3 endpoint did you rerun all the hadoop-aws tests against?
https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/testing.html

This only takes < 10 minutes, even remotely

mvn verify -Dparallel-tests -DtestsThreadCount=10 -Dscale
# or with s3guard
mvn verify -Dparallel-tests -DtestsThreadCount=10 -Ds3guard -Ddynamodb -Dscale

I'll help with testing the other stores (azure wasb/abfs/adl), before the commit, but I'd appreciate the diligence here as it shows you've done the homework. thx

@lqjack
Copy link
Contributor Author

lqjack commented Nov 17, 2018

@steveloughran Due to the upgrade to the latest version of the Mac OS, I need take some time to resolve the environment before I can run the test successful .

@steveloughran
Copy link
Contributor

no worries

fix junit
update the available logic
fix junit
Fix the nextPos
@steveloughran steveloughran changed the title HADOOP-15870 HADOOP-15870 S3AInputStream.remainingInFile should use nextReadPos Mar 28, 2019
@steveloughran
Copy link
Contributor

this is merged in, closing

shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
1. Added a meaningful name for the container thread pool threads.
2. Made the thread names for framework threads consistent.
3. Made a couple of monitoring/metrics threads daemon.
4. Fixed a few checkstyle warning about missing param/throws documentation.

Author: Prateek Maheshwari <[email protected]>

Reviewers: Jagadish <[email protected]>, Jacob M <[email protected]>

Closes apache#433 from prateekm/container-thread-pool-name
NyteKnight pushed a commit to NyteKnight/hadoop that referenced this pull request Jun 25, 2024
singer-bin pushed a commit to singer-bin/hadoop that referenced this pull request Dec 19, 2024
Author: Nandor Kollar <[email protected]>

Closes apache#433 from nandorKollar/PARQUET-1115 and squashes the following commits:

5504a39 [Nandor Kollar] PARQUET-1115: Warn users when misusing parquet-tools merge
f2ece26 [Nandor Kollar] PARQUET-1115: Warn users when misusing parquet-tools merge
4f3ec99 [Nandor Kollar] PARQUET-1115: Warn users when misusing parquet-tools merge
f97e620 [Nandor Kollar] PARQUET-1115: Prevent users from misusing parquet-tools merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants