-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-3586][streaming]Support nested directories in Spark Streaming #6588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #33990 has finished for PR 6588 at commit
|
|
Test build #36271 has finished for PR 6588 at commit
|
|
@zsxwing Do you have any further plan about this PR ? When is it going to be merged? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the other recursive call on l.203 above) may blow up the call stack in very deep hierarchies (think Maildir++, etc). I realise there's a depth limit set, but users may precisely want to set it extremely high for some of those cases.
How about making the exploration using a @tailrec auxiliary function, with a queue of yet-to-be-explored directories in arguments ?
|
@tdas : any update on this PR? Is there anything that should be done yet? |
|
@zsxwing : your solution is not going to work on S3, since |
|
Good point! |
|
@zsxwing is this stalled now? It doesn't look like this will proceed. |
|
Let me just close it. Maybe visit later. |
|
Is there a plan to support nested directory streaming with latest versions of spark? |
Closes #2765