-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-15954][SQL] Disable loading test tables in Python tests #14005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The diff is a lot smaller when ignoring whitespaces: https://github.com/apache/spark/pull/14005/files?w=1 Most of the changes are just some indentation change. |
|
Great approach, I mentioned I had a similar approach available in the other PR #13737 (comment) to fix this (adding a flag to disable loading the tables) (although mine used a lazy val and only had the if around register) but looks functional equivalent. Note: I ran the scala hive tests locally and they failed, I think that the default should be to load the tables. You can fix it here or I can push my version. But other than that LGTM pending tests passing. |
|
ah yes default should definitely be true. let me fix that. |
|
Test build #61579 has finished for PR 14005 at commit
|
|
LGTM pending tests. I'll go ahead and close my original PR. cc @MLnick and @sameeragarwal . |
|
Test build #61582 has finished for PR 14005 at commit
|
|
Merging in master/2.0. |
## What changes were proposed in this pull request? This patch introduces a flag to disable loading test tables in TestHiveSparkSession and disables that in Python. This fixes an issue in which python/run-tests would fail due to failure to load test tables. Note that these test tables are not used outside of HiveCompatibilitySuite. In the long run we should probably decouple the loading of test tables from the test Hive setup. ## How was this patch tested? This is a test only change. Author: Reynold Xin <[email protected]> Closes #14005 from rxin/SPARK-15954. (cherry picked from commit 38f4d6f) Signed-off-by: Reynold Xin <[email protected]>
…skipped always ## What changes were proposed in this pull request? Currently, `HiveContext` in SparkR is not being tested and always skipped. This is because the initiation of `TestHiveContext` is being failed due to trying to load non-existing data paths (test tables). This is introduced from #14005 This enables the tests with SparkR. ## How was this patch tested? Manually, **Before** (on Mac OS) ``` ... Skipped ------------------------------------------------------------------------ 1. create DataFrame from RDD (test_sparkSQL.R#200) - Hive is not build with SparkSQL, skipped 2. test HiveContext (test_sparkSQL.R#1041) - Hive is not build with SparkSQL, skipped 3. read/write ORC files (test_sparkSQL.R#1748) - Hive is not build with SparkSQL, skipped 4. enableHiveSupport on SparkSession (test_sparkSQL.R#2480) - Hive is not build with SparkSQL, skipped 5. sparkJars tag in SparkContext (test_Windows.R#21) - This test is only for Windows, skipped ... ``` **After** (on Mac OS) ``` ... Skipped ------------------------------------------------------------------------ 1. sparkJars tag in SparkContext (test_Windows.R#21) - This test is only for Windows, skipped ... ``` Please refer the tests below (on Windows) - Before: https://ci.appveyor.com/project/HyukjinKwon/spark/build/45-test123 - After: https://ci.appveyor.com/project/HyukjinKwon/spark/build/46-test123 Author: hyukjinkwon <[email protected]> Closes #14889 from HyukjinKwon/SPARK-17326. (cherry picked from commit 50bb142) Signed-off-by: Shivaram Venkataraman <[email protected]>
…skipped always ## What changes were proposed in this pull request? Currently, `HiveContext` in SparkR is not being tested and always skipped. This is because the initiation of `TestHiveContext` is being failed due to trying to load non-existing data paths (test tables). This is introduced from #14005 This enables the tests with SparkR. ## How was this patch tested? Manually, **Before** (on Mac OS) ``` ... Skipped ------------------------------------------------------------------------ 1. create DataFrame from RDD (test_sparkSQL.R#200) - Hive is not build with SparkSQL, skipped 2. test HiveContext (test_sparkSQL.R#1041) - Hive is not build with SparkSQL, skipped 3. read/write ORC files (test_sparkSQL.R#1748) - Hive is not build with SparkSQL, skipped 4. enableHiveSupport on SparkSession (test_sparkSQL.R#2480) - Hive is not build with SparkSQL, skipped 5. sparkJars tag in SparkContext (test_Windows.R#21) - This test is only for Windows, skipped ... ``` **After** (on Mac OS) ``` ... Skipped ------------------------------------------------------------------------ 1. sparkJars tag in SparkContext (test_Windows.R#21) - This test is only for Windows, skipped ... ``` Please refer the tests below (on Windows) - Before: https://ci.appveyor.com/project/HyukjinKwon/spark/build/45-test123 - After: https://ci.appveyor.com/project/HyukjinKwon/spark/build/46-test123 Author: hyukjinkwon <[email protected]> Closes #14889 from HyukjinKwon/SPARK-17326.
What changes were proposed in this pull request?
This patch introduces a flag to disable loading test tables in TestHiveSparkSession and disables that in Python. This fixes an issue in which python/run-tests would fail due to failure to load test tables.
Note that these test tables are not used outside of HiveCompatibilitySuite. In the long run we should probably decouple the loading of test tables from the test Hive setup.
How was this patch tested?
This is a test only change.