Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
81706c3
read/write Timestamp ntz or ltz to Orc uses UTC timestamp
beliefer Nov 25, 2021
8c01499
Update code
beliefer Nov 25, 2021
8e1fb0a
Update OrcFileFormat.scala
beliefer Nov 25, 2021
1a3e6fb
Update OrcUtils.scala
beliefer Nov 25, 2021
7c343a7
Update sql/core/src/test/scala/org/apache/spark/sql/execution/datasou…
beliefer Nov 25, 2021
92d7b8f
Update code
beliefer Nov 25, 2021
998e93d
Update code
beliefer Nov 25, 2021
d192d96
[SPARK-32079][PYTHON] Remove namedtuple hack by replacing built-in pi…
HyukjinKwon Nov 25, 2021
444cfe6
Revert "[SPARK-37445][BUILD] Rename the maven profile hadoop-3.2 to h…
HyukjinKwon Nov 26, 2021
69e1151
[SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL…
HyukjinKwon Nov 26, 2021
95fc4c5
[SPARK-37457][PYTHON] Update cloudpickle to v2.0.0
HyukjinKwon Nov 26, 2021
87be39c
Update code
beliefer Nov 26, 2021
f9a1d72
Update code
beliefer Nov 26, 2021
7b50cf0
[SPARK-34735][SQL][UI] Add modified configs for SQL execution in UI
ulysses-you Nov 26, 2021
f399d0d
[SPARK-37437][BUILD] Remove unused hive profile and related CI test
AngersZhuuuu Nov 27, 2021
db9a982
[SPARK-37461][YARN] YARN-CLIENT mode client.appId is always null
AngersZhuuuu Nov 28, 2021
e91ef19
[SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
ueshin Nov 29, 2021
a3886ba
[SPARK-37319][K8S][FOLLOWUP] Set JAVA_HOME for Java 17 installed by a…
sarutak Nov 29, 2021
5d09828
[SPARK-37447][SQL] Cache LogicalPlan.isStreaming() result in a lazy val
JoshRosen Nov 29, 2021
0c3c4e2
[SPARK-37452][SQL] Char and Varchar break backward compatibility betw…
yaooqinn Nov 29, 2021
251e6fd
[SPARK-37464][SQL] SCHEMA and DATABASE should simply be aliases of NA…
cloud-fan Nov 29, 2021
0f631b1
[SPARK-33875][SQL][FOLLOWUP] Handle the char/varchar column for `Desc…
Peng-Lei Nov 29, 2021
a6ca481
[SPARK-36346][SQL][FOLLOWUP] Rename `withAllOrcReaders` to `withAllNa…
dongjoon-hyun Nov 29, 2021
7484c1b
[SPARK-37468][SQL] Support ANSI intervals and TimestampNTZ for UnionE…
sarutak Nov 29, 2021
1966416
[SPARK-37454][SQL][FOLLOWUP] Time travel timestamp expression should …
cloud-fan Nov 30, 2021
7689102
[SPARK-37485][CORE][SQL] Replace `map` with expressions which produce…
LuciferYang Nov 30, 2021
e36fae4
[SPARK-37484][CORE][SQL] Replace `get` and `getOrElse` with `getOrElse`
LuciferYang Nov 30, 2021
c38c617
[SPARK-37482][PYTHON] Skip check monotonic increasing for Series.asof…
dchvn Nov 30, 2021
fe1bb55
[SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesRe…
kazuyukitanimura Nov 30, 2021
98b0c80
[SPARK-36850][SQL] Migrate CreateTableStatement to v2 command framework
huaxingao Nov 30, 2021
49b5dd1
[SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders
beliefer Nov 30, 2021
3657703
[SPARK-37465][PYTHON] Bump minimum pandas version to 1.0.5
Yikun Nov 30, 2021
e031d00
[SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check dis…
Yikun Nov 30, 2021
1a43112
[SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should p…
AngersZhuuuu Nov 30, 2021
e3256b8
[SPARK-36396][PYTHON] Implement DataFrame.cov
dchvn Nov 30, 2021
ac7c52d
[MINOR][DOC] Update doc for `ResourceProfileManager.isSupported`
wzhfy Nov 30, 2021
fdb33dd
[SPARK-37505][MESOS][TESTS] Add a log4j.properties for `mesos` module UT
LuciferYang Nov 30, 2021
ca25534
[SPARK-37509][CORE] Improve Fallback Storage upload speed by avoiding…
dongjoon-hyun Nov 30, 2021
2b04496
[SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapsho…
dongjoon-hyun Dec 1, 2021
d61c2f4
[SPARK-37490][SQL] Show extra hint if analyzer fails due to ANSI type…
gengliangwang Dec 1, 2021
e7fa289
[SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartiti…
sunchao Dec 1, 2021
004cab1
Update code
beliefer Dec 1, 2021
e4f6a0d
Update code
beliefer Dec 1, 2021
d925ce7
read/write Timestamp ntz or ltz to Orc uses UTC timestamp
beliefer Nov 25, 2021
e3775e3
Update code
beliefer Nov 25, 2021
e8f015a
Update OrcFileFormat.scala
beliefer Nov 25, 2021
659aa2f
Update OrcUtils.scala
beliefer Nov 25, 2021
e6bc2da
Update sql/core/src/test/scala/org/apache/spark/sql/execution/datasou…
beliefer Nov 25, 2021
d9e2fba
Update code
beliefer Nov 25, 2021
4491498
Update code
beliefer Nov 25, 2021
7cbf277
Update code
beliefer Nov 26, 2021
86012a4
Update code
beliefer Nov 26, 2021
cb91b31
Update code
beliefer Dec 1, 2021
f313ea5
Update code
beliefer Dec 1, 2021
82738b1
Update code
beliefer Dec 1, 2021
aec499f
Update code
beliefer Dec 1, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,37 +63,37 @@ jobs:
echo '::set-output name=branch::master'
echo '::set-output name=type::scheduled'
echo '::set-output name=envs::{"SCALA_PROFILE": "scala2.13"}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
elif [ "${{ github.event.schedule }}" = "0 7 * * *" ]; then
echo '::set-output name=java::8'
echo '::set-output name=branch::branch-3.2'
echo '::set-output name=type::scheduled'
echo '::set-output name=envs::{"SCALA_PROFILE": "scala2.13"}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
elif [ "${{ github.event.schedule }}" = "0 10 * * *" ]; then
echo '::set-output name=java::8'
echo '::set-output name=branch::master'
echo '::set-output name=type::pyspark-coverage-scheduled'
echo '::set-output name=envs::{"PYSPARK_CODECOV": "true"}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
elif [ "${{ github.event.schedule }}" = "0 13 * * *" ]; then
echo '::set-output name=java::11'
echo '::set-output name=branch::master'
echo '::set-output name=type::scheduled'
echo '::set-output name=envs::{"SKIP_MIMA": "true", "SKIP_UNIDOC": "true"}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
elif [ "${{ github.event.schedule }}" = "0 16 * * *" ]; then
echo '::set-output name=java::17'
echo '::set-output name=branch::master'
echo '::set-output name=type::scheduled'
echo '::set-output name=envs::{"SKIP_MIMA": "true", "SKIP_UNIDOC": "true"}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
else
echo '::set-output name=java::8'
echo '::set-output name=branch::master' # Default branch to run on. CHANGE here when a branch is cut out.
echo '::set-output name=type::regular'
echo '::set-output name=envs::{}'
echo '::set-output name=hadoop::hadoop3.3'
echo '::set-output name=hadoop::hadoop3.2'
fi

# Build: build Spark and run the tests for specified modules.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -463,14 +463,14 @@ object ResourceProfile extends Logging {
case ResourceProfile.CORES =>
cores = execReq.amount.toInt
case rName =>
val nameToUse = resourceMappings.get(rName).getOrElse(rName)
val nameToUse = resourceMappings.getOrElse(rName, rName)
customResources(nameToUse) = execReq
}
}
customResources.toMap
} else {
defaultResources.customResources.map { case (rName, execReq) =>
val nameToUse = resourceMappings.get(rName).getOrElse(rName)
val nameToUse = resourceMappings.getOrElse(rName, rName)
(nameToUse, execReq)
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,10 @@ private[spark] class ResourceProfileManager(sparkConf: SparkConf,
private val notRunningUnitTests = !isTesting
private val testExceptionThrown = sparkConf.get(RESOURCE_PROFILE_MANAGER_TESTING)

// If we use anything except the default profile, its only supported on YARN right now.
// Throw an exception if not supported.
/**
* If we use anything except the default profile, it's only supported on YARN and Kubernetes
* with dynamic allocation enabled. Throw an exception if not supported.
*/
private[spark] def isSupported(rp: ResourceProfile): Boolean = {
val isNotDefaultProfile = rp.id != ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
val notYarnOrK8sAndNotDefaultProfile = isNotDefaultProfile && !(isYarn || isK8s)
Expand Down Expand Up @@ -103,7 +105,7 @@ private[spark] class ResourceProfileManager(sparkConf: SparkConf,
def resourceProfileFromId(rpId: Int): ResourceProfile = {
readLock.lock()
try {
resourceProfileIdToResourceProfile.get(rpId).getOrElse(
resourceProfileIdToResourceProfile.getOrElse(rpId,
throw new SparkException(s"ResourceProfileId $rpId not found!")
)
} finally {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ private[spark] class FetchFailedException(
// which intercepts this exception (possibly wrapping it), the Executor can still tell there was
// a fetch failure, and send the correct error msg back to the driver. We wrap with an Option
// because the TaskContext is not defined in some test cases.
Option(TaskContext.get()).map(_.setFetchFailed(this))
Option(TaskContext.get()).foreach(_.setFetchFailed(this))

def toTaskFailedReason: TaskFailedReason = FetchFailed(
bmAddress, shuffleId, mapId, mapIndex, reduceId, Utils.exceptionString(this))
Expand Down
16 changes: 11 additions & 5 deletions core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import org.apache.spark.deploy.SparkHadoopUtil
import org.apache.spark.internal.Logging
import org.apache.spark.internal.config.{STORAGE_DECOMMISSION_FALLBACK_STORAGE_CLEANUP, STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH}
import org.apache.spark.network.buffer.{ManagedBuffer, NioManagedBuffer}
import org.apache.spark.network.util.JavaUtils
import org.apache.spark.rpc.{RpcAddress, RpcEndpointRef, RpcTimeout}
import org.apache.spark.shuffle.{IndexShuffleBlockResolver, ShuffleBlockInfo}
import org.apache.spark.shuffle.IndexShuffleBlockResolver.NOOP_REDUCE_ID
Expand Down Expand Up @@ -60,15 +61,17 @@ private[storage] class FallbackStorage(conf: SparkConf) extends Logging {
val indexFile = r.getIndexFile(shuffleId, mapId)

if (indexFile.exists()) {
val hash = JavaUtils.nonNegativeHash(indexFile.getName)
fallbackFileSystem.copyFromLocalFile(
new Path(indexFile.getAbsolutePath),
new Path(fallbackPath, s"$appId/$shuffleId/${indexFile.getName}"))
new Path(fallbackPath, s"$appId/$shuffleId/$hash/${indexFile.getName}"))

val dataFile = r.getDataFile(shuffleId, mapId)
if (dataFile.exists()) {
val hash = JavaUtils.nonNegativeHash(dataFile.getName)
fallbackFileSystem.copyFromLocalFile(
new Path(dataFile.getAbsolutePath),
new Path(fallbackPath, s"$appId/$shuffleId/${dataFile.getName}"))
new Path(fallbackPath, s"$appId/$shuffleId/$hash/${dataFile.getName}"))
}

// Report block statuses
Expand All @@ -86,7 +89,8 @@ private[storage] class FallbackStorage(conf: SparkConf) extends Logging {
}

def exists(shuffleId: Int, filename: String): Boolean = {
fallbackFileSystem.exists(new Path(fallbackPath, s"$appId/$shuffleId/$filename"))
val hash = JavaUtils.nonNegativeHash(filename)
fallbackFileSystem.exists(new Path(fallbackPath, s"$appId/$shuffleId/$hash/$filename"))
}
}

Expand Down Expand Up @@ -168,7 +172,8 @@ private[spark] object FallbackStorage extends Logging {
}

val name = ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID).name
val indexFile = new Path(fallbackPath, s"$appId/$shuffleId/$name")
val hash = JavaUtils.nonNegativeHash(name)
val indexFile = new Path(fallbackPath, s"$appId/$shuffleId/$hash/$name")
val start = startReduceId * 8L
val end = endReduceId * 8L
Utils.tryWithResource(fallbackFileSystem.open(indexFile)) { inputStream =>
Expand All @@ -178,7 +183,8 @@ private[spark] object FallbackStorage extends Logging {
index.skip(end - (start + 8L))
val nextOffset = index.readLong()
val name = ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID).name
val dataFile = new Path(fallbackPath, s"$appId/$shuffleId/$name")
val hash = JavaUtils.nonNegativeHash(name)
val dataFile = new Path(fallbackPath, s"$appId/$shuffleId/$hash/$name")
val f = fallbackFileSystem.open(dataFile)
val size = nextOffset - offset
logDebug(s"To byte array $size")
Expand Down
10 changes: 5 additions & 5 deletions dev/create-release/release-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ SCALA_2_12_PROFILES="-Pscala-2.12"
HIVE_PROFILES="-Phive -Phive-thriftserver"
# Profiles for publishing snapshots and release to Maven Central
# We use Apache Hive 2.3 for publishing
PUBLISH_PROFILES="$BASE_PROFILES $HIVE_PROFILES -Phive-2.3 -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud"
PUBLISH_PROFILES="$BASE_PROFILES $HIVE_PROFILES -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud"
# Profiles for building binary releases
BASE_RELEASE_PROFILES="$BASE_PROFILES -Psparkr"

Expand Down Expand Up @@ -322,18 +322,18 @@ if [[ "$1" == "package" ]]; then
# 'python/pyspark/install.py' and 'python/docs/source/getting_started/install.rst'
# if you're changing them.
declare -A BINARY_PKGS_ARGS
BINARY_PKGS_ARGS["hadoop3.3"]="-Phadoop-3 $HIVE_PROFILES"
BINARY_PKGS_ARGS["hadoop3.2"]="-Phadoop-3.2 $HIVE_PROFILES"
if ! is_dry_run; then
BINARY_PKGS_ARGS["without-hadoop"]="-Phadoop-provided"
BINARY_PKGS_ARGS["hadoop2.7"]="-Phadoop-2.7 $HIVE_PROFILES"
fi

declare -A BINARY_PKGS_EXTRA
BINARY_PKGS_EXTRA["hadoop3.3"]="withpip,withr"
BINARY_PKGS_EXTRA["hadoop3.2"]="withpip,withr"

if [[ $PUBLISH_SCALA_2_13 = 1 ]]; then
key="hadoop3.3-scala2.13"
args="-Phadoop-3 $HIVE_PROFILES"
key="hadoop3.2-scala2.13"
args="-Phadoop-3.2 $HIVE_PROFILES"
extra=""
if ! make_binary_release "$key" "$SCALA_2_13_PROFILES $args" "$extra" "2.13"; then
error "Failed to build $key package. Check logs for details."
Expand Down
7 changes: 2 additions & 5 deletions dev/run-tests-jenkins.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,11 +172,8 @@ def main():
# Switch the Hadoop profile based on the PR title:
if "test-hadoop2.7" in ghprb_pull_title:
os.environ["AMPLAB_JENKINS_BUILD_PROFILE"] = "hadoop2.7"
if "test-hadoop3.3" in ghprb_pull_title:
os.environ["AMPLAB_JENKINS_BUILD_PROFILE"] = "hadoop3.3"
# Switch the Hive profile based on the PR title:
if "test-hive2.3" in ghprb_pull_title:
os.environ["AMPLAB_JENKINS_BUILD_HIVE_PROFILE"] = "hive2.3"
if "test-hadoop3.2" in ghprb_pull_title:
os.environ["AMPLAB_JENKINS_BUILD_PROFILE"] = "hadoop3.2"
# Switch the Scala profile based on the PR title:
if "test-scala2.13" in ghprb_pull_title:
os.environ["AMPLAB_JENKINS_BUILD_SCALA_PROFILE"] = "scala2.13"
Expand Down
29 changes: 4 additions & 25 deletions dev/run-tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ def get_hadoop_profiles(hadoop_version):

sbt_maven_hadoop_profiles = {
"hadoop2.7": ["-Phadoop-2.7"],
"hadoop3.3": ["-Phadoop-3"],
"hadoop3.2": ["-Phadoop-3.2"],
}

if hadoop_version in sbt_maven_hadoop_profiles:
Expand All @@ -345,24 +345,6 @@ def get_hadoop_profiles(hadoop_version):
sys.exit(int(os.environ.get("CURRENT_BLOCK", 255)))


def get_hive_profiles(hive_version):
"""
For the given Hive version tag, return a list of Maven/SBT profile flags for
building and testing against that Hive version.
"""

sbt_maven_hive_profiles = {
"hive2.3": ["-Phive-2.3"],
}

if hive_version in sbt_maven_hive_profiles:
return sbt_maven_hive_profiles[hive_version]
else:
print("[error] Could not find", hive_version, "in the list. Valid options",
" are", sbt_maven_hive_profiles.keys())
sys.exit(int(os.environ.get("CURRENT_BLOCK", 255)))


def build_spark_maven(extra_profiles):
# Enable all of the profiles for the build:
build_profiles = extra_profiles + modules.root.build_profile_flags
Expand Down Expand Up @@ -615,8 +597,7 @@ def main():
# to reflect the environment settings
build_tool = os.environ.get("AMPLAB_JENKINS_BUILD_TOOL", "sbt")
scala_version = os.environ.get("AMPLAB_JENKINS_BUILD_SCALA_PROFILE")
hadoop_version = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE", "hadoop3.3")
hive_version = os.environ.get("AMPLAB_JENKINS_BUILD_HIVE_PROFILE", "hive2.3")
hadoop_version = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE", "hadoop3.2")
test_env = "amplab_jenkins"
# add path for Python3 in Jenkins if we're calling from a Jenkins machine
# TODO(sknapp): after all builds are ported to the ubuntu workers, change this to be:
Expand All @@ -626,15 +607,13 @@ def main():
# else we're running locally or GitHub Actions.
build_tool = "sbt"
scala_version = os.environ.get("SCALA_PROFILE")
hadoop_version = os.environ.get("HADOOP_PROFILE", "hadoop3.3")
hive_version = os.environ.get("HIVE_PROFILE", "hive2.3")
hadoop_version = os.environ.get("HADOOP_PROFILE", "hadoop3.2")
if "GITHUB_ACTIONS" in os.environ:
test_env = "github_actions"
else:
test_env = "local"

extra_profiles = get_hadoop_profiles(hadoop_version) + get_hive_profiles(hive_version) + \
get_scala_profiles(scala_version)
extra_profiles = get_hadoop_profiles(hadoop_version) + get_scala_profiles(scala_version)

print("[info] Using build tool", build_tool, "with profiles",
*(extra_profiles + ["under environment", test_env]))
Expand Down
2 changes: 2 additions & 0 deletions dev/sparktestsupport/modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,7 @@ def __hash__(self):
"pyspark.sql.tests.test_streaming",
"pyspark.sql.tests.test_types",
"pyspark.sql.tests.test_udf",
"pyspark.sql.tests.test_udf_profiler",
"pyspark.sql.tests.test_utils",
]
)
Expand Down Expand Up @@ -606,6 +607,7 @@ def __hash__(self):
"pyspark.pandas.namespace",
"pyspark.pandas.numpy_compat",
"pyspark.pandas.sql_processor",
"pyspark.pandas.sql_formatter",
"pyspark.pandas.strings",
"pyspark.pandas.utils",
"pyspark.pandas.window",
Expand Down
14 changes: 6 additions & 8 deletions dev/test-dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ HADOOP_MODULE_PROFILES="-Phive-thriftserver -Pmesos -Pkubernetes -Pyarn -Phive \
MVN="build/mvn"
HADOOP_HIVE_PROFILES=(
hadoop-2.7-hive-2.3
hadoop-3.3-hive-2.3
hadoop-3.2-hive-2.3
)

# We'll switch the version to a temp. one, publish POMs using that new version, then switch back to
Expand Down Expand Up @@ -84,22 +84,20 @@ $MVN -q versions:set -DnewVersion=$TEMP_VERSION -DgenerateBackupPoms=false > /de

# Generate manifests for each Hadoop profile:
for HADOOP_HIVE_PROFILE in "${HADOOP_HIVE_PROFILES[@]}"; do
if [[ $HADOOP_HIVE_PROFILE == **hadoop-3.3-hive-2.3** ]]; then
HADOOP_PROFILE=hadoop-3
HIVE_PROFILE=hive-2.3
if [[ $HADOOP_HIVE_PROFILE == **hadoop-3.2-hive-2.3** ]]; then
HADOOP_PROFILE=hadoop-3.2
else
HADOOP_PROFILE=hadoop-2.7
HIVE_PROFILE=hive-2.3
fi
echo "Performing Maven install for $HADOOP_HIVE_PROFILE"
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE -P$HIVE_PROFILE jar:jar jar:test-jar install:install clean -q
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE jar:jar jar:test-jar install:install clean -q

echo "Performing Maven validate for $HADOOP_HIVE_PROFILE"
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE -P$HIVE_PROFILE validate -q
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE validate -q

echo "Generating dependency manifest for $HADOOP_HIVE_PROFILE"
mkdir -p dev/pr-deps
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE -P$HIVE_PROFILE dependency:build-classpath -pl assembly -am \
$MVN $HADOOP_MODULE_PROFILES -P$HADOOP_PROFILE dependency:build-classpath -pl assembly -am \
| grep "Dependencies classpath:" -A 1 \
| tail -n 1 | tr ":" "\n" | awk -F '/' '{
# For each dependency classpath, we fetch the last three parts split by "/": artifact id, version, and jar name.
Expand Down
2 changes: 1 addition & 1 deletion docs/sql-ref-ansi-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -528,7 +528,7 @@ Below is a list of all the keywords in Spark SQL.
|ROW|non-reserved|non-reserved|reserved|
|ROWS|non-reserved|non-reserved|reserved|
|SCHEMA|non-reserved|non-reserved|non-reserved|
|SCHEMAS|non-reserved|non-reserved|not a keyword|
|SCHEMAS|non-reserved|non-reserved|non-reserved|
|SECOND|non-reserved|non-reserved|non-reserved|
|SELECT|reserved|non-reserved|reserved|
|SEMI|non-reserved|strict-non-reserved|non-reserved|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -387,11 +387,11 @@ private[kafka010] class KafkaOffsetReaderAdmin(

// Calculate offset ranges
val offsetRangesBase = untilPartitionOffsets.keySet.map { tp =>
val fromOffset = fromPartitionOffsets.get(tp).getOrElse {
val fromOffset = fromPartitionOffsets.getOrElse(tp,
// This should not happen since topicPartitions contains all partitions not in
// fromPartitionOffsets
throw new IllegalStateException(s"$tp doesn't have a from offset")
}
)
val untilOffset = untilPartitionOffsets(tp)
KafkaOffsetRange(tp, fromOffset, untilOffset, None)
}.toSeq
Expand Down
4 changes: 2 additions & 2 deletions hadoop-cloud/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@

<profiles>
<!--
hadoop-3 profile is activated by default so hadoop-2.7 profile
hadoop-3.2 profile is activated by default so hadoop-2.7 profile
also needs to be declared here for building with -Phadoop-2.7.
-->
<profile>
Expand All @@ -201,7 +201,7 @@
enables store-specific committers.
-->
<profile>
<id>hadoop-3</id>
<id>hadoop-3.2</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
Expand Down
7 changes: 1 addition & 6 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3349,15 +3349,10 @@
</profile>

<profile>
<id>hadoop-3</id>
<id>hadoop-3.2</id>
<!-- Default hadoop profile. Uses global properties. -->
</profile>

<profile>
<id>hive-2.3</id>
<!-- Default hive profile. Uses global properties. -->
</profile>

<profile>
<id>yarn</id>
<modules>
Expand Down
Loading