Skip to content

Commit ee1dd86

Browse files
authored
Merge branch 'trunk' into MAPREDUCE-7422
2 parents 7947a9d + 04b31d7 commit ee1dd86

File tree

276 files changed

+9376
-3647
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

276 files changed

+9376
-3647
lines changed

BUILDING.txt

Lines changed: 84 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -492,39 +492,66 @@ Building on CentOS 8
492492

493493
----------------------------------------------------------------------------------
494494

495-
Building on Windows
495+
Building on Windows 10
496496

497497
----------------------------------------------------------------------------------
498498
Requirements:
499499

500-
* Windows System
500+
* Windows 10
501501
* JDK 1.8
502-
* Maven 3.0 or later
503-
* Boost 1.72
504-
* Protocol Buffers 3.7.1
505-
* CMake 3.19 or newer
506-
* Visual Studio 2010 Professional or Higher
507-
* Windows SDK 8.1 (if building CPU rate control for the container executor)
508-
* zlib headers (if building native code bindings for zlib)
502+
* Maven 3.0 or later (maven.apache.org)
503+
* Boost 1.72 (boost.org)
504+
* Protocol Buffers 3.7.1 (https://github.com/protocolbuffers/protobuf/releases)
505+
* CMake 3.19 or newer (cmake.org)
506+
* Visual Studio 2019 (visualstudio.com)
507+
* Windows SDK 8.1 (optional, if building CPU rate control for the container executor. Get this from
508+
http://msdn.microsoft.com/en-us/windows/bg162891.aspx)
509+
* Zlib (zlib.net, if building native code bindings for zlib)
510+
* Git (preferably, get this from https://git-scm.com/download/win since the package also contains
511+
Unix command-line tools that are needed during packaging).
512+
* Python (python.org, for generation of docs using 'mvn site')
509513
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
510-
* Unix command-line tools from GnuWin32: sh, mkdir, rm, cp, tar, gzip. These
511-
tools must be present on your PATH.
512-
* Python ( for generation of docs using 'mvn site')
513514

514-
Unix command-line tools are also included with the Windows Git package which
515-
can be downloaded from http://git-scm.com/downloads
515+
----------------------------------------------------------------------------------
516516

517-
If using Visual Studio, it must be Professional level or higher.
518-
Do not use Visual Studio Express. It does not support compiling for 64-bit,
519-
which is problematic if running a 64-bit system.
517+
Building guidelines:
520518

521-
The Windows SDK 8.1 is available to download at:
519+
Hadoop repository provides the Dockerfile for building Hadoop on Windows 10, located at
520+
dev-support/docker/Dockerfile_windows_10. It is highly recommended to use this and create the
521+
Docker image for building Hadoop on Windows 10, since you don't have to install anything else
522+
other than Docker and no additional steps are required in terms of aligning the environment with
523+
the necessary paths etc.
522524

523-
http://msdn.microsoft.com/en-us/windows/bg162891.aspx
525+
However, if you still prefer taking the route of not using Docker, this Dockerfile_windows_10 will
526+
still be immensely useful as a raw guide for all the steps involved in creating the environment
527+
needed to build Hadoop on Windows 10.
524528

525-
Cygwin is not required.
529+
Building using the Docker:
530+
We first need to build the Docker image for building Hadoop on Windows 10. Run this command from
531+
the root of the Hadoop repository.
532+
> docker build -t hadoop-windows-10-builder -f .\dev-support\docker\Dockerfile_windows_10 .\dev-support\docker\
533+
534+
Start the container with the image that we just built.
535+
> docker run --rm -it hadoop-windows-10-builder
536+
537+
You can now clone the Hadoop repo inside this container and proceed with the build.
538+
539+
NOTE:
540+
While one may perceive the idea of mounting the locally cloned (on the host filesystem) Hadoop
541+
repository into the container (using the -v option), we have seen the build to fail owing to some
542+
files not being able to be located by Maven. Thus, we suggest cloning the Hadoop repository to a
543+
non-mounted folder inside the container and proceed with the build. When the build is completed,
544+
you may use the "docker cp" command to copy the built Hadoop tar.gz file from the docker container
545+
to the host filesystem. If you still would like to mount the Hadoop codebase, a workaround would
546+
be to copy the mounted Hadoop codebase into another folder (which doesn't point to a mount) in the
547+
container's filesystem and use this for building.
548+
549+
However, we noticed no build issues when the Maven repository from the host filesystem was mounted
550+
into the container. One may use this to greatly reduce the build time. Assuming that the Maven
551+
repository is located at D:\Maven\Repository in the host filesystem, one can use the following
552+
command to mount the same onto the default Maven repository location while launching the container.
553+
> docker run --rm -v D:\Maven\Repository:C:\Users\ContainerAdministrator\.m2\repository -it hadoop-windows-10-builder
526554

527-
----------------------------------------------------------------------------------
528555
Building:
529556

530557
Keep the source code tree in a short path to avoid running into problems related
@@ -540,6 +567,24 @@ configure the bit-ness of the build, and set several optional components.
540567
Several tests require that the user must have the Create Symbolic Links
541568
privilege.
542569

570+
To simplify the installation of Boost, Protocol buffers, OpenSSL and Zlib dependencies we can use
571+
vcpkg (https://github.com/Microsoft/vcpkg.git). Upon cloning the vcpkg repo, checkout the commit
572+
7ffa425e1db8b0c3edf9c50f2f3a0f25a324541d to get the required versions of the dependencies
573+
mentioned above.
574+
> git clone https://github.com/Microsoft/vcpkg.git
575+
> cd vcpkg
576+
> git checkout 7ffa425e1db8b0c3edf9c50f2f3a0f25a324541d
577+
> .\bootstrap-vcpkg.bat
578+
> .\vcpkg.exe install boost:x64-windows
579+
> .\vcpkg.exe install protobuf:x64-windows
580+
> .\vcpkg.exe install openssl:x64-windows
581+
> .\vcpkg.exe install zlib:x64-windows
582+
583+
Set the following environment variables -
584+
(Assuming that vcpkg was checked out at C:\vcpkg)
585+
> set PROTOBUF_HOME=C:\vcpkg\installed\x64-windows
586+
> set MAVEN_OPTS=-Xmx2048M -Xss128M
587+
543588
All Maven goals are the same as described above with the exception that
544589
native code is built by enabling the 'native-win' Maven profile. -Pnative-win
545590
is enabled by default when building on Windows since the native components
@@ -557,6 +602,24 @@ the zlib 1.2.7 source tree.
557602

558603
http://www.zlib.net/
559604

605+
606+
Build command:
607+
The following command builds all the modules in the Hadoop project and generates the tar.gz file in
608+
hadoop-dist/target upon successful build. Run these commands from an
609+
"x64 Native Tools Command Prompt for VS 2019" which can be found under "Visual Studio 2019" in the
610+
Windows start menu. If you're using the Docker image from Dockerfile_windows_10, you'll be
611+
logged into "x64 Native Tools Command Prompt for VS 2019" automatically when you start the
612+
container.
613+
614+
> set classpath=
615+
> set PROTOBUF_HOME=C:\vcpkg\installed\x64-windows
616+
> mvn clean package -Dhttps.protocols=TLSv1.2 -DskipTests -DskipDocs -Pnative-win,dist^
617+
-Drequire.openssl -Drequire.test.libhadoop -Pyarn-ui -Dshell-executable=C:\Git\bin\bash.exe^
618+
-Dtar -Dopenssl.prefix=C:\vcpkg\installed\x64-windows^
619+
-Dcmake.prefix.path=C:\vcpkg\installed\x64-windows^
620+
-Dwindows.cmake.toolchain.file=C:\vcpkg\scripts\buildsystems\vcpkg.cmake -Dwindows.cmake.build.type=RelWithDebInfo^
621+
-Dwindows.build.hdfspp.dll=off -Dwindows.no.sasl=on -Duse.platformToolsetVersion=v142
622+
560623
----------------------------------------------------------------------------------
561624
Building distributions:
562625

LICENSE-binary

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -220,12 +220,12 @@ com.cedarsoftware:java-util:1.9.0
220220
com.cedarsoftware:json-io:2.5.1
221221
com.fasterxml.jackson.core:jackson-annotations:2.12.7
222222
com.fasterxml.jackson.core:jackson-core:2.12.7
223-
com.fasterxml.jackson.core:jackson-databind:2.12.7
223+
com.fasterxml.jackson.core:jackson-databind:2.12.7.1
224224
com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:2.12.7
225225
com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:2.12.7
226226
com.fasterxml.jackson.module:jackson-module-jaxb-annotations:2.12.7
227227
com.fasterxml.uuid:java-uuid-generator:3.1.4
228-
com.fasterxml.woodstox:woodstox-core:5.3.0
228+
com.fasterxml.woodstox:woodstox-core:5.4.0
229229
com.github.davidmoten:rxjava-extras:0.8.0.17
230230
com.github.stephenc.jcip:jcip-annotations:1.0-1
231231
com.google:guice:4.0
@@ -241,8 +241,8 @@ com.google.guava:guava:27.0-jre
241241
com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-guava
242242
com.microsoft.azure:azure-storage:7.0.0
243243
com.nimbusds:nimbus-jose-jwt:9.8.1
244-
com.squareup.okhttp3:okhttp:4.9.3
245-
com.squareup.okio:okio:1.6.0
244+
com.squareup.okhttp3:okhttp:4.10.0
245+
com.squareup.okio:okio:3.2.0
246246
com.zaxxer:HikariCP:4.0.3
247247
commons-beanutils:commons-beanutils:1.9.3
248248
commons-cli:commons-cli:1.2
@@ -310,7 +310,7 @@ org.apache.commons:commons-csv:1.9.0
310310
org.apache.commons:commons-digester:1.8.1
311311
org.apache.commons:commons-lang3:3.12.0
312312
org.apache.commons:commons-math3:3.6.1
313-
org.apache.commons:commons-text:1.9
313+
org.apache.commons:commons-text:1.10.0
314314
org.apache.commons:commons-validator:1.6
315315
org.apache.curator:curator-client:5.2.0
316316
org.apache.curator:curator-framework:5.2.0
@@ -362,7 +362,7 @@ org.ehcache:ehcache:3.3.1
362362
org.lz4:lz4-java:1.7.1
363363
org.objenesis:objenesis:2.6
364364
org.xerial.snappy:snappy-java:1.0.5
365-
org.yaml:snakeyaml:1.32
365+
org.yaml:snakeyaml:1.33
366366
org.wildfly.openssl:wildfly-openssl:1.0.7.Final
367367

368368

@@ -427,7 +427,7 @@ hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/bootstrap.min.js
427427
hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
428428
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap.min.css
429429
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap-responsive.min.css
430-
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/*
430+
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.11.5/*
431431
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery
432432
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jt/jquery.jstree.js
433433
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/TERMINAL
@@ -523,7 +523,7 @@ junit:junit:4.13.2
523523
HSQL License
524524
------------
525525

526-
org.hsqldb:hsqldb:2.5.2
526+
org.hsqldb:hsqldb:2.7.1
527527

528528

529529
JDOM License

LICENSE.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,7 @@ hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/bootstrap.min.js
252252
hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
253253
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap.min.css
254254
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap-responsive.min.css
255-
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/*
255+
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.11.5/*
256256
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery
257257
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jt/jquery.jstree.js
258258
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/TERMINAL

hadoop-client-modules/hadoop-client-runtime/pom.xml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,7 @@
148148
<!-- Leave javax APIs that are stable -->
149149
<!-- the jdk ships part of the javax.annotation namespace, so if we want to relocate this we'll have to care it out by class :( -->
150150
<exclude>com.google.code.findbugs:jsr305</exclude>
151+
<exclude>io.netty:*</exclude>
151152
<exclude>io.dropwizard.metrics:metrics-core</exclude>
152153
<exclude>org.eclipse.jetty:jetty-servlet</exclude>
153154
<exclude>org.eclipse.jetty:jetty-security</exclude>
@@ -156,6 +157,8 @@
156157
<exclude>org.bouncycastle:*</exclude>
157158
<!-- Leave snappy that includes native methods which cannot be relocated. -->
158159
<exclude>org.xerial.snappy:*</exclude>
160+
<!-- leave out kotlin classes -->
161+
<exclude>org.jetbrains.kotlin:*</exclude>
159162
</excludes>
160163
</artifactSet>
161164
<filters>

hadoop-common-project/hadoop-common/pom.xml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,11 @@
383383
<artifactId>mockwebserver</artifactId>
384384
<scope>test</scope>
385385
</dependency>
386+
<dependency>
387+
<groupId>com.squareup.okio</groupId>
388+
<artifactId>okio-jvm</artifactId>
389+
<scope>test</scope>
390+
</dependency>
386391
<dependency>
387392
<groupId>dnsjava</groupId>
388393
<artifactId>dnsjava</artifactId>

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProvider.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,13 +639,14 @@ public void invalidateCache(String name) throws IOException {
639639
public abstract void flush() throws IOException;
640640

641641
/**
642-
* Split the versionName in to a base name. Converts "/aaa/bbb/3" to
642+
* Split the versionName in to a base name. Converts "/aaa/bbb@3" to
643643
* "/aaa/bbb".
644644
* @param versionName the version name to split
645645
* @return the base name of the key
646646
* @throws IOException raised on errors performing I/O.
647647
*/
648648
public static String getBaseName(String versionName) throws IOException {
649+
Objects.requireNonNull(versionName, "VersionName cannot be null");
649650
int div = versionName.lastIndexOf('@');
650651
if (div == -1) {
651652
throw new IOException("No version in key path " + versionName);

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileRange.java

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,13 +55,33 @@ public interface FileRange {
5555
*/
5656
void setData(CompletableFuture<ByteBuffer> data);
5757

58+
/**
59+
* Get any reference passed in to the file range constructor.
60+
* This is not used by any implementation code; it is to help
61+
* bind this API to libraries retrieving multiple stripes of
62+
* data in parallel.
63+
* @return a reference or null.
64+
*/
65+
Object getReference();
66+
5867
/**
5968
* Factory method to create a FileRange object.
6069
* @param offset starting offset of the range.
6170
* @param length length of the range.
6271
* @return a new instance of FileRangeImpl.
6372
*/
6473
static FileRange createFileRange(long offset, int length) {
65-
return new FileRangeImpl(offset, length);
74+
return new FileRangeImpl(offset, length, null);
75+
}
76+
77+
/**
78+
* Factory method to create a FileRange object.
79+
* @param offset starting offset of the range.
80+
* @param length length of the range.
81+
* @param reference nullable reference to store in the range.
82+
* @return a new instance of FileRangeImpl.
83+
*/
84+
static FileRange createFileRange(long offset, int length, Object reference) {
85+
return new FileRangeImpl(offset, length, reference);
6686
}
6787
}

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1543,6 +1543,39 @@ public FSDataOutputStream append(Path f, int bufferSize) throws IOException {
15431543
public abstract FSDataOutputStream append(Path f, int bufferSize,
15441544
Progressable progress) throws IOException;
15451545

1546+
/**
1547+
* Append to an existing file (optional operation).
1548+
* @param f the existing file to be appended.
1549+
* @param appendToNewBlock whether to append data to a new block
1550+
* instead of the end of the last partial block
1551+
* @throws IOException IO failure
1552+
* @throws UnsupportedOperationException if the operation is unsupported
1553+
* (default).
1554+
* @return output stream.
1555+
*/
1556+
public FSDataOutputStream append(Path f, boolean appendToNewBlock) throws IOException {
1557+
return append(f, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
1558+
IO_FILE_BUFFER_SIZE_DEFAULT), null, appendToNewBlock);
1559+
}
1560+
1561+
/**
1562+
* Append to an existing file (optional operation).
1563+
* This function is used for being overridden by some FileSystem like DistributedFileSystem
1564+
* @param f the existing file to be appended.
1565+
* @param bufferSize the size of the buffer to be used.
1566+
* @param progress for reporting progress if it is not null.
1567+
* @param appendToNewBlock whether to append data to a new block
1568+
* instead of the end of the last partial block
1569+
* @throws IOException IO failure
1570+
* @throws UnsupportedOperationException if the operation is unsupported
1571+
* (default).
1572+
* @return output stream.
1573+
*/
1574+
public FSDataOutputStream append(Path f, int bufferSize,
1575+
Progressable progress, boolean appendToNewBlock) throws IOException {
1576+
return append(f, bufferSize, progress);
1577+
}
1578+
15461579
/**
15471580
* Concat existing files together.
15481581
* @param trg the path to the target destination.

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/CombinedFileRange.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,10 +29,10 @@
2929
* together into a single read for efficiency.
3030
*/
3131
public class CombinedFileRange extends FileRangeImpl {
32-
private ArrayList<FileRange> underlying = new ArrayList<>();
32+
private List<FileRange> underlying = new ArrayList<>();
3333

3434
public CombinedFileRange(long offset, long end, FileRange original) {
35-
super(offset, (int) (end - offset));
35+
super(offset, (int) (end - offset), null);
3636
this.underlying.add(original);
3737
}
3838

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/FileRangeImpl.java

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,21 @@ public class FileRangeImpl implements FileRange {
3434
private int length;
3535
private CompletableFuture<ByteBuffer> reader;
3636

37-
public FileRangeImpl(long offset, int length) {
37+
/**
38+
* nullable reference to store in the range.
39+
*/
40+
private final Object reference;
41+
42+
/**
43+
* Create.
44+
* @param offset offset in file
45+
* @param length length of data to read.
46+
* @param reference nullable reference to store in the range.
47+
*/
48+
public FileRangeImpl(long offset, int length, Object reference) {
3849
this.offset = offset;
3950
this.length = length;
51+
this.reference = reference;
4052
}
4153

4254
@Override
@@ -71,4 +83,9 @@ public void setData(CompletableFuture<ByteBuffer> pReader) {
7183
public CompletableFuture<ByteBuffer> getData() {
7284
return reader;
7385
}
86+
87+
@Override
88+
public Object getReference() {
89+
return reference;
90+
}
7491
}

0 commit comments

Comments
 (0)