-
Couldn't load subscription status.
- Fork 9.1k
HADOOP-18883. [ABFS]: Expect-100 JDK bug resolution: prevent multiple server calls #6022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
29d08d5
4847c6a
11e4c2d
afb5860
2835269
b15468f
ef871b3
01d509e
75c722a
bd38659
01cc8eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,6 +69,7 @@ public final class AbfsHttpConstants { | |
| * and should qualify for retry. | ||
| */ | ||
| public static final int HTTP_CONTINUE = 100; | ||
| public static final String EXPECT_100_JDK_ERROR = "Server rejected operation"; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wont the JDK bug were redundant connections are attempted be an issue for any exception case in getOutputStream() ? Shouldn't the case to handle be irrespective of explicit server rejected request case ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For any IOException thrown by getOutputStream, no headers / inputStream will be parsed. The flow of code is such that for IOException other expect-100 error, exception will be thrown back to AbfsRestOperation which will retry again as per retry-policy. |
||
|
|
||
| // Abfs generic constants | ||
| public static final String SINGLE_WHITE_SPACE = " "; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,12 +22,14 @@ | |
| import java.io.InputStream; | ||
| import java.io.OutputStream; | ||
| import java.net.HttpURLConnection; | ||
| import java.net.ProtocolException; | ||
| import java.net.URL; | ||
| import java.util.List; | ||
|
|
||
| import javax.net.ssl.HttpsURLConnection; | ||
| import javax.net.ssl.SSLSocketFactory; | ||
|
|
||
| import org.apache.hadoop.classification.VisibleForTesting; | ||
| import org.apache.hadoop.fs.azurebfs.utils.UriUtils; | ||
| import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; | ||
|
|
||
|
|
@@ -43,6 +45,7 @@ | |
| import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; | ||
| import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; | ||
|
|
||
| import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; | ||
| import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; | ||
| import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; | ||
|
|
||
|
|
@@ -83,6 +86,7 @@ public class AbfsHttpOperation implements AbfsPerfLoggable { | |
| private long sendRequestTimeMs; | ||
| private long recvResponseTimeMs; | ||
| private boolean shouldMask = false; | ||
| private boolean connectionDisconnectedOnError = false; | ||
|
|
||
| public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult( | ||
| final URL url, | ||
|
|
@@ -324,14 +328,26 @@ public void sendRequest(byte[] buffer, int offset, int length) throws IOExceptio | |
| */ | ||
| outputStream = getConnOutputStream(); | ||
| } catch (IOException e) { | ||
| /* If getOutputStream fails with an exception and expect header | ||
| is enabled, we return back without throwing an exception to | ||
| the caller. The caller is responsible for setting the correct status code. | ||
| If expect header is not enabled, we throw back the exception. | ||
| connectionDisconnectedOnError = true; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. setting this field here and using in processResponse() means that we won't be processing response for any IOException. But isn't the intent to not process only in case of JDK error? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want to prevent any further API called on the |
||
| /* If getOutputStream fails with an expect-100 exception , we return back | ||
| without throwing an exception to the caller. Else, we throw back the exception. | ||
| */ | ||
| String expectHeader = getConnProperty(EXPECT); | ||
| if (expectHeader != null && expectHeader.equals(HUNDRED_CONTINUE)) { | ||
| if (expectHeader != null && expectHeader.equals(HUNDRED_CONTINUE) | ||
| && e instanceof ProtocolException | ||
| && EXPECT_100_JDK_ERROR.equals(e.getMessage())) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned in an earlier comment, the HttpUrlConnection that we hold currently is disconnected while inside this catch block. Do we want to prevent later API calls that trigger connections irrespective of throttled failures ? If so, setting the status of connectionDisconnectedOnError should be at the start of catch block ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Taken. The setting of connectionDisconnectedOnError in starting of the catch block. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the question by @snvijaya is Do we want to prevent later API calls that trigger connections irrespective of any failures? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At In the JDK's implementation of Also, the implementation of
Hence, any further API is prevented on the HttpUrlConnection object which has got an IOException in getOutputStream. |
||
| LOG.debug("Getting output stream failed with expect header enabled, returning back ", e); | ||
| /* | ||
| * In case expect-100 assertion has failed, headers and inputStream should not | ||
| * be parsed. Reason being, conn.getHeaderField(), conn.getHeaderFields(), | ||
| * conn.getInputStream() will lead to repeated server call. | ||
| * ref: https://bugs.openjdk.org/browse/JDK-8314978. | ||
| * Reading conn.responseCode() and conn.getResponseMessage() is safe in | ||
| * case of Expect-100 error. Reason being, in JDK, it stores the responseCode | ||
| * in the HttpUrlConnection object before throwing exception to the caller. | ||
| */ | ||
| this.statusCode = getConnResponseCode(); | ||
| this.statusDescription = getConnResponseMessage(); | ||
| return; | ||
| } else { | ||
| LOG.debug("Getting output stream failed without expect header enabled, throwing exception ", e); | ||
|
|
@@ -364,7 +380,17 @@ public void sendRequest(byte[] buffer, int offset, int length) throws IOExceptio | |
| * @throws IOException if an error occurs. | ||
| */ | ||
| public void processResponse(final byte[] buffer, final int offset, final int length) throws IOException { | ||
| if (connectionDisconnectedOnError) { | ||
| LOG.debug("This connection was not successful or has been disconnected, " | ||
| + "hence not parsing headers and inputStream"); | ||
| return; | ||
| } | ||
| processConnHeadersAndInputStreams(buffer, offset, length); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the method name can remain same as before. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did this to make it more testable. This is to abstract out the logic which actually reads the responseHeaders and inputStream. And in future, if some more logic would need to be added, that can be added in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. |
||
| } | ||
|
|
||
| void processConnHeadersAndInputStreams(final byte[] buffer, | ||
| final int offset, | ||
| final int length) throws IOException { | ||
| // get the response | ||
| long startTime = 0; | ||
| startTime = System.nanoTime(); | ||
|
|
@@ -608,6 +634,11 @@ String getConnResponseMessage() throws IOException { | |
| return connection.getResponseMessage(); | ||
| } | ||
|
|
||
| @VisibleForTesting | ||
| Boolean getConnectionDisconnectedOnError() { | ||
| return connectionDisconnectedOnError; | ||
| } | ||
|
|
||
| public static class AbfsHttpOperationWithFixedResult extends AbfsHttpOperation { | ||
| /** | ||
| * Creates an instance to represent fixed results. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this error string be returned by server only for this exception ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exception message thrown by JDK and not server: https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/net/www/protocol/http/HttpURLConnection.java#L1364