Errors propagate through transaction #247

flyingsilverfin · 2022-01-20T12:03:53Z

What is the goal of this PR?

We align with Client NodeJS version 2.6.1 (mostly the work in typedb/typedb-driver-nodejs#197), which implements a better error propagation mechanism: when an exception occurs, we store it against all the transaction's active transmit queues to retrieve whenever the user tries to perform an operation in the transaction anywhere.

What are the changes implemented in this PR?

store errors received from gRPC against each receive queue
return a new exception type, transaction is closed with errors, which throws the errors from all queues (note that this can be duplicate if there are multiple open transmit queues that have been given the same error)
we clean up queues that are no longer needed, so we minimise the number of times the user sees the same exception

flyingsilverfin · 2022-01-20T12:04:17Z

typedb/connection/transaction.py


    def _raise_transaction_closed(self):
-        errors = self._bidirectional_stream.drain_errors()
+        errors = self._bidirectional_stream.get_errors()


new name, aligned with client-nodejs

flyingsilverfin · 2022-01-20T12:05:19Z

typedb/stream/bidirectional_stream.py

+    def done(self, request_id: UUID):
+        self._response_collector.remove(request_id)
+


we now clean up completed single/stream from the response collectors, so we don't propagate and print errors for every old operation the transaction also handled but isn't active anymore!

flyingsilverfin · 2022-01-20T12:05:47Z

typedb/stream/bidirectional_stream.py

+                error = TypeDBClientException.of_rpc(e)
+                self.close(error)
+                raise error


changed this so that we propagate our own exception instead of the gRPC error throughout the transaction queues

flyingsilverfin · 2022-01-20T12:05:58Z

typedb/stream/bidirectional_stream.py

+        return self._response_collector.get_errors()

-    def close(self, error: RpcError = None):
+    def close(self, error: TypeDBClientException = None):


new type: use our own exception everywhere

flyingsilverfin · 2022-01-20T12:06:27Z

typedb/stream/bidirectional_stream.py

        def get(self) -> T:
-            return self._stream.fetch(self._request_id)
+            value = self._stream.fetch(self._request_id)
+            self._stream.done(self._request_id)


a Single can immediately remove itself from the transaction stream when the user retrieves the value via get()

flyingsilverfin · 2022-01-20T12:06:37Z

typedb/stream/response_collector.py

                collector.close(error)

-    def drain_errors(self) -> [RpcError]:
+    def get_errors(self) -> [TypeDBClientException]:


new error typing

flyingsilverfin · 2022-01-20T12:06:59Z

typedb/stream/response_collector.py

+                error = collector.get_error()
+                if error is not None:
+                    errors.append(error)


get error from each collector queue

flyingsilverfin · 2022-01-20T12:09:18Z

typedb/stream/response_collector.py

+            self._error: TypeDBClientException = None

        def get(self, block: bool) -> R:
            response = self._response_queue.get(block=block)
-            if response.message:
+            if response.is_response():
                return response.message
-            elif response.error:
-                raise TypeDBClientException.of_rpc(response.error)
-            else:
+            elif response.is_done() and self._error is None:
                raise TypeDBClientException.of(TRANSACTION_CLOSED)
+            elif response.is_done() and self._error is not None:
+                raise TypeDBClientException.of(TRANSACTION_CLOSED_WITH_ERRORS, self._error)
+            else:
+                raise TypeDBClientException.of(ILLEGAL_STATE)


same as in client-nodejs: we now return the recorded error, and don't put the error in the Done queue slot

flyingsilverfin · 2022-01-20T12:09:32Z

typedb/stream/response_collector.py

+    def is_response(self):
+        return True
+
+    def is_done(self):
+        return False
+

 class Done:

-    def __init__(self, error: Optional[RpcError]):
-        self.error = error
+    def __init__(self):
+        pass
+
+    def is_response(self):
+        return False
+
+    def is_done(self):
+        return True


is this the right way to do this in python??

flyingsilverfin · 2022-01-20T12:10:13Z

typedb/stream/response_part_iterator.py


    def __next__(self) -> transaction_proto.Transaction.ResPart:
        if not self._has_next():
+            self._bidirectional_stream.done(self._request_id)


we also let the transaction stream know this query stream is done so it can be removed

alexjpwalker · 2022-01-20T13:12:57Z

typedb/stream/response_collector.py

-    def close(self, error: Optional[RpcError]):
+    def remove(self, request_id: UUID):
+        with self._collectors_lock:
+            del self._collectors[request_id]


Something feels awkward about this - not the fault of the PR but rather the existing code.

We have a class named ResponseCollector, which contains an object named _collectors. This is simply illogical since the type of those collectors is not ResponseCollector.

Do you have any idea how we might improve the terminology?

It's a fair point, I think the idea is to see the ResponseCollector as a single thing - which means the naming inside is off - renamed to _response_queues

typedb/stream/response_collector.py

typedb/stream/bidirectional_stream.py

typedb/stream/response_collector.py

flyingsilverfin · 2022-01-20T18:16:54Z

Review Summary:
We improve some terminology in the ResponseCollector - the collector presents itself as ONE collector. However, inside, the member variable is collectors -- so the question arises if the inner collectors are the actual collectors. To make this clearer, we rename the inner variable to be response_queues, since it represents a map to Queue objects, rather than further Collectors

## What is the goal of this PR? We no longer delete response collectors in a transaction after receiving a response to a "single" request, or receiving a "DONE" message in a stream. This fixes a possible error when loading 50+ answers in one query and then performing a second query. ## What are the changes implemented in this PR? We had previously added code to clean up used response collectors in #247. But this broke in the scenario where we open a transaction, run a query that loads 51 answers (the prefetch size + 1), and then run a second query. The server would respond to the first query with: 50 answers -> CONTINUE -> 1 answer [compensating for latency] -> DONE. The client would respond to CONTINUE with STREAM to keep iterating, and the server would respond to STREAM with a 2nd DONE message. The iterator for query 1 finishes as soon as we see the first DONE message, so we stop reading responses at that point, meaning the second DONE may never be read by the client. But opening the iterator for query 2 causes us to continue reading messages from the transaction stream - note that we have no control over which request is being "currently served"; all responses use the same pipeline, the same gRPC stream. That's why we have the Response Collectors - when we get a response for a request that is different to the request we actually asked for, we need to store it in its respective Collector bucket. We could mitigate the issue by patching the server, but its current behaviour is actually pretty intuitive - if you send it a STREAM request and it has no more answers, it responds with DONE. We could change it to not respond at all, but that would be adding complexity where it is not really necessary to do so. So instead, we're reverting back to the old client behaviour, where the response collectors follow the lifetime of the Transaction, noting that Transactions are typically short-lived so cleanup will be performed in a timely manner anyway.

## What is the goal of this PR? Revert previous changes from #247 and #248, which made query queues and iterators throw the same error idempotently. However, this goes counter to standard usage of iterators and queues, which are not meant to behave idempotently (each item is only returned once, and if they have an error they should no longer be used). ## What are the changes implemented in this PR? * remove idempotent error state of collectors and queues, which back query iterators * note that we still store the error on the transaction bidirectional stream, in case the server throws an exception when there are no query iterators active Note: mirrors change from typedb/typedb-driver#372

flyingsilverfin added 2 commits January 20, 2022 11:35

Transmitter queue holds error and returns it

346a762

Remove finished operations from stream collectors

b6d1014

flyingsilverfin requested a review from alexjpwalker as a code owner January 20, 2022 12:03

grabl assigned alexjpwalker Jan 20, 2022

Add newline

1668a05

flyingsilverfin commented Jan 20, 2022

View reviewed changes

flyingsilverfin added the priority: high label Jan 20, 2022

flyingsilverfin added this to the Technical Debt milestone Jan 20, 2022

flyingsilverfin added 3 commits January 20, 2022 12:11

auto format

391641d

add pritn

7797c67

Add missing method

f2884cc

alexjpwalker suggested changes Jan 20, 2022

View reviewed changes

Clarify ResponseCollector and BidirectionalStream naming

6ecd55e

flyingsilverfin requested a review from alexjpwalker January 20, 2022 14:51

alexjpwalker approved these changes Jan 20, 2022

View reviewed changes

flyingsilverfin changed the title ~~Improve UX by allowing errors to propagate through transaction~~ Errors propagate through transaction Jan 20, 2022

flyingsilverfin merged commit 56c990b into typedb:master Jan 20, 2022

flyingsilverfin deleted the reuse-transmitter-error branch January 20, 2022 16:13

alexjpwalker mentioned this pull request Jan 26, 2022

Don't delete response collectors in a transaction #250

Merged

flyingsilverfin mentioned this pull request Mar 22, 2022

Query iterator only throws exeption once #252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Errors propagate through transaction #247

Errors propagate through transaction #247

Uh oh!

flyingsilverfin commented Jan 20, 2022 •

edited

Loading

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

alexjpwalker Jan 20, 2022

Uh oh!

flyingsilverfin Jan 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flyingsilverfin commented Jan 20, 2022

Uh oh!

Uh oh!

		def done(self, request_id: UUID):
		self._response_collector.remove(request_id)

Errors propagate through transaction #247

Errors propagate through transaction #247

Uh oh!

Conversation

flyingsilverfin commented Jan 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the goal of this PR?

What are the changes implemented in this PR?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flyingsilverfin commented Jan 20, 2022

Uh oh!

Uh oh!

flyingsilverfin commented Jan 20, 2022 •

edited

Loading