[Backend] Refactor ipynb kernel messages serialization #436

senwang86 · 2023-08-05T07:32:36Z

Summary

Currently, the kernel messages are concatenated and stored in a single object, i.e., pod.result. This concatenation behavior creates a few discrepancy with Colab regarding the output results, e.g., it can't produce multiple plots in a single pod, the order of line execution might confuse users (see screenshots in Test section)

Test

Before

After

Export/Import also verified

Follow-up

The ResultBlock in Code.tsx needs more tuning
We might need to support more message types, Messaging in Jupyter

lihebi · 2023-08-05T20:08:54Z

ui/src/components/nodes/Code.tsx

+                      <Box
+                        component="pre"
+                        whiteSpace="pre-wrap"
+                        key={i + 1}


The key must be unique. There's a console warning.

lihebi · 2023-08-05T20:20:14Z

The stderr is not printed.
Also, I'd like to keep the visual <hr/> to visually separate the stdout/stderr-stream and the return value.

Reference:

Old behavior:

lihebi · 2023-08-05T20:25:43Z

runtime/src/zmq-utils.ts

+    // There's no exec_count in display_data, thus we pass in the session exec_count
+    count: exec_count,


Could we pass in 0? This value doesn't seem to be useful. Thus, I don't feel maintaining a session exec_count is necessary. It increases the logic complexity.

I think the count in execute_result is enough.

I think the count in execute_result is enough.

The tricky part is that execute_result is not always available, instead, IIUC, execute_reply would be the last messages in each cell run. The key is how to update the count properly, the logic would be either on the frontend or backend.

I think we can wait until the final execution results before setting the count. This should be consistent with the behavior of Jupyter?

I think we can wait until the final execution results before setting the count. This should be consistent with the behavior of Jupyter?

Yes, I think we can make it in the frontend as well.

I'm not sure if maintaining session_exec_count ourselves is accurate. I'd vote for using msgs.content.execution_count because it is always accurate and seems enough for the purpose of showing a count.

lihebi · 2023-08-05T20:44:38Z

ui/src/lib/store/index.tsx

+    text?: string;
+    count: number;
+    image?: string;
+  }[];


I am not confident about changing this. It is a breaking change, the old values in the DB need migration to work with the new code.

If we really want to change this, we need to supply a DB migration script or procedure/function.

I think the issue you are trying to fix is:

the order of stderr

being able to display multiple images (at the end, not to be mixed with stdout/stderr streams)

I think you can fix (1) without introducing this schema change. For 2, I'd suggest skipping it for now, it's not crucial.

Also, using an array to store streams doesn't sound like a good idea. The stream can come at any granularity, e.g., commonly line-by-line using because \n flushes stream in most languages, or users may call flush() manually.

I believe "stream" is supposed to be concatenated together upon receiving.

In summary, I suggest fixing only issue 1 here, with minimal change so that we don't have to worry about migration. Forget about 2, it's not important.

It's better to open another PR to fix issue 1, and leave this PR for future reference.

Also, using an array to store streams doesn't sound like a good idea. The stream can come at any granularity, e.g., commonly line-by-line using because \n flushes stream in most languages, or users may call flush() manually.

I believe "stream" is supposed to be concatenated together upon receiving.

So the change here is put all the returned messages from kernel in an array, rather than manually separating off each message and concatenating the text field. Ipynb kernel would decide how to concatenate each line's execution.

I am not confident about changing this. It is a breaking change, the old values in the DB need migration to work with the new code.

If we really want to change this, we need to supply a DB migration script or procedure/function.

I think the issue you are trying to fix is:

the order of stderr

being able to display multiple images (at the end, not to be mixed with stdout/stderr streams)

I think you can fix (1) without introducing this schema change. For 2, I'd suggest skipping it for now, it's not crucial.

IIUC, the JSON format change will not render the result filed correctly in the existing repos, in that case, will a re-execution of each pod overwrite the result field?

Yes, you are right, the re-execution will fix the result field. The only thing that breaks is the existing result, which might not be that important to do a migration for it, especially at this early release point.

lihebi · 2023-08-07T22:09:52Z

The stderr is not printed.

Also, I'd like to keep the visual <hr/> to visually separate the stdout/stderr-stream and the return value.

What do you say about these two issues? @senwang86 The rest of the code looks good to me.

senwang86 · 2023-08-07T22:31:44Z

The stderr is not printed.

Also, I'd like to keep the visual <hr/> to visually separate the stdout/stderr-stream and the return value.

What do you say about these two issues? @senwang86 The rest of the code looks good to me.

These 2 issues are addressed in the ab4818c, can you give it a test?

lihebi · 2023-08-07T22:56:17Z

Just tried, the order is not fixed. Two issues:

the order should be 1,2,3
there shouldn't be spaces in between (there are spaces after 3, and after 2)

lihebi · 2023-08-07T22:57:28Z

ui/src/components/nodes/Code.tsx

+                      return <></>;
+                    }
+                  default:
+                    return <></>;


The console warning is still there, caused by these two lines. Adding key={combineKey} will fix it.

senwang86 · 2023-08-07T23:04:35Z

Just tried, the order is not fixed. Two issues:

the order should be 1,2,3

there shouldn't be spaces in between (there are spaces after 3, and after 2)

Forget to mention about this, it depends on how Ipynb kernel handles the running result, the result is consistent with Colab.

lihebi · 2023-08-07T23:22:47Z

I see, SG.

senwang86 · 2023-08-07T23:30:33Z

I see, SG.

Spaces removed.

lihebi · 2023-08-07T23:44:31Z

Cool, thanks!

senwang86 added 3 commits August 5, 2023 07:07

Refactor pod.result

20d4413

Refactor Jupyter notebook import/export results output

244b36b

clean up

b2db10a

senwang86 requested a review from lihebi August 5, 2023 07:33

lihebi reviewed Aug 5, 2023

View reviewed changes

senwang86 added 2 commits August 7, 2023 19:56

Address comments

ab4818c

clean up

f17ee87

lihebi reviewed Aug 7, 2023

View reviewed changes

senwang86 added 2 commits August 7, 2023 23:26

Set pod result margin and padding to 0

dbc8c11

clean up

c17783f

lihebi merged commit 30b3b0f into codepod-io:main Aug 7, 2023

lihebi mentioned this pull request Aug 25, 2023

[UI] Fix export import after pr 467 #470

Merged

senwang86 deleted the refactor_ipynb_kernel_messages branch September 7, 2023 22:02

		// There's no exec_count in display_data, thus we pass in the session exec_count
		count: exec_count,

[Backend] Refactor ipynb kernel messages serialization #436

[Backend] Refactor ipynb kernel messages serialization #436

Uh oh!

Conversation

senwang86 commented Aug 5, 2023

Summary

Test

Before

After

Follow-up

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihebi commented Aug 5, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihebi Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

senwang86 Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihebi Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lihebi commented Aug 7, 2023

Uh oh!

senwang86 commented Aug 7, 2023

Uh oh!

lihebi commented Aug 7, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

senwang86 commented Aug 7, 2023

Uh oh!

lihebi commented Aug 7, 2023

Uh oh!

senwang86 commented Aug 7, 2023

Uh oh!

lihebi commented Aug 7, 2023

Uh oh!

Uh oh!

lihebi Aug 7, 2023 •

edited

Loading

senwang86 Aug 7, 2023 •

edited

Loading

lihebi Aug 7, 2023 •

edited

Loading