Integrate the pipe() syscall implementation originally written by @cynecx #4935

atrosinenko · 2017-02-11T15:37:10Z

This pull request is based on the PR #4378 by @cynecx. I am not a JavaScript developer and I am not familiar with the Emscripten FS layer, though, the majority of PIPEFS code was already written by @cynecx and even reviewed there to some extent. The problem with that PR was the lack of tests. I have added some tests and the tests revealed some problems with the original implementation that I have fixed.

There still may be some performance issue with the (common?) use case with writing a few bytes, and then reading these bytes when implementing events, since every read() call seems to pop an almost empty 8kb buffer, and write() will allocate a new one, so It may generate some excessive garbage.

Another peculiarity of this code is that the read end of the pipe always behaves as if it is non-blocking -- is it currently possible to block the read operation at all?

atrosinenko · 2017-02-11T15:41:43Z

src/library_pipefs.js

        assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer));
-        var data = new Uint8Array(buffer);
-        data = buffer.subarray(offset, offset + length);
+        var data = buffer.subarray(offset, offset + length);


The previous code (and similar code in write) looks like an incorrectly written conversion to the byte-sized buffer. Is it safe to assume that the buffer is always already byte-sized?

atrosinenko · 2017-02-11T15:46:40Z

Unfortunately my source code comment is not shown anyway, so is it safe to assume that the buffer argument to read and write operations is always byte-sized (an Uint8Array)?

cynecx · 2017-02-11T16:48:53Z

I think the input buffer type (for read) must be a type of ArrayBuffer and not somekind of view, otherwise the whole read function wouldn't actually read something as we would read into a newly created buffer.

new Uint8Array(buffer); // Where buffer instanceof ArrayBuffer => The resulting Uint8Array is a view on an existing buffer.
new Uint8Array(buffer); // Where buffer instanceof Uint8Array || ArrayBuffer.isView(buffer) => The resulting Uint8Array is a copy

I think we should change this line:

assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer));

to

assert(buffer instanceof ArrayBuffer);
var data = new Uint8Array(buffer); // Create a view of an ArrayBuffer

However, if we also want to accept typed views and ArrayBuffer we should do something like:

assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer));
buffer = new Uint8Array(
    ArrayBuffer.isView(buffer) ? buffer.buffer : buffer);

cynecx · 2017-02-11T19:10:17Z

src/library_pipefs.js

+            toRemove++;
+          }
+        }
+


I think we could add something like this to prevent deallocating all buffers and leave one bucket alive:

if (toRemove && toRemove == pipe.buckets.length) { toRemove--; pipe.buckets[toRemove].offset = 0; pipe.buckets[toRemove].roffset = 0; }

Thank you @cynecx, integrated this code too :)

atrosinenko · 2017-02-12T12:28:22Z

@cynecx changing the line

assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer));

to

assert(buffer instanceof ArrayBuffer);

in either read or write makes assert fail.

atrosinenko · 2017-02-12T12:45:40Z

Meanwhile, I ran some tests and at least test_freetype (test_core.default) and test_poppler (test_core.default) are failing. But when I switch to the incoming branch (70d3702), they fail too.

cynecx · 2017-02-12T13:25:08Z

@atrosinenko Could you console.log() the buffer?

atrosinenko · 2017-02-12T13:49:26Z

When ran under Node, console.log() tries to print a lot in this test (since read and write are called many times), but when I add something like console.log(buffer.byteLength) just before asserts, it prints 16777216. I think, we should wait until @kripken's comment about FS design.

juj · 2017-02-13T11:28:33Z

src/library_pipefs.js

@@ -0,0 +1,208 @@
+mergeInto(LibraryManager.library, {
+  $PIPEFS__postset: '__ATINIT__.push(function() { PIPEFS.root = FS.mount(PIPEFS, {}, null); });',


Hmm, this is actually a bit suspect: I don't think we have any type of ordering guaranteed in our JS dependencies, so calling FS.mount() here as a postset to PIPEFS might not guarantee that FS.staticInit() would have been called before. I think if it does, then it's due to 'f' < 'p' alphabetically by accident?

I think, it may be worth looking at sockfs -- it seems to have the same dependency.

That is a good observation.

Reading closer, it looks like FS.init.initialized is not needed to be true in order to be able to FS.mount() (FS does that itself in FS.staticInit() as well) However FS.staticInit() must have been called before FS.mount() can. I think if this does not hold, then errors will naturally manifest by the absence of FS.nameTable, so things should be good here.

juj · 2017-02-13T11:32:14Z

src/library_pipefs.js

+  $PIPEFS: {
+    BUCKET_BUFFER_SIZE: 1024 * 8, // 8KiB Buffer
+    mount: function (mount) {
+      return FS.createNode(null, '/', {{{ cDefine('S_IFDIR') }}} | 511 /* 0777 */, 0);


This line looks odd, it's mounting the root directory of the whole filesystem to be PIPEFS - does this replace MEMFS altogether?

Looks like the same code is used in sockfs -- is it currently supported?

SOCKFS implements BSD sockets support via a WebSockets backend, it is currently supported. Reading closer, this is not mounting PIPEFS to the root of the filesystem, this is calling FS.createNode (and not FS.mount) to create the filesystem root directory. This should be redundant, the root directory should exist, and this actually would create multiple versions of that node. Does PIPEFS.mount get called from anywhere? It can be removed if not.

PIPEFS.mount is called from $PIPEFS__postset through FS.mount. But it looks like this createNode creates kind of pseudo-root directory for PIPEFS not linked to the FS.root. Maybe it is OK?

Oh right, missed that line. Yeah, that is a bit odd since it creates multiple nodes that have the identical path /. Would instead here return FS.lookupPath('/'); work instead of return FS.createNode(...);? That would avoid creating duplicate root nodes. If it doesn't work for some reason, then that's ok if we document this peculiar behavior to understand why. (I know we do have the same proprty in the existing SOCKFS, would be good to figure that out too)

I'm not sure, that it creates identical root nodes: when null is passed as parent to createNode, as far as I understand, it creates a node that is parent of itself. And since all nodes have sequential inode numbers, then this pseudo-root node should have different parent inode compared to the "real" root directory. Looks like the only way new node is registered in createNode is through FS.hashAddNode(...) and all lookups through FS.nameTable are performed with comparison by name and parent id. Though, it seems that any call to FS.unmount will remove nodes, that are not linked to FS.root. Meanwhile FS.mount has special treatment for pseudo-mounts.

Oh, I probably misunderstood FS.unmount() -- it should not remove such pseudo mounts.

juj · 2017-02-13T11:36:57Z

src/library_pipefs.js

+
+        pipe.buckets.forEach(function (bucket) {
+          currentLength += bucket.offset - bucket.roffset;
+        });


I see there is a comment below about optimising to not generate extra garbage. In case that is interesting, this forEach construct also generates temporary garbage by creating an iterator object and a temporary function that gets nuked afterwards. The second iteration over the buckets below is using a for (var i = 0; i < pipe.buckets.length; i++) { construct. That is a nice and garbage-free way to iterate over an array.

juj · 2017-02-13T11:39:37Z

src/library_pipefs.js

+
+        while (toRemove--) {
+          pipe.buckets.shift();
+        }


A naive implementation of .shift() in a browser could be of linear complexity, so this while loop can end up being quadratic complexity. Calling pipe.buckets.splice(0, toRemove); would achieve the same with better performance?

juj · 2017-02-13T11:42:22Z

src/library_pipefs.js

+          data = data.subarray(freeBytesInCurrBuffer, data.byteLength);
+        }
+
+        var numBuckets = ~~(data.byteLength / PIPEFS.BUCKET_BUFFER_SIZE);


Interesting to see ~~ to truncate down to integer. Generally x|0 is used throughout the codebase to achieve this, though I suppose these end up being functionally identical?

juj · 2017-02-13T11:44:18Z

src/modules.js

        'library_fs.js',
        'library_memfs.js',
        'library_tty.js',
+        'library_pipefs.js',


Oh, I presume the fs < pipefs order does end up coming from here. In that case, probably worth to add a note here that pipefs static init depends on fs static init being called before, so the ordering in this list is fixed.

Unfortunately, moving library_pipefs.js above the library_fs.js here seems not to change the order in which postsets are emitted in the resulting js. Reversing __deps to $FS -> $PIPEFS seems not to change it too. So it works, but no one know how. :(

It is likely that these names go to a dictionary later on, so the alphabetical ordering will be restored later even if they are not present now. However I think no need to worry if it works, since issues of not having initialized the filesystem should be apparent if that happens. (For curiosity, you could try renaming the file to library_aipefs.js or something similar and see if that causes issues)

juj · 2017-02-13T11:45:52Z

tests/unistd/pipe.c

+    puts("success");
+
+    return 0;
+}


Very nice work with the test! Btw, in the upcoming ASMFS filesystem, these kinds of tests are super valuable because they help make sure that the constructs will be multithreading capable.

atrosinenko · 2017-02-15T06:50:56Z

Fixed some review issues as suggested.

atrosinenko · 2017-03-04T08:56:22Z

What further actions are expected from me on this PR?

kripken · 2017-03-06T18:46:51Z

(@juj has been at a conference, GDC, but hopefully he'll have time to reply soon)

atrosinenko · 2017-04-10T11:04:00Z

Excuse me for the delay. @juj , what further actions are expected from me on this PR?

I have run random500 tests and it found an error in test_unistd_misc. It should definitely be adjusted to reflect presence of pipe syscall, but it tests how does pipe(0) work too. It should be possible to return EFAULT in that particular case, but does it test something really useful, since NULL is not the only possible invalid pointer?

atrosinenko · 2017-05-06T09:41:52Z

@kripken @juj What should I fix in this PR?

atrosinenko · 2017-06-05T19:28:37Z

Hello? Does anybody hear me? :)

juj · 2017-06-06T10:08:12Z

Very sorry for the delay. This does look good, see the question about the mount function.

Also looking at the test/unistd/misc.c test, the expected results file test/unistd/misc.out assumes that pipes are always failing, i.e.:

pipe(good): -1, errno: 38
pipe(bad): -1, errno: 38

We should update that test to reflect the new implementation status, should it be as follows?

pipe(good): 0, errno: 0
pipe(bad): -1, errno: 38

atrosinenko · 2017-06-06T19:52:06Z

Considering this test, I think we may either ignore NULL, remove the second line and consider such cases as Undefined Behavior, but it may be worth to really return error in this particular case to facilitate debug. Maybe it is even required by some standard...

I hope I prepare some sane version of the PR this weekend.

Make poll() behave differently for the read end and write end of the pipe.

Remove constructing of `Uint8Array`s that are disposed immediately. Suppose buffer is always byte-sized.

Fix the `read` operation so it returns EAGAIN if there is no data to read as if the read end of a pipe is always in a non-blocking mode.

@cynecx

The current PIPEFS `read()` implementation seems to produce excessive garbage with use cases such as write several bytes, read everything, write, read..., since it everytime throws away an almost empty 8Kb buffer. This commit addes a fix proposed by @cynecx.

atrosinenko · 2017-06-11T17:10:18Z

Rebased PR onto fresh incoming and fixed test_unistd_misc. Now it passes all the tests that are run on my installation with python tests/runner.py. See my reply on PIPEFS.mount. Meanwhile, is it OK to suppose buffers passed to read and write are always byte-sized?

atrosinenko · 2017-06-24T09:15:00Z

I have added a comment clarifying my current view of situation with PIPEFS.mount(...).

I tested the following program:

#include <stdio.h>
#include <assert.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
#include <errno.h>

void ls()
{
  DIR *dir = opendir("/");
  struct dirent *ent;
  int total = 0;

  assert(dir != NULL);
  printf("####\n");
  while(errno = 0, ent = readdir(dir))
  {
    printf("%s\n", ent->d_name);
	total++;
  }
  assert(errno == 0);
  printf("Total: %d\n\n", total);
  assert(closedir(dir) == 0);
}

int main()
{
  ls();
  fopen("/test.txt", "w");
  ls();
  int fd[2];
  assert(pipe(fd) == 0);
  ls();
}

And when ran with emcc test.c -o test.html --preload-file test.c --emrun && emrun --browser firefox test.html it prints

pre-main prep time: 28 ms
####
test.c
proc
dev
home
tmp
..
.
Total: 7

####
test.txt
test.c
proc
dev
home
tmp
..
.
Total: 8

####
test.txt
test.c
proc
dev
home
tmp
..
.
Total: 8

that looks good.

kripken · 2017-06-26T17:16:36Z

src/library_pipefs.js

+        assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer));
+        var data = buffer.subarray(offset, offset + length);
+
+        if(length <= 0) {


space after if

kripken · 2017-06-26T17:17:10Z

src/library_syscall.js

+    var fdPtr = SYSCALLS.get();
+
+    if (fdPtr == 0) {
+        throw new FS.ErrnoError(ERRNO_CODES.EFAULT);


should be only 2 spaces

kripken · 2017-06-26T17:18:38Z

Aside from those tiny style comments, this looks good to me, nice work. @juj had some comments earlier, though, so let's give him a little more time to look at this before merging.

atrosinenko · 2017-06-26T18:37:59Z

Fixed those two style issues, thanks.

atrosinenko · 2017-07-18T12:50:51Z

Ping :) Meanwhile, can we safely assume that buffers passed to read and write are always byte-sized?

juj · 2017-07-18T17:06:28Z

Thanks for the update. This looks good to merge to me now.

atrosinenko · 2017-07-20T11:33:41Z

Thank you very much everyone who participated, we ultimately did it. :)

juj · 2017-07-21T10:31:36Z

Thanks @atrosinenko for connecting to the various conversations on the bug tracker on this.

atrosinenko commented Feb 11, 2017

View reviewed changes

cynecx reviewed Feb 11, 2017

View reviewed changes

juj added the filesystem label Feb 13, 2017

juj reviewed Feb 13, 2017

View reviewed changes

atrosinenko force-pushed the pipefs branch from aacca24 to 75b3634 Compare April 10, 2017 07:36

cynecx added 5 commits June 11, 2017 09:49

Add support for pipe syscall

47038b1

[PIPEFS] write/read: Optimizations (NFC)

1b9709f

[PIPEFS] Fix code-style issues

8872a86

[PIPEFS] Optimize poll function

b3b9607

[PIPEFS] createPipe: Return an object instead of an array

71d32cd

atrosinenko added 9 commits June 11, 2017 09:49

Add some tests for the pipe() syscall

1709da1

[PIPEFS] Test writing / reading in large chunks

d9c3120

[PIPEFS] Fix poll() behavior

59e2cda

Make poll() behave differently for the read end and write end of the pipe.

[PIPEFS] Fix performance issue

bc35790

Remove constructing of `Uint8Array`s that are disposed immediately. Suppose buffer is always byte-sized.

[PIPEFS] Return EAGAIN when no data is available to read

ddcd951

Fix the `read` operation so it returns EAGAIN if there is no data to read as if the read end of a pipe is always in a non-blocking mode.

[PIPEFS] Fix some review issues

b4d4d12

[PIPEFS] Use the asm.js style of real number truncation instead of ~~

0ae0817

Fix test_unistd_misc

6d14096

atrosinenko force-pushed the pipefs branch from 0b20db4 to 6d14096 Compare June 11, 2017 16:58

[PIPEFS] Add note to PIPEFS.mount(...)

71f49b3

kripken reviewed Jun 26, 2017

View reviewed changes

[PIPEFS] Fixed two style issues.

0fca85b

kripken merged commit 8fce4a2 into emscripten-core:incoming Jul 19, 2017

atrosinenko deleted the pipefs branch July 20, 2017 09:16

This was referenced Jul 20, 2017

Add support for pipe syscall (PIPEFS) #4378

Closed

pipe is not working in the browser #4405

Closed

pipe() error and Boost.ASIO not working #2883

Closed

sbc100 mentioned this pull request Oct 16, 2025

Fix select didn't work with pipes and the timeout argument #25523

Open

		@@ -0,0 +1,208 @@
		mergeInto(LibraryManager.library, {
		$PIPEFS__postset: '__ATINIT__.push(function() { PIPEFS.root = FS.mount(PIPEFS, {}, null); });',

Integrate the pipe() syscall implementation originally written by @cynecx #4935

Integrate the pipe() syscall implementation originally written by @cynecx #4935

Uh oh!

Conversation

atrosinenko commented Feb 11, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atrosinenko commented Feb 11, 2017

Uh oh!

cynecx commented Feb 11, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atrosinenko commented Feb 12, 2017

Uh oh!

atrosinenko commented Feb 12, 2017

Uh oh!

cynecx commented Feb 12, 2017

Uh oh!

atrosinenko commented Feb 12, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atrosinenko commented Feb 15, 2017

Uh oh!

atrosinenko commented Mar 4, 2017

Uh oh!

kripken commented Mar 6, 2017

Uh oh!

atrosinenko commented Apr 10, 2017

Uh oh!

atrosinenko commented May 6, 2017

Uh oh!

atrosinenko commented Jun 5, 2017

Uh oh!

juj commented Jun 6, 2017

Uh oh!

atrosinenko commented Jun 6, 2017

Uh oh!

atrosinenko commented Jun 11, 2017

Uh oh!

atrosinenko commented Jun 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!