-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Integrate the pipe() syscall implementation originally written by @cynecx #4935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer)); | ||
var data = new Uint8Array(buffer); | ||
data = buffer.subarray(offset, offset + length); | ||
var data = buffer.subarray(offset, offset + length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code (and similar code in write
) looks like an incorrectly written conversion to the byte-sized buffer. Is it safe to assume that the buffer
is always already byte-sized?
Unfortunately my source code comment is not shown anyway, so is it safe to assume that the |
I think the input buffer type (for read) must be a type of ArrayBuffer and not somekind of view, otherwise the whole read function wouldn't actually read something as we would read into a newly created buffer.
I think we should change this line:
to
However, if we also want to accept typed views and ArrayBuffer we should do something like:
|
toRemove++; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could add something like this to prevent deallocating all buffers and leave one bucket alive:
if (toRemove && toRemove == pipe.buckets.length) {
toRemove--;
pipe.buckets[toRemove].offset = 0;
pipe.buckets[toRemove].roffset = 0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cynecx, integrated this code too :)
@cynecx changing the line assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer)); to assert(buffer instanceof ArrayBuffer); in either |
Meanwhile, I ran some tests and at least |
@atrosinenko Could you console.log() the buffer? |
When ran under Node, |
@@ -0,0 +1,208 @@ | |||
mergeInto(LibraryManager.library, { | |||
$PIPEFS__postset: '__ATINIT__.push(function() { PIPEFS.root = FS.mount(PIPEFS, {}, null); });', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is actually a bit suspect: I don't think we have any type of ordering guaranteed in our JS dependencies, so calling FS.mount()
here as a postset to PIPEFS might not guarantee that FS.staticInit()
would have been called before. I think if it does, then it's due to 'f' < 'p' alphabetically by accident?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it may be worth looking at sockfs -- it seems to have the same dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good observation.
Reading closer, it looks like FS.init.initialized
is not needed to be true in order to be able to FS.mount()
(FS
does that itself in FS.staticInit()
as well) However FS.staticInit()
must have been called before FS.mount()
can. I think if this does not hold, then errors will naturally manifest by the absence of FS.nameTable
, so things should be good here.
$PIPEFS: { | ||
BUCKET_BUFFER_SIZE: 1024 * 8, // 8KiB Buffer | ||
mount: function (mount) { | ||
return FS.createNode(null, '/', {{{ cDefine('S_IFDIR') }}} | 511 /* 0777 */, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks odd, it's mounting the root directory of the whole filesystem to be PIPEFS
- does this replace MEMFS
altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the same code is used in sockfs -- is it currently supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SOCKFS
implements BSD sockets support via a WebSockets backend, it is currently supported. Reading closer, this is not mounting PIPEFS
to the root of the filesystem, this is calling FS.createNode
(and not FS.mount
) to create the filesystem root directory. This should be redundant, the root directory should exist, and this actually would create multiple versions of that node. Does PIPEFS.mount
get called from anywhere? It can be removed if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PIPEFS.mount
is called from $PIPEFS__postset
through FS.mount
. But it looks like this createNode
creates kind of pseudo-root directory for PIPEFS not linked to the FS.root
. Maybe it is OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, missed that line. Yeah, that is a bit odd since it creates multiple nodes that have the identical path /
. Would instead here return FS.lookupPath('/');
work instead of return FS.createNode(...);
? That would avoid creating duplicate root nodes. If it doesn't work for some reason, then that's ok if we document this peculiar behavior to understand why. (I know we do have the same proprty in the existing SOCKFS, would be good to figure that out too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, that it creates identical root nodes: when null
is passed as parent
to createNode
, as far as I understand, it creates a node that is parent of itself. And since all nodes have sequential inode numbers, then this pseudo-root node should have different parent inode compared to the "real" root directory. Looks like the only way new node is registered in createNode
is through FS.hashAddNode(...)
and all lookups through FS.nameTable
are performed with comparison by name and parent id. Though, it seems that any call to FS.unmount
will remove nodes, that are not linked to FS.root
. Meanwhile FS.mount
has special treatment for pseudo-mounts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I probably misunderstood FS.unmount()
-- it should not remove such pseudo mounts.
src/library_pipefs.js
Outdated
|
||
pipe.buckets.forEach(function (bucket) { | ||
currentLength += bucket.offset - bucket.roffset; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there is a comment below about optimising to not generate extra garbage. In case that is interesting, this forEach construct also generates temporary garbage by creating an iterator object and a temporary function that gets nuked afterwards. The second iteration over the buckets below is using a for (var i = 0; i < pipe.buckets.length; i++) {
construct. That is a nice and garbage-free way to iterate over an array.
src/library_pipefs.js
Outdated
|
||
while (toRemove--) { | ||
pipe.buckets.shift(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A naive implementation of .shift()
in a browser could be of linear complexity, so this while
loop can end up being quadratic complexity. Calling pipe.buckets.splice(0, toRemove);
would achieve the same with better performance?
src/library_pipefs.js
Outdated
data = data.subarray(freeBytesInCurrBuffer, data.byteLength); | ||
} | ||
|
||
var numBuckets = ~~(data.byteLength / PIPEFS.BUCKET_BUFFER_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting to see ~~
to truncate down to integer. Generally x|0
is used throughout the codebase to achieve this, though I suppose these end up being functionally identical?
'library_fs.js', | ||
'library_memfs.js', | ||
'library_tty.js', | ||
'library_pipefs.js', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I presume the fs
< pipefs
order does end up coming from here. In that case, probably worth to add a note here that pipefs
static init depends on fs
static init being called before, so the ordering in this list is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, moving library_pipefs.js
above the library_fs.js
here seems not to change the order in which postsets are emitted in the resulting js. Reversing __deps
to $FS -> $PIPEFS
seems not to change it too. So it works, but no one know how. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is likely that these names go to a dictionary later on, so the alphabetical ordering will be restored later even if they are not present now. However I think no need to worry if it works, since issues of not having initialized the filesystem should be apparent if that happens. (For curiosity, you could try renaming the file to library_aipefs.js
or something similar and see if that causes issues)
puts("success"); | ||
|
||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work with the test! Btw, in the upcoming ASMFS
filesystem, these kinds of tests are super valuable because they help make sure that the constructs will be multithreading capable.
Fixed some review issues as suggested. |
What further actions are expected from me on this PR? |
(@juj has been at a conference, GDC, but hopefully he'll have time to reply soon) |
Excuse me for the delay. @juj , what further actions are expected from me on this PR? I have run |
Hello? Does anybody hear me? :) |
Very sorry for the delay. This does look good, see the question about the Also looking at the
We should update that test to reflect the new implementation status, should it be as follows?
|
Considering this test, I think we may either ignore I hope I prepare some sane version of the PR this weekend. |
Make poll() behave differently for the read end and write end of the pipe.
Remove constructing of `Uint8Array`s that are disposed immediately. Suppose buffer is always byte-sized.
Fix the `read` operation so it returns EAGAIN if there is no data to read as if the read end of a pipe is always in a non-blocking mode.
The current PIPEFS `read()` implementation seems to produce excessive garbage with use cases such as write several bytes, read everything, write, read..., since it everytime throws away an almost empty 8Kb buffer. This commit addes a fix proposed by @cynecx.
Rebased PR onto fresh |
I have added a comment clarifying my current view of situation with I tested the following program: #include <stdio.h>
#include <assert.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
#include <errno.h>
void ls()
{
DIR *dir = opendir("/");
struct dirent *ent;
int total = 0;
assert(dir != NULL);
printf("####\n");
while(errno = 0, ent = readdir(dir))
{
printf("%s\n", ent->d_name);
total++;
}
assert(errno == 0);
printf("Total: %d\n\n", total);
assert(closedir(dir) == 0);
}
int main()
{
ls();
fopen("/test.txt", "w");
ls();
int fd[2];
assert(pipe(fd) == 0);
ls();
} And when ran with
that looks good. |
src/library_pipefs.js
Outdated
assert(buffer instanceof ArrayBuffer || ArrayBuffer.isView(buffer)); | ||
var data = buffer.subarray(offset, offset + length); | ||
|
||
if(length <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after if
src/library_syscall.js
Outdated
var fdPtr = SYSCALLS.get(); | ||
|
||
if (fdPtr == 0) { | ||
throw new FS.ErrnoError(ERRNO_CODES.EFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be only 2 spaces
Aside from those tiny style comments, this looks good to me, nice work. @juj had some comments earlier, though, so let's give him a little more time to look at this before merging. |
Fixed those two style issues, thanks. |
Ping :) Meanwhile, can we safely assume that |
Thanks for the update. This looks good to merge to me now. |
Thank you very much everyone who participated, we ultimately did it. :) |
Thanks @atrosinenko for connecting to the various conversations on the bug tracker on this. |
This pull request is based on the PR #4378 by @cynecx. I am not a JavaScript developer and I am not familiar with the Emscripten FS layer, though, the majority of PIPEFS code was already written by @cynecx and even reviewed there to some extent. The problem with that PR was the lack of tests. I have added some tests and the tests revealed some problems with the original implementation that I have fixed.
There still may be some performance issue with the (common?) use case with writing a few bytes, and then reading these bytes when implementing events, since every
read()
call seems to pop an almost empty 8kb buffer, andwrite()
will allocate a new one, so It may generate some excessive garbage.Another peculiarity of this code is that the read end of the pipe always behaves as if it is non-blocking -- is it currently possible to block the read operation at all?