Skip to content

Conversation

@martindurant
Copy link
Member

@martindurant martindurant commented May 25, 2021

  • remove abspath from linux/mac make_path_posix and move fast-path to start of long function
  • logging in fuse
  • give option to don't have instance cache keep instances alive (but still share) using config "weakref_instance_cache"
  • allow min/max to cat_file to be negative, like a slice

Martin Durant added 6 commits May 25, 2021 12:52
Turns out it takes a surprisingly long time to evaluate paths
(os.path.abspath) and this shows up in fastparquet for reading
from many files.
Because cwd can change during run
@martindurant martindurant changed the title Add logging to FUSE; cache make_path_posix Weakref instances May 26, 2021
Martin Durant added 2 commits May 31, 2021 11:43
Make common things first and use os.scandir in ls
@martindurant martindurant changed the title Weakref instances optimisations May 31, 2021
@martindurant martindurant merged commit 8ea35ab into fsspec:master Jun 1, 2021
@martindurant martindurant deleted the log_and_cache branch June 1, 2021 12:34
else:
t = "other"
# scandir DirEntry
out = path.stat(follow_symlinks=False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a breaking change in the API? in pytorch lightning tests we're seeing this stacktrace now:

kwargs = {}
    def info(self, path, **kwargs):
        if isinstance(path, str):
            path = self._strip_protocol(path)
            out = os.stat(path, follow_symlinks=False)
            link = os.path.islink(path)
            if os.path.isdir(path):
                t = "directory"
            elif os.path.isfile(path):
                t = "file"
            else:
                t = "other"
        else:
            # scandir DirEntry
>           out = path.stat(follow_symlinks=False)
E           TypeError: stat() got an unexpected keyword argument 'follow_symlinks'

example: https://github.com/PyTorchLightning/pytorch-lightning/runs/2766424283?check_suite_focus=true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the API did not change, this path is only meant to be called by internal functions. The test is whether the path is a str, and I suspect that the tmpdir fixture used for the test is not actually a str, but something str-like. str(tempdir) in the test would solve it. However, fsspec maybe should have coped with this in the more user-facing isdir/info functions (if my guess about the cause is correct).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right. tmpdir is a py.path.local object. while we can update our tests to handle with this, does it make sense to update the instance check as well?

*[self._pipe_file(k, v, **kwargs) for k, v in path.items()]
)

async def _cat_file(self, path, start=None, end=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindurant what's the purpose of **kwargs here? Should subclasses check that additional kwargs aren't passed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so that subclasses can pass any extra arguments (like headers) without changing the signature. If the subclass doesn't have any such thing it can do, then kwargs can be ignored.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without changing the signature.

What's the motivation for not changing the signature? If they subclass accepts additional kwargs, then they can add them. As long as it's a superset of the parent's signature and the function is valid with the defaults then things will work fine.

then kwargs can be ignored.

Functions probably shouldn't silently ignore kwargs, since this leads to difficult to debug problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess its an indication to downstream that this is a reasonable place to add extra args, and it enables fsspec to add arguments (like start/end) without immediately braking other implementations for the common case that they are left with default values such as None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants