Skip to content

BF: Fix interaction between indexed_gzip presence and keep_file_open flag #679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 84 additions & 25 deletions nibabel/arrayproxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,14 @@
used for the lifetime of the ``ArrayProxy``. It should be set to one of
``True``, ``False``, or ``'auto'``.

If ``True``, a single file handle is created and used. If ``False``, a new
file handle is created every time the image is accessed. For gzip files, if
``'auto'``, and the optional ``indexed_gzip`` dependency is present, a single
file handle is created and persisted. If ``indexed_gzip`` is not available,
behaviour is the same as if ``keep_file_open is False``.
Management of file handles will be performed either by ``ArrayProxy`` objects,
or by the ``indexed_gzip`` package if it is used.

If this flag is set to ``True``, a single file handle is created and used. If
``False``, a new file handle is created every time the image is accessed. For
gzip files, if ``'auto'``, and the optional ``indexed_gzip`` dependency is
present, a single file handle is created and persisted. If ``indexed_gzip`` is
not available, behaviour is the same as if ``keep_file_open is False``.

If this is set to any other value, attempts to create an ``ArrayProxy`` without
specifying the ``keep_file_open`` flag will result in a ``ValueError`` being
Expand Down Expand Up @@ -160,8 +163,10 @@ def __init__(self, file_like, spec, mmap=True, keep_file_open=None):
# Permit any specifier that can be interpreted as a numpy dtype
self._dtype = np.dtype(self._dtype)
self._mmap = mmap
self._keep_file_open = self._should_keep_file_open(file_like,
keep_file_open)
# Flags to keep track of whether a single ImageOpener is created, and
# whether a single underlying file handle is created.
self._keep_file_open, self._persist_opener = \
self._should_keep_file_open(file_like, keep_file_open)
self._lock = RLock()

def __del__(self):
Expand All @@ -184,16 +189,64 @@ def __setstate__(self, state):
self._lock = RLock()

def _should_keep_file_open(self, file_like, keep_file_open):
"""Called by ``__init__``, and used to determine the final value of
``keep_file_open``.
"""Called by ``__init__``.

This method determines how to manage ``ImageOpener`` instances,
and the underlying file handles - the behaviour depends on:

- whether ``file_like`` is an an open file handle, or a path to a
``'.gz'`` file, or a path to a non-gzip file.
- whether ``indexed_gzip`` is present (see
:attr:`.openers.HAVE_INDEXED_GZIP`).

An ``ArrayProxy`` object uses two internal flags to manage
``ImageOpener`` instances and underlying file handles.

- The ``_persist_opener`` flag controls whether a single
``ImageOpener`` should be created and used for the lifetime of
this ``ArrayProxy``, or whether separate ``ImageOpener`` instances
should be created on each file access.

- The ``_keep_file_open`` flag controls qwhether the underlying file
handle should be kept open for the lifetime of this
``ArrayProxy``, or whether the file handle should be (re-)opened
and closed on each file access.

The internal ``_keep_file_open`` flag is only relevant if
``file_like`` is a ``'.gz'`` file, and the ``indexed_gzip`` library is
present.

The return value is derived from these rules:
This method returns the values to be used for the internal
``_persist_opener`` and ``_keep_file_open`` flags; these values are
derived according to the following rules:

- If ``file_like`` is a file(-like) object, ``False`` is returned.
Otherwise, ``file_like`` is assumed to be a file name.
- If ``keep_file_open`` is ``auto``, and ``indexed_gzip`` is
not available, ``False`` is returned.
- Otherwise, the value of ``keep_file_open`` is returned unchanged.
1. If ``file_like`` is a file(-like) object, both flags are set to
``False``.

2. If ``keep_file_open`` (as passed to :meth:``__init__``) is
``True``, both internal flags are set to ``True``.

3. If ``keep_file_open`` is ``False``, but ``file_like`` is not a path
to a ``.gz`` file or ``indexed_gzip`` is not present, both flags
are set to ``False``.

4. If ``keep_file_open`` is ``False``, ``file_like`` is a path to a
``.gz`` file, and ``indexed_gzip`` is present, ``_persist_opener``
is set to ``True``, and ``_keep_file_open`` is set to ``False``.
In this case, file handle management is delegated to the
``indexed_gzip`` library.

5. If ``keep_file_open`` is ``'auto'``, ``file_like`` is a path to a
``.gz`` file, and ``indexed_gzip`` is present, both internal flags
are set to ``True``.

6. If ``keep_file_open`` is ``'auto'``, and ``file_like`` is not a
path to a ``.gz`` file, or ``indexed_gzip`` is not present, both
internal flags are set to ``False``.

Note that a value of ``'auto'`` for ``keep_file_open`` will become
deprecated behaviour in version 2.4.0, and support for ``'auto'`` will
be removed in version 3.0.0.

Parameters
----------
Expand All @@ -206,8 +259,10 @@ def _should_keep_file_open(self, file_like, keep_file_open):
Returns
-------

The value of ``keep_file_open`` that will be used by this
``ArrayProxy``, and passed through to ``ImageOpener`` instances.
A tuple containing:
- ``keep_file_open`` flag to control persistence of file handles
- ``persist_opener`` flag to control persistence of ``ImageOpener``
objects.
"""
if keep_file_open is None:
keep_file_open = KEEP_FILE_OPEN_DEFAULT
Expand All @@ -216,12 +271,15 @@ def _should_keep_file_open(self, file_like, keep_file_open):
'\'auto\', True, False}')
# file_like is a handle - keep_file_open is irrelevant
if hasattr(file_like, 'read') and hasattr(file_like, 'seek'):
return False
# don't have indexed_gzip - auto -> False
if keep_file_open == 'auto' and not (openers.HAVE_INDEXED_GZIP and
file_like.endswith('.gz')):
return False
return keep_file_open
return False, False
# if the file is a gzip file, and we have_indexed_gzip,
have_igzip = openers.HAVE_INDEXED_GZIP and file_like.endswith('.gz')
if keep_file_open == 'auto':
return have_igzip, have_igzip
elif keep_file_open:
return True, True
else:
return False, have_igzip

@property
@deprecate_with_version('ArrayProxy.header deprecated', '2.2', '3.0')
Expand Down Expand Up @@ -265,13 +323,14 @@ def _get_fileobj(self):
A newly created ``ImageOpener`` instance, or an existing one,
which provides access to the file.
"""
if self._keep_file_open:
if self._persist_opener:
if not hasattr(self, '_opener'):
self._opener = openers.ImageOpener(
self.file_like, keep_open=self._keep_file_open)
yield self._opener
else:
with openers.ImageOpener(self.file_like) as opener:
with openers.ImageOpener(
self.file_like, keep_open=False) as opener:
yield opener

def get_unscaled(self):
Expand Down
Loading