Skip to content

MemoryError when sequentially loading big data sets #208

@rphlypo

Description

@rphlypo

Although each of the unzipped data sets fits RAM size, calling a sequence of functions and deleting each object after having written it to disk, raises a MemoryError of which the traceback is given below. My data (gzipped Nifti1 format with extension .nii.gz) contains about 1GB of gzipped data for each data set and my machine has a 16GB of RAM, still it crashes during manipulation of each third data set, with the exact same message (error in the same line of code, same traceback).

Is this due to the data handling in gzip, or due to the asanyarray constructor that provides too much overhead?

Thanks in advance for any return on this issue.

~/.local/lib/python2.7/site-packages/nibabel/spatialimages.pyc in get_data(self)
339
340 def get_data(self):
--> 341 return np.asanyarray(self._data)
342
343 @Property

/usr/lib/python2.7/dist-packages/numpy/core/numeric.pyc in asanyarray(a, dtype, order)
285
286 """
--> 287 return array(a, dtype, copy=False, order=order, subok=True)
288
289 def ascontiguousarray(a, dtype=None):

~/.local/lib/python2.7/site-packages/nibabel/arrayproxy.pyc in array(self)
53 ''' Cached read of data from file '''
54 if self._data is None:
---> 55 self._data = self._read_data()
56 return self._data
57

~/.local/lib/python2.7/site-packages/nibabel/arrayproxy.pyc in _read_data(self)
58 def _read_data(self):
59 fileobj = allopen(self.file_like)
---> 60 data = self.header.data_from_fileobj(fileobj)
61 if isinstance(self.file_like, basestring): # filename
62 fileobj.close()

~/.local/lib/python2.7/site-packages/nibabel/analyze.pyc in data_from_fileobj(self, fileobj)
484 '''
485 # read unscaled data

--> 486 data = self.raw_data_from_fileobj(fileobj)
487 # get scalings from header. Value of None means not present in header

488 slope, inter = self.get_slope_inter()

~/.local/lib/python2.7/site-packages/nibabel/analyze.pyc in raw_data_from_fileobj(self, fileobj)
456 shape = self.get_data_shape()
457 offset = self.get_data_offset()
--> 458 return array_from_file(shape, dtype, fileobj, offset)
459
460 def data_from_fileobj(self, fileobj):

~/.local/lib/python2.7/site-packages/nibabel/volumeutils.pyc in array_from_file(shape, in_dtype, infile, offset, order)
482 if datasize == 0:
483 return np.array([])
--> 484 data_str = infile.read(datasize)
485 if len(data_str) != datasize:
486 if hasattr(infile, 'name'):

/usr/lib/python2.7/gzip.pyc in read(self, size)
254 try:
255 while size > self.extrasize:
--> 256 self._read(readsize)
257 readsize = min(self.max_read_chunk, readsize * 2)
258 except EOFError:

/usr/lib/python2.7/gzip.pyc in _read(self, size)
306
307 uncompress = self.decompress.decompress(buf)
--> 308 self._add_read_data( uncompress )
309
310 if self.decompress.unused_data != "":

/usr/lib/python2.7/gzip.pyc in _add_read_data(self, data)
324 self.crc = zlib.crc32(data, self.crc) & 0xffffffffL
325 offset = self.offset - self.extrastart
--> 326 self.extrabuf = self.extrabuf[offset:] + data
327 self.extrasize = self.extrasize + len(data)
328 self.extrastart = self.offset

MemoryError:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions