MemoryError when sequentially loading big data sets

Although each of the unzipped data sets fits RAM size, calling a sequence of functions and deleting each object after having written it to disk, raises a MemoryError of which the traceback is given below. My data (gzipped Nifti1 format with extension `.nii.gz`) contains about 1GB of gzipped data for each data set and my machine has a 16GB of RAM, still it crashes during manipulation of each third data set, with the exact same message (error in the same line of code, same traceback).

Is this due to the data handling in gzip, or due to the `asanyarray` constructor that provides too much overhead?

Thanks in advance for any return on this issue.

> ~/.local/lib/python2.7/site-packages/nibabel/spatialimages.pyc in get_data(self)
>     339 
>     340     def get_data(self):
> --> 341         return np.asanyarray(self._data)
>     342 
>     343     @property
> 
> /usr/lib/python2.7/dist-packages/numpy/core/numeric.pyc in asanyarray(a, dtype, order)
>     285 
>     286     """
> --> 287     return array(a, dtype, copy=False, order=order, subok=True)
>     288 
>     289 def ascontiguousarray(a, dtype=None):
> 
> ~/.local/lib/python2.7/site-packages/nibabel/arrayproxy.pyc in **array**(self)
>      53         ''' Cached read of data from file '''
>      54         if self._data is None:
> ---> 55             self._data = self._read_data()
>      56         return self._data
>      57 
> 
> ~/.local/lib/python2.7/site-packages/nibabel/arrayproxy.pyc in _read_data(self)
>      58     def _read_data(self):
>      59         fileobj = allopen(self.file_like)
> ---> 60         data = self.header.data_from_fileobj(fileobj)
>      61         if isinstance(self.file_like, basestring):  # filename
>      62             fileobj.close()
> 
> ~/.local/lib/python2.7/site-packages/nibabel/analyze.pyc in data_from_fileobj(self, fileobj)
>     484         '''
>     485         # read unscaled data
> 
> --> 486         data = self.raw_data_from_fileobj(fileobj)
>     487         # get scalings from header.  Value of None means not present in header
> 
>    488         slope, inter = self.get_slope_inter()
> 
> ~/.local/lib/python2.7/site-packages/nibabel/analyze.pyc in raw_data_from_fileobj(self, fileobj)
>     456         shape = self.get_data_shape()
>     457         offset = self.get_data_offset()
> --> 458         return array_from_file(shape, dtype, fileobj, offset)
>     459 
>     460     def data_from_fileobj(self, fileobj):
> 
> ~/.local/lib/python2.7/site-packages/nibabel/volumeutils.pyc in array_from_file(shape, in_dtype, infile, offset, order)
>     482         if datasize == 0:
>     483             return np.array([])
> --> 484         data_str = infile.read(datasize)
>     485         if len(data_str) != datasize:
>     486             if hasattr(infile, 'name'):
> 
> /usr/lib/python2.7/gzip.pyc in read(self, size)
>     254             try:
>     255                 while size > self.extrasize:
> --> 256                     self._read(readsize)
>     257                     readsize = min(self.max_read_chunk, readsize \* 2)
>     258             except EOFError:
> 
> /usr/lib/python2.7/gzip.pyc in _read(self, size)
>     306 
>     307         uncompress = self.decompress.decompress(buf)
> --> 308         self._add_read_data( uncompress )
>     309 
>     310         if self.decompress.unused_data != "":
> 
> /usr/lib/python2.7/gzip.pyc in _add_read_data(self, data)
>     324         self.crc = zlib.crc32(data, self.crc) & 0xffffffffL
>     325         offset = self.offset - self.extrastart
> --> 326         self.extrabuf = self.extrabuf[offset:] + data
>     327         self.extrasize = self.extrasize + len(data)
>     328         self.extrastart = self.offset
> 
> MemoryError: 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MemoryError when sequentially loading big data sets #208

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MemoryError when sequentially loading big data sets #208

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions