Skip to content

ENH: Improve par-rec support #253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from
Closed

Conversation

larsoner
Copy link
Contributor

Goals:

  • Detect and correct mis-ordered acquisition (e.g., x, y, t, z).
  • Detect and return multiple echoes in dataset.
  • Detect and optionally permit missing data (e.g., run too short, doesn't contain full dataset).
  • Add file overwrite checks and option to overwrite.
  • Correct output if there are multiple data types in same recording with different scale factors (phase contrast vs complex).
  • Output DTI b-values and vectors, if recorded.
  • Output dwell time.
  • Make tests more comprehensive.

Although it's not 100% complete, it would be good to get a round of review to make sure I'm not way off.

nhdr.set_slope_inter(1., 0.)
else:
nhdr.set_slope_inter(slope, intercept)
data = raw_data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthew-brett, this was done to support writing data that had variable scale factors.

The raw_data.dtype is np.int16 for one of my datasets, which is not good. Does the Nifti format support writing other datatypes, like maybe np.float32? I tried outputting without casting (which means np.float64 after this math), and that definitely did not work. It would be nice not to have to use an integer format, especially for data like this that must be scaled before being saved.

The cleaner alternative here would be to save a multidimensional vector of scale factors, but from looking at the nifti code a bit it seems like it expects single scalars.

And from a bit of reading it seems like other dtypes are supported, but I'm not sure how to get it to work... seems like the header will have to be changed to "know" that the dtype is not np.int16...

@larsoner
Copy link
Contributor Author

One thing @mrjeffs suggested was that for DTI scans there should be an option (enabled by default) to discard the trace volume(s), if present. Interested to hear if others agree, too.

@matthew-brett
Copy link
Member

Can you say more about the trace volumes?

@larsoner
Copy link
Contributor Author

In a couple of our data files the trace / mean diffusivity images are given for b=0 and b=3000. It might be convenient to discard them by default -- in our protocol this would make it so that the resulting NIFTI file had 64 volumes corresponding to the 64 directions, as opposed to 66 volumes (since our PAR/REC files have the two trace images in them).

@matthew-brett
Copy link
Member

Sorry to take a little while to get back to you with a review - internet is very tough here, but I should be able to get to my mail every couple of days.

Here's a review all in a lump because I could write it offline, and the github web interface is very slow to use here.

Thanks a lot for this.

The PEP changes in the parrec2nii parser have the effect of adding an annoying
leading space to the output strings for the options. For example, this:

  --origin=ORIGIN        Reference point of the q-form transformation of the
                        NIfTI image. If 'scanner' the (0,0,0) coordinates will

instead of:

  --origin=ORIGIN       Reference point of the q-form transformation of the
                        NIfTI image. If 'scanner' the (0,0,0) coordinates will

In general the PEP8 changes are a bit confusing. Probably too late now, but
I personally prefer to make PEP8 changes, by hand, only within code blocks I'm
editing myself, to make it as obvious as possible where the real changes are.

You've found that the par / rec code is not organized very well for
testing or re-use, because there is far too much code in the bin/parrec2nii
file. The reason for this is partly lack of development time, and partly
because Michael's original use for the code assumed a direct relationship
between the REC file and the output NIfTI image data. You've run into many
deviations from this, of which the most obvious are:

  • interleaved slice schemes
  • time before slice ordering
  • multiple scale factors

If the REC file has one scale factor and no reordering, then there is a direct
translation to the NIfTI file, and you can just write the NIfTI single
scalefactor. Otherwise you have to do something more complicated.

At the moment, I think I'm right in saying that all the data reordering is in
parrec2nii. This has the advantage that you can detect the simple vs
not-simple case there, and write the simple NIfTI if possible, but the
disadvantage that loading the image into memory using parrec.load will
often result in a file with broken ordering.

Also - again partly because of all the code in parrec2nii, I think there
aren't any tests for the new code.

So, I think the best thing to do here is to move the non-option code out of
parrec2nii and into the parrec header.

The header could have the logic for resorting the slices, with a helper method
maybe something like this:

    def sorted_slice_indices(self):
        """ Return indices to sort (and maybe discard) slices in REC file

        Returns list for indexing into a single dimension of an array.
        """
        # No attempt to detect missing combinations or early stop
        idefs = self.image_defs
        is_trace = np.logical_and(
            np.all(idefs['diffusion'] == 0, axis=1), # all 0 direction
            idefs['diffusion_b_factor'] != 0) # b value 0
        keys = zip(
            idefs['echo number'],
            idefs['dynamic scan number'],
            is_trace, # put trace scan at end
            idefs['gradient orientation number'],
            idefs['slice number'],
            idefs['index in REC file'])
        return [key[-1] for key in sorted(keys)]

This would generate a vector of slice_indices. Then you could write a helper
function to get the data:

def _data_from_rec(rec_fileobj, in_shape, dtype, slice_indices, out_shape, scaling=None):
    rec_data = array_from_file(in_shape, dtype, rec_fileobj)
    if not scaling is None:
        slopes, inters = scaling
        rec_data *= slopes
        rec_data += inters
    slice_indices = self.sorted_slice_indices()
    n_slices = in_shape[-1]
    if (len(slice_indices) != n_slices or
        np.any(slice_indices != np.arange(n_slices))):
        rec_data = rec_data[..., slice_indices]
    return rec_data.reshape(out_shape, order='F')

That would make it easy to rewrite the *_from_fileobj methods:

    def get_rec_shape(self):
        inplane_shape = tuple(self._get_unique_image_prop('recon resolution'))
        return inplane_shape + (len(self.image_defs,)

    def get_slice_scaling(self, method='dv'):
        scale_slope = self.image_defs['scale slope']
        rescale_slope = self.image_defs['scale slope']
        rescale_intercept = self.image_defs['rescale intercept']
        if method == 'dv':
            return rescale_slope, rescale_intercept
        elif method == 'fp':
            return (1. / scale_slope,
                   rescale_intercept / (rescale_slope * scale_slope))
        raise ValueError("Unknown scaling method '%s'." % method)

    def raw_data_from_fileobj(self, fileobj):
        return _data_from_rec(
                fileobj,
                self.get_rec_shape(),
                self.get_data_dtype(),
                self.sorted_slice_indices(),
                self.get_data_shape(),
                scaling = None)

    def data_from_fileobj(self, fileobj):
        return _data_from_rec(
                fileobj,
                self._rec_shape(),
                self.get_data_dtype(),
                self.sorted_slice_indices(),
                self.get_data_shape(),
                scaling = self.get_slice_scaling())

The slice_indices method would make it fairly easy to support things like
multiple echoes and dynamics, for example, by sorting by (slice number,
dynamic number, echo number) and have shape (slice x, slice y, n_slices,
n_dynamics, n_echoes).

Reordering time and slice would just be a question of returning slice indices
for sorting slices by (slice number, time). Discarding volumes or dealing
with truncation would just involve dropping the matching indices from the
returned list.

Your ArrayProxy could be something very simple like:

class PARRECArrayProxy(object):
    def __init__(self, file_like, header):
        self.file_like = file_like
        # Copies of values needed to read array
        self._shape = header.get_data_shape()
        self._dtype = header.get_data_dtype()
        self._slice_indices = header.sorted_slice_indices()
        self._slice_scaling = header.get_slice_scaling()
        self._rec_shape = header.get_rec_shape()

    @property
    def shape(self):
        return self._shape

    @property
    def dtype(self):
        return self._dtype

    @property
    def is_proxy(self):
        return True

    def get_unscaled(self):
        with BinOpener(self.file_like) as fileobj:
            return _data_from_rec(
                    fileobj,
                    self._rec_shape,
                    self._dtype,
                    self._slice_indices,
                    self._shape,
                    scaling = None)

    def __array__(self):
        with BinOpener(self.file_like) as fileobj:
            return _data_from_rec(
                    fileobj,
                    self._rec_shape,
                    self._dtype,
                    self._slice_indices,
                    self._shape,
                    scaling = self._slice_scaling)

parrec2nii could use the returned scaling to detect when it can use a single
scalefactor.

    slopes, inters = self.get_slice_scaling()
    if (np.any(np.diff(slopes)), np.any(np.diff(inters)) == (False, False):
        # Single scalefactor case
        nhdr.set_slope_inter(slopes[0], inters[0])
        nhdr.write_to(outfile)
        array_to_file(pr_hdr.raw_data_from_fileobj(), outfile)
    else:
        # Multi scalefactor case
        nhdr.set_slope_inter(1, 0)
        nhdr.set_data_dtype(np.float)
        nhdr.write_to(outfile)
        array_to_file(pr_hdr.data_from_fileobj(), outfile)

where the header takes care of the all the nasty re-sorting, scaling.

It would be good to support easy conversion of a parrec header to a nifti
header, in memory (and use this in parrec2nii). This could go as a header
method as_analyze_map, as in:

    def as_analyze_map(self):
        return dict(
            descr = "%s;%s;%s;%s" % (
                self.general_info['exam_name'],
                self.general_info['patient_name'],
                self.general_info['exam_date'].replace(' ', ''),
                self.general_info['protocol_name'])
        # etc

Sorry - I should have documented this feature better, but it's a way that any
header can represent itself as a key, value pairs of nifti / analyze fields.

If you do that, then you just do:

nimg = nifti1.Nifti1Image(raw_data, affine, pr_hdr)

and let as_analyze_map deal with the automatic conversion between par / rec
and nifti header fields.

Can you move dwell_time to a function in a new module like mriutils,
with the constants as module globals like:

GYROMAGNETIC_RATIO = 42.57  # MHz/T

Then you could maybe make get_water_fat_shift and get_echo_train_length
methods of the parrec.header, and use those? What do you think?

It would be good to do the bvals, bvec calculation within the header object
too. Maybe these could go together in a new method of the parrec header - say
get_q_vectors (bvals * bvecs) (see discussion on the list about q_vector
field in JSON header). The trace scan could maybe be NaNs in this vector?
Then parrec2nii could just write out the bvals, bvecs files, as in something
like:

    q_vectors = header.get_q_vectors()
    bvals = np.sqrt(np.sum(q_vectors**2, axis=1))
    bvecs = np.zeros_like(q_vectors)
    b_non_zero = bvals > 0 # or some threshold.
    bvecs[b_non_zero] = q_vectors[b_non_zero] / bvals[b_non_zero]

The q_vectors / bvecs will need rotating to match the stored image
orientation, to match the FSL convention that gradients are oriented to the
voxel axes. I think this would be (in parrec.py):

    import numpy.linalg as npl
    from nibabel.affines import apply_affine
    slice_orientation = header.get_slice_orientation()
    permute_to_psl = ACQ_TO_PSL.get(slice_orientation)
    q_vectors = apply_affine(npl.inv(permute_to_psl), q_vectors)

There are some useful test files that Yarick H put up on github at:

https://github.com/yarikoptic/nitest-balls1

We could easily include some or all of these in the nibabel repo. Could you
use those for testing?

Thanks a lot for working on this. With just a little more work I think we
could make this close to the best available par / rec conversion.

@larsoner
Copy link
Contributor Author

Thanks for the comprehensive review. The only part I disagree with is the "little" in the phrase "little mork work" :)

But to be serious, the next issue in my mind was also that most of the heavy lifting is done with parrec2nii currently, which is unfortunate if one wants to use the files in Python directly (as opposed to writing files out). I agree that we should get it in the .load function instead if possible, and make parrec2nii a thin wrapper around methods on that class, as opposed to it doing any real work.

I'm happy to implement all of the suggestions here, they all seem reasonable. My only concern is the timeline. I can try to do this this week, but I'm a little bit wary of adequately implementing and testing all the changes by the time nibabel wants to release (this week?). I'll see how far I can get...

@larsoner
Copy link
Contributor Author

Okay, I made some progress along the lines you suggested. It likely broke some of the old tests. I'll take a look at the test files that you mentioned hopefully soon. I get confused about how all the various classes and class methods are meant to interact, so it's probably not correct yet. But I'll at least try to get tests passing and it working on my files, then we can make it cleaner.

@@ -142,47 +176,69 @@ def proc_file(infile, opts):
else:
# anatomical or DTI
nhdr.set_xyzt_units('mm', 'unknown')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthew-brett for example, could I somehow move this into my pr_hdr.as_analyze_map function so that I wouldn't have to set it in nhdr in parrec2nii? I think from your comment I'd need to update my dict returned by as_analyze_map, but I don't know how exactly (as_analyze_map doesn't seem to be used anywhere other than in as_analyze_map, where it's not clear what's happening).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in analyze.py, rather.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - yes - the units thing would go very well in
PARRECHeader.as_analyze_map - thanks again for working on this. I'll help
as much as I can...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just not sure what the syntax would look like, since as_analyze_map doesn't seem to be used anywhere / documented anywhere I don't have anything to go off of...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear - sorry that is true.

I will write something better today and try and push it to master, but
as_analyze_map is a method of a header that takes 'self' as the only
argument (self being the header) and, returns a dict, where the keys are
field names in the nifti or analyze header and the values are the values
for those keys in the analyze header. So, if you look at - say - the
nifti1.py file header field definitions, you coudl return in the output
dict any named field in that definition. In the case of the
set_xyzt_units, you would have to go to the nifti1.py source to see what
header fields that sets, and set those with the key, value pairs of the
returned dict. Is that a little clearer? I'll post something soom with
better documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's good enough for now. If you're feeling adventurous, you could even just open a PR into my branch to extend the support if I don't get to it, then this could serve as the example. I'll see how far I can get and ping you to check it out.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.23%) when pulling 2bde1d1 on Eric89GXL:fp-checks into 66b8d38 on nipy:master.

@larsoner
Copy link
Contributor Author

Got tests passing again, now I need to make them more comprehensive / incorporate the new files.

@larsoner larsoner mentioned this pull request Aug 27, 2014
@larsoner
Copy link
Contributor Author

@matthew-brett none of the Philips files in that test set contain the XYTZ ordering, unfortunately. What would you recommend for extending tests? I'm not experienced enough with these data yet to know what the appropriate tests are. Any chance you can open a PR into my branch to clean up loose ends / add tests?

@matthew-brett
Copy link
Member

Ok - I will add a PR into your branch - will maybe be on Monday - won't
have internet for a while now.

@larsoner
Copy link
Contributor Author

Sounds good

@matthew-brett
Copy link
Member

It's a bit late for you now, but I just push this up for review, for as_analyze_map:

#256

If you have time to rebase on top of that in the next 10 minutes or so, that would help. Don't worry otherwise, I'm going offline again :)

@larsoner
Copy link
Contributor Author

I could rebase, but there are no merge conflicts...?

@matthew-brett
Copy link
Member

OK - don't worry - I'll do it offline, push on Monday or so.

Refactor header conversion routine to use mapping if available, then
basic conversion always.

Document the ``as_analyze_map`` interface in docstring.

Add ``as_analyze_map`` tests and as API documentation.

Rename ``_set_format_specifics`` to ``_clean_after_mapping`` to make
clear that the routine is to set needed defaults after creating a header
from a mapping.

Remove unused ``_set_format_specifics`` from freesurfer/mghformat
@matthew-brett
Copy link
Member

Sorry to be slow - various things came up this week, I might have more time over the weekend.

@larsoner
Copy link
Contributor Author

larsoner commented Sep 5, 2014

Don't sweat it -- we're going to work on some testing with our data in parallel, hopefully next week as well.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.21%) when pulling 323e946 on Eric89GXL:fp-checks into 4c2ffe8 on nipy:master.

@larsoner
Copy link
Contributor Author

Rebased to refactor-analyze-map branch and pushed. The only test failures I get on my system are ones caught by un-merged #254, and Travis is happy. @matthew-brett any idea on a timeline to take a look?

@matthew-brett
Copy link
Member

Sorry for the delay - I got distracted by various things. I will take a
look now, get back to you in a day or two.

On Mon, Sep 29, 2014 at 12:32 PM, Eric Larson [email protected]
wrote:

Rebased to refactor-analyze-map branch and pushed. The only test failures
I get on my system are ones caught by un-merged #254
#254, and Travis is happy.
@matthew-brett https://github.com/matthew-brett any idea on a timeline
to take a look?


Reply to this email directly or view it on GitHub
#253 (comment).

@matthew-brett
Copy link
Member

Eric - are you online? I'm planning to upload some work but I wanted your advice?

@larsoner
Copy link
Contributor Author

larsoner commented Oct 3, 2014

Indeed -- what's up?

@matthew-brett
Copy link
Member

I've done some work on your branch. I found I needed to merge in another PR. I would like to merge the PRs to master, then push the rebased branch to another PR, closing this one. Is that OK?

@larsoner
Copy link
Contributor Author

larsoner commented Oct 3, 2014

Sure, I can always open PRs into that branch instead if necessary -- whatever organization makes most sense at your end is fine by me.

@larsoner
Copy link
Contributor Author

larsoner commented Oct 3, 2014

BTW might be good to get #254 in sooner rather than later since it touches many files, too... it makes testing output much cleaner / easier to parse.

@matthew-brett
Copy link
Member

OK - thanks - have just pushed #261 with rebased changes.

matthew-brett added a commit that referenced this pull request Oct 13, 2014
MRG: Cleaner tests

In the process of working on #253 I was a little bit surprised by the number of warnings during testing. This cleans that up, and fixes a couple minor errors along the way.
@larsoner larsoner deleted the fp-checks branch October 24, 2014 19:02
grlee77 pushed a commit to grlee77/nibabel that referenced this pull request Mar 15, 2016
MRG: Cleaner tests

In the process of working on nipy#253 I was a little bit surprised by the number of warnings during testing. This cleans that up, and fixes a couple minor errors along the way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants