Skip to content

WIP: Adding ICA_AROMA functionality #539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 59 commits into from
Jun 28, 2017
Merged

WIP: Adding ICA_AROMA functionality #539

merged 59 commits into from
Jun 28, 2017

Conversation

jdkent
Copy link
Collaborator

@jdkent jdkent commented May 27, 2017

This is my attempt at adding ICA_AROMA functionality to fmriprep. Almost 100% certain it doesn't work as is.

I have ICA_AROMA completing it's own transforms since it appears the template space could be something something different than ICA AROMA is expecting, in addition, ICA_AROMA requires a affine fsl matrix and fnirt fsl warp to pass in so ANTs is off the table (afaik).

I may continue editing/working on this, but first I wanted feedback if this is a worthwhile endeavour or if it's better to hi-jack the original code from ICA-AROMA to make it flexibly fit in the fmriprep workflows (as was done with compcor).

Thanks!
James

@chrisgorgo
Copy link
Contributor

Hey. This is a great contribution. In fact we planned to do this - see #334 (comment) so I am really happy to see your contribution! I prefer reusing rather than reinventing so using the Nipype interfaces as you are doing is preferable to hijacking the code.

A few things I would suggest:

  • Do we need to perform the normalization again using FNIRT or maybe we can just reused the normalized times generated using ANTs (outputs of epi_mni_trans_wf)? It would save some time.

  • We should make sure the smoothing is performed with the same algorithm/parameters as in the original paper to match the training set.

Things worth considering, but potentially a separate PR:

  • It would be great to add visualization of the good/bad components to the report. I wrote a generic melodic reportlet generator that could be reused here: [RTM] Added MELODICRPT. niworkflows#134. This would require running melodic outside of aroma, but is should support this use case.

  • Not sure we need to make users chose between aggressive and non-agressive aggressors. We could output both sets in the confounds.tsv file as a default. There should be a way to generate both sets without much computational overhead. It would avoid users having to rerun all of fmriprep if they change their minds.

Once again - thanks for this PR!

@jdkent
Copy link
Collaborator Author

jdkent commented May 28, 2017

Thanks you for your comments! Would the template: MNI152NLin2009cAsym always have to be specified to run ICA_AROMA, since it looks like it's currently not mandatory.

  • I can see about reporting the good/bad components, but I will at least run melodic separately to generate the report.

  • aggressive v. non-aggressive. This is completed via fsl_regfilt, where the non-aggressive denoising is completed through partial regression (e.g. variance shared with the other components is kept in the data) versus full regression (e.g. the shared variance is removed from the data). So I would want to return the coefficients from the partial regression, I think. fsl_regfilt does not return the coefficients sadly, so I would need to calculate that myself, perhaps with this?

  • It would be easier just to pass in the epi_mni and forgo the transforms, is this less recommended? I guess I'm less sure about whether the ouputs are organized so we can pass one affine and one warp if there are fieldmaps in the processing stream.

  • I will look into the published smoothing method

@chrisgorgo
Copy link
Contributor

chrisgorgo commented May 28, 2017 via email

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 1, 2017

I'm still not done with this. Things I still need to work on:

  • ica aroma partial regression: I've given it some effort, but I can't replicate the results from fsl_regfilt for non-aggressive denoising. My attempt followed this line of reasoning.
    linear regression: y_hat = mx +b, so the residual (or partial) after regression should be y - y_hat, right? I can send a link to some example data if what I did wasn't a simple mathematical error.

  • I still need to work on the melodicRPT, I'll continue reading up on that to add that functionality.

  • clean up the code and get formatting consistent. I still don't have a good sense of what should be a method, a node, and a workflow, and I saw in a tutorial that some simple mathematical operations (like number output from ImageStats) can be modified before being passed into another node.

Just making an update to keep consistent with working on the code/learn the nipype framework.

@chrisgorgo
Copy link
Contributor

chrisgorgo commented Jun 1, 2017 via email

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 1, 2017

RE: regression
here is the relevant bit of code:

def calc_residuals(x,y):
        X = np.column_stack(x+[[1]*len(x[0])])
         beta_hat = np.linalg.lstsq(X,y)[0]
         y_hat = np.dot(X,beta_hat)
         residuals = y - y_hat
 
         return residuals

Where y is the "bad" motion IC, and x is a matrix of the good_ICs. Maybe will leave it to a separate PR if I can't figure it out soon. I asked on the FSL listserve to possibly get an answer

@chrisgorgo
Copy link
Contributor

chrisgorgo commented Jun 1, 2017 via email

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 12, 2017

Okay, I'm getting closer to making this a reality, there are still several TODOs before it's ready for a formal review:

  • I can't get the basic melodic reportlet to work, I get the following error:
170612-01:58:13,360 workflow INFO:
	 Executing node ds_ica_aroma_report in dir: /out/work2/fmriprep_wf/single_subject_controlGE159_wf/func_preproc_task_flanker_wf/func_reports_wf/ds_ica_aroma_report
170612-01:58:13,463 workflow ERROR:
	 ['Node ds_ica_aroma_report failed to run on host 6c638518fad3.']
170612-01:58:13,466 workflow INFO:
	 Saving crash info to /out/fmriprep/sub-controlGE159/log/20170612-015349_3ed5f63d-a4cb-4a3c-9244-2955be966222/crash-20170612-015813-root-ds_ica_aroma_report-8fc746fc-5653-40cb-99ba-ced522bd80e3.pklz
170612-01:58:13,467 workflow INFO:
	 Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 322, in _send_procs_to_workers
    self.procs[jobid].run()
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/engine/nodes.py", line 372, in run
    self._run_interface()
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/engine/nodes.py", line 482, in _run_interface
    self._result = self._run_command(execute)
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/engine/nodes.py", line 613, in _run_command
    result = self._interface.run()
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/interfaces/base.py", line 1066, in run
    self._check_mandatory_inputs()
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/interfaces/base.py", line 971, in _check_mandatory_inputs
    raise ValueError(msg)
ValueError: DerivativesDataSink requires a value for input 'in_file'. For a list of required inputs, see DerivativesDataSink.help()

^it looks like a simple error but I can't find the solution right now.
Here is the link to the workflow graph made, in case I connected the nodes incorrectly.

  • Have ica_aroma output in the correct directory: right now all the output is going into a /tmp/ directory, but I think with ENH: ICA_AROMA: make current working directory default output directory nipy/nipype#2056, this issue should be fixed as soon as niworkflows uses the updated version of nipype.

  • change the testing suite to reflect ica aroma functionality.

  • I've changed the dockerfile to include fsl atlases, but I don't know if this is needed, or if I could make a symbolic link between the template image fmriprep already uses to the standard place ica aroma expects the MNI images.

Good news: within a docker container, I have it working (albiet the melodic report error above), but the confounds tsv file is generated with the expected output (the partial regression implemented here is not identical to fsl_regfilt's partial regression, but it's close)

@chrisgorgo
Copy link
Contributor

Great progress! I had a look at the code, but unfortunately I cannot see what could be wrong with the way you use MELODICRPT. Things worth checking:

  • Is the SVG file present in the working directory of the MELODICRPT node? You might need to set remove_unnecessary_outputs to false in nipype.cfg

  • the only difference between how you are using MELODICRPT and how it it used in mriqc is setting the report_mask. This should not be necessary, but it is a good idea anyway since you already have a mask available and there is no need to calculate it again the the MELODICRPT.


from niworkflows.nipype.interfaces.nilearn import SignalExtraction
from fmriprep.interfaces.utils import prepare_roi_from_probtissue


def init_discover_wf(bold_file_size_gb, name="discover_wf"):
def init_discover_wf(bold_file_size_gb, name="discover_wf",use_aroma=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_aroma should not have a default set here. This might be the core of the error you are getting.

[('aroma_confounds','aroma_confounds')]),
#TODO change melodic report to reflect noise and non-noise components
(melodic, outputnode,
[('out_report','ica_aroma_report')]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ica_aroma_report => out_report

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 12, 2017

Thanks @chrisfilo and @effigies, you guys are rockstars! That fixed the report error I was getting in the workflow. Sorry that re-running takes forever, when nipy/nipype#2056 is implemented, debugging will be way faster (I think).

Fred Mertz and others added 2 commits June 12, 2017 17:51
@effigies
Copy link
Member

Okay. Let me try to summarize the situation. As I think I've said, I've never used MELODIC or AROMA; hopefully I understand roughly what's going on, but I recognize what follows is likely to be laughably simplistic. Please correct me if it's also wrong.

  1. MELODIC finds IC time series, which I assume are global signals, independent of space. Some of these get labeled noise, some get labelled signal.
  2. Aggressive de-noising simply regresses out all noise ICs from every voxel.
  3. Non-aggressive does some kind of per-voxel modification of the ICs, so we can't actually generate these regressors in such a way as to apply in a GLM anyway.

If that's basically right, then I think your strategy seems reasonable.

I believe the BIDS preference is always for TSV, not CSV, so I'd use TSV for both of your files. We should also avoid _ in the descriptive parts of the filenames, since that's used as a field delimiter in BIDS. How about:

  • aroma_noise_classification.csv -> AROMAnoiseICs.tsv
  • aroma_melodic_mix.txv -> MELODICmix.tsv

Or something like that? Since the MELODIC output is actually not specific to AROMA post-processing, I don't think we need to label it with both AROMA and MELODIC. I'm open to alternative capitalization schemes.

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 23, 2017

Looks like you got it right for points 1 and 2, and I'm not sure if I have it right for point 3, but I'll try explaining what I think I know more.
RE: 3, The response I got from the fsl listserv suggests it's possible to make a non-aggressive denoising glm matrix, but I'm not mathematically inclined enough to recreate the process. Additionally, the pseudo-inverse of Y_clean (where Y_clean is the image data) is what we need to pass into the glm. It may be possible to simplify the matrix, but the direct solution appears to be each voxel would get a column. (if my understanding is correct, haha).

I believe the BIDS preference is always for TSV, not CSV,

I also believe the preference is for tsv, but this is another special case. I include the noise IC classification as a csv because fsl_regfilt does not accept tab separated values with it's "-f" flag. However, fsl_regfilt does not tell you it can't read the .tsv, instead it only filters the first index/number in the list. Making the AROMAnoiseICs.tsv could lead to misuse on the user end if they try to pass the tsv directly to fsl_regfilt without knowing that quirk. ugh, that is annoying.

^^clarification of fsl_regfilt, it does not directly read in a file for the -f flag, rather you cat the contents of the file following the -f flag like so:
-f $(cat AROMAnoiseICs.tsv)
so the problem may be a part of the way bash parses text on the commandline, and not fsl_regfilt itself.

@effigies
Copy link
Member

Alright. I think in any case, it's reasonable to go ahead in this PR and not produce the non-aggressive regressors, and just add documentation. If future investigation shows we can do it for users without producing a column per voxel, that can be another PR. (Incidentally, if it would end up being a column per-voxel, it probably makes more sense to output it as a .nii.gz time series so that the spatial information is directly encoded.)

Thanks for the context on the CSV issue. That makes sense and let's definitely go ahead with your plan, there.

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 23, 2017

I'm not entirely sure what the errors for the ci/circle mean for me, I've tested it on my data (however, there were not edge cases where no signal components were detected), so I am unsure where I should look to fix the error.

Figured it out by saying it out loud, rubber duck debugging works again

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 24, 2017

Now I'm out of ideas on how to pass the tests, it looks like running ICA_AROMA.py from the container failed, perhaps due to some property of the data?

#572 fixed it

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 25, 2017

@effigies @chrisfilo Question: what software do you use to preview the *rst files in docs? I tried retest for ubuntu, but my first impression is disappointing.

Again thank you for taking the time to help me through this journey, I believe the code is ready barring documentation formatting and any further suggestions you have.

@effigies
Copy link
Member

python setup.py build_sphinx

Generally run it in a docker container, because it's pretty touchy.

@effigies
Copy link
Member

And I'll have a thorough look through (probably Monday). Thanks so much for all the effort you've put into this.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. Review time.

Looks good! Lots of little comments and a couple organizational issues, but no substantial code changes that I see.

Also:

  • Can you add the ICA-AROMA dependency and version to docs/installation.rst?

@@ -320,15 +324,29 @@ Confounds estimation

from fmriprep.workflows.confounds import init_discover_wf
wf = init_discover_wf(name="discover_wf",
use_aroma=False,ignore_aroma_err=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space after first comma.


# ICA_AROMA options
g_aroma = parser.add_argument_group('Specific options for running ICA_AROMA')
g_aroma.add_argument('--use_aroma', action='store_true', default=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's switch this to --use-aroma? This won't change the variable name, so it's a lightweight change. The only other places to update it are in the docs where you talk about the flag.

bold_file_size_gb=3)

Given a motion-corrected fMRI, a brain mask, MCFLIRT movement parameters and a
segmentation, the `discover_wf` sub-workflow calculates potential
confounds per volume.
Optional (--use_aroma): give a motion corrected fMRI in MNI space, and mask in MNI space,
the `discover_wf` sub-workflow calculates potential motion related
independent components using ICA_AROMA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this addition can be discarded. The next section explains the change in outputs.

bold_file_size_gb=3)

Given a motion-corrected fMRI, a brain mask, MCFLIRT movement parameters and a
segmentation, the `discover_wf` sub-workflow calculates potential
confounds per volume.
Optional (--use_aroma): give a motion corrected fMRI in MNI space, and mask in MNI space,
the `discover_wf` sub-workflow calculates potential motion related
independent components using ICA_AROMA

Calculated confounds include the mean global signal, mean tissue class signal,
tCompCor, aCompCor, Framewise Displacement, 6 motion parameters and DVARS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to:

... 6 motion parameters, DVARS, and, if the ``--use-aroma`` flag is enabled, the noise
components identified by ICA-AROMA (those to be removed by the "aggressive" denoising strategy).

``fsl_regfilt -i <BIDS SUBJECT>_task-<TASK>_bold_space-<SPACE>_preproc.nii.gz
-f $(cat <BIDS SUBJECT>_task-<TASK>_bold_AROMAnoiseICs.csv)
-d <BIDS SUBJECT>_task-<TASK>_bold_MELODICmix.tsv
-o <BIDS SUBJECT>_task-<TASK>_bold_space-<SPACE>_AromaNonAggressiveDenoised.nii.gz``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rephrase this whole section a bit just to improve the narrative flow a little. Think of the following as a rough template, and feel free to tweak it further:

*Note*: Confounds for performing *non*-aggressive denoising cannot be generated in FMRIPREP.
If the ``--use-aroma`` flag is passed to FMRIPREP, the noise components and MELODIC mix will
be generated, and non-aggressive denoising may be performed with ``fsl_regfilt``, *e.g.*::

    fsl_regfilt -i sub-<subject_label>_task-<task_id>_bold_space-<space>_preproc.nii.gz \
        -f $(cat sub-<subject_label>_task-<task_id>_bold_AROMAnoiseICs.csv) \
        -d sub-<subject_label>_task-<task_id>_bold_MELODICmix.tsv \
        -o sub-<subject_label>_task-<task_id>_bold_space-<space>_AromaNonAggressiveDenoised.nii.gz

It may be worth linking to any online documentation that you think appropriate for fsl_regfilt, and this can include mailing list threads if they give better context. This is a judgment call; if you don't think it'll be helpful, skip it.

('melodic_mix', 'melodic_mix')]),
# TODO change melodic report to reflect noise and non-noise components
(melodic, outputnode,
[('out_report', 'out_report')]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these connections can be reduced to one or two lines. I find this extremely hard to read as is.

debug, output_grid_ref, layout=None):
fmap_bspline, fmap_demean, debug, output_grid_ref,
use_aroma, ignore_aroma_err,
use_syn, force_syn, layout=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to reduce the diff to a single addition, rather than 2-/4+.

fmap_bspline, fmap_demean, use_syn, force_syn,
use_aroma, ignore_aroma_err,
debug, output_grid_ref, layout=None):

@@ -142,11 +150,11 @@ def init_func_preproc_wf(bold_file, ignore, freesurfer,
('subject_id', 'inputnode.subject_id'),
('fs_2_t1_transform', 'inputnode.fs_2_t1_transform')
]),
(inputnode, discover_wf, [('t1_tpms', 'inputnode.t1_tpms')]),
(inputnode, discover_wf, [('t1_tpms', 'inputnode.t1_tpms'),
]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert this newline.

(discover_wf, func_derivatives_wf, [
('outputnode.aroma_noise_ics', 'inputnode.aroma_noise_ics'),
('outputnode.melodic_mix', 'inputnode.melodic_mix')]),
])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current pattern is to pass anything that needs to go to derivatives to outputnode, and then pass the outputnode fields to func_derivatives_wf.inputnode. The reasoning is basically that if it's a derivative, it's something another workflow should reasonably expect access to.

workflow.connect([
(inputnode, ds_ica_aroma_report, [('source_file', 'source_file'),
('ica_aroma_report', 'in_file')]),
])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The close brace should be aligned with the start of the tuple. e.g.

if x:
    wf.connect([
        (node1, node2, [(...)]),
        ])

.circle/tests.sh Outdated
@@ -43,7 +43,7 @@ case ${CIRCLE_NODE_INDEX} in
| tee $HOME/docs/builddocs.log
cat $HOME/docs/builddocs.log && if grep -q "ERROR" $HOME/docs/builddocs.log; then false; else true; fi
fmriprep-docker -i poldracklab/fmriprep:latest --help
fmriprep-docker -i poldracklab/fmriprep:latest --config $HOME/nipype.cfg -w $HOME/ds005/scratch $HOME/data/ds005 $HOME/ds005/out participant --output-space fsaverage5 --debug --write-graph
fmriprep-docker -i poldracklab/fmriprep:latest --config $HOME/nipype.cfg -w $HOME/ds005/scratch $HOME/data/ds005 $HOME/ds005/out participant --debug --write-graph --use_aroma --ignore-aroma-denoising-errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update --use-aroma here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, thanks!

@effigies effigies merged commit df80957 into nipreps:master Jun 28, 2017
@effigies
Copy link
Member

@jdkent Thanks so much for spending a month on this. This was a huge contribution.

@chrisgorgo
Copy link
Contributor

chrisgorgo commented Jun 28, 2017 via email

@jdkent
Copy link
Collaborator Author

jdkent commented Jun 28, 2017

@effigies @chrisfilo This was an amazing learning experience for me and I can't emphasize enough how thankful I am for you two taking the time to guide me through this contribution. I plan to keep contributing to fmriprep and projects like it (hopefully I'll need less help and be able to help others as I learn more about nipype and python).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants