-
Notifications
You must be signed in to change notification settings - Fork 135
WiP RF+ENH: further RF to support flexible heuristic to name sequences so they follow BIDS in their naiming #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…or getting arb set of files Decided to split away 'session' notion, which was pretty much used in the script only whenever multiple tarballs are provided, as a sign of multiple sessions. Otherwise -- session is given on cmdline, and thus processing that particular session might be 'incompatible' with future runs for other sessions if we save the entire mapping in a single file which we might load, and which wouldn't have that session information. Also, to allow for consumption of arbitrary set of dicoms, which might be coming from different studies and sessions, we need to analyze/group before extracting useful session sequence information. So, that was also in preparation to that
…from the same study instance uid
now that we use named tuple, easy to make it into a key. For file groups though we would need to unpair them, but it all just makes "alignment" easier and allows to group per studyUID easier
now that we use named tuple, easy to make it into a key. For file groups though we would need to unpair them, but it all just makes "alignment" easier and allows to group per studyUID easier
r=9; out=../outputs-all-reran$r; rm -rf $out; HEUDICONV_LOGLEVEL=DEBUG bin/heudiconv --dbg -f heuristics/dbic_bids.py -c dcm2niix -o $out -b ../dartmouth-phantoms/bids_test3
Codecov Report
@@ Coverage Diff @@
## master #32 +/- ##
===========================================
+ Coverage 14.11% 78.35% +64.24%
===========================================
Files 4 6 +2
Lines 503 1109 +606
===========================================
+ Hits 71 869 +798
+ Misses 432 240 -192
Continue to review full report at Codecov.
|
…ierarchy of BIDS, safeguards against writing over
…fo (BK) seqinfo not necessarily matches info in terms of # of items apparently... so will not work as is
…cludes fieldmaps also creates template files for dataset descriptor etc
|
@yarikoptic - the heuristics has become very complicated now - requires a sufficiently enlightened python programmer. can we have the simpler version, which was relatively easy for people to learn? also:
|
|
I didn't change any of heuristics IIRC (at least not yet) ;) indeed currently it is a heap of changes, but once again -- I am just trying first to tune it so I could do what I need to accomplish, and later would hopefully look to catch back diversions from original implementation... which is tough as you know since there were not a single test of any kind. |
…roviding dicom series time for files mtime
…ailing spaces from json files
…uth-phantoms/bids_test4-20161014/phantom-1/fmap_acq-3mm
…writes), unified some names, etc
* gh-mvdoc/enh-anonym: Reload scans keys if existing, start adding tests Clean up requirements and setup.py Update dcmstack source to fix missing importsys Fix dbic_bids.py compatibility with python 3.x Remove forgotten pdb Extract function to get row info and add test Add randomly generated column to avoid figuring out date from hash Name variables that make sense Simplify parsing of date NF: save scan_keys tsv file BF: do not test with bids flag if using convertall Add more requirements BF: add missing import re in embedder
|
@yarikoptic - any chance this can be looked at and @matthew-brett's comments addressed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthew-brett @satra ha -- for some reason I was looking for your @matthew-brett comments in the PR against my clone, and saw none... then looked here. hopefully I caught all of them. I am now about to push the changes addressing (some) of them. And also merged with the work we did during brainhack -- to improve annonimization etc
| from collections import namedtuple | ||
| from collections import defaultdict | ||
| from collections import OrderedDict as ordereddict | ||
| from os.path import isdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could as well be
just a mater of habit/style and also
- easier to prune -- I just need to delete the lines which are grayed out by pycharm due to imported names not being used.
Whenever it is all within a single line -- more pain, since need to go select specific names etc - I believe that it eases for pycharm to refactor/move pieces of code into other modules, so it finds/moves such import statements as well
bin/heudiconv
Outdated
| converted. This edited file will always overwrite the original file. If | ||
| there is a need to revert to original state, please delete this edit.txt | ||
| file and rerun the conversion | ||
| This script uses DicomStack and mri_convert to convert DICOM directories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took your version and expanded... most probably will need to resolve conflicts with master by now ;) thanks
| ) | ||
|
|
||
|
|
||
| class TempDirs(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid concern! But primarily for the case whenever such refactoring would (and if, since original idea was to have a single script IIRC) happen -- ATM it is just as fine imho, so just added a TODO comment on that
bin/heudiconv
Outdated
| dcmfilter : callable, optional | ||
| If called on dcm_data and returns True, it is used to set | ||
| series_id | ||
| grouping : str ('studyUID', 'accession_number') or None, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done
bin/heudiconv
Outdated
| Parameters | ||
| ---------- | ||
| fl : list of str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no reason for type (_list) suffix really -- making it files (noone works with open files in here, so no fear of confusing with file instances) and flfilter into file_filter
| allowed_groupings = ['studyUID', 'accession_number', None] | ||
| if grouping not in allowed_groupings: | ||
| raise ValueError('I do not know how to group by {0}'.format(grouping)) | ||
| per_studyUID = grouping == 'studyUID' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be done imho either way. this centralizes comparison to the string value into a variable. Happen we decide to refactor anything and complicate the logic to decide e.g. on per_studyUID value -- will be just a single point to change instead of hunting for all the comparisons to a string value. So I would leave it as is for now
bin/heudiconv
Outdated
| pass | ||
| studyUID_ = mw.dcm_data.StudyInstanceUID | ||
| except AttributeError: | ||
| #import pdb; pdb.set_trace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks -- cleaned up here and elsewheres
bin/heudiconv
Outdated
| except AttributeError: | ||
| #import pdb; pdb.set_trace() | ||
| lgr.info("File %s is missing any StudyInstanceUID" % filename) | ||
| studyUID_ = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, did to file_studyUID
* origin/master: enh: dcm2niix now outputs TotalReadoutTime Conflicts: Dockerfile -- kept our (Satra's) reformatted version and just progressed dcm2niix to v1.0.20170621 (60bab318ee738b644ebb1396bbb8cbe1b006218f)
…later in the code
* gh-mvdoc/enh-anonym: Add test for add_rows_to_scans_keys_file BF: forgotten nipype Conflicts: setup.py -- both fixed for nipype
|
@yarikoptic - LGTM - i'm going to file an issue about cluster/distributed computing support to be added back after this is merged. |
add David V. Smith (DVSneuro) as author
Sits on top of #31
It implements perspective heuristic which would allow simply to name sequences following BIDS convention. More details available at https://docs.google.com/document/d/1R54cgOe481oygYVZxI7NHrifDyFUZAjOBwCTu7M7y48/edit?usp=sharing
TODOs
move derivatives(atm there is no derivatives)distribution-restrictions=sensitiveone of the test DICOM datasets to try it on is http://datasets.datalad.org/?dir=/dicoms/dartmouth-phantoms/bids_test3