Skip to content

Conversation

@chris-langfield
Copy link
Collaborator

@chris-langfield chris-langfield commented Nov 20, 2021

Previously, a RelionSource would automatically look for the following CTF related parameters in the STAR file:

"_rlnVoltage",                                                                                                         
"_rlnDefocusU",                                                                                                        
"_rlnDefocusV",                                                                                                        
"_rlnDefocusAngle",                                                                                                    
"_rlnSphericalAberration",                                                                                             
"_rlnAmplitudeContrast", 

If they exist in the STAR file, self.unique_filters is populated with a CTFFilter for each unique combination of the parameters. This filter is constructed directly from the parameters above except for "_rlnDefocusAngle", which is multiplied by np.pi / 180 to convert from degrees to radians.

When filter_indices is set, behind the scenes, the ImageSource.filter_indices setter reads the properties of the unique_filters that were already set:

attribute_list = (
                "voltage",
                "defocus_u",
                "defocus_v",
                "defocus_ang",
                "Cs",
                "alpha",
            )
            filter_values = np.zeros((len(indices), len(attribute_list)))
            for i, filt in enumerate(self.unique_filters):
                filter_values[indices == i] = [
                    getattr(filt, attribute, np.nan) for attribute in attribute_list
                ]

And then reassigns this metadata, even though it has already been read in from the STAR file (which allowed the filters to be created in the first place):

self.set_metadata(
            [
                "_rlnVoltage",
                "_rlnDefocusU",
                "_rlnDefocusV",
                "_rlnDefocusAngle",
                "_rlnSphericalAberration",
                "_rlnAmplitudeContrast",
            ],
            filter_values,
        )

As far as I can tell, this is completely circular except for the "_rlnDefocusAngle", which was changed from degrees to radians during the construction of the filter.

This PR

  • Allows a RelionSource to be created from a STAR file with no metadata, i.e. one column: _rlnImageName. In particular, when a CoordinateSource is saved (ImageSource subclass to represent datasets with coordinate files representing picked particles #489 CoordinateSource -- class factory pattern refactor #520), it creates a STAR file in this format, meaning it can be loaded and saved as a RelionSource.

  • Removes the Relion-specific behavior of the setter method of ImageSource.filter_indices, which tries to populate the above parameters. Also, the table containing the data types of each Relion field has been moved from ImageSource to RelionSource. This relates to Separate Relion-specific defaults from ImageSource #519 where we would like to make ImageSource more generalized.

  • Refactors the classmethod starfile2df in RelionSource to populate_metadata, and adds some comments for clarity.

  • Changes the default metadata storage of _rlnDefocusAngle from radians to degrees, to agree with how it is represented by Relion. This change is a side effect of removing the filter_indices setter, which would repopulate this parameter based on the CTFFilter attribute, which is in radians. Now it keeps the original value read from the STAR file.

@codecov
Copy link

codecov bot commented Nov 20, 2021

Codecov Report

Merging #522 (b2a7d0c) into develop (ce200cc) will decrease coverage by 0.00%.
The diff coverage is 92.10%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #522      +/-   ##
===========================================
- Coverage    87.17%   87.16%   -0.01%     
===========================================
  Files          108      108              
  Lines         7984     7987       +3     
===========================================
+ Hits          6960     6962       +2     
- Misses        1024     1025       +1     
Impacted Files Coverage Δ
src/aspire/source/image.py 95.66% <ø> (+0.22%) ⬆️
src/aspire/source/relion.py 93.47% <90.32%> (-1.98%) ⬇️
src/aspire/source/simulation.py 99.41% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce200cc...b2a7d0c. Read the comment docs.

@chris-langfield chris-langfield force-pushed the refactor_image_src_519 branch 3 times, most recently from 3543145 to 01afc7c Compare November 29, 2021 15:34
@chris-langfield chris-langfield linked an issue Dec 3, 2021 that may be closed by this pull request
@chris-langfield
Copy link
Collaborator Author

Squashed down to one commit. @garrettwrong I added a method in Simulation to populate the metadata that would have been populated by the ImageSource.filter_indices setter. However, it is clunky and adds lines of code. It could potentially be useful for other subclasses (e.g. if CTFFilters are added to a CoordinateSource). Should I move it up into ImageSource instead?

@janden
Copy link
Collaborator

janden commented Dec 8, 2021

The issue of storing the CTF parameters probably merits a discussion. Right now, as you point out, we store it in two places: unique_filters and various metadata fields. Ideally, there'd just be one. This is why you're seeing the copying from one place to another – to make sure that the values correspond (except for the degrees↔radians).

The root of the complication is that while we have some filters (like the CTF) that are specified by the forward model (and hence have their parameters as part of the metadata), others are given by our image processing functionality (whitening, phase flipping, downsampling, etc.). This is not so much an issue when all we're doing is processing the data within the framework, but once we want to export (write STAR files) we need to have something that fits into the CTF format again.

@chris-langfield chris-langfield self-assigned this Dec 8, 2021
garrettwrong
garrettwrong previously approved these changes Feb 11, 2022
Copy link
Collaborator

@garrettwrong garrettwrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize this was waiting for me to look at, sorry. Overall this looks good. I made a few comments. I think only one or two things you might consider before sending to Joakim, but I didn't feel strongly enough about them to hold it up.

# for these columns
#
# class attributes of CTFFilter:
CTFFilter_attributes = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if these would be better as a dictionary mapping between CTFFilter_attributes and CTF_params. They look similar. At least similar enough I had to look if exactly the same...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there is a one-to-one correspondence. But the two sets of strings are used separately, so I think to do the set_metadata line that follows this line, I'd have to do something like CTFDict.keys(), which I would probably find more confusing if I were a newcomer. Unless I'm missing something

This was referenced Feb 17, 2022
@chris-langfield chris-langfield merged commit 36cc4ec into develop Feb 17, 2022
garrettwrong pushed a commit that referenced this pull request Feb 22, 2022
* Refactor ImageSource <-> RelionSource, and change Simulation accordingly to populate its metadata

* typo

Co-authored-by: Christopher Langfield <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Separate Relion-specific defaults from ImageSource

4 participants