Skip to content

Implement total_branch_length using numpy ops? #1794

@jeromekelleher

Description

@jeromekelleher

After #1704 lands we can implement total_branch_length like this:

@property
 def total_branch_length(self):
        nodes = self.preorder()
        parent = self.parent_array[nodes]
        time = self.tree_sequence.tables.nodes.time
        branch_length = time[parent] - time[nodes]
        return float(np.sum(branch_length[parent != NULL]))

I'm sure this'll be faster most of the time, but I guess there could be some cases where overhead of creating the time array leads to a regression. There'll be a bit of memory overhead too, but I doubt this is significant.

Any thoughts on whether we should/shouldn't do things like this?

(In retrospect this should have been a function rather than a property, as it would have been more useful (you could specify the root node you wanted to sum from), and it also violates the "properties are inexpensive" rule. I think my reasoning at the time was that this was something we should be able to keep track of efficiently incrementally, but this isn't actually true under the current definition, which only sums branch length reachable from roots. That ship has sailed anyway, but I guess we should make a note of this in the comments if/when we're updating the method)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceThis issue addresses performance, either runtime or memoryPython APIIssue is about the Python API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions