Skip to content

Conversation

@henryiii
Copy link
Member

@henryiii henryiii commented Nov 8, 2019

Adding class families.

  • New families of classes can be defined with @set_family.
    • The cpp classes are now distinct, with reprs directly from C++'s <<.
    • The cpp axes are now distinct classes with C++ calling conventions, closes cpp mode methods #207
    • Test added for external use of @set_family
  • The lookup system for C++ classes is now unified, with a single @register(class_list) and cast instead of a collection of _to_* functions and various ._CLASSES and such. If there is just one C++ class associated (through subclassing) to a Python class, @register is not needed.
  • __module__ is now being set to the correct locations using @set_module
  • __all__ has been added to public modules. It is always the third line, after the future stuff.
  • Free functions mentioned in Remove bh.algorithm #197 dropped (closes Remove bh.algorithm #197)
  • Python mode reprs respect Python class names and calling conventions
  • Add Transforms: Log, Sqrt, and Pow
    • Remove Regular.pow, etc. shortcuts.
  • Nicer error when C++ module is not built (like when running in main directory)

Note that accumulators currently do not participate in the family scheme.

@henryiii
Copy link
Member Author

henryiii commented Nov 8, 2019

@HDembinski, do you want correct __module__'s for the classes in cpp? Axis, etc. (histogram is easy and correct already in this PR). This is what it looks like now:

>>> bh.Histogram # Correct
boost_histogram.Histogram
>>> import boost_histogram.cpp as bh
>>> bh.histogram # Correct
boost_histogram.cpp.histogram
>>> bh.axis.regular # Should be boost_histogram.cpp.axis.regular
boost_histogram.axis.Regular

Should this report the correct module "boost_histogram.cpp.axis", and have the correct name "regular"? Also, I'm thinking if we move the repr's to Python, the CPP classes can have the original BoostHistogram reprs, which would be rather nice. It would give Python users access to the ascii histogram, which was one of the most popular parts of the talk at CHEP. Also, is the line printout available separately in Boost.Histogram?

Also, do you want cpp imported by default, so you can do import boost_histogram as bh; bh.cpp.... without import boost_histogram.cpp? Or is the main expected way to use it import boost_histogram.cpp as bh?

The cpp module is 100% yours, I'll craft it any way you like. But I need input.

@henryiii
Copy link
Member Author

henryiii commented Nov 9, 2019

Separate question: Where do you like to set __module__? At the definition, at import site, or with a decorator? Different projects do it differently; Numpy seems to do all three interchangeably, of course.

from numpy.core.overrides import set_module
@set_module("boost_histogram")
class Histogram ...

 # (we probably would use our own set_module, it's just 4 lines)

vs.

class Histogram ...

Histogram.__module__ = "boost_histogram"

vs.

from _internal.hist import Histogram
Histogram.__module__ = "boost_histogram"

I'm currently doing #2, but can switch if you have a preference; I don't.

@henryiii henryiii force-pushed the henryiii-classes branch 3 times, most recently from 0c20aa8 to 5e0b004 Compare November 9, 2019 05:33
@HDembinski
Copy link
Member

I am ok with "wrong" __module__ etc in the cpp module, if it simplifies the code.

@HDembinski
Copy link
Member

Option 2 looks fine.

@HDembinski
Copy link
Member

Should this report the correct module "boost_histogram.cpp.axis", and have the correct name "regular"?

I think it is ok if it is not correct. If you feel super strong about it and you are ok with having extra code complexity to have it right, I am ok with this. I personally think it is not worth it. Everything should look nice for the main module, but the cpp module is only there for convenience of some people who use C++ Boost.Histogram. This is a much smaller fraction of the user base and I think this fraction does not look at __module__ etc.

Also, I'm thinking if we move the repr's to Python, the CPP classes can have the original BoostHistogram reprs, which would be rather nice.

Ok, good idea.

It would give Python users access to the ascii histogram, which was one of the most popular parts of the talk at CHEP. Also, is the line printout available separately in Boost.Histogram?

Feel free to adapt the code for boost-histogram. In Python, it is much easier to detect the terminal width.

@henryiii
Copy link
Member Author

If you feel super strong about it and you are ok with having extra code complexity to have it right

I don't feel super strong about it, but I've already written the code now to get it right on the plane (it's needed for returning the correct Axis from Histogram in cpp mode anyway), so the extra complexity is not that high to add it (lower than before). I'll complete the PR then make a comparison. It would take about 30 seconds to remove it if we decide to.

It would be a bit harder to add to accumulators, so if you don't mind, I'd like to leave those unwrapped even if we wrap Storage/Transform. (And, three of the accumulators are going to be included in boost-histogram anyway, so they technically are not the same as in Boost.Histogram).

@HDembinski
Copy link
Member

Sounds all good. Sorry for prematurely removing the Storage class.

@henryiii henryiii force-pushed the henryiii-classes branch 2 times, most recently from f5f4933 to 2496a2e Compare November 12, 2019 12:29
@henryiii henryiii force-pushed the henryiii-classes branch 2 times, most recently from fce67bf to 2a14ef7 Compare November 14, 2019 04:17
@henryiii
Copy link
Member Author

I think this is finally ready for review, @HDembinski. I've listed the changes in the issue description. One key feature of building a general system to select between the normal and cpp versions of the classes turned out to be the fact it can be used to allow external packages to do the same trick of defining families of classes. External packages will not be able to remove methods/properties, like cpp can, but they can add and replace (and, besides being disallowed by the Python language, removing a property/method would require internal knowledge of the private interface, which we don't support). This will be key for Hist, since Hist will provide new methods like "name" on axis, and will make metadata readonly. See the test_subclassing for an example of how this can be done.

Future cleanup (later PR) will restructure the _internal directory to better mimic the normal directory, and will come up with a way to allow a custom AxesTuple as well (maybe just as an example of how to do it with the existing code).


Also, due to the drastically different design direction I ended up taking, @set_module turned out to be much more natural than setting __module__ after the definition. It nicely shows the key difference between things like int and Int, it goes with the other decorators, and it is much harder to forget than something after the class definition (I forgot it a time or two before). I think you'll agree; but if you still like setting it afterwards, I will move it back.

@henryiii henryiii changed the title WIP: Polish Class families Nov 14, 2019
@henryiii
Copy link
Member Author

Edit: one more simplification I forgot to put in will be pushed in a moment

Copy link
Member

@HDembinski HDembinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this solution at first glance, which makes me uncomfortable, but I don't want to block this. I hope you know what you are doing. In any case, this is not a change of the public interface, so go ahead.

@henryiii
Copy link
Member Author

Okay, there are lots of other fixes here I'd like to get in, and it will unblock future work. I do have an alternate solution in mind you might like better, a little simpler in some ways but with different tradeoffs. This is still a good base to compare them if we do want to try that solution. Let's try to discuss it early next week.

@henryiii henryiii merged commit 20b5dfd into develop Nov 16, 2019
@henryiii henryiii deleted the henryiii-classes branch November 16, 2019 22:40
@henryiii
Copy link
Member Author

(I'll write up an issue in the next few days that describes the current solution and proposes the alternative I have in mind, so you can pick or iterate. If we keep the current one, I can still use that to start the docs for it)

@HDembinski
Copy link
Member

It is not so urgent, I think. I am much more eager to fix the bugs that nsmith pointed out.

@HDembinski
Copy link
Member

We can always make the implementation nicer later :).

@HDembinski
Copy link
Member

Although if you want to depend on these facilities in hist, then it is kinda part of the public interface, although for developers and not for users.

@henryiii henryiii mentioned this pull request Nov 17, 2019
@henryiii
Copy link
Member Author

Yes, it affects the public interface for developers building on it, so we'll need to iron it out for Hist, but no rush quite yet. I'll probably aim for end-of-the-week then to write it up (busy week ahead), and no reason to work on it before 0.6.0.

HDembinski pushed a commit that referenced this pull request Dec 1, 2019
* Warning for uninitialized storage

* Drop free functions, add __module__ and __all__

* Fix a few more missing __module__s

* Clean up __all__

* Remove .reduce and .project from cpp class

* Move Axis repr to Python and add transforms

* Lookup for transforms works

* Using repr from Python for Histogram

* Use Boost.Histogram << as repr for cpp.histogram

* Helpful error when C++ code not built

* Use register for Axes

* Use simpler storage scheme

* Remove final _to_ function

* CPP histogram returns CPP axis now

* Revamp decorator system, now could be useful in other packages

* Adding subclassing test

* Adding Storage/Transforms to the register system

* Simplify: no longer require @register for direct subclasses

* Unicode strings in repr on Python 2 (fix test)

* Simplify use of _compute_common_index

* Drop a few extra __module__ sets that were superfluous

* Fix __module__ on accumulators and slicing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cpp mode methods Remove bh.algorithm

3 participants