-
Notifications
You must be signed in to change notification settings - Fork 25
Milestone
Description
boost-histogram currently doesn't work very well with Pandas DataFrames, requiring a np.asarary to work (and sometimes an explicit cast to a NumPy string datatype).
-
My suggestion would be to do the
np.asarraywrapping inside the Python fill wrapper, so that these can be simplified when using non-NumPy based arrays. -
You also can't setup categories from all iterables, like sets, but only from true
lists, which is restrictive.
Actual usage:
hist = bh.Histogram(
bh.axis.StrCategory(list(set(skhep.file_project))),
bh.axis.Integer(2018, 2021, underflow=False, overflow=False),
bh.axis.Integer(2, 4, underflow=False, overflow=False),
storage=bh.storage.Int64()
)
hist.fill(np.asarray(skhep.file_project, dtype=str),
np.asarray(skhep.timestamp.dt.year),
np.asarray(skhep.details_python.str[0].astype(int))
)Ideal usage:
hist = bh.Histogram(
bh.axis.StrCategory(set(skhep.file_project)),
bh.axis.Integer(2018, 2021, underflow=False, overflow=False),
bh.axis.Integer(2, 4, underflow=False, overflow=False),
storage=bh.storage.Int64()
)
hist.fill(skhep.file_project,
skhep.timestamp.dt.year,
skhep.details_python.str[0].astype(int)
)Metadata
Metadata
Assignees
Labels
No labels