Miscellaneous usability issues

Even the best laid plans won't survive first contact with the user.  Let me know if you want these broken into separate issues, but most are small.  p.s. this is with v0.5.2

I installed the package and played around with it, first reaction: docstrings are quite sparse, probably due to the large amount of pybind11 shenanigans.
It would be nice if the various axis type inheritances are visible from python, or at least something that tells me `isinstance(x, bh.axis.axis)` or something to that effect.

### 1 (see #216)

First dumb move:
```
ax = bh.axis.regular(20, 0, 1)
ax.value()
```
raises a cryptic `TypeError`, but I just didn't know the function required arguments.

### 2 (see #215)

For growable categories, I hit several obstacles:
```
ax = bh.axis.category([], growth=True)
```
raises `ValueError` saying there must be at least one bin.
Is it not possible to create growable axis with empty starting categories?

### 3
Answered well by https://github.com/scikit-hep/boost-histogram/issues/214#issuecomment-554683890

Playing with indexing,
```
ax = bh.axis.category(['old'], growth=True)
ax.index('new')
# returns 1
ax.extent
# returns 1, why?
```
so I assume we should only treat axis objects as immutable with no actual state w.r.t. filling?
Of course, if I make a histogram using this axis, everything works as expected:
```
h = bh.histogram(bh.axis.category(['old'], growth=True))
h.fill(['hi', 'there'])
h.axis(0).extent
# returns 3
```

### 4 (see #184)

The options for how to provide indices to fill are a bit weird:
```
h = bh.histogram(bh.axis.category(['old'], growth=True))
h.fill('hi')
```
raises `ValueError` about casting strings to `char const&`, so I cannot use plain strings?

### 4.a (#230)

I see the choice to interpret python strings as iterables causes some headache:
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill('hi', np.arange(4))
```
raises `ValueError: spans must have compatible lengths`
(although even when they are compatible, the string to char array issue comes up again)

### 5 #233

Ok, so sticking to arrays,
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill(['hi'], np.arange(4))
Segmentation fault: 11
```
uh oh!
but this works ok:
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill(['hi'], np.arange(1))
```

### 5.a #230 (not supported)

How about numpy scalars (dimension-0 arrays)?
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill(np.array('hi'), np.arange(3))
```
raises a cryptic `ValueError: allocator<T>::allocate(size_t n) 'n' exceeds maximum supported size`

But indeed numpy 1D arrays work:
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill(np.repeat('hi', 3), np.arange(3))
h.fill(np.array(['hi']), np.arange(3))
```
so broadcasting is supported, nice!

### 6

When adding histograms, the growable categories are not as flexible as I'd like:
```
h = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h2 = bh.histogram(bh.axis.category([''], growth=True), bh.axis.regular(20, 0, 1))
h.fill(np.array(['hi']), np.array([.1]))
h2.fill(np.array(['hi again']), np.array([.2]))
h + h2
```
raises `ValueError: axes of histograms differ`, when clearly one could be grown to accept the other
I tried a workaround:
```
h.fill(np.array(['hi', 'hi again']), np.array([0, 0]), weight=np.zeros(2))
h2.fill(np.array(['hi', 'hi again']), np.array([0, 0]), weight=np.zeros(2))
h + h2
```
which also fails since the categories are introduced in a different order in the respective histograms.

### 7

About categorical axes, it looks like the storage contains the outer product of growable categories:
```
h = bh.histogram(
  bh.axis.category([''], growth=True),
  bh.axis.category([''], growth=True),
  bh.axis.category([''], growth=True),
  bh.axis.regular(60, 60, 120),
)
h.fill(
  ['dataset1', 'dataset2', 'dataset2'],
  ['region1', 'region1', 'region2'],
  ['', '', 'JESup'],
  [90., 86., 92.],
)
import pickle
len(pickle.dumps(h))
```
returns 9290 (787 empty), while for comparison,
```
import coffea.hist as hist
h = hist.Hist('events',
  hist.Cat('dataset', ''),
  hist.Cat('region', ''),
  hist.Cat('systematic', ''),
  hist.Bin('mass', '', 60, 60, 120),
)
h.fill(dataset='dataset1', region='region1', systematic='', mass=90.)
h.fill(dataset='dataset2', region='region1', systematic='', mass=86.)
h.fill(dataset='dataset2', region='region2', systematic='JESup', mass=92.)
len(pickle.dumps(h))
```
returns 4898 (2880 empty).  Coffea histograms have a fair bit of pickling overhead compared to boost, but the sparseness catches up.
Is there a way to request sparse bin storage here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Miscellaneous usability issues #214

1 (see #216)

2 (see #215)

3

4 (see #184)

4.a (#230)

5 #233

5.a #230 (not supported)

6

7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Miscellaneous usability issues #214

Description

1 (see #216)

2 (see #215)

3

4 (see #184)

4.a (#230)

5 #233

5.a #230 (not supported)

6

7

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions