Skip to content

Conversation

@DerThorsten
Copy link

@DerThorsten DerThorsten commented Jun 12, 2017

pybind11 will raise a TypeError if the exported classes / functions are called with incompatible function arguments.
In the cpp impl pybind11::repr / __repr__ is called, but I would strongly suggest to call pybind::str / __str_ .
__repr__ can get very very long for array like classes.
The printout can take very long and furthermore it does not help the user very much.

One could also something like:

#ifdef PYBIND11_USE_REPR_IN_TYPE_ERROR
msg += pybind::repr(obj);
#else
msg += pybind::str(obj);
#endif

If this is preferred, I could implement this.
But I can see almost no benefit to just using __str__

@jagerman
Copy link
Member

Unfortunately, neither repr nor str is ever going to be completely satisfactory: repr is, as you point out, sometimes overly verbose; str, on the other hand, sometimes doesn't include enough information to diagnose the actual type mismatch. See #517 and #518 for why we originally changed this to repr (before that, it did use str).

@jagerman
Copy link
Member

Also, for making this configurable, a define is discouraged—it's easy to cause an ODR violation if you link together different compilation units which don't have exact feature macro matches.

If you do want to write up support for showing str instead of repr, it would be preferrable to use options.h approach so that you could use py::options::type_error_repr() or py::options::type_error_str() to configure the behaviour.

@DerThorsten
Copy link
Author

I'll try to implement it as suggested using and commit the results the next days

@DerThorsten
Copy link
Author

I used type_error_print_repr instead of type_error_repr.
Also I only implemented enable_type_error_print_repr and enable_type_error_print_str.
I did not implement disable_type_error_print_repr because this would be the same as enable_type_error_print_str.

More suggestions?

@jagerman
Copy link
Member

I'd prefer the names type_error_print_repr, i.e. dropping the enable_ prefix since, as you point out, the "disable" equivalent is really the _str method; these aren't really about enabling/disabling something, but just about making an A-or-B choice.

@DerThorsten DerThorsten changed the title print __str__ instead of __repr__ in case of TypeError from incompatible function arguments Extended options object to choose between the print of __repr__ and __str__ (edited title) Jun 12, 2017
@DerThorsten
Copy link
Author

DerThorsten commented Jun 12, 2017

So type_error_print_repr is the getter function, you suggest to give the setter the same name? This will not work (edit. ok the one is static, the other is not, but having the same names, is that what you want?)

@jagerman
Copy link
Member

Ah, right, I forgot about the conflicting getter. How about: set_type_error_print_repr?

@DerThorsten
Copy link
Author

Fine ! commit is coming in a sec

@DerThorsten
Copy link
Author

Anything else I can do to accept this PR?

@jagerman
Copy link
Member

jagerman commented Jun 12, 2017

Some suggestions:

  • The way options works is that one creates a locally-scoped options instance, set options in that instance, which then affects definitions within the scope of that options instance. That is, it works like this:
py::class_<C> c(m, "C");
{
    py::options options;
    options.set_type_error_print_str();
    c.def("f", [](ArgType a) { /* ... */ }); // Uses str() on argument failure
}
c.def("g", [](ArgType a) { /* ... */ }); // Uses repr() on argument failure

The problem is that once you're out of the scope, the option gets reset to default, so you need to preserve the option as part of the function definition at the time the definition is set.

To do this, you'll need to add a bool use_repr : 1; to the struct function_record with the other bools (in attr.h), initialize it to true in the constructor, then in pybind11.h's cpp_function::initialize() you can set rec->use_repr = options::type_error_print_repr();. Then, finally, instead of checking options::type_error_print_repr() to decide which print mode to use, you check overloads->use_repr. There is one small disadvantage here that if there are multiple overloads you're going to get the repr/str choice in effect when the first overload was defined, but that seems like a perfectly acceptable limitation to me.

  • It's going to need some tests to make sure it's working. I suggest adding them to test_docstring_options (and rename it to test_options since it won't be just about docstring options anymore).
  • the commit currently misses the display of kwargs values (in the loop just under the part you changed): you'll want changing the {!r} in the format to {!s} when in str mode.
  • Add a short bit of documentation of the feature to docs/advanced/misc.rst (at the end, just after the current docstring description) with a short example of how to use it.

@DerThorsten
Copy link
Author

Thanks for the detailed instructions!
Will push changes soon.

@dean0x7d
Copy link
Member

Long messages for arrays are annoying, but I'm not sure that switching to str really helps in the general case. Consider:

import numpy as np

def str_vs_repr(n):
    a = np.arange(n)
    print(str(a))
    print(repr(a))

For str_vs_repr(1000), both print the entire array which is too long. For str_vs_repr(10000) both truncate, which looks fine. Point is: the behavior of str and repr is the same with regard to the length of the message.

Perhaps it would be better to leave repr but truncate long output, i.e. replace pybind11::repr(arg) with something like "{!r:.100}"_s.format(arg).

@DerThorsten
Copy link
Author

DerThorsten commented Jun 12, 2017

@dean0x7d good suggestion.

I would suggest the following:
Users should still be able to choose between str and repr and ontop
of that one can only print the first k chars if the string is to long.
I would suggest to also make it an option how long the string should be at max.

@jagerman
Copy link
Member

Another option that might be valuable: displaying the argument type name, rather than either repr or str.

@DerThorsten
Copy link
Author

DerThorsten commented Jun 12, 2017

So in our usecases with https://github.com/DerThorsten/nifty/ we get the TypeError because we called the function with the wrong arguments.
I would say that it was never helpful that the actual value of the args was printed.

Therefore I really like your suggestion

@jagerman
Copy link
Member

Okay, forget the str part, and make the feature choose between repr() (default, current behaviour) and the type name. I also agree that I'd find it much more useful than just being able to toggle between str and repr. (Py_TYPE(args_[ti])->tp_name should get the type name for you). All my other points (about tests and docs and whatnot) still apply.

@DerThorsten
Copy link
Author

DerThorsten commented Jun 12, 2017

Ok, you get a PR soon (but most probably not today)

@dean0x7d
Copy link
Member

I'm not really a fan of making everything an option. Good defaults are very important. An option should be added only if there is a compelling reason.

For str vs repr, I'm not sure when str would ever have an advantage in this use case (error reporting/debugging). As for repr vs. type name, repr generally already includes either the explicit type name or an unambiguous representation (for builtins). With the truncation to a reasonable length, repr should be all that's needed.

@DerThorsten
Copy link
Author

DerThorsten commented Jun 12, 2017

Also fine.
All which is important for me is to get a truncation, no mather how!
since @dean0x7d solution is very general and fixes the problem for any case I am starting to be a fan of this solution =)

@jagerman feel free to decide, I'll implement any solution since I really need this to be fixed =)

@jagerman
Copy link
Member

I don't feel particularly strongly either way, so I guess that means go with the truncated repr approach.

@wjakob
Copy link
Member

wjakob commented Jun 12, 2017

I am not so sure about this PR. It's possible to extend pybind11 in a myriad of different ways to address every one-off requirement that users of this project might have, which will make everyone happy in the short term, but which would lead to significant bloat eventually. When we accept PRs, they should provide a clear improvement that helps more than just one or two users. This one IMO seems below the threshold.

@jagerman
Copy link
Member

@wjakob - right; I think we settled on just truncating the repr rather than adding an option for it.

@DerThorsten
Copy link
Author

DerThorsten commented Jun 13, 2017

@jagerman I impemented the trucation and add

   "...[truncated by pybind11]"

I think this message could reduce confusing. Are you fine with this impl.
I also added a tiny unit test which can be extended if you consider it to be to small.

ATM I use 100 as a size limit which could be to small. Suggestions?

Should I open a new PR or should we stick to this one?

Greetings

@jagerman
Copy link
Member

Stick to this PR.

  • The [truncated by pybind11] part of the message isn't necessary (and worse, it just makes the string longer again).
  • you want "substring" in string, not string.contains("substring"). But I think it might be better to do a full string test here (rather than just a contains) to make sure that the truncation is happening at the correct place.
  • it looks to me like the truncation string is always added, but should only be added when a truncation actually takes place. Perhaps we could add a size() method to class str in pytypes.h:
    // Returns the string size (in unicode codepoints, *not* bytes)
    size_t size() { return (size_t) PyUnicode_GET_LENGTH(m_ptr); }

and then in the "invoked with" code use something like this for the shortening logic:

auto arg_repr = pybind11::repr(args_[ti]);
std::string arg =
    #ifdef NDEBUG
    arg_repr.size() <= 100 ? (std::string) arg_repr : (std::string) pybind11::str("{!r:.95}").format(arg_repr) + "[...]";
    #else
    arg_repr;
    #endif

I'm using 100 for the sake of example: I think we could shorten it to, e.g., 50 (with 45 in the format). We could also use just "..." instead of "[...]", and change the format size value to total-3 instead of total-5.

I think it might also be worthwhile only when NDEBUG is not defined so that, when compiling in debug mode, you get the entire, unaltered repr value (I incorporated this into the code above).

@jagerman
Copy link
Member

@DerThorsten - ping, any updates on this?

@jagerman jagerman mentioned this pull request Jul 21, 2017
6 tasks
@wjakob
Copy link
Member

wjakob commented Sep 11, 2018

I took another look at this and just don't think that this is important enough to warrant the necessary changes. Sorry, it seems like you spent quite a bit of time on this.

@wjakob wjakob closed this Sep 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants