Skip to content

Conversation

@CuriousLearner
Copy link
Member

@CuriousLearner CuriousLearner commented Dec 3, 2017

This fixes the newly added print suggestion for cases when there is leading whitespace in the initial data.

I've also added a test case for this. @ncoghlan Can you please check this?

https://bugs.python.org/issue32028

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test case is good, and the implementation changes are headed in the right general direction. However, it should be sufficient to reorder the existing operations, rather than calling _PyUnicode_XStrip twice.

Py_ssize_t text_len = PyUnicode_GET_LENGTH(self->text);
PyObject *data = PyUnicode_Substring(self->text, PRINT_OFFSET, text_len);
// Issue 32028: Handle case when whitespace is used with print call
PyObject *initial_data = _PyUnicode_XStrip(self->text, 2, strip_sep_obj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding a new call to _PyUnicode_XStrip, just reorder the existing operations:

  1. First strip any surrounding ASCII whitespace (i.e. do this first, instead of last)
  2. Then get the length of the result (rather than the length of the original text)
  3. Then skip over PRINT_OFFSET characters at the beginning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor point I missed when reviewing the original PR: defining and using const int STRIP_BOTH = 2; will make the _PyUnicode_XStrip call more self explanatory.

So we may as well make that change now, since the code is being modified anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the const int STRIP_BOTH = 2;. I was explaining my patch to a few folks and I had to consult my blog for that particular thing. Few months down the line I didn't even remember that.

So, yes that would be great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan So, while I modified the code, one of the test cases fails with re-ordering the code. That test case is:

test_string_with_excessive_whitespace. Since we're only stripping the the data at the beginning, and then directly trying to extract the sub-string.

I think we may need to strip the leading chars to make the previous test case pass. What do you say?

cc @serhiy-storchaka

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed my current code which re-orders the existing operations & the test_string_with_excessive_whitespace is failing. I'm waiting for your reply on this. I think we might want to strip the leading whitespace from data.


def test_string_with_leading_whitespace(self):
python2_print_str = '''if 1:
print "Hello World"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're stripping all leading whitespace, you may as well indent this 4 spaces relative to the target variable name (so 12 leading spaces total)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to check for errors and count references.

const int STRIP_BOTH = 2;
// Issue 32028: Handle case when whitespace is used with print call
PyObject *initial_data = _PyUnicode_XStrip(self->text, STRIP_BOTH, strip_sep_obj);
Py_ssize_t text_len = PyUnicode_GET_LENGTH(initial_data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial_data can be NULL.

PyObject *initial_data = _PyUnicode_XStrip(self->text, STRIP_BOTH, strip_sep_obj);
Py_ssize_t text_len = PyUnicode_GET_LENGTH(initial_data);
PyObject *data = _PyUnicode_XStrip( \
PyUnicode_Substring(initial_data, PRINT_OFFSET, text_len), \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of PyUnicode_Substring() can be NULL.
Otherwise it will be leaked.

@CuriousLearner
Copy link
Member Author

@serhiy-storchaka Nice catch! Sorry, I missed those memory leaks. I've tried to address those issues. Can you please take a pass.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove redundant empty lines.

@CuriousLearner
Copy link
Member Author

@serhiy-storchaka Fixed :)

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd missed that the second _PyUnicode_XStrip call handled excess whitespace between print and the expression being printed, so I was wrong about that being redundant (and the regression test suite was right).

I've adjusted the NEWS entry wording, so +1 from me (I'll merge once CI finishes).

@CuriousLearner
Copy link
Member Author

Thank you so much @ncoghlan & @serhiy-storchaka :)

@CuriousLearner
Copy link
Member Author

CuriousLearner commented Dec 16, 2017

@serhiy-storchaka @ncoghlan I guess we can merge this now :)

Also needs a backport to 3.6 label on this one.

@CuriousLearner
Copy link
Member Author

Hi @serhiy-storchaka @ncoghlan

Do we need something else here? Please let me know and I'll do that :)

@ncoghlan
Copy link
Contributor

Just closing & reopending to restart the CI (Appveyor's a required check now, and it didn't run properly)

@ncoghlan ncoghlan closed this Jan 18, 2018
@ncoghlan ncoghlan reopened this Jan 18, 2018
@ncoghlan ncoghlan merged commit d57f26c into python:master Jan 20, 2018
@miss-islington
Copy link
Contributor

Thanks @CuriousLearner for the PR, and @ncoghlan for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 20, 2018
The suggested replacement for print statements previously failed to account
for leading whitespace and hence could end up including unwanted text in
the proposed call to the print builtin.

Patch by Sanyam Khurana.
(cherry picked from commit d57f26c)
@bedevere-bot
Copy link

GH-5249 is a backport of this pull request to the 3.6 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants