Skip to content

Conversation

@cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Oct 5, 2025

Resolves 7 warnings.

Generally: "file object" -> IOBase; "binary"/byte read/write -> BufferedIOBase, "text"/str read/write -> TextIOBase

My reasoning for specific cases (happy to change):

  • EOFError is for input builtin function which reads a str, so TextIOBase
  • os.read and os.write speak bytes but the sentence is about sys.stdin and sys.stdout which are TextIOBase; use the function that gets called by sys.stdout.write (TextIOBase.write).
  • os.exec: flush is talking about files generally, so IOBase
  • email.parser is talking about binary files / bytes so BufferedIOBase

📚 Documentation preview 📚: https://cpython-previews--139592.org.readthedocs.build/

Resolves 7 warnings.

Generally: "file object" -> IOBase; "binary"/`byte` read/write -> BufferedIOBase, "text"/`str` read/write -> TextIOBase

Specific cases:

 - `EOFError` is for `input` builtin function which reads a `str`, so `TextIOBase`
 - `os.read` and `os.write` speak bytes but the sentence is about `sys.stdin` and `sys.stdout` which are `TextIOBase`; use the function that gets called by `sys.stdout.write`.
 - `os.exec`: `flush` is talking about files generally, so `IOBase`
 - `email.parser` is talking about binary files / `bytes` so `BufferedIOBase`
@cmaloney cmaloney requested a review from a team as a code owner October 5, 2025 07:38
@bedevere-app bedevere-app bot added docs Documentation in the Doc dir skip news labels Oct 5, 2025
@github-project-automation github-project-automation bot moved this to Todo in Docs PRs Oct 5, 2025
@cmaloney cmaloney changed the title gh-101100: Fix sphinx warnings around I/O gh-101100: Fix sphinx reference warnings around I/O Oct 6, 2025
@emmatyping emmatyping self-requested a review October 20, 2025 05:49
@emmatyping
Copy link
Member

I think this makes sense since read and write aren't defined on IOBase. I do wonder though if it would make sense to define them there without specifying a signature, like

.. method:: read(...) # literally ... to show the signature is elided
.. method:: write(...)
   Explanation of why read and write don't have signatures...

@AA-Turner wondering if you have thoughts on the best path forward here.

@AA-Turner
Copy link
Member

AA-Turner commented Oct 20, 2025

Though at runtime, IOBase has no read:

>>> io.IOBase.read 
Traceback (most recent call last):
  File "...", line 1, in <module>
    io.IOBase.read
AttributeError: type object 'IOBase' has no attribute 'read'

I worry it'd be confusing to 'lie' here, given that there's somewhat of a conflict: in one place we note that it is RawIOBase that has read(), rather than IOBase, yet in another we say that read() ought be considered in the interface.

Eliding the signatures entirely ought to work, i.e.:

.. method:: IOBase.read

   ... spam spam spam spam ham eggs and spam ...

A

@cmaloney
Copy link
Contributor Author

cmaloney commented Oct 20, 2025

re: Common definition, I'm open to but worry it will be hard to express the three different variants of "write" (and "read") concisely:

  1. write which retries EINTR but nothing else. Can give back partial writes caller must retry; value must be bytes ("Raw I/O")
  2. write which raises or writes/takes ownership of the whole value which must be bytes; ("Buffered I/O")
  3. write which raises or writes/takes ownership of the whole value which must be text; ("Text I/O")

Scanning through github repos (including CPython which caches .pyc using "Raw I/O" for speed) people frequently use 1 as through it was 2 and most the time it works. Sometimes you end up with partial files... Practically I like pointing people to pathlib.Path.{read,write}_{text,bytes} because a lot of the time people write files in full (and those can/do optimize for that)

I have as a general project to clarify the I/O docs around read/write more (they don't match shipping behavior, ex more than one system call may be made in Raw I/O; gh-129011). That's a lot more intricate as it needs to also cover Non-Blocking behavior which is well specified at "Raw I/O" and "Buffered I/O" layers and there is a lot of corner case complexity... but also most people use print so the baseline write needs to be understandable without deep systems knowledge.

@cmaloney
Copy link
Contributor Author

What would you think of putting a generic read and write definition into https://docs.python.org/3/glossary.html#term-file-object?

It feels like they're concepts that are replicated a lot in Python code. Another option would be https://docs.python.org/3/library/io.html#static-typing which defines a generic read and write, but understanding that doc to me requires a lot more knowledge someone learning about read and write might not have (what is typing, how do I read this / what is the generic type, ...).

@AA-Turner
Copy link
Member

This almost feels a better fit for the data model (in the PLR). We already have pseudo definitions for 'protocols', eg for module or sequence, I think we could consider adding a brief point on file ops. Not guaranteeing its acceptance, but floating the idea.

A

@cmaloney
Copy link
Contributor Author

@emmatyping , @AA-Turner I added file.read, file.write, file.close and that with file should work to the PLR Data Model. If that looks like a reasonable approach I'll move the references (otherwise can undo from this PR)

Co-authored-by: Carol Willing <[email protected]>
Copy link
Member

@emmatyping emmatyping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, thank you!

@cmaloney
Copy link
Contributor Author

Pre-merge I'd like either @AA-Turner review on the Python Language Reference change or to pull that out (and do a separate issue for it an refactoring more broadly possibly)

Copy link
Contributor

@willingc willingc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm satisfied with the language ref change will leave for @AA-Turner to re-review and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge docs Documentation in the Doc dir skip news

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

4 participants