-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-135944: Add a "Runtime Components" Section to the Execution Model Docs #135945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
535c8b8
3f3d5cc
b1d6ed7
aeca87a
b12a02b
17a2f34
9ac4b4a
f7cb965
e71394c
8f454c4
cd0200c
9ccc743
4dce0fc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -398,6 +398,112 @@ See also the description of the :keyword:`try` statement in section :ref:`try` | |
and :keyword:`raise` statement in section :ref:`raise`. | ||
|
||
|
||
.. _execcomponents: | ||
|
||
Runtime Components | ||
================== | ||
|
||
Python's execution model does not operate in a vacuum. It runs on a | ||
computer. When a program runs, the conceptual layers of how it runs | ||
on the computer look something like this:: | ||
|
||
host machine and operating system (OS) | ||
process | ||
OS thread (runs machine code) | ||
|
||
Hosts and processes are isolated and independent from one another. | ||
However, threads are not. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This falls a little oddly: "multiples of each". Threads, sure. But to say that a (Python) program could grow to include multiple hosts or processes would take a more extended idea of "program" than I think the execution model seeks to address. Wouldn't it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I'll narrow that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed |
||
A program always starts with exactly one thread, known as the "main" | ||
thread, it may grow to run in multiple. Not all platforms support | ||
threads, but most do. For those that do, all threads in a process | ||
share all the process' resources, including memory. | ||
|
||
The fundamental point of threads is that each thread does *run* | ||
independently, at the same time as the others. That may be only | ||
conceptually at the same time ("concurrently") or physically | ||
("in parallel"). Either way, the threads effectively run | ||
at a non-synchronized rate. | ||
|
||
.. note:: | ||
|
||
That non-synchronized rate means none of the global state is | ||
guaranteed to stay consistent for the code running in any given | ||
thread. Thus multi-threaded programs must take care to coordinate | ||
access to intentionally shared resources. Likewise, they must take | ||
care to be absolutely diligent about not accessing any *other* | ||
resources in multiple threads; otherwise two threads running at the | ||
same time might accidentally interfere with each other's use of some | ||
shared data. All this is true for both Python programs and the | ||
Python runtime. | ||
|
||
The cost of this broad, unstructured requirement is the tradeoff for | ||
the concurrency and, especially, parallelism that threads provide. | ||
The alternative generally means dealing with non-deterministic bugs | ||
and data corruption. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sounds a bit like the risk of accidents is unavoidable. Though I take your point, some science is available, so I offer: "The care needed to co-ordinate access to intentionally shared data, when two or more threads could be running, is exactly what makes threads a painful way to exploit the potential concurrence of multiple CPUs." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Arguably, the risk of accidents with threads is unavoidable. 😄 That said, I'll clarify similarly to what you've suggested. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I've driven you to use a lot of words. I'm not as pessinistic as this about threads, but won't claim correctness comes easy. You're the editor. |
||
The same layers apply to each Python program, with some extra layers | ||
specific to Python:: | ||
|
||
host | ||
process | ||
Python runtime | ||
interpreter | ||
Python thread (runs bytecode) | ||
|
||
When a Python program starts, it looks exactly like that, with one | ||
of each. The process has a single global runtime to manage Python's | ||
process-global resources. The runtime may grow to include multiple | ||
interpreters and each interpreter may grow to include multiple Python | ||
threads. The initial interpreter is known as the "main" interpreter, | ||
and the initial thread, where the runtime was initialized, is known | ||
as the "main" thread. | ||
|
||
An interpreter completely encapsulates all of the non-process-global | ||
runtime state that the interpreter's Python threads share. For example, | ||
all its threads share :data:`sys.modules`, but each interpreter has its | ||
own :data:`sys.modules`. | ||
|
||
.. note:: | ||
|
||
The interpreter here is not the same as the "bytecode interpreter", | ||
which is what regularly runs in threads, executing compiled Python code. | ||
|
||
A Python thread represents the state necessary for the Python runtime | ||
to *run* in an OS thread. It also represents the execution of Python | ||
code (or any supported C-API) in that OS thread. Depending on the | ||
implementation, this probably includes the current exception and | ||
the Python call stack. The Python thread always identifies the | ||
interpreter it belongs to, meaning the state it shares | ||
with other threads. | ||
|
||
.. note:: | ||
|
||
Here "Python thread" does not necessarily refer to a thread created | ||
using the :mod:`threading` module. | ||
|
||
Each Python thread is associated with a single OS thread, which is where | ||
it can run. In the opposite direction, a single OS thread can have many | ||
Python threads associated with it. However, only one of those Python | ||
threads is "active" in the OS thread at time. The runtime will operate | ||
in the OS thread relative to the active Python thread. | ||
|
||
For an interpreter to be used in an OS thread, it must have a | ||
corresponding active Python thread. Thus switching between interpreters | ||
means changing the active Python thread. An interpreter can have Python | ||
threads, active or inactive, for as many OS threads as it needs. It may | ||
even have multiple Python threads for the same OS thread, though at most | ||
one can be active at a time. | ||
|
||
Once a program is running, new Python threads can be created using the | ||
:mod:`threading` module (on platforms and Python implementations that | ||
support threads). Additional processes can be created using the | ||
:mod:`os`, :mod:`subprocess`, and :mod:`multiprocessing` modules. | ||
You can run coroutines (async) in the main thread using :mod:`asyncio`. | ||
Interpreters can be created and used with the | ||
:mod:`concurrent.interpreters` module. | ||
|
||
|
||
.. rubric:: Footnotes | ||
|
||
.. [#] This limitation occurs because the code that is executed by these operations | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a good idea to mention that this is CPython specific. Other implementations are allowed to have different runtime models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, oops, didn't see this. Some parts are CPython specific here, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to avoid any CPython-specific notions here. Do you have an example of another runtime model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm not familiar with how any other implementations work, this is coming solely off a bit of speculation on my part. My concern is that a thread is inherently CPython-specific, because some Python implementations exist in areas that don't have access to OS threads, such as Brython for JS, and probably MicroPython/CircuitPython for some microcontrollers.
I do think that this is a great section to have for subinterpreters, I just think it'd be better to keep some parts of it CPython specific. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a conceptual standpoint, which is the level of this text, threads are not implementation-specific (or platform-specific): the Python runtime always deals with at least the one thread in which it was started. Not all platforms and implementations support multiple threads, but that's a different matter. I'll add a sentence clarifying that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I can get behind that. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the question of implementations that do not have access to OS threads, unless they are unable to create Thread objects, I think this is covered by the possibility of multiple Python threads may be mapped to the single OS thread available. I'm assuming this is multiple Python threads in one interpreter, but interested to be reassured about the multiplicity of the relationships in this data model. I'm reading "OS thread" as "JVM Thread" for my purposes. It's not necessarily the same, but I don't think we can/should peek behind that particular abstraction.