Skip to content

Commit 21f8da8

Browse files
authored
Merge branch 'main' into gh-143511-remote-exec-docs
2 parents 269d9ff + 5f57f69 commit 21f8da8

File tree

60 files changed

+1439
-280
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+1439
-280
lines changed

Android/testbed/app/build.gradle.kts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,12 @@ android {
9292
}
9393
throw GradleException("Failed to find API level in $androidEnvFile")
9494
}
95-
targetSdk = 35
95+
96+
// This controls the API level of the maxVersion managed emulator, which is used
97+
// by CI and cibuildwheel. 34 takes up too much disk space (#142289), 35 has
98+
// issues connecting to the internet (#142387), and 36 and later are not
99+
// available as aosp_atd images yet.
100+
targetSdk = 33
96101

97102
versionCode = 1
98103
versionName = "1.0"

Doc/c-api/memory.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -677,7 +677,11 @@ The pymalloc allocator
677677
Python has a *pymalloc* allocator optimized for small objects (smaller or equal
678678
to 512 bytes) with a short lifetime. It uses memory mappings called "arenas"
679679
with a fixed size of either 256 KiB on 32-bit platforms or 1 MiB on 64-bit
680-
platforms. It falls back to :c:func:`PyMem_RawMalloc` and
680+
platforms. When Python is configured with :option:`--with-pymalloc-hugepages`,
681+
the arena size on 64-bit platforms is increased to 2 MiB to match the huge page
682+
size, and arena allocation will attempt to use huge pages (``MAP_HUGETLB`` on
683+
Linux, ``MEM_LARGE_PAGES`` on Windows) with automatic fallback to regular pages.
684+
It falls back to :c:func:`PyMem_RawMalloc` and
681685
:c:func:`PyMem_RawRealloc` for allocations larger than 512 bytes.
682686
683687
*pymalloc* is the :ref:`default allocator <default-memory-allocators>` of the

Doc/library/subprocess.rst

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -803,14 +803,29 @@ Instances of the :class:`Popen` class have the following methods:
803803

804804
.. note::
805805

806-
When the ``timeout`` parameter is not ``None``, then (on POSIX) the
807-
function is implemented using a busy loop (non-blocking call and short
808-
sleeps). Use the :mod:`asyncio` module for an asynchronous wait: see
806+
When ``timeout`` is not ``None`` and the platform supports it, an
807+
efficient event-driven mechanism is used to wait for process termination:
808+
809+
- Linux >= 5.3 uses :func:`os.pidfd_open` + :func:`select.poll`
810+
- macOS and other BSD variants use :func:`select.kqueue` +
811+
``KQ_FILTER_PROC`` + ``KQ_NOTE_EXIT``
812+
- Windows uses ``WaitForSingleObject``
813+
814+
If none of these mechanisms are available, the function falls back to a
815+
busy loop (non-blocking call and short sleeps).
816+
817+
.. note::
818+
819+
Use the :mod:`asyncio` module for an asynchronous wait: see
809820
:class:`asyncio.create_subprocess_exec`.
810821

811822
.. versionchanged:: 3.3
812823
*timeout* was added.
813824

825+
.. versionchanged:: 3.15
826+
if *timeout* is not ``None``, use efficient event-driven implementation
827+
on Linux >= 5.3 and macOS / BSD.
828+
814829
.. method:: Popen.communicate(input=None, timeout=None)
815830

816831
Interact with process: Send data to stdin. Read data from stdout and stderr,

Doc/using/configure.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -783,6 +783,21 @@ also be used to improve performance.
783783

784784
See also :envvar:`PYTHONMALLOC` environment variable.
785785

786+
.. option:: --with-pymalloc-hugepages
787+
788+
Enable huge page support for :ref:`pymalloc <pymalloc>` arenas (disabled by
789+
default). When enabled, the arena size on 64-bit platforms is increased to
790+
2 MiB and arena allocation uses ``MAP_HUGETLB`` (Linux) or
791+
``MEM_LARGE_PAGES`` (Windows) with automatic fallback to regular pages.
792+
793+
The configure script checks that the platform supports ``MAP_HUGETLB``
794+
and emits a warning if it is not available.
795+
796+
On Windows, use the ``--pymalloc-hugepages`` flag with ``build.bat`` or
797+
set the ``UsePymallocHugepages`` MSBuild property.
798+
799+
.. versionadded:: 3.15
800+
786801
.. option:: --without-doc-strings
787802

788803
Disable static documentation strings to reduce the memory footprint (enabled

Doc/whatsnew/3.15.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -743,6 +743,20 @@ ssl
743743

744744
(Contributed by Ron Frederick in :gh:`138252`.)
745745

746+
subprocess
747+
----------
748+
749+
* :meth:`subprocess.Popen.wait`: when ``timeout`` is not ``None`` and the
750+
platform supports it, an efficient event-driven mechanism is used to wait for
751+
process termination:
752+
753+
- Linux >= 5.3 uses :func:`os.pidfd_open` + :func:`select.poll`.
754+
- macOS and other BSD variants use :func:`select.kqueue` + ``KQ_FILTER_PROC`` + ``KQ_NOTE_EXIT``.
755+
- Windows keeps using ``WaitForSingleObject`` (unchanged).
756+
757+
If none of these mechanisms are available, the function falls back to the
758+
traditional busy loop (non-blocking call and short sleeps).
759+
(Contributed by Giampaolo Rodola in :gh:`83069`).
746760

747761
symtable
748762
--------
@@ -1463,6 +1477,12 @@ Build changes
14631477
modules that are missing or packaged separately.
14641478
(Contributed by Stan Ulbrych and Petr Viktorin in :gh:`139707`.)
14651479

1480+
* The new configure option :option:`--with-pymalloc-hugepages` enables huge
1481+
page support for :ref:`pymalloc <pymalloc>` arenas. When enabled, arena size
1482+
increases to 2 MiB and allocation uses ``MAP_HUGETLB`` (Linux) or
1483+
``MEM_LARGE_PAGES`` (Windows) with automatic fallback to regular pages.
1484+
On Windows, use ``build.bat --pymalloc-hugepages``.
1485+
14661486
* Annotating anonymous mmap usage is now supported if Linux kernel supports
14671487
:manpage:`PR_SET_VMA_ANON_NAME <PR_SET_VMA(2const)>` (Linux 5.17 or newer).
14681488
Annotations are visible in ``/proc/<pid>/maps`` if the kernel supports the feature

Include/internal/pycore_code.h

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -292,17 +292,7 @@ extern int _PyCode_SafeAddr2Line(PyCodeObject *co, int addr);
292292
extern void _PyCode_Clear_Executors(PyCodeObject *code);
293293

294294

295-
#ifdef Py_GIL_DISABLED
296-
// gh-115999 tracks progress on addressing this.
297-
#define ENABLE_SPECIALIZATION 0
298-
// Use this to enable specialization families once they are thread-safe. All
299-
// uses will be replaced with ENABLE_SPECIALIZATION once all families are
300-
// thread-safe.
301-
#define ENABLE_SPECIALIZATION_FT 1
302-
#else
303295
#define ENABLE_SPECIALIZATION 1
304-
#define ENABLE_SPECIALIZATION_FT ENABLE_SPECIALIZATION
305-
#endif
306296

307297
/* Specialization functions, these are exported only for other re-generated
308298
* interpreters to call */

Include/internal/pycore_jit.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ typedef _Py_CODEUNIT *(*jit_func)(
2626
int _PyJIT_Compile(_PyExecutorObject *executor, const _PyUOpInstruction *trace, size_t length);
2727
void _PyJIT_Free(_PyExecutorObject *executor);
2828
void _PyJIT_Fini(void);
29+
PyAPI_FUNC(int) _PyJIT_AddressInJitCode(PyInterpreterState *interp, uintptr_t addr);
2930

3031
#endif // _Py_JIT
3132

Include/internal/pycore_obmalloc.h

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,11 @@ typedef unsigned int pymem_uint; /* assuming >= 16 bits */
208208
* mappings to reduce heap fragmentation.
209209
*/
210210
#ifdef USE_LARGE_ARENAS
211-
#define ARENA_BITS 20 /* 1 MiB */
211+
# ifdef PYMALLOC_USE_HUGEPAGES
212+
# define ARENA_BITS 21 /* 2 MiB */
213+
# else
214+
# define ARENA_BITS 20 /* 1 MiB */
215+
# endif
212216
#else
213217
#define ARENA_BITS 18 /* 256 KiB */
214218
#endif
@@ -469,7 +473,7 @@ nfp free pools in usable_arenas.
469473
*/
470474

471475
/* How many arena_objects do we initially allocate?
472-
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the
476+
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE before growing the
473477
* `arenas` vector.
474478
*/
475479
#define INITIAL_ARENA_OBJECTS 16
@@ -512,14 +516,26 @@ struct _obmalloc_mgmt {
512516
513517
memory address bit allocation for keys
514518
515-
64-bit pointers, IGNORE_BITS=0 and 2^20 arena size:
519+
ARENA_BITS is configurable: 20 (1 MiB) by default on 64-bit, or
520+
21 (2 MiB) when PYMALLOC_USE_HUGEPAGES is enabled. All bit widths
521+
below are derived from ARENA_BITS automatically.
522+
523+
64-bit pointers, IGNORE_BITS=0 and 2^20 arena size (default):
516524
15 -> MAP_TOP_BITS
517525
15 -> MAP_MID_BITS
518526
14 -> MAP_BOT_BITS
519527
20 -> ideal aligned arena
520528
----
521529
64
522530
531+
64-bit pointers, IGNORE_BITS=0 and 2^21 arena size (hugepages):
532+
15 -> MAP_TOP_BITS
533+
15 -> MAP_MID_BITS
534+
13 -> MAP_BOT_BITS
535+
21 -> ideal aligned arena
536+
----
537+
64
538+
523539
64-bit pointers, IGNORE_BITS=16, and 2^20 arena size:
524540
16 -> IGNORE_BITS
525541
10 -> MAP_TOP_BITS

Lib/profiling/sampling/sample.py

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ def _pause_threads(unwinder, blocking):
4242
LiveStatsCollector = None
4343

4444
_FREE_THREADED_BUILD = sysconfig.get_config_var("Py_GIL_DISABLED") is not None
45+
4546
# Minimum number of samples required before showing the TUI
4647
# If fewer samples are collected, we skip the TUI and just print a message
4748
MIN_SAMPLES_FOR_TUI = 200
@@ -64,19 +65,23 @@ def __init__(self, pid, sample_interval_usec, all_threads, *, mode=PROFILING_MOD
6465
self.realtime_stats = False
6566

6667
def _new_unwinder(self, native, gc, opcodes, skip_non_matching_threads):
67-
if _FREE_THREADED_BUILD:
68-
unwinder = _remote_debugging.RemoteUnwinder(
69-
self.pid, all_threads=self.all_threads, mode=self.mode, native=native, gc=gc,
70-
opcodes=opcodes, skip_non_matching_threads=skip_non_matching_threads,
71-
cache_frames=True, stats=self.collect_stats
72-
)
68+
kwargs = {}
69+
if _FREE_THREADED_BUILD or self.all_threads:
70+
kwargs['all_threads'] = self.all_threads
7371
else:
74-
unwinder = _remote_debugging.RemoteUnwinder(
75-
self.pid, only_active_thread=bool(self.all_threads), mode=self.mode, native=native, gc=gc,
76-
opcodes=opcodes, skip_non_matching_threads=skip_non_matching_threads,
77-
cache_frames=True, stats=self.collect_stats
78-
)
79-
return unwinder
72+
kwargs['only_active_thread'] = bool(self.all_threads)
73+
74+
return _remote_debugging.RemoteUnwinder(
75+
self.pid,
76+
mode=self.mode,
77+
native=native,
78+
gc=gc,
79+
opcodes=opcodes,
80+
skip_non_matching_threads=skip_non_matching_threads,
81+
cache_frames=True,
82+
stats=self.collect_stats,
83+
**kwargs
84+
)
8085

8186
def sample(self, collector, duration_sec=None, *, async_aware=False):
8287
sample_interval_sec = self.sample_interval_usec / 1_000_000

0 commit comments

Comments
 (0)