Skip to content

MNT: add linter for thread-unsafe C API uses #28634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lvllvl
Copy link
Contributor

@lvllvl lvllvl commented Apr 2, 2025

Adding a linter to audit PRs.
This would lint out PRs that use problematic functions such as

  • PyList_GetItem
  • PyList_GET_ITEM
  • PyDict_GetItem
  • PyDict_getItemWithError
  • PyDict_Next
  • PyDict_GetItemString
  • _PyDict_GetItemStringWithError

Attempts to resolve #26159

@lvllvl lvllvl force-pushed the issue-26159 branch 5 times, most recently from 62be0b4 to b77b43a Compare April 3, 2025 01:50
@lvllvl lvllvl marked this pull request as ready for review April 3, 2025 02:19
@ngoldbaum
Copy link
Member

ngoldbaum commented Apr 3, 2025

Does this need a whole new CI job? I know I suggested that in the issue, but it occurs to me that this could be integrated into e.g. the spin lint command.

Also rather than just checking diffs we could do this codebase-wide and add e.g. // noqa comments throughout the codebase, since it is safe to use borrowed references in some situations, but I think having a linter that alerts contributors and reviewers about the problem is good.

If it ends up being really invasive to add the lint opt-outs we can reconsider.

If the lint fails, it should suggest what to do to fix the issue. One of:

  • use an API that returns a strong reference

or, add an opt-out to the lint if:

  • you are very sure that a lock is held
  • you are very sure you have the only reference to the object (e.g. it is newly created in the local scope).

The linter should also probably check all of the functions listed in the free-threaded HOWTO in the CPython docs:

https://docs.python.org/3/howto/free-threading-extensions.html#borrowed-references

Also see here for more details about why we need to do this and why we can't just globally replace all these functions:

https://py-free-threading.github.io/porting-extensions/#cpython-c-api-usage

We probably should also be linting uses of fast accessor macros that don't do any locking.

Also, FWIW, there are probably still lurking issues. I'm aware of Py_SEQUENCE_FAST at least (#28046), I've just had more pressing things to work on and haven't been able to finish that one off. I'm hoping to finish off that work soon, or at least upstream the trivial fixes I have but leave an issue open for the more complicated issues in the array coercion code.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes in the right direction, but may need a bit of care. Overall, perhaps good to keep it simple, and maybe just use git grep for all, like,

git grep -e PyList_GetItem\( --or -e PyList_GET_ITEM\( ... origin/main...HEAD

Or perhaps better write the list to a file and use -f.

Notes:

  1. important to add the ( otherwise correct functions like PyList_GetItemRef will be matched too.
  2. You have to be sure to match all C files, so also .c.src and .cpp, etc. My suggestion would be to just match all files and perhaps remove .rst... since this only runs on the differences, false positives are not very likely.

@seberg
Copy link
Member

seberg commented Apr 4, 2025

I almost wonder if one could do a custom clang-tidy style check (not if that can that work with .c.src though, I admit)? That might also be useful for other projects then.

Plus that means we have an established pattern for allowing it, just add a // NOLINT: Newly created tuple comment. I think an inline code-comment is vastly preferential to some global allowlist.

@lvllvl lvllvl force-pushed the issue-26159 branch 5 times, most recently from e69e629 to 3cf055d Compare April 11, 2025 01:00
@ngoldbaum
Copy link
Member

This is looking much closer to being mergeable. Let me know if you need a hand with getting the circleci build fixed.

@ngoldbaum
Copy link
Member

.. or maybe it's unrelated and would go away with a rebase?

@lvllvl lvllvl force-pushed the issue-26159 branch 3 times, most recently from bd4afe5 to 5e57f30 Compare April 12, 2025 18:40
@lvllvl
Copy link
Contributor Author

lvllvl commented Apr 12, 2025

@ngoldbaum

  1. I think the rebase worked, kind of.
  2. The test is set to fail if any of the functions are used but currently I think there's about 92 uses of the functions, here's a link: C API Borrowed Ref Lint
  3. Should I assume that all the currently flagged uses of the functions are approved uses? I.e., should I just mark each listed instance as ok // noqa: borrowed-ref OK
  4. I think a failing version of this test would cause 4 tests to fail, azure-pipeline numpy, azure-pipeline numpy (ComprehensiveTest Lint), Linux, and Run MyPy/C API Borrowed Ref Lint. I think it's because those jobs run linter.py, is that ok if it functions that way?

@lvllvl lvllvl force-pushed the issue-26159 branch 2 times, most recently from 7811388 to 0ea4d9c Compare April 13, 2025 17:24
@lvllvl lvllvl requested a review from ngoldbaum April 13, 2025 18:10
@lvllvl lvllvl force-pushed the issue-26159 branch 3 times, most recently from af12173 to dccef23 Compare April 20, 2025 23:01
@lvllvl lvllvl force-pushed the issue-26159 branch 2 times, most recently from 7bc287b to 3e90169 Compare April 21, 2025 01:18
@lvllvl lvllvl force-pushed the issue-26159 branch 2 times, most recently from d155200 to 56b3d77 Compare April 21, 2025 17:48
@ngoldbaum
Copy link
Member

Let me know if you'd like a hand with some git surgery :)

@lvllvl
Copy link
Contributor Author

lvllvl commented Apr 21, 2025

@ngoldbaum looks like all the tests are passing now.

  1. Re: numpy/_core/src/common/pythoncapi_compat.h, I think I accidentally added this file into this location. So you were right, this was the source of all the additional lines. I removed this from my PR
  2. Re: Linter in tools/ci/check_c_api_usage.sh, I changed the ALL_FILES filter to ignore the submodule section. I tested it in my codespace and here, seems to work ok. The line is now:
ALL_FILES=$(find numpy -type f \( -name "*.c" -o -name "*.h" -o -name "*.c.src" -o -name "*.cpp" \) ! -path "*/pythoncapi-compat/*")

Let me know if you need any additional changes. thanks for the help!

@lvllvl lvllvl requested a review from ngoldbaum April 21, 2025 19:03
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I'm glad this is shaping up.

I guess my main issue with this is that we should be relatively sure that all of the ones we're marking as OK really are OK, otherwise we'll be misleading future readers.

I need to sit down with this PR and try to stare at all the uses and see if I can come up with possible problems.

We might need a new NOQA category for uses that are known to be problematic but need manual fixes. That would allow us to catch and triage new uses in CI while not needing to go through and fix absolutely everything that's still in the library.

I will try to do this soon but may not have a ton of time before PyCon.

@@ -108,7 +108,7 @@ PyUFuncOverride_GetOutObjects(PyObject *kwds, PyObject **out_kwd_obj, PyObject *
* PySequence_Fast* functions. This is required for PyPy
*/
PyObject *seq;
seq = PySequence_Fast(*out_kwd_obj,
seq = PySequence_Fast(*out_kwd_obj, // noqa: borrowed-ref OK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually have some changes I need to upstream related to this from back in February, see #28046 (comment). Maybe I should finally get around to sending in what I have...

@ngoldbaum
Copy link
Member

Hi all, I'm really sorry for the delay here. Totally my fault for letting it sit for so long.

I'll try to set aside some time to get this up to shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MNT: Add linter for thread-unsafe C API uses
4 participants