-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Make it easier to create python stub files #3268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Filed as internal issue #USD-10058 |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Hi @chadrik - sorry for the delay; we finally had a chance to talk this over, and we like the idea of at least your step #1 at this point. We're going to do a little more testing for build-time impact, and we're pretty sure the new pxr_boost should support this, though if you had a chance to rebase against current dev (or 24.11) to verify, that would be awesome also! |
|
Hi all, I'm back on this! I'm going to rebase onto dev, and see if we can get this merged. By the way, I've succeeded in moving my stub generation process into OpenImageIO and I'm getting close on OpenColorIO. OpenEXR is next. I'm hoping to hit get the entire ASWF-and-friends done before SIGGRAPH. @spiffmon is there any new info I should be aware of? |
|
Nothing new to add, @chadrik , |
pmolodo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a few nitpicks, but I think it would be ok to merge even if they're not addressed...
| if (NOT PXR_BUILD_PYTHON_DOCUMENTATION) | ||
| _add_define(BOOST_PYTHON_NO_PY_SIGNATURES) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the documentation builds also build TF, I'm thinking that this comment may now be out of date, and we can remove the define unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setting of the BOOST_PYTHON_NO_PY_SIGNATURES define has moved here: https://github.com/PixarAnimationStudios/OpenUSD/blob/dev/pxr/external/boost/python/CMakeLists.txt#L327
I tried to port the same behavior, like so:
--- a/pxr/external/boost/python/CMakeLists.txt
+++ b/pxr/external/boost/python/CMakeLists.txt
@@ -321,11 +321,13 @@ pxr_library(python
DISABLE_PRECOMPILED_HEADERS
)
-# Disable insertion of C++ signatures in docstrings. We generate this
-# information separately via our Python doc build process.
-target_compile_definitions(python
- PUBLIC PXR_BOOST_PYTHON_NO_PY_SIGNATURES
-)
+if (NOT PXR_BUILD_PYTHON_DOCUMENTATION)
+ # Disable insertion of C++ signatures in docstrings. We generate this
+ # information separately via our Python doc build process.
+ target_compile_definitions(python
+ PUBLIC PXR_BOOST_PYTHON_NO_PY_SIGNATURES
+ )
+endif()Unfortunately, this leads to a build error that I don't understand.
/Users/chad/dev/USD/pxr/external/boost/python/type_id.hpp:88:9: error: 'typeid' of incomplete type 'pxrInternal_v0_25_8__pxrReserved__::ArAsset'
88 | typeid(T)
| ^
/Users/chad/dev/USD/pxr/external/boost/python/converter/registered.hpp:111:31: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::type_id<const volatile pxrInternal_v0_25_8__pxrReserved__::ArAsset &>' requested here
111 | return registry::lookup(type_id<T&>());
| ^
/Users/chad/dev/USD/pxr/external/boost/python/converter/registered.hpp:118:14: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::converter::detail::registry_lookup2<const volatile pxrInternal_v0_25_8__pxrReserved__::ArAsset>' requested here
118 | return registry_lookup2((T(*)())0);
| ^
/Users/chad/dev/USD/pxr/external/boost/python/converter/registered.hpp:129:64: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::converter::detail::registry_lookup1<const volatile pxrInternal_v0_25_8__pxrReserved__::ArAsset &>' requested here
129 | registration const& registered_base<T>::converters = detail::registry_lookup1(type<T>());
| ^
/Users/chad/dev/USD/pxr/external/boost/python/to_python_value.hpp:117:54: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::shared_ptr_to_python_value<const std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> &>::get_pytype<pxrInternal_v0_25_8__pxrReserved__::ArAsset>' requested here
117 | PyTypeObject const* get_pytype() const {return get_pytype((type<argument_type>*)0);}
| ^
/Users/chad/dev/USD/pxr/external/boost/python/detail/caller.hpp:137:98: note: in instantiation of member function 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::shared_ptr_to_python_value<const std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> &>::get_pytype' requested here
137 | return create_result_converter((PyObject*)0, (ResultConverter *)0, (ResultConverter *)0).get_pytype();
| ^
/Users/chad/dev/USD/pxr/external/boost/python/detail/caller.hpp:159:61: note: (skipping 6 contexts in backtrace; use -ftemplate-backtrace-limit=0 to see all)
159 | , &detail::converter_target_type<result_converter>::get_pytype
| ^
/Users/chad/dev/USD/pxr/external/boost/python/make_function.hpp:148:20: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::make_function_aux<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> (pxrInternal_v0_25_8__pxrReserved__::ArResolver::*)(const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &) const, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::default_call_policies, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::type_list<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset>, pxrInternal_v0_25_8__pxrReserved__::ArResolver &, const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &>, std::integral_constant<int, 1>>' requested here
148 | return detail::make_function_aux(
| ^
/Users/chad/dev/USD/pxr/external/boost/python/class.hpp:458:13: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::make_function<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> (pxrInternal_v0_25_8__pxrReserved__::ArResolver::*)(const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &) const, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::default_call_policies, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::keywords<1>, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::type_list<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset>, pxrInternal_v0_25_8__pxrReserved__::ArResolver &, const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &>>' requested here
458 | , make_function(
| ^
/Users/chad/dev/USD/pxr/external/boost/python/class.hpp:527:15: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::class_<pxrInternal_v0_25_8__pxrReserved__::ArResolver, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::noncopyable>::def_impl<pxrInternal_v0_25_8__pxrReserved__::ArResolver, std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> (pxrInternal_v0_25_8__pxrReserved__::ArResolver::*)(const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &) const, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::def_helper<pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::keywords<1>>>' requested here
527 | this->def_impl(
| ^
/Users/chad/dev/USD/pxr/external/boost/python/class.hpp:205:15: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::class_<pxrInternal_v0_25_8__pxrReserved__::ArResolver, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::noncopyable>::def_maybe_overloads<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> (pxrInternal_v0_25_8__pxrReserved__::ArResolver::*)(const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &) const, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::keywords<1>>' requested here
205 | this->def_maybe_overloads(name, a1, a2, &a2);
| ^
/Users/chad/dev/USD/pxr/usd/ar/wrapResolver.cpp:97:10: note: in instantiation of function template specialization 'pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::class_<pxrInternal_v0_25_8__pxrReserved__::ArResolver, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::noncopyable>::def<std::shared_ptr<pxrInternal_v0_25_8__pxrReserved__::ArAsset> (pxrInternal_v0_25_8__pxrReserved__::ArResolver::*)(const pxrInternal_v0_25_8__pxrReserved__::ArResolvedPath &) const, pxrInternal_v0_25_8__pxrReserved__::pxr_boost::python::detail::keywords<1>>' requested here
97 | .def("OpenAsset", &This::OpenAsset,
| ^
/Users/chad/dev/USD/pxr/usd/ar/resolver.h:25:7: note: forward declaration of 'pxrInternal_v0_25_8__pxrReserved__::ArAsset'
25 | class ArAsset;
| ^
1 error generated.
gmake[2]: *** [pxr/usd/ar/CMakeFiles/_ar.dir/build.make:233: pxr/usd/ar/CMakeFiles/_ar.dir/wrapResolver.cpp.o] Error 1
My goal in not setting BOOST_PYTHON_NO_PY_SIGNATURES is to allow boost to add its representation of the python signatures to the python docstrings.
In an ideal world, instead of adding these to the __doc__ attribute of each object we would write them to disk as a build artifact that can be read by the custom docstring generator. Now that we have our own fork of boost that might actually be on the table, but either way this build error needs to be resolved.
Unfortunately this has reached a level of complexity that I won't be able to solve it on my own, which means I can't make stubs for any versions of USD after the switch to pxr_boost. Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how far on the journey I can go, but this particular error is saying that wrapResolver.cpp, with that option enabled, requires a complete declaration of ArAsset, whereas resolver.h is only forward declaring it. So I think you just need to include asset.h in wrapResolver.cpp
| not parentPath[-1].name.startswith(obj.__module__.split(".")[1]): | ||
| # The doxygen object clearly indicates it's from another module. | ||
| obj = None | ||
| elif type(obj).__module__ != 'Boost.Python': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the module name changed with pxr_boost, but seeing this reminds me that you didn't explicitly confirm if you've tested this with pxr_boost yet...
| # boost signature has been prepended. that means if there is | ||
| # an existing description provided in the C++ wrapper it will be | ||
| # indented. strip the signature and dedent the description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an example before / after docstring (either real or made up) would greatly help understanding here. I had to re-read this a few times before I thought I understood what's going on, as my initial assumption about what these would look like was wrong.
| for line in lines: | ||
| if not found and line.startswith(' '): | ||
| found = True | ||
| if found: | ||
| newLines.append(line) | ||
| return textwrap.dedent('\n'.join(newLines)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, basically we're finding the first indented line, grabbing it and all following, and then dedenting.
I'm assuming you haven't encountered the situation where unindented text follows indented, ie:
A
B
C
D
E
F
...but we should consider what to do if it happens. Current behavior would be to return:
C
D
E
F
...which is likely not desired.
Simplest solution is to just error if we're fairly sure this should never happen, and/or can't make any guesses at what the best behavior would be if it does. Other options would be to return None, or one of:
C
D
F
or
C
D
E
F
...all of which feel more likely to be correct.
| found = False | ||
| newLines = [] | ||
| for line in lines: | ||
| if not found and line.startswith(' '): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add a and line.strip() check here? ie, if we have several lines of indented-but-empty lines preceeding the start of the non-whitespace section, we'll probably want to strip those too.
Would also (help) prevent us from returning a result that's nothing but whitespace-only lines. (We'd also need to check if found=False after the loop).
| for line in lines: | ||
| if not found and line.startswith(' '): | ||
| found = True | ||
| if found: | ||
| newLines.append(line) | ||
| return textwrap.dedent('\n'.join(newLines)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return None if found = False, or error?
| return None | ||
| if len(doc) > 0 and not overloads[0].isModule(): | ||
| newDoc = self.__stripBoostSig(doc) | ||
| if newDoc is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some cases where __stripBoostSig() could return a non-None string that's either empty or nothing but whitespace. We should either clean those up, or change this check to if newDoc is None or not newDoc.strip():, or both (an effectively empty docstring is probably never the correct thing to return, I'm guessing).
| and not overloads[0].isModule(): | ||
| Debug("Docstring exists for %s - skipping" % pypath) | ||
| return None | ||
| if len(doc) > 0 and not overloads[0].isModule(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to add the old we always override the module doc string for now... comment back here
| try: | ||
| module = importlib.import_module(".__DOC", f_locals["__name__"]) | ||
| module.Execute(f_locals) | ||
| if os.environ.get("PXR_USD_PYTHON_DISABLE_DOCS", "false").lower() not in ("1", "true", "yes"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tf.GetEnvBool() also returns true for "on" - we should probably follow that convention:
https://github.com/PixarAnimationStudios/OpenUSD/blob/v25.05.01/pxr/base/tf/getenv.h#L48
@pmolodo, thanks for the review! Sorry it's taken me awhile to get back to this. I'll address a few of your notes now, but I'm unfortunately a bit stuck putting this back together after the pxr_boost change. I'm hoping that someone can point me in the right direction. |
This PR makes it easier to create python stub files, which are used for IDE completion and static type analysis. The direct benefactors of this change are a very niche group, but we're a group that support the community by providing tools that make developing with OpenUSD a more productive and pleasurable experience.
I'm using these changes to create stubs which are published to pypi here. You can peruse the stub files here. For the past year or two I've had to maintain my own fork of OpenUSD to make this possible.
In order to sell you on this change, I'm going to demonstrate that the function signatures in the stub files that I'm creating are superior to those included with docstrings that ship with OpenUSD, and I'd like to plant the idea that this PR is the first step towards an eventual integration of stub generation into the native OpenUSD build process.
Background
As someone who wants very accurate python stubs, I've faced a handful of challenges:
object. However, it does give us the actual number of python overloads, and is reliable about certain result types likelistandtuple, and this extra information can help fill the gap between live inspection of python objects and the rich information in the C++ docs.Comparison between python signatures: yours (native) vs mine (cg-stubs)
Overview
As part of its build process, OpenUSD generates docstrings in side-car python files named
__DOC.py, which are loaded at import time. In order to make use of these docstrings within your IDE, it must be configured to be able to import the OpenUSD python modules to inspect them.By comparison, pyi stub files are a lightweight approach to load not only function and class documentation into your IDE but also type information, which is informative, and is used to drive code completion. Additionally, stub files can be used by static type analysis tools like mypy to validate your code. Providing access to stub files within your IDE is typically as simple as running
pip install types-usdwithin your project's virtualenv.Quality
This is the docstring that OpenUSD generates for the method
Usd.ClipsAPI.GetClipAssetPaths:These are my stubs generated for the same methods:
You'll notice that the docstrings present in yours and mine are equivalent (and share the same whitespace idiosyncracies) because I'm using the same
doxygenlibmodules to process docstrings.The primary difference is in the overload signatures. Here's a more direct side-by-side comparsion:
First overload
Second overload
tl;dr The native signatures are wrong in many ways. Below is a table explaining the sources of these differences:
std::vectorandstd::sequencestd::set,std::unordered_set,std::function,std::map,std::unordered_map,std::optionalAr.ResolvedPathtypedefandusingstatements to substitute these aliases for their actual typesselforclsargsselfandclsargs for methodsOk, I'm convinced. What next?
Here's my proposed 3-step plan:
Step 1
Merge this PR.
Step 2
I will contribute type annotations to several open source project which I can use to further validate my stubs, and continue to broadcast the existence of these stubs far and wide to drive adoption. I've already created a PR for the great USD Cookbook repo. Ideally, if Pixar is open to it, I would like to contribute type annotations to OpenUSD itself, for example in
Usdviewq. To ease into this, I have a followup to this PR to add annotations todoxygenlib.Step 3
If we are all in agreement, then the final step would be to move the pyi stub generation process into OpenUSD itself. I'm a pretty busy guy and moving the project into OpenUSD would ensure that it undergoes more regular releases than I can manage. I will be around to help out with maintenance of the generator.
Q & A
What has changed under the hood?
There are two main changes in this PR, both limited to the build process:
doxygenlibis updated to hide these signatures from users.Tf.PreparePythonModuleoverrides function__doc__attributes with values from__DOC.py. This behavior has remained unchanged. What has changed is the functions that are included in__DOC.pyfiles during the build process. Prior to this change,cdWriterDocstringomitted entries in__DOC.pyfor functions which had a docstring provided by Boost, because an existing docstring indicated that it had been manually authored in the C++ wrapper.After this PR, all functions have docstrings generated by Boost (because of the new Boost signatures), so we have to do a little extra work to determine if the Boost docstrings are just the auto-generated python signatures, or custom docstrings from the wrappers. Now when
cdWriterDocstringprocesses an existing docstring, it removes the Boost signatures and if there's anything left it writes it to__DOC.py.The upshot is that the additional docstrings make the python lib directory slightly larger, 25MB, or about 0.6% to a full OpenUSD build:
What are the differences for the user?
As mentioned, I've taken great pains to preserve the original behavior as much as possible, however, there are some edge cases.
Functions for which
cdWriterDocstringfound a C++ match in doxygen will remain the same, because they will recieve an entry in__DOC.py. However, there are functions whichcdWriterDocstringnever visits and which therefore do not receive an override in__DOC.py. These functions will now have docstrings generated by Boost.For example, before this PR, the following function had no docstring:
$ python3.9 -c "import pxr.Sdf as Sdf;print(Sdf.ListEditorProxy_SdfNameKeyPolicy.ContainsItemEdit.__doc__)" NoneAfter this PR, it has a docstring generated by Boost:
Why enable the boost signatures by default?
If we follow through with the 3-step plan above, building with signatures will need to be the default, because we will not want to build one version of OpenUSD with signatures enabled to create the stubs, and then rebuild with signatures disabled.
Why not provide an option to enable the Boost signatures?
I'm open to doing this, but adding an option to enable the Boost signatures adds complexity to the doxygenlib code because it needs to produce parity in output with both scenarios.
I know this is a lot to read, but I thought I'd front load all of the context necessary to make a decision on this PR. Take your time and ask as many questions as you like.
Thanks!