(Updated to remove details about pyconfigure now this is gone.)
The autogen.sh/configure/make steps are pretty unusual for a Python application.
I suspect we may be able to add a lot of clarity and remove a significant barrier to contribution by looking at an alternative, but I'm not fully across the background here.
./autogen.sh && ./configure && make currently:
That's all pretty unusual for a Python application, and adds a lot of fairly dense build config/infrastructure.
The translation building makes sense, and it's nice that the system can find the python interpreter, but beyond that these tools aren't normally used to create virtualenvs and install Python/JS deps. That's a bit unusual.
Installing could be a compelling feature, if we're not installing as a Python package. But maybe a python package would be easier.
This issue has become more urgent to address now due to autoconf version issues we're starting to see. See #12.
I asked for an opinion on #pypa IRC about packaging applications:
- sturm I'm working on packaging for the MediaGoblin project - which is a Python web app. I think it makes sense to distribute a PyPI package, but we have both translations to compile and some third-party JS libs to bundle. Conceptually, would we we pre-build the translations and JS bundle, then include them as data files for subsequent
python -m build? There's no way to add extra steps in
python -m buildright? It's
- sturm just for preparing the final distribution?
- webknjaz You need to learn about PEP517 in-tree build backends — this will allow you to add some code when producing artifacts.
- sturm webknjaz: great, thanks I'll go do some reading
- webknjaz OTOH it's questionable that you really need to distribute your app. Let me check what that is
- sturm webknjaz: I'd be very interested in your thoughts on whether a PyPI package is even a good idea. It seems like a grey area for applications.
- webknjaz Yep. I think it heavily depends on the application and the intended use of it.
- sturm webknjaz: any rules of thumb you'd suggest in deciding?
- webknjaz Well, libs and frameworks go to PyPI, and they have open-ended direct (usually only direct) deps specified.
- webknjaz For apps, it's different. If you expose some sort of a CLI (or runpy) executable, you could probably use PyPI as well.
- webknjaz But normally apps need stricter env specification.
- webknjaz Like exact pins of the whole dependency tree (including transitive deps).
- sturm stricter? as in only targeting a particular platform?
- sturm ah I see
- sturm we've been trying to keep pins very loose to hopefully ease distro packaging
- webknjaz Normally, this is a task for virtualenvs and requirements+constraints files, and the assumption that the users probably wouldn't attempt to install incompatible deps/packages into the said venv
- sturm that's a good point though, your existing PyPI release could break at any moment if a new incompatible dependency was released and you hadn't fully pinned it down.
- webknjaz That's why you mostly need to package a virtualenv (or something similar).
- webknjaz And there's a lot of tools that solve this problem in different (although similar) ways
- webknjaz There's zipapps (natively supported by the Python) interpreter
- webknjaz And tools like shiv or pex for creating them
- webknjaz Also, there's things like py2exe and similar, that produce OS-specific bundles even including a separate interpreter copy sometimes
- sturm ok, thanks - I'll look into that
- webknjaz FYI we maintain this documentation website https://packaging.python.org/
- webknjaz You should be able to find more pointers there
- sturm is the main hassle you'd expect for distributing an app on PyPI the problem of getting the many deps to align/remain compatible?
- sturm or are there other issues too?
- webknjaz Also, here's a few more specific pointers https://packaging.python.org/overview/#packaging-python-applications https://packaging.python.org/discussions/deploying-python-applications/
- sturm great, thanks
- webknjaz As for your last question: when you publish an open-ended thing, be prepared to continuously maintain/test it as the dependency tree will be changing over time (each new user install will attempt to download the latest deps, not those you tested against at the time of the publication)
- webknjaz Also, you'll be shifting the responsibility of managing the transitive dep tree pins in the installation locations to the end users.
- webknjaz I guess, if the end-users are pythonistas, it's not very problematic, just point them to pip-tools as an example solution. But you may end up having to educate folks who don't understand that they'd need to manage a proper venv. OTOH now that we've adopted pipx, it may ease some of the issues with attempting to install incompatible things.
- sturm right, makes sense. Is there any particular reason that dropping all the transitive deps into setup.py/cfg would be a bad idea?
- sturm (aside from that it's not generally done)
- webknjaz Well, normally, the right place for this is requirements. And pip-tools is not able to generate them in setup.cfg/py
- webknjaz But it may be fine if you tell the users to install with pipx
- sturm webknjaz: thanks very much, that's all really helpful - appreciate your time :)
- webknjaz Another trade-off with the pinned deps in the libs is that it would be hard to get security fixes somewhere deep in the tree. Every time you go this route, you will need to have a mechanism in place to make new releases.
- webknjaz You're welcome :)
This would be an improvement. As far as I can tell, this project doesn't need or benefit from autotools; it's easier to develop and build Python packages by sticking closely to the modern Python ecosystem.
I've been looking over the build system and making a checklist of changes and thoughts:
- If autotools is removed, the
boostrap.shfile can be removed. Besides autotools setup, all
bootstrap.shdoes is call
git submodule update --init. That can be done manually.
- The build tool doesn't need to create a virtual environment for a developer. They can create a virtual environment at their own discretion. I think this is how most projects do things?
- Several of the Python packages are installed using
pip. For example, the command
./bin/pip install sphinxcontrib-applehelp sphinxcontrib-htmlhelp sphinxcontrib-jsmathappears in
Dockerfile-debian-11-sqlite. These requirements can be stuffed in a
python -m pip install -r requirements.txtwill handle the installation.
- I don't think source should be pulling
bowercan be added as a dependency when packaging. Otherwise, the developer should be able to figure it out.
bowershould be called manually.
pdf.jscan be put into
bower.json. That's one less submodule.
To hack on
mediagoblinwith this system, the developer would need to do something like the following:python3 -m venv venv source venv/bin/activate git submodule update --init npm install python setup.py install
This seems okay. I doubt it's convenient or useful to package
pip, but if the build structure is modernized it should be easier to hack on and package the project as a
Final thought: Does it even make sense to have a
setup.py? I don't expect other code to call into
celerythe celery tasks need a way to import it.
My understanding is that we need to have a configure script as it is a requirement for GNU software https://www.gnu.org/prep/standards/standards.html#Configuration (note the “should”, not “must”, so maybe I'm reading too much there). I also just found this https://www.gnu.org/help/evaluation.html#whatmeans, which states “[T]here are a few general principles. Certain parts of the GNU coding standards directly affect the consistency of the whole system. These include the standards for configuring and building a program, and the standards for command-line options. It is important to make all GNU programs follow these standards,” but immediately dampens the statement with “where they are applicable.” Maybe having a configure script is not applicable to a Python program.
Anyway, taking a different tack, I have refactored the configure/make stuff to be (I think) a bit nicer. My branch is here: https://git.sr.ht/~shtrom/mediagoblin/log/refactor-configure
In my view, the first part of the configure/make phase creates all the artifacts that are needed prior to building the Python package, which can then be built the normal way. This is the approach that I have taken there.
There is also a bit of logic around Dockerising the whole build (which was my main objective, until the configure/make become too much of a sore spot). This way, we don't have to struggle to much with providing build dependencies (and the right versions thereof). Nonetheless, the configure script should now be a lot more specific at detecting them now (and is use both to detect Docker on the host, and check the dependencies in the guest).
That work on the branch is not entirely complete. I do get a functional Docker container out of it, which is a good start, though it is a bit fat... The docs generation is still in progress. Moreover I haven't fully tested the final Python wheel. A recuring concern I have is the dependency on python-gobject code, which I think can only be built locally or by the distro, and is not a native Python package.
I believe the configure script is useful to attract developers accustomed to the gnu, but not to the python world. And the current script is not a big one, effectively it is just a wrapper around the python setuptools. It could be a wrapper also around PyConfigure with minimal work.
After learning a little about autoconf today, it struck me that we're not actually using pyconfigure at all (ie. the macros in m4/python.m4). I've gone ahead and deleted them, and the installation still runs no problem. That's a really good start.
I'm also looking at moving to using
python -m pip install --editable .in the Makefile rather than
python setup.py develop. This will bring us closer still to a modern Python workflow.
Noting that we're now using
python -m pip install --editable .in the Makefile rather than
python setup.py develop. There are now no longer any references to pyconfigure in the codebase.
Leaving this issue open, but I'll rename it to reflect that this is more about "configure" rather than "pyconfigure".
My current concern with our use of configure/make is that we're using it to install dependencies, which isn't what people normally do.
I agree. This is not the correct way of using autotools. Instead the configure script should just detect that all the necessary dependencies are present, and the Makefile merely use them.
This is the pattern I followed in my refactor, and their various iterations, the most recent being https://git.sr.ht/~shtrom/mediagoblin/tree/refactor-configure-py311-bookworm