Week 3 and 4: Argparse F2PY
The much awaited F2PY with an argparse frontend is here. It is in review as I write this blog but if you are feeling adventurous and can’t wait use it, you can clone my fork of NumPy and checkout the f2py_front branch. Installation of this experimental NumPy version are detailed in these instructions.(Of course I am joking, no one who gets excited over F2PY builds NumPy)
Lets dive into the changes that have taken place:-
Python has really good command line parsing frameworks such as click, Rich, Typer, and of course, the built-in
argparse isn’t usually the first choice for a developer when it comes to adding CLIs. Many developers prefer
typer for this purpose as they provide higher degrees of freedom and more polished interfaces for flags and arguments, autocompletion, and better code structure as compared to
argparse's rigid requirements and boiler plate such as handling Namespace with an
if-else ladder and attaching correct functions.
However, when you are renovating a CLI from the early 2000s and need to maintain backward compatibility, there is only so much you can modernise in one go. Additionally,
numpy does not wish to force users to have third-party dependencies.
argparse provides us with enough features to implement all the features of F2PY’s
sys.argv based front-end. In fact, BECAUSE
argparse forces user to handle Namespace, we were able to easily maintain the
numpy.distutils compilation backwards compatibility. Moreover, the other projects would add heavy dependencies to NumPy for maintaining and using only a very small part of the codebase.
So my GSoC mentor, Mr (soon to be Dr) Rohit Goswami started with building the frontend with
argparse in October’21. I wasn’t a part of this project until a little before GSoC this year. The goal was to modernise F2PY’s command line parsing interface of F2PY - f2py2e.
Now, let me remind you of the major capabilities of F2PY:-
- Generating signature file for user to tweak the interface of between Python and the library user will build with F2PY.
- Generate the C wrapper of Fortran code, which can be compiled and linked with Fortran code to produce desired python library.
- Building the python library. Basically, the automated 2nd step where F2PY generates a C wrapper to a Fortran source (or a pyf) file and uses
numpy.distutilsfor compilation and linking.
All the 3 above steps make sense in a linear flow. Firstly you create signature file of your fortran code using
f2py -m mypythonlib -h signature.pyf source.f # signature.pyf generated
and make sure the interace is correct. Secondly you can generate a C wrapper using the created signature file,
f2py -m mypythonlib signature.pyf # Generate a mypythoblibmodule.c wrapper
but it is almost never required outside of debugging, unless the wrappers are to be explicitly modified and checked into version control. F2PY can reliably create the desired interface of the wrapper using your tweaked
.pyf file. However, if you want to build your library using another build system, this C wrapper file is important. It needs to be bundled with other fotran sources and F2PY’s internal C libaries which as described here.
Thirdly You will build your module using
f2py -c -m mypythonlib signature.pyf source.f
This will create your Python library shared object file, which you can import and use.
>>> import mypythonlib >>> mypythonlib.sum([1, 2]) 3.0
If you read through f2py2e, you will find that it has two main functions - run_main() for handling wrapper and signature file generation (Step 1 and 2) and run_compile() for building extension modules using
numpy.distutils(step 3). Now, the
f2py2e command line parsing design was a product of its times. Both the
run_compile() functions have seperately implemented command line parsing in two different ways. Both use different default values of important information like the build directory and module name.
run_compile runs the entire workflow by emulating what
run_main does internally with
run_compile are two divergent code-paths although they are parts of the same 3 step flow we discussed above.
This design led to a lot of problems while we tried to maintain backwards compatibility with
numpy.distutils. If you read my implementation f2pyarg here, you can see I am reconstructing command line arguments from Namespace because
distutils needs it to call
Probably in a year,
numpy.distutils will be deprecated, which means F2PY won’t need to maintain the convoluted flow for building modules using
numpy.distutils. F2PY is shifting to Meson, and by the time distutils is deprecated F2PY will be fully supported by the
meson build system.
To read more about
f2py2e you can visit my blog Week 1: F2PY Frontend.
f2pyarg does it?
f2pyarg is the ongoing reimplementation of F2PY’s frontend. I started working on it during GSoC after Rohit had created a basic layout by November last year. My plan was to split the bulky
f2py2e into two files -
f2pyarg for dealing with the parser which would connect to a service.py file responsible for dealing with F2PY internals. My goal here is to create a hierarchical file structure which can be further improved. This step seperate F2PY’s functionalities with F2PY’s parser. The user can now connect with user f2py functionalities by passing values instead of feeding command line arguments to a parser.
f2pyarg.py's design is fairly straightforward and easy to understand. It has a main parser with 2 argument groups - (–hint-signature/-h group and -c group). It has some helper functions to sanitize input arguments, and some command line flag reconstruction functions (ex. get_f2py_flags_dist, get_fortran_compiler_flags) necessary for building with
The process_args is the main argument handling part of the frontend. First we have to parse positional arguments (the file paths etc). I faced a lot of problem here in implementing
only: flags of original CLI
f2py2e. Basically, F2PY provides these flag for user to pass some functions as arguments s/he would like to omit or select. But
argparse framework can’t accept
only: as flags, they need to start with a hyphen, like
--skip:. So these flags and their arguments were being sent to the positional arugment list along with source files.
f2pyarg therefore has to segregate this mixture of files and functions.
The files are then segregated based on the extension. The flow goes by sanitizing module name and signature file, creating settings for F2PY’s internal modules. I want to improve this settings mechanism in the future. The current way of creating these large dictionaries and passing them doesn’t please me. If the user has chosen not to build the module right now (
-c flag omitted), the generate files function simply creating signature file or wrapper.
If the user wants to build and passes the
-c flag, we build the module using
numpy.distutils. Now, the new
numpy.distutils module building mechaninism in a messy way too. It recreates the command line arguments using the Namespace arguments,
numpy.distutils internally calls
f2py2e so we are still not free from it etc etc. But we couldn’t improve it any further because of the time limitation. Furthermore, whatever new fancier implentation we make will be deprecated soon so its not worth it. We just cleaned up the original
run_compile method as much as we could and stopped there to catch our breathes. My next part of the project, implementing backend build system with Meson will probably make everything much cleaner and maintainable.
If you are a seasoned Python programmer or a veteran CLI designer, I urge you to review and drop constructuve criticism on my implementation - f2pyarg.
I think F2PY’s frontend will improve a lot once we deprecate
f2pyarg still does wrapper generation and compilation seperately due to the
numpy.distutils requirement. Shifting F2PY completely to
meson and possibly
cmake will improve code reusability and frontend code structure a lot. I am looking forward to make a good design in the upcoming months as I work on Meson Integration.
I am a beginner with a lot to learn about software design and programming. I will continue to maintain this code I have written and improve it with others’ help.
f2pyarg has come and its here to stay.