One of the perks of my new gig at Tembo is that I have more time to work on PGXN. In the last ten years I've had very little time to give, so things have stagnated. The API, for example, hasn't seen a meaningful update since 2016!
But that's all changed now, and every bit of the PGXN architecture has experienced a fair bit of TLC in the last few weeks. A quick review.
PGXN Manager provides user registration and extension release services. It maintains the root registry of the project, ensuring the consistency of download formatting and naming ($extension-$version.zip if you're wondering). Last year I added a LISTEN/NOTIFY queue for publishing new releases to Twitter and Mastodon, replacing the old Twitter posting on upload. It has worked quite well (although Twitter killed the API so we no longer post there), but has sometimes disappeared for days or weeks at a time. I've futzed with it over the past year, but last weekend I think I finally got to the bottom of the problem.
As a result, the PGXN Mastodon account should say much more up-to-date than it sometimes has. I've also added additional logging capability, because sometimes the posts fail and I need better clarity into what failed and how so it, too, can be fixed. So messaging should continue to become more reliable. Oh, and testing and unstable releases are now prominently marked as such in the posts (or will be, next time someone uploads a non-stable release).
I also did a little work on the formatting of the How To doc, upgrading to MultiMarkdown 6 and removing some post-processing that appended unwanted backslashes to the ends of code lines.
PGXN API (docs) has been the least loved part of the architecture, but all that changed in the last couple weeks. I finally got over my trepidation and got it working where I could hack on it again. Turns out, the pending issues and to-dos — some dating back to 2011! — weren't so difficult to take on as I had feared. In the last few weeks, API has finally come unstuck and evolved:
Switched the parsing of Markdown documents from the quite old Text::Markdown module to CommonMark, which is a thin wrapper around the well-maintained and modern cmark library. The main way in which this change will be visible is in the support for fenced code blocks. Compare the pg_later 0.0.14 README, formatted with the old parser, with pg_later 0.1.0 README, formatted with the new parser.
Updated the full text indexer to index the file identified in the META.json as the documentation for an extension even if the file name is different from the extension (issue 10), including the README.
Furthermore, if the indexer finds no documentation for the extension, either in its metadata or an appropriately-named file, it falls back on the README (issue 12).
These two fixes, the oldest and most significant of the whole system, greatly increase the accuracy of documentation search, because so many extensions have no docs other than the README. Prior to this fix, for example, a search for "later" failed to turn up pg_later, and now it's appropriately the first result.
The indexer now indexes releases marked "testing" or "unstable" in the metadata if there is no stable release (issue 2!). This makes new extensions under development easier to find on the site than previously, when only stable releases were indexed. However, once a stable release is made, no more testing or unstable releases will be indexed. I believe this is more intuitive behavior.
Also fixed some permissions issue, ensuring that source code unzipped for browsing is accessible to the app. In other words, a zip file without READ permission would trigger a not found response; the API now ensures that all files and directories are accessible to the application.
As a bonus, the API now also recognizes plain text files (.txt and .text) as documentation and indexes their contents and adds links for display on the site. Previously plain text files other than READMEs were ignored (issue 13).
I'm quite pleased with some of these fixes, and relieved to have finally cleared out the karmic overhang of the backlog. But the indexing and HTML rendering changes, in particular, are tactical: until the entire registry can be re-indexed, only new uploads will benefit from the indexing improvements. I hope to find time to work on the indexing project soon.
The pgxn-site project powers the main site, pgxn.org, and has seen 7 new releases since the beginning of the year! Notable changes:
Changed the default search index from Documentation too Distributions, because most release include no docs other than a READMe, which was indexed as part of the distribution and not the extension contained in a distribution. As a result, searches return more relevant and comprehensive results. However, I say "was indexed" because, as described above, the PGXN API has since been updated to appropriately identify and index READMEs as extension documentation. So this change might get reverted at some point, but not before the entire registry can be re-indexed.
Changed the link to the How To in the footer from the somewhat opaque "Release It!" to "Release on PGXN". Should make it clearer what it is. Also, have you considered releasing your extension on PGXN? You should!
Fixed "not found" errors for links to files with uppercase letters, such as /dist/vectorize/vector-serve/README.html.
Tweaked the CSS a bit to improve list item alignment in the "Contents" of doc pages (like semver), as well as the spacing between definition lists, as in the FAQ.
Switched the parsing and formatting of the static pages (about, art, FAQ, feedback, and mirroring) from the quite ancient Text::MultiMarkdown module to MultiMarkdown 6.
Oh, and I fixed or removed a bunch of outdated links from those pages, and replaced all relevant HTTP URLs with HTTPS URLs. Then had to revert those changes from a bunch of SVG files, where xmlns="http://www.w3.org/2000/svg" is relevant but xmlns="https://www.w3.org/2000/svg" is not. 🤦🏻♂️
I've made a number of more iterative improvements, as well, mostly to the maintainability of PGXN projects:
The GitHub repositories for all of these projects have been updated with comprehensive test and release workflows, as well as improved Perl release configuration.
The WWW::PGXN, PGXN::API::Searcher Perl modules have been updated with test and release workflows and improved Perl release configuration, as well.
The libexec configuration of the PGXN Client Homebrew formula has been fixed, allowing PGXN::Meta::Validator to install itself in the proper place to allow pgxn validate-meta to work properly on Homebrew managed-systems (like mine!)
The pgxn/pgxn-tools Docker image, which saw some fixes and improvements last month, has continued to evolve, as well. Recent improvements include support for PostgreSQL TAP tests (not to be confused with pgTAP, already supported), finer control over managing the Postgres cluster — or multiple clusters — by setting NO_CLUSTER= to prevent pg-start from creating the cluster. Both these improvements better enable multi-cluster testing.
I think that about covers it. Although I'm thinking and writing quite a lot about what's next for the Postgres extension ecosystem, I expect to continue tending to the care and feeding and improvement of PGXN as well. To the future!