Tools for static blogging, Activitypub and the Fediverse

Sat Jan 11 2020 00:00:00 GMT+0000 (Coordinated Universal Time)

Over Christmas 2019 I decided to port my blog away from the all-in-one CMS I'd used for 15 years (Plone).

I turned it into a static blog, using activitypub on the way to make and push content with a GUI on Android and desktop. I discovered some great projects and apps on the way.

Somewhat covered in this post are ActivityPub, Mastodon, Pump.io, Plume, Tusky, AndStatus, Subway Tooter, Metalsmith, Pumpa and Plone .

What I migrated from

Feel free to skip this section unless you're into the Plone CMS or porting systems in general :)

My blog had been running on Plone for 15 years, but initial tests over Christmas of upgrading it and its plugins to the latest and greatest version of Plone indicated it would be quite some work on all kinds of levels. I had already done a complex upgrade of it once before, a couple of years ago.

Plone is a great system, I made my living consulting, programming its products and teaching it for 10 years. But one would not necessarily be wrong if one said that Plone is the most complex CMS on earth. It has some very powerful principles such as content frameworks, acquisition, multiple inheritance, interfaces & adapters (either in XML files or somewhere in the code), a great workflow engine, a fine grained permissions system and event listeners that when they are all brought together can create a multi dimensional contraption of zen-like complexity (if that makes sense). It also has a great search engine.

Plone is object oriented to the core with an object database, and with all of that come the usual problems with objects: There are so many ways they can be configured and used. When porting from one blogging framework in Plone to another, one can export to XML files, and then reimport them to the new system. However if no one has written these exports and imports, you will need to delve into how the objects actually work.

From the point of the reader or author, a blog post is just a piece of text and some pictures. But for Plone or any similar system it is an object of multiple inheritance which is workflowable and has data on it separated into different categories such as metadata. It may to try to to be compliant with different standards such as Dublin core. The data may be stored on the object as attributes with names you can figure out, or they may be stored in dictionaries with weird top key names. Maybe you should call methods instead of accessing attributes directly. Maybe you should create an accessor object that accesses the objects for you. Maybe the document object should be configured with an accessor object that is pluggable. And so on, you get the picture.

When different people make different objects with different ideas of how they should work, and put them all into the same system, you may get a combinatorial explosion of how things could interact and you need to read up on an understand each one to make them work together. Files on the other hand have at least a limited behavior.

Also the sheer age of Plone and its underlying Zope server has created archeological layers of coding styles and patterns.

In fact, the Zope ecosystem has such a fierce reputation that a guy has made a joke package called "zope.cooties" that you can include in your project to discourage people from reading your code, asking question or submitting pull requests:

https://pypi.org/project/zope.cooties/

On top of that or rather underpinning it are different versions of python and different ways of installing software in the python universe. Dealing with old stuff, sometimes you need to know what month package versions are in, to know what to use, especially when the stuff goes out and tries to install stuff for you.

https://stackoverflow.com/questions/6344076/differences-between-distribute-distutils-setuptools-and-distutils2

Version pinning mitigates a lot of this, but sooner or later the complex systems starts "leaking" and you are pulling in the wrong kitchen sink.

As I was trying to upgrade all these different things I ended up not being able to estimate how much there was left to do. I realised I was not comfortable with having all my content in a system I could not touch. So my Plone had to go.

I tried using wget and httrack to mirror my Plone site but because of CGI parameters you can get into permutations of urls that just go on. In hindsight I could have turned off CGI params, but in the end I just asked Plone through ZCatalog queries what urls it has for documents and images, and I wrote a downloader to fetch those specific urls.

Static blogging

Ok, now I had 2900 files. I decided to go completely the other way from Plone, in order to cover as much conceptual ground as possible, and learn. Plone is super-dynamic, so let's go for static blogging! I will not end up losing control of a bunch of static files after all.

Static blogging has become all the rage with systems such as Jekyll, Nikola, Pelican, Next, Hugo and Gatsby. I chose however Metalsmith for the initial porting. Metalsmith is just one big pipeline and hence conceptually very simple.

Remember I was tired of complex stuff, and if everything flows in a pipeline there will be no spooky interaction at a distance, or the multi dimensional Zen of a Plone request (did I say Plone request? I mean of course Plone multi adapter).

The plugin universe of Metalsmith did not have exactly the stuff I needed, but it is dead easy to make your own plugins and within short time I had made Metalsmith plugins for:

(These are on a works-for-me quality level and nothing beyond that)

Many, maybe all static blog systems employ a metadata standard called Frontmatter. It allows file to have a section in the beginning with data in e.g. YAML or JSON format, depending a bit on the platform.

The client side search engine question was interesting, how do you search a static blog? The answer is you export the search index over to the web browser and have it search everything. I found the most efficient way for me was just to export all contents to the client side in a JSON file. My almost 3000 pages only weighed in at 3MB in plain text and a brute force search of that is instantaneous on both laptop and mobile. There are other more elaborate solutions such as Flexsearch, but the index size turned out to be the limiting factor in my case, so brute-force search on the actual texts worked fine. Flexsearch by default weighed in at 150MB in index size. That size can be trimmed but I guess Flexsearch comes to it's fore for massive amounts of text such as 50'000 documents and upwards and/or sophisticated searches. I just AND together my search words. Other solutions are lunr.js .

Wysiwig publishing to a static blog from desktop or mobile is not a thing, ActivityPub to the rescue

Plone and other similar CMSes have excellent HTML Wysiwyg editors, where you can even just paste in images and they get included. Static blogs afaict generally have nothing. You are supposed to slog away in markdown, make your image links in code and then hit the command line for publishing. Not an option on mobile and not much fun on desktop either. However there are great Markdown and Image helper plugins for e.g. the VS code and Atom text editors.

For VS code there is Markdown Preview Enhanced, Paste Image and Markdown All In One, used by me indeed for this making this post. In fact including images is so easy I may have gone a little overboard…

Making blog posts on mobile

You could go with note taking apps on mobile such as Joplin or Orgzly, but they edit a document tree, and I wanted to push content to my static blog, not having the whole blog on my phone.

But are there any general clients for editing rich content and pushing it to publishing? It turns out there are, such as Tusky, AndStatus and Subway Tooter, and here we are entering the world of federated updates and blogging on the ActivityPub standard.

I have tried out Tusky, AndStatus and Subway Tooter on Android and they all have their pros and cons.

AndStatus can handle a lot of different accounts simultaneously:

…while Tusky seems to only handle one. Tusky looks nicer though and can take photos directly with the camera. Subway Tooter looks a bit worse but can also record video which Tusky cannot.

All three can include an image or video already recorded. Even with the multiple account feature, it seems AndStatus cannot handle services on non standard ports.

What are then the ActivityPub systems that you can push to?

The Fediverse

ActivityPub is a standard widely used in an ecosystem called the Fediverse. The Fediverse is the idea that instead of using global centralized services, we should use many hubs in a federation, where these hubs cooperate to make a service, sometimes similar to let's say Twitter or Facebook.

I think the Fediverse is important. If you have all your social interactions and data through centralized services, you could well lose all your contacts and interactions if a service would stop working. It seems therefore to be prudent to complement the central service with at least a bit of Fediversing.

I believe that digital signatures must be become a more used part in what I have seen of the Fediverse so far, including certificate systems.

However for my blog project and for this post, the interesting part of the Fediverse is that federation means there are standards for communication which means we can use components such as editors and servers any way we want. They are pluggable.

ActivityPub servers

ActivityPub is an open standard for social content. Systems based on it got a lot of attention back in 2013, but it's have now made a comeback. It is about breaking free from centralized services and build up a federated publishing ecosystem.

Mastodon

The first ActivityPub server system I encountered was Mastodon which is Twitter-like. It limits the post size to 500 characters by default but has excellent support on e.g. Android. I had an account on a Mastodon server and I can now push content to the server with Tusky or the other apps, and then pull it down to my static blog via RSS:

Tusky→Mastodon→RSS→My blog

This may seem a bit like overkill for getting stuff from A to B, but you get the benefit of having Android support for publishing.

Pump.io

But what if you want longer texts than 500 characters? Blog posts are not tweets after all. There is another ActivityPub server called pump.io which has no default size limits afaict. It is written in javascript while Mastodon is in Ruby. I'd prefer javascript and python and possibly Rust, while I'd like to avoid Ruby and Go (no offense, those are just languages I have less experience with). However how well maintained the systems are will be more important in the end, which I will have to check further into

I had problems using the public pumps, so I set up my own and after some work I can now publish to it from the the desktop client Pumpa. I put my pump server on a high port and most pump.io clients, or indeed any ActivityPub clients cannot handle that, there is no way to put in another port and if you make it part of your identity string, the clients cannot parse it. With the exception of the Pumpa desktop app which does it just fine.

Pump.io is hard to proxy and I have decided to use it only for long blog posts from the Pumpa client on the desktop. I tried to put my pump.io instance on a low port but you basically need to start it as root for that, which I feel is a bit unnecessary. I'd prefer to proxy it behind nginx but the docs advise you not to.

The cool thing is though it does not matter to me what channels I use to reach my blog. They are just conveyors. All roads will lead to Rome anyway, i.e. to my static blog. So I can use pump for some posts and Mastodon or something else for other posts. And there may be even better things in the ActivityPub universe for blogging than Mastodon or Pump…

Getting a feed out of pump.io and learning about OAuth1

Pump.io in contrast to mastodon does not have RSS feeds. Instead you are supposed to use a client that authenticates itself via Oauth1, and gets permissions and pulls down JSON. I tried to install some elaborate feed reader systems but eventually I found some very elegant seven year old Python scripts by Dirk Krause that does the job just fine in few lines of code (check comments in the end for what may need to be changed). https://gist.github.com/dirkk0/5875461

But wait, there's more!

And once I had done all that work often dealing with old code, it turns out there is a brand spanking new modern ActivityPub blog engine, written in Rust called Plume, that can be installed with Docker, or with Snap or in other ways. Maybe that is the future?

Update: I tried installing from snap, and it cannot understand its own command line arguments as given in the documentation. The command:

$> sudo plume.plm instance new --private --domain plume.example.com --name 'My Plume Instance' -l 'CC-BY'
error: Found argument 'Plume' which wasn't expected, or isn't valid in this context

indicates that command line parsing does not work properly. I will wait with this offering.

Plume

Looking more into ActivityPub, is it really suitable for blogging?

Maybe I have not delved deep enough into the ActivityPub standard, but from what I can see from pump.io's JSON output and from the spec itself, it seems to be a lot about pushing one item such as a photo or a piece of text. But for a longer blog post you want multipart. Not sure if that is covered?

For shorter blog posts you can get by with the description of the event + the photo object.

Another standard: MicroPub

https://indieweb.org/Micropub

There is also what appears to be a nascent standard for publishing to blogs. In fact here is a code library specifically made to post to static blogs, in this case Jekyll via GitHub but I guess it would pretty much work for any static blog system with git and frontmatter.

https://github.com/voxpelli/webpage-micropub-to-github

There is one Android client for MicroPub called Indigenous but have not tried it yet.

First (after)thoughts

I'm starting to think that a decent compromise between complex dynamic blogging/CMS systems and static blogs, is to use a dynamic system to manage and make the content, but that the actual publishing goes to static files. A number of CMS systems have such plugins now. However I will soldier on with the new toolchain I built up, for now.

The editing experience was really nice in Plone and I miss it. Blogging gets slower without it. Still it feels good to have the files, and have them under source control. Although Plone has revisions (of course :) ).

This blog post posted via rsync from a git repository that was populated via a Metalsmith pipeline, with as input a Markdown document created with VSCode. All running on Ubuntu Linux 19.10 or 18.04 LTS. The images were taken from the web browser or snagged with Flameshot. Served out statically with Nginx. CSS provided mostly by Twitter Bootstrap for the new blog posts.