kaniini's blog!

The OTP ecosystem which grew out of Erlang has all sorts of useful applications included with it, such as support for encoding and decoding ASN.1 messages based on ASN.1 definition files.

I recently began work on Cacophony, which is a programmable LDAP server implementation, intended to be embedded in the Pleroma platform as part of the authentication components. This is intended to allow applications which support LDAP-based authentication to connect to Pleroma as a single sign-on solution. More on that later, that's not what this post is about.

Compiling ASN.1 files with mix

The first thing you need to do in order to make use of the asn1 application is install a mix task to compile the files. Thankfully, somebody already published a Mix task to accomplish this. To use it, you need to make a few changes to your mix.exs file:

  1. Add compilers: [:asn1] ++ Mix.compilers() to your project function.
  2. Add {:asn1ex, git: "https://github.com/vicentfg/asn1ex"} in the dependencies section.

After that, run mix deps.get to install the Mix task into your project.

Once you're done, you just place your ASN.1 definitions file in the asn1 directory, and it will generate a parser in the src directory when you compile your project. The generated parser module will be automatically loaded into your application, so don't worry about it.

For example, if you have asn1/LDAP.asn1, the compiler will generate src/LDAP.erl and src/LDAP.hrl, and the generated module can be called as :LDAP in your Elixir code.

How the generated ASN.1 parser works

ASN.1 objects are marshaled (encoded) and demarshaled (parsed) to and from Erlang records. Erlang records are essentially tuples which begin with an atom that identifies the type of the record.

Elixir provides a module for working with records, which comes with some documentation that explain the concept in more detail, but overall the functions in the Record module are unnecessary and not really worth using, I just mention it for completeness.

Here is an example of a record that contains sub-records inside it. We will be using this record for our examples.

message = {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE}

This message maps to an LDAP unbindRequest, inside an LDAP envelope. The unbindRequest carries a null payload, which is represented by :NULL.

The LDAP envelope (the outer record) contains three fields: the message ID, the request itself, and an optional access-control modifier, which we don't want to send, so we use the special :asn1_NOVALUE parameter. Accordingly, this message has an ID of 1 and represents an unbindRequest without any special access-control modifiers.

Encoding messages with the encode/2 function

To encode a message, you must represent it in the form of an Erlang record, as shown in our example. Once you have the Erlang record, you pass it to the encode/2 function:

iex(1)> message = {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE}
{:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE}
iex(2)> {:ok, msg} = :LDAP.encode(:LDAPMessage, message)
{:ok, <<48, 5, 2, 1, 1, 66, 0>>}

The first parameter is the Erlang record type of the outside message. An astute observer will notice that this signature has a peculiar quality: it takes the Erlang record type as a separate parameter as well as the record. This is because the generated encode and decode functions are recursive-descent, meaning they walk the passed record as a tree and recurse downward on elements of the record!

Decoding messages with the decode/2 function

Now that we have encoded a message, how do we decode one? Well, lets use our msg as an example:

iex(6)> {:ok, decoded} = :LDAP.decode(:LDAPMessage, msg)
{:ok, {:LDAPMessage, 1, {:unbindRequest, :NULL}, :asn1_NOVALUE}}
iex(7)> decoded == message
true

As you can see, decoding works the same way as encoding, except the input and output are reversed: you pass in the binary message and get an Erlang record out.

Hopefully this blog post is useful in answering questions that I am sure people have about making use of the asn1 application with Elixir. There are basically no documentation or guides for it anywhere, which is why I wrote this post.

Historically, there has been a practice of combining URIs with access tokens containing sufficient entropy to make them difficult to brute force. A few different techniques have been implemented to do this, but those techniques can be considered implementation specific. One of the earliest and most notable uses of this technique can be observed in the Second Life backend APIs.

An example of a capability URL being used in the fediverse today would be in the way that Pleroma refers to ActivityPub objects and activities: https://socially.whimsic.al/objects/b05e883b-b66f-421b-ac46-c6018f539533 is a capability URL that allows access to a specific object.

However, while it is difficult for capability URLs to be brute forced due to containing tokens that have sufficient entropy to make guessing them expensive, they are have a serious flaw — since the access token is part of the resource being accessed, they tend to leak the access token in two ways:

  • Browser Referer headers
  • Server access and error logs

Since the usage of capability URLs has become widespread, a wonderful specification has been released by the IETF: OAuth 2.0. OAuth provides a possible solution in the form of bearer tokens, which can be passed around using an Authorization header.

Bearer Capability URIs (aka bearcaps) are essentially a construct which combines a resource with an access token that happens to grant some form of access to that resource, while keeping that access token out of band. Since the access token is out of band, it cannot be leaked in a Referer header or in a server log.

But what does a bearcap URI look like? It's actually quite simple. Let's build on the example URL I gave above — it's bearcap URI would be something like: bearcap:?u=https://socially.whimsic.al/notice/9nnmWVszgTY13FduAS&t=b05e883b-b66f-421b-ac46-c6018f539533.

Let's break this down into parts.

bearcap is the name of the URI scheme, and then that is further broken down into a pair of URI query parameters:

  • u is the URI that the supplied token happens to be able to access in some way
  • t is the access token itself

In this case, a client which wanted to consume this bearcap (say to fetch the object behind it) would make a GET request to the URI specified, with the token specified:

GET /notice/9nnmWVszgTY13FduAS HTTP/1.1
Host: socially.whimsic.al
Authorization: Bearer b05e883b-b66f-421b-ac46-c6018f539533

HTTP/1.1 200 OK
[...]

In a future post, we will go into how bearcap URIs can help to bring some aspects of capability-based security to ActivityPub.

ActivityStreams provides for a multitude of different actor and object types, which ActivityPub capitalizes on effectively. However, neither ActivityPub nor ActivityStreams provide a method for hinting how a given actor or object should be interpreted in the vocabulary.

The purpose of this blog post is to document how the litepub community intends to provide behavioural hinting in ActivityPub, as well as demonstrate an edge case where behavioural hinting is useful.

A Quick Refresher: what unhinted ActivityStreams objects look like

This is an example actor, which is a relay service. It represents how relay services appear now.

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    "https://pleroma.site/schemas/litepub-0.1.jsonld"
  ],
  "id": "https://pleroma.site/relay",
  "type": "Application",
  "endpoints": {
    "sharedInbox": "https://pleroma.site/inbox"
  },
  "followers": "https://pleroma.site/relay/followers",
  "following": "https://pleroma.site/relay/following",
  "inbox": "https://pleroma.site/relay/inbox"
}

As you can tell, the type is set to Application, which when interpreted as a JSON-LD document expands to https://www.w3.org/ns/activitystreams#Application.

Hinting objects through compound typing

In ActivityPub, different activities impose different side effects, but in many cases, it is not necessarily optimal to impose all side effects in all contexts. To know when we want to impose certain side effects or not, we need more semantic knowledge of the intention behind an object.

To solve this semantic quandry, JSON-LD provides a mechanism known as compound typing. In other words, an object can be two or more different types at once. For example, a Person object could also be a Mother or a Partner object as well.

How does this apply to ActivityPub? By using the same mechanism, we can effectively hint the object to indicate how an implementation should ideally treat it:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    "https://pleroma.site/schemas/litepub-0.1.jsonld",
    {"Invisible": "litepub:Invisible"}
  ],
  "id": "https://pleroma.site/relay",
  "type": ["Application", "Invisible"],
  "endpoints": {
    "sharedInbox": "https://pleroma.site/inbox"
  },
  "followers": "https://pleroma.site/relay/followers",
  "following": "https://pleroma.site/relay/following",
  "inbox": "https://pleroma.site/relay/inbox"
}

Voila! Now an implementation which understands type hinting will understand that this relay service should not be visible to end users, which means that side effects caused by it doing it's job shouldn't be visible either.

Of course, respecting such hinting is not mandatory, and therefore any security-dependent functionality shouldn't depend on behavioural hints. But security aside, they do have their uses.

I have assigned the litepub:Invisible type as the first behavioural hint, for cases where side effects should not be visible to end users, as in the case of relaying and group chats (what matters in both cases is that the peer discovers the referenced message instead of showing the Announce activity directly).

We just cut the v1.0.90 (aka Pleroma 1.1 RC1) this afternoon, so I figured it would be a good time to write a blog updating people on what is happening.

Everything about Pleroma 1.1

So the first Pleroma 1.1 release candidate has been cut, but what does that ultimately mean? When will the final release happen? Ultimately, that depends on you.

We want people to test the 1.1 release candidate and report back to us about how it performs — thus the first release candidate really begins a cycle of incorporating feedback and bug reports then cutting further release candidates.

However, don't despair! We have been using an improved release engineering process, which has left the tree in a pretty reliable state verses the 1.0 release cycle, which was somewhat seat of the pants. Accordingly, we only expect to have a single revision cycle.

This means that the expected timeline is as such:

  • October 7: Pleroma v1.0.95 tag (aka Pleroma 1.1 RC2).
  • October 14: Pleroma v1.1.0 tag; 1.1 becomes the new stable branch head, release/1.1 branch created for future updates.
  • October 28: Pleroma v1.1.1 tag; revision update for anything that got caught after the 1.1.0 release.

So the key date is October 14th, that is when we expect that 1.1.0 will drop.

Pleroma 1.0.x deprecation timeline

At the moment, Pleroma 1.0.x is actively maintained. As users migrate to the new 1.1.0 release, we expect for interest in the 1.0.x branch to wane. However, we want to ensure that the transition from 1.0.x to 1.1.x goes smoothly.

As such, I plan to keep the Pleroma 1.0.x tree under maintenance (since I'm doing most of the maintenance work on it) until April 2020. This gives roughly 6 months for users to migrate from 1.0.x to 1.1.x. However, new features will not be backported to 1.0.x at this point — only security fixes. If you want new features, then you should switch to the new stable tree or follow the maint tree.

Release engineering in a nutshell

Starting with Pleroma 1.1, we have adopted a new release engineering process that is similar to how other large scale git projects do release engineering.

In this workflow, we have three branches:

  • stable: this branch represents the latest and greatest stable release, and is actually just an alias to whatever that release's branch is.
  • maint: this branch is where new changes are held until we cut a new stable release
  • develop: this branch contains the bleeding edge code (but we try to keep it reasonably stable)

In general, things graduate from develop to maint to stable. Every so often, we freeze develop for a while and then cut a new maint branch, which will become the next release series.

By doing this, we allow for features to mature before exposing wider audiences to them. This ensures that people remain confident in Pleroma.

OStatus deprecation and removal

As has been discussed before, OStatus is deprecated in Pleroma since 1.0. Mastodon is removing OStatus in the 3.0 release. In general, it is time for the fediverse to move on from OStatus for a multitude of reasons. But the question remains: what is the actual deprecation plan for OStatus in Pleroma?

Right now, the plan is to disable the OStatus modules by default in Pleroma 1.2, which should release toward the end of March 2020 if we are able to keep our current cadence. After the 1.2 release, we will most likely remove them entirely, which means that OStatus will most likely be excised from the Pleroma tree by October 2020.

This timeline should not be a problem, since GNU Social intend to make a release this month with ActivityPub support, and everyone else has migrated to ActivityPub months ago.

OCAP

I wanted to write a little about our OCAP plans, but ultimately I want to wait a little while before writing about this. So I will cover it in the next blog post about Pleroma instead.

One of my areas of interest in multimedia coding has always been writing audio visualizers. Audio visualizers are software which take audio data as input, run various equations on it and use the results of those equations to render visuals.

You may remember from your childhood using WinAmp to listen to music. The MilkDrop plugin and AVS plugin included in WinAmp are examples of audio visualizers. AVS is a classical visualization engine that operates on evaluating a scene graph to composite the visuals. MilkDrop on the other hand defines the entire visual as a Direct3D scene and uses the hardware to composite the image.

MilkDrop is a very powerful visualizer, and a clone of it was made called projectM. projectM, unfortunately, has some stability issues (and frankly design issues, like the insistence on loading fonts to render text that will never actually be rendered) and so we do not include it in Audacious at the moment. The newest versions of MilkDrop and projectM even support pixel shaders, which allow for many calculations to be run in parallel for each pixel in the final image. It is a very impressive piece of software.

But, with all that said about MilkDrop, I feel like AVS is closer to what I actually like in a visualization engine. But AVS has a lot of flaws too. Describing logic in a scene graph is a major pain. However, in 2019, the situation is a lot different than when AVS was created — JavaScript engines are ubiquitous and suitably performant, so what if we could develop a programming language based on JavaScript that is domain-specific for visualization coding?

And so, that is what LVis is. LVis stands for Lapine Visualizer (I mean, I'm egotistical, what is there to say?) and uses the same underlying tech that QML apps use to glue together native pieces of code that result in beautiful visuals.

LVis rendering a complex waveform
LVis rendering a complex waveform

But why JavaScript? People already know it, and the fundamentals are easy enough that anyone can pick it up. I already hear the groans of “hacker” “news” at this justification, but expecting people to learn something like Rust to do art is simply not realistic — p5.js is popular for a reason: it's quick and easy to get something going.

And the LVis variant of JavaScript is indeed quite easy to learn: you get the Math functions from JavaScript and a bunch of other components that can you pull into your presets. That's all you get, nothing else.

LVis rendering another complex waveform with bloom
LVis rendering another complex waveform, with bloom

There are quite a few details we need to fill in, like documenting the specifics of the JavaScript-based DSL used by the LVis engine, so people can confidently write presets. We also need to add additional effects, like video echo and colour remapping.

I think it is important to talk about what LVis is not. LVis is not Milkdrop or projectM. It is based on raster graphics, and provides an architecture not that dissimilar to SVGA-based graphics cards in the 90s. Everything is a surface, and surfaces are freely allocateable, but the world is not an OpenGL scene graph. In this way, it is very similar to AVS.

Right now, the LVis source kit includes a plugin for Audacious and a more fully-featured client that allows for debugging and running specific presets. The client, however, requires PulseAudio, as it monitors the system audio output. I am open to adding patches for other audio systems.

You can download the sources from git.sr.ht.

LVis demonstration of complex waveforms

At this point, I think it is at a point where others can start playing with it and contributing presets. The core language DSL is basically stable — I don't expect to change anything in a way that would cause breakage. So, please download it and send me your presets!

I've been taking a break from focusing on fediverse development for the past couple of weeks — I've done some things, but it's not my focus right now because I'm waiting for Pleroma's develop tree to stabilize enough to branch it for the 1.1 stable releases. So, I've been doing some multimedia coding instead.

The most exciting aspect of this has been libreplayer, which is essentially a generic interface between replayer emulation cores and audio players. For example, it will be possible for a replayer emulation core to simply target libreplayer and an audio player to target libreplayer and allow these two things (the emulation core and the libreplayer client) to work together to produce audio.

The first release of libreplayer will drop soon. It will contain a PSF1/PSF2 player that is free of binary blobs. This is an important milestone because the only other PSF1/PSF2 replayer that is blob-free has many emulation bugs due to the use of incorrectly modified CPU emulation code from MAME. Highly Experimental's dirty secret is that it contains an obfuscated version of the PS2 firmware that has been stripped down.

And so, the naming of libreplayer is succinct, in two ways: one, it's self-descriptive, libreplayer obviously conveys that it's a library for accessing replayers, but also due to the emulation cores included in the source kit being blob-free, it implies that the replayer emulation cores we include are free as in freedom, which is also important to me.

What does this mean for audacious? Well, my intention is to replace the uglier replayer emulation plugins in audacious with a libreplayer client and clean-room implementations of each replayer core. I also intend to introduce replayer emulation cores that are not yet supported in audacious in a good way.

Hopefully this allows for the emulation community to be more effective stewards of their own emulation cores, while allowing for projects like audacious to focus on their core competencies. I also hope that having high-quality clean-room implementations of emulator cores written to modern coding practice will help to improve the security of the emulation scene in general. Time will tell.

(I Shitpost Therefore I Am)

This blog has been a long time coming, because we have a lot we need to talk about. Every day, too many people try to claim tutelage over a perpetually growing dung heap. I've written before about the flawed security model that was adopted in the ensuing rush to get real-world ActivityPub implementations out the door. This is not one of those posts.

In the interests of avoiding outright cancellation (which will happen anyway), I will just note that the next sections should be taken with an extreme content warning: many of the sections dissect and examine various incidents that intersect outright harassment or direct examples of white nationalism that have gone entirely unnoticed by the “cancel crew.”

Arguably, I think it's time to cancel the “cancel crew” because they're not protecting us as promised, and in the absence of funding to purchase security services from Prolexic, they will be completely unequipped for the future that they've largely created for us all.

What is the Fediverse anyway?

It's 2019, we're in Web 5.0 or whatever the current buzzword is, Social was dead, Facebook stock was in freefall and, for the last few years, the idea of an independent, federated social network has been growing a new life, largely catalyzed by the launch of the Mastodon platform in late 2016.

It has been said that the Internet is a series of tubes, and that services like Netflix are clogging them up. I have a different perspective: I argue that the collective Sidekiq and Oban instances of Mastodon and Pleroma nodes are the slow-moving garbage-laden trucks shipping around untold terabytes per day of trash. And that trash? That trash is what we lovingly call the fediverse.

If you want to get technical, the fediverse is the federation of servers running OStatus and ActivityPub protocols. Numerous software implement these protocols: Mastodon, Pleroma, Hubzilla, Friendica, GNU Social and PixelFed are good examples. They serve various niches, but have some level of interoperability.

Defenders of the fediverse say that the growth of the fediverse is the fruit of cooperation and collaboration. However, they rarely mention how this cooperation is achieved: name-calling, mischaracterization, disinformation and “cancellation”.

Death By Shitposting

How does an open-world network based on anarchy police itself? Cancellation and tribalism of course, but at least there's the ACAB emoji. Like in proprietary social media, clout on the fediverse is derived from elevating one's reputation at the expense of others. Sometimes this happens for good reason, but usually nobody actually knows the reason it is happening. Like the AMBER alerts you receive on your phone, you just know it's time to get your shotgun and join the mob!

Whenever there's a design flaw with the protocol, it's best to blame the software implementations for disagreeing to the level at which they will cover up the design flaw, instead of the actual design flaw. As is frequently observed, software other than the software of the user's choice is seen as problematic because their software created a flawed security model in the first place.

A Social Network Free Of Nazis

Content warning: We're going to talk about actual nazis. If this bothers you, you may want to skip this section.

One of the main advertising points of the Mastodon software is that the Mastodon Network is free of nazis. Of course, the Mastodon Network is the fediverse, an open-world federated network, and Mastodon itself is free software licensed under AGPL, all of which means that this claim is technically infeasible to enforce. So, how have they been doing with this?

Well, if you use Pleroma or other software that is not Mastodon and doesn't completely buy into the (broken) Mastodon security model, you're a nazi according to many Mastodon users. So, that's part of the point, but not really, and it's not even what I am getting at.

The real question is how is Mastodon doing with having a nazi-free network? Well, Gab and KiwiFarms joined the fediverse lately, and much of the fediverse as a whole are completely anxious about these developments. There's certainly arguments for blocking both of those instances, but that's still not what I'm talking about. This is, however, the ball the Mastodon people have been keeping their eyes on.

Nazis? In the fediverse? It's more likely than you think.

The easiest way to find actual bonafide nazis on the fediverse is to look at Pieville. Pieville is an instance operated by people associated with StormFront, a self-described “White Nationalist Community.” Users openly share videos and messages from key people in the white nationalist movement, such as Billy Roper and William Pierce. Other neo-nazi figures like Alex Linder have an account there. Oh, and Pieville runs Mastodon v2.7.4 at present time of writing.

Whatever you think of Gab or KiwiFarms, Pieville is on a completely different level, and it's surprising to see nobody discussing them as a threat, while cooking up all sorts of threat scenarios about Gab and KiwiFarms. This is not a defense of either of those instances, but it makes me wonder why our eyes aren't on the real ball.

Pieville isn't the only one. There are others, but Pieville has recently blocked fediverse.space from crawling their instance.

The Scriptkiddie-ification Of The Fediverse

Nazis aren't the only problem. The security model where data is distributed to as many nodes as humanly possible and security is not properly enforced to ensure relationships exist with nodes prior to sharing data with them is a problem.

This leads to numerous incidents where instances you don't expect to have copies of your data have copies of your data.

But even that is not the real problem. The real problem is the script kiddies abusing these implementation flaws, and the lack of audience restriction capabilities in the software, which lead people to post things publicly when they probably shouldn't.

Oh, and by the way, there is already a fediverse-wide search engine, which was built in public view while everyone was fighting in order to gain clout.

So, how do we fix the fediverse?

We need to transition the security model away from one that is cooperative, to one that has border-oriented security. The Internet itself, is a federated network, but BGP defines clear boundaries and policy. OCAP or other capability-based systems will do the same for the fediverse. Instead of cancelling each other, we should concentrate on building real security tools and deploying a real security model.

The good news is that progress is being made on this front. Hopefully by 2020, we will have some real solutions widely deployed and people can go back to taking it easy.

We are approaching the end of the merge window for the 1.0.5 release of Pleroma, which will likely be cut next Tuesday (August 13). I have been trying to aim for bi-weekly updates to the Pleroma releases, so that communities tracking stable have the latest security fixes as well as minimally impacting feature additions.

How to get features into the stable release branch?

As the stable release branches are largely frozen, you have to request that a feature be included into the master branch. Stable releases are always cut from master. To do so, open an issue on the Pleroma gitlab or comment on the relevant MR so that a maintainer may tag it with a backport request.

When cutting new releases, a branch is created, such as release/1.0.5, which contains the proposed release. Users are encouraged to test this branch and report on whether any problems exist in the proposed release. These branches contain manual backports done by me at the time of preparing the release, and build on any backports done by others to master, so tracking master instead of a release tag will also get you some of the backports if they are done using the process of using an MR and feature branch for the backport.

Bugfixes

Mastodon API: Set follower/following counters to 0 when hiding followers/following is enabled by @rin@patch.cx

Pleroma reports follower/following counts as 0 in the ActivityStreams 2.0 representations when the user requests to hide their social network. This change adjusts the Mastodon API responses to also return 0 when this setting is enabled.

(backport 409bcad5 to release/1.0.5)

Mastodon API: Fix thread mute detection by @rin@patch.cx

Fix a logic error where CommonAPI.thread_muted? was being called in the wrong context, leading it to always report as false.

(backport 0802a088 to release/1.0.5)

Mastodon API: Return profile URL when available instead of actor URI for MastodonAPI mention URL by @Thib@social.sitedethib.com

Return the profile URL specified in the actor object instead of the actor's IRI when possible in Mastodon API responses. This makes our behaviour consistent with how Mastodon returns profile URLs.

(backport 9c0da100..a10c840a to release/1.0.5)

Correctly style anchor tags in templates by @lanodan@queer.hacktivis.me

Correctly style anchor tags in templates so they match the rest of the template design.

(backport a035ab8c to release/1.0.5)

Do not re-embed ActivityStreams objects after updating them in the IR by @rin@patch.cx

Pleroma's current internal representation (IR) uses a split log of activities (the activities table) and underlying AS2 objects (the objects table). For storage efficiency, the IR refers to child objects by their stable IRI when stored in the IR. In some cases, updates of child objects would result in the child object being re-embedded in the parent activity.

(backport 73d8d5c4..4f1b9c54 to release/1.0.5)

Strip IR-specific fields including likes from incoming and outgoing activities by @sergey@pleroma.broccoli.si

In some cases, IR fields would be shared with peer instances. This caused occasional problems, as some of the IR fields would be serialized in ways that would be inappropriate. Accordingly, we remove all IR-specific vocabulary from incoming and outgoing activities before processing them further.

(backport 0c1d72ab..fa59de5c to release/1.0.5)

Fix --uploads-dir in instance gen task by @lanodan@queer.hacktivis.me

Due to a typo, --uploads-dir is not correctly respected when using the CLI to deploy an instance.

(backport 977c2d04 to release/1.0.5)

Fix documentation for invite gen task by @lanodan@queer.hacktivis.me

Fix typos in the documentation of this task for the --max-use and --expires-at options, underscores were used instead of dashes.

(backport 8815f070 to release/1.0.5)

Handle MRF rejections of incoming AP activities by @sergey@pleroma.broccoli.si

Previously, MRF rejections would be logged to the error log as a crash. This allows for MRF rejections to be more gracefully handled.

(backport d61c2ca9 to release/1.0.5)

New Features

Add relay list task by @kaniini@pleroma.site

Adds the relay list task that has been missing since relay support was implemented. Multiple people have observed that this task was missing for a long time, but nobody got around to writing it until now.

(backport cef3af55 to release/1.0.5)

Add listener port and ip option for 'instance gen' task by @sachin@bikeshed.party

Adds the --listener-port and --listener-ip to the instance gen task. This is primarily useful for automated deployments of Pleroma.

(backport 6d0ae264 to release/1.0.5)

Add wildcard domain matches to MRF simple policy by @alexs@bikeshed.party

Adds the ability for mrf_simple to match using wildcards, for example, against *.example.com instead of just example.com.

(backport 54832360..e886ade7 to release/1.0.5)

With all of the recent hullabaloo with Gab, and then, today Kiwi Farms joining the fediverse, there has been a lot of people asking questions about how data flows in the fediverse and what exposure they actually have.

I'm not really particularly a fan of either of those websites, but that's beside the point. The point here is to provide an objective presentation of how instances federate with each other and how these federation transactions impact exposure.

How Instances Federate

To start, lets describe a basic model of a federated network. This network will have five actors in it:

  • alyssa@social.example
  • bob@chatty.example
  • chris@photos.example
  • emily@cat.tube
  • sophie@activitypub.dev

(yeah yeah, I know, I'm not that good at making up fake domains.)

Next, we will build some relationships:

  • Sophie follows Alyssa and Bob
  • Emily follows Alyssa and Chris
  • Chris follows Emily and Alyssa
  • Bob follows Sophie and Alyssa
  • Alyssa follows Bob and Emily

Here's what that looks like as a graph:

A graph of social relationships.
A graph of social relationships.

Broadcasts

Normally posts flow through the network in the form of broadcasts. A broadcast type post is one that is sent to and only to a pre-determined set of targets, typically your followers collection.

So, this means that if Sophie makes a post, chatty.example is the only server that gets a copy of it. It does not matter that chatty.example is peered with other instances (social.example).

This is, by far, the majority of traffic inside the fediverse.

Relaying

The other kind of transaction is easily described as relaying.

To extend on our example above, lets say that Bob chooses to Announce (Mastodon calls this a boost, Pleroma calls this a repeat) the post Sophie sent him.

Because Bob is followed by Sophie and Alyssa, both of these people receive a copy of the Announce activity (an activity is a message which describes a transaction). Relay activities refer to the original message by it's unique identifier, and recipients of Announce activities use the unique identifier to fetch the referred message.

For now, we will assume that Alyssa's instance (social.example) was able to succeed in fetching the original post, because there's presently no access control in practice on fetching posts in ActivityPub.

This now means that Sophie's original post is present on three servers:

  • activitypub.dev
  • chatty.example
  • social.example

Relaying can cause perceived problems when an instance blocks another instance, but these problems are actually caused by a lack of access control on object fetches.

Replying

A variant on the broadcast-style transaction is a Create activity that references an object as a reply.

Lets say Alyssa responds to Sophie's post that was boosted to her. She composes a reply that references Sophie's original post with the inReplyTo property.

Because Alyssa is followed by actors on the entire network, now the entire network goes and fetches Sophie's post and has a copy of it.

This too can cause problems when an instance blocks another. And like in the relaying case, it is caused by a lack of access control on object fetches.

Metadata Leakage

From time to time, people talk about metadata leakage with ActivityPub. But what does that actually mean?

Some people erroneously believe that the metadata leakage problem has to do with public (without access control) posts appearing on instances which they have blocked. While that is arguably a problem, that problem is related to the lack of access controls on public posts. The technical term for a publicly available post is as:Public, a reference to the security label that is applied to them.

The metadata leakage problem is an entirely different problem. It deals with posts that are not labelled as:Public.

The metadata leakage problem is this: If Sophie composes a post addressed to her followers collection, then only Bob receives it. So far, so good, no leakage. However, because of bad implementations (and other problems), if Bob replies back to Sophie, then his post will be sent not only to Sophie, but Alyssa. Based on that, Alyssa now has knowledge that Sophie posted something, but no actual idea what that something was. That's why it's called a metadata leakage problem — metadata about one of Sophie's objects existing and it's contents (based on the text of the reply) are leaked to Alyssa.

This problem is the big one. It's not technically ActivityPub's fault, either, but a problem in how ActivityPub is typically implemented. But at the same time, it means that followers-only posts can be risky. Mastodon covers up the metadata leakage problem by hiding replies to users you don't follow, but that's all it is, a cover up of the problem.

Solution?

The solution to the metadata leakage problem is to have replies be forwarded to the OP's audience. But to do this, we need to rework the way the protocol works a bit. That's where proposals like moving to an OCAP-based variant of ActivityPub come into play. In those variants, doing this is easy. But in what we have now, doing this is difficult.

Anyway, I hope this post helps to explain how data flows through the network.

OCAP refers to Object CAPabilities. Object Capabilities are one of many possible ways to achieve capability-based security. OAuth Bearer Tokens, for example, are an example of an OCAP-style implementation.

In this context, OCAP refers to an adaptation of ActivityPub which utilizes capability tokens.

But why should we care about OCAP? OCAP is a more flexible approach that allows for more efficient federation (considerably reduced cryptography overhead!) as well as conditional endorsement of actions. The latter enables things like forwarding Create activities using tokens that would not normally be authorized to do such things (think of this like sudo, but inside the federation). Tokens can also be used to authorize fetches allowing for non-public federation that works reliably without leaking metadata about threads.

In short, OCAP fixes almost everything that is lacking about ActivityPub's security, because it defines a rigid, robust and future-proof security model for the fediverse to use.

How does it all fit together?

This work is being done in the LitePub (maybe soon to be called SocialPub) working group. LitePub is to ActivityPub what the WHATWG is to HTML5. The examples I use here don't necessarily completely line up with what is really in the spec, because they are meant to just be a basic outline of how the scheme works.

So the first thing that we do is extend the AS2 actor description with a new endpoint (capabilityAcquisitionEndpoint) which is used to acquire a new capability object.

Example: Alyssa P. Hacker's actor object
{
  "@context": "https://social.example/litepub-v1.jsonld",
  "id": "https://social.example/~alyssa",
  "capabilityAcquisitionEndpoint": "https://social.example/caps/new"
  [...]
}

Bob has a server which lives at chatty.example. Bob wants to exchange notes with Alyssa. To do this, Bob's instance needs to acquire a capability that he uses to federate in the future by POSTing a document to the capabilityAcquisitionEndpoint and signing it with HTTP Signatures:

Example: Bob's instance acquires the inbox:write and objects:read capabilities
{
  "@context": "https://chatty.example/litepub-v1.jsonld",
  "id": "https://chatty.example/caps/request/9b2220dc-0e2e-4c95-9a5a-912b0748c082",
  "type": "Request",
  "capability": ["inbox:write", "objects:read"],
  "actor": "https://chatty.example"
}

It should be noted here that Bob's instance itself makes the request, using an instance-specific actor. This is important because capability tokens are scoped to their actor. In this case, the capability token may be invoked by any children actors of the instance, because it's an instance-wide token. But the instance could request the token strictly on Bob's behalf by using Bob's actor and signing the request with Bob's key.

Alyssa's instance responds with a capability object:

Example: A capability token
{
  "@context": "https://social.example/litepub-v1.jsonld",
  "id": "https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11",
  "type": "Capability",
  "capability": ["inbox:write", "objects:read"],
  "scope": "https://chatty.example",
  "actor": "https://social.example"
}

There's a few peculiar things about this object that I'm sure you've probably noticed. Lets look at this object together:

  • The scope describes the actor which may use the token. Implementations check the scope for validity by merging it against the actor referenced in the message.

  • The actor here describes the actor which granted the capability. Usually this is an instance-wide actor, but it may also be any other kind of actor.

In traditional ActivityPub the mechanism through which Bob authenticates and later authorizes federation is left undefined. This is the hole that got filled with signature-based authentication, and is being filled again with OCAP.

But how do we invoke the capability to exchange messages? There's a couple of ways.

When pushing messages, we can simply reference the capability by including it in the message:

Example: Pushing a note using a capability
{
  "@context": "https://chatty.example/litepub-v1.jsonld",
  "id": "https://chatty.example/activities/63ffcdb1-f064-4405-ab0b-ec97b94cfc34",
  "capability": "https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11",
  "type": "Create",
  "object": {
    "id": "https://chatty.example/objects/de18ad80-879c-4ad2-99f7-e1c697c0d68b",
    "type": "Note",
    "attributedTo": "https://chatty.example/~bob",
    "content": "hey alyssa!",
    "to": ["https://social.example/~alyssa"]
  },
  "to": ["https://social.example/~alyssa"],
  "cc": [],
  "actor": "https://chatty.example/~bob"
}

Easy enough, right? Well, there's another way we can do it as well, which is to use the capability as a bearer token (because it is one). This is useful when fetching objects:

Example: Fetching an object with HTTP + capability token
GET /objects/de18ad80-879c-4ad2-99f7-e1c697c0d68b HTTP/1.1
Accept: application/activity+json
Authorization: Bearer https://social.example/caps/640b0093-ae9a-4155-b295-a500dd65ee11

HTTP/1.1 200 OK
Content-Type: application/activity+json

[...]

Because we have a valid capability token, the server can make decisions on whether or not to disclose the object based on the relationship associated with that token.

This is basically OCAP in a nutshell. It's simple and easy for implementations to adopt and gives us a framework for extending it in the future to allow for all sorts of things without leakage of cryptographically-signed metadata.

If this sort of stuff interests you, drop by #litepub on freenode!