洪 民憙 (Hong Minhee) :nonbinary:'s avatar
洪 民憙 (Hong Minhee) :nonbinary:

@hongminhee@hollo.social

I have deeply mixed feelings about 's adoption of JSON-LD, as someone who's spent way too long dealing with it while building .

Part of me wishes it had never happened. A lot of developers jump into ActivityPub development without really understanding JSON-LD, and honestly, can you blame them? The result is a growing number of implementations producing technically invalid JSON-LD. It works, sort of, because everyone's just pattern-matching against what Mastodon does, but it's not correct. And even developers who do take the time to understand JSON-LD often end up hardcoding their documents anyway, because proper JSON-LD processor libraries simply don't exist for many languages. No safety net, no validation, just vibes and hoping you got the @context right. Naturally, mistakes creep in.

But then the other part of me thinks: well, we're stuck with JSON-LD now. There's no going back. So wouldn't it be nice if people actually used it properly? Process the documents, normalize them, do the compaction and expansion dance the way the spec intended. That's what Fedify does.

Here's the part that really gets to me, though. Because Fedify actually processes JSON-LD correctly, it's more likely to break when talking to implementations that produce malformed documents. From the end user's perspective, Fedify looks like the fragile one. “Why can't I follow this person?” Well, because their server is emitting garbage JSON-LD that happens to work with implementations that just treat it as a regular JSON blob. Every time I get one of these bug reports, I feel a certain injustice. Like being the only person in the group project who actually read the assignment.

To be fair, there are real practical reasons why most people don't bother with proper JSON-LD processing. Implementing a full processor is genuinely a lot of work. It leans on the entire Linked Data stack, which is bigger than most people expect going in. And the performance cost isn't trivial either. Fedify uses some tricks to keep things fast, and I'll be honest, that code isn't my proudest work.

Anyway, none of this is going anywhere. Just me grumbling into the void. If you're building an ActivityPub implementation, maybe consider using a JSON-LD processor if one's available for your language. And if you're not going to, at least test your output against implementations that do.

🪨's avatar
🪨

@Varpie@peculiar.florist · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee I have the same feeling. The idea behind JSON-LD is nice, but it isn't widely available, so developing with it becomes a headache: do I want to create a JSON-LD processor, spending twice the time I wanted to, or do I just consider it as JSON for now and hope someone will make a JSON-LD processor soon? Often, the answer is the latter, because it's a big task that we're not looking for when creating fedi software.

Doug Webb's avatar
Doug Webb

@douginamug@mastodon.xyz · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee I'm reading this thread as a relative noob, but what I see again and again: almost no one "properly" implents largely because is hard but also because the spec itself is unclear. Most people who get stuff done have to go off-spec to actually ship.

This seems a fundamental weakness of the - and that disregarding the limitations coming from base architecture. Seems to pose a mid/long-term existential threat.

What can we do to help improve things?

shopkeeper's avatar
shopkeeper

@potatomeow@fosstodon.org · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee i struggling badly with rust cuz it's rust being rust... i can imagine a duck typing lang might have easier time

Doug Webb's avatar
Doug Webb

@douginamug@mastodon.xyz · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee thank you for doing hard work with the plumbing! You are helping build coherence in this place and I'm grateful for it. Diverse people, unified standards 🖤

초무's avatar
초무

@chomu.dev@bsky.brid.gy · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

JSON-LD 를 제대로 다루는 라이브러리를 하나 만들어 보시면 어떨까요

Hazelnoot's avatar
Hazelnoot

@hazelnoot@enby.life · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee@hollo.social boosting this for the excellent points, even though I'm one of the people not using JSON-LD and frequently producing malformed documents.

(And honestly, I don't think I'll change that soon. Sharkey only uses JSON-LD on one single code path, and even that's been enough to introduce critical bugs. I'm planning to remove the JSON-LD lib entirely from Campfire fork.)

((And that's not even getting into the security problems with every JSON-LD lib I've ever audited...))

Rimu's avatar
Rimu

@rimu@piefed.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

JSON-LD is a trap. Sorry you fell in.

silverpill's avatar
silverpill

@silverpill@mitra.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee

>There's no going back.

We absolutely must go back. Either we have a vibrant ecosystem where building stuff is a pleasant experience, or fediverse slowly dies while linked data cultists harass developers about nonresolvable URLs in @context.

JSON-LD adds nothing to ActivityPub, it only creates problems. Time to move on.

Phil's avatar
Phil

@philcowans@universeodon.com · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee - boosting as I think this is an interesting discussion to have. I'm working on some ActivityPub adjacent ideas which use semantic web concepts, but I'm not deep enough in yet to have strong feelings about the right standards to use in different places.

julian's avatar
julian

@julian@activitypub.space · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee@hollo.social I'll give you my take on this... which is that my understanding of JSON-LD is that with JSON-LD you can have two disparate apps using the same property, like thread, and avoid namespace collision because one is actually https://example.org/ns/thread and the other's really https://foobar.com/ns/thread.

Great.

I posit that this is a premature optimization, and one that fails because of inadequate adoption. There are likely documented cases of implementations using the same property, and those concern the actual ActivityStreams vocabulary, and the solution to that is to communicate and work together so that you don't step on each others' toes.

I personally feel that it is a technical solution to a problem that can be completely handled by simply talking to one another... but we're coders, we're famously anti-social yes? mmmmm...

Ben Pate 🤘🏻's avatar
Ben Pate 🤘🏻

@benpate@mastodon.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee

Great piece of writing, and I agree with lots of what you say here. Building this stuff is super hard, so good for you doing it “the right way.”

I hate not being able to trust the data I receive.

I’m one of those who’s taking the shortcuts, which has plenty of drawbacks, so I’m glad you’re in here fighting the good fight.

Luke Kanies's avatar
Luke Kanies

@lkanies@hachyderm.io · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee @jalefkowit huh. I’ve been pondering using it for some projects of mine, so this is good to know.

Is it a fundamental problem with JSON-LD, such that it should just be avoided, or a problem with how ActivityPub uses it?

And is there something else you’d recommend that fulfills the same goals?

pettter's avatar
pettter

@pettter@social.accum.se · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee XML was always the better option tbh.

Evan Prodromou's avatar
Evan Prodromou

@evan@cosocial.ca · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee I agree that new developers should use a JSON-LD processor. It saves a lot of heartache.

Evan Prodromou's avatar
Evan Prodromou

@evan@cosocial.ca · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee do you use the activitystrea.ms module from npm? It takes a lot of the pain out.

mcc's avatar
mcc

@mcc@mastodon.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee How hard would it be for a future version of ActivityPub to simply back out JSON-LD support? Would there be a downside to this?

Jocelynephiliac :reclaimer:'s avatar
Jocelynephiliac :reclaimer:

@twipped@twipped.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee @jalefkowit JSON-LD is the sole reason I gave up on building anything for the fediverse. It is ludicrously over-engineered for the purpose, all the worst parts of XML crammed into JSON.

Want to add a field to any of the objects? Better be prepared to host a schema definition from now until the end of time. Good luck figuring out how to write that definition though.

artemist's avatar
artemist

@artemist@social.mildlyfunctional.gay · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee I attempted to deal with parsing json-ld and I really wish they had just gone with XML. XML is also complicated, but it has mature implementations, while JSON-LD is a new protocol that never felt to me to have compelling advantages except matching data models people were used to using.

Hugo Mills's avatar
Hugo Mills

@darkling@mstdn.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee You have my deepest sympathies. I've been using Linked Data (correctly, for the most part) since 2002, and I feel your pain.

Khleedril's avatar
Khleedril

@khleedril@cyberplace.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee This is a conversation I've been wanting to see for a long time. As things stand you have to develop ActivityPub specially for every service you want to talk to, because of all the implementation inconsistency. I hope we can converge on a simple solution soon.

Kan-Ru Chen 🦀's avatar
Kan-Ru Chen 🦀

@kanru@g0v.social · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee I had a similar realization early on when implementing Pinka. I almost went full JSON-LD but found that to properly expand the document I might need to make network calls. I stopped worrying about unknown terms and just hard coded a list of well-known AS and APub terms for interoperability.

Digimon Story: Eevee Stranger's avatar
Digimon Story: Eevee Stranger

@asonix@masto.asonix.dog · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee completely agree here

JSON-LD has been something I really want to support in federated software I work on, but the existing libraries (for rust) don't offer everything I'd want (framing) which means I'd be hardcoding outgoing documents. Not the end of the world but it would be nice if the ecosystem was more established

kopper :colon_three:'s avatar
kopper :colon_three:

@kopper@not-brain.d.on-t.work · Reply to 洪 民憙 (Hong Minhee) :nonbinary:'s post

@hongminhee from the point of view of someone who is "maintaining" a JSON-LD processing fedi software and has implemented their own JSON-LD processing library (which is, to my knowledge, the fastest in it's programming language), JSON-LD is pure overhead. there is nothing it allows for that can't be done with

1. making fields which take multiple values explicit
2. always using namespaces and letting HTTP compression take care of minimizing the transfer

without JSON-LD, fedi software could use zero-ish-copy deserialization for a majority of their objects (when strings aren't escaped) through tools like serde_json and Cow<str>, or
System.Text.Json.JsonDocument. JSON-LD processing effectively mandates a JSON node DOM (in the algorithms standardized, you may be able to get rid of it with Clever Programming)

additionally, due to JSON-LD 1.1 features like @type:@json, you can not even fetch contexts ahead of time of running JSON DOM transformations, meaning all JSON-LD code has to be async (in the languages which has the concept), potentially losing out on significant optimizations that can't be done in coroutines due to various reasons (e.g. C# async methods can't have ref structs, Rust async functions usually require thread safety due to tokio's prevalence, even if they're ran in a single-threaded runtime)

this is
after context processing introducing network dependency to the deserialization of data, wasting time and data on non-server cases (e.g. activitypub C2S). sure you can cache individual contexts, but then the context can change underneath you, desynchronizing your cached context and, in the worst case, opening you up to security vulnerabilities

json-ld is not my favorite part of this protocol