Musubi Resources:
Insights & Best Practices
How to audit your fixed ML classifier
Four signs that a fixed ML Classifier might not be working for your Trust & Safety operations. What the symptoms are, how to diagnose, and what might work instead.
Read article
How to use LLMs for Content Moderation (in 2026)
A lot has changed in how T&S teams use LLMs for content moderation. This is a practitioner's guide to what's working in 2026: model selection, policy engineering, agentic workflows, and the operational practices that separate mature systems from experimental ones.
Rule-Based vs. Fixed ML vs. LLM Content Moderation: How to Choose
A practical comparison of rule-based, ML, and LLM-based content moderation for T&S teams — how each approach works, where each breaks down, and how to think about the decision in 2026.
We Tried to Detect Bots in 500 Comments. We Found a More Interesting Problem.
Can you tell which online comments were written by a bot? We scored 500 of them across eight dimensions and a library of 60+ AI-writing patterns. The answer changed what we think platforms should be optimizing for.
How Bluesky Kept Spam in Check While Growing to 41 Million Users
How Musubi helped Bluesky achieve 60x faster spam takedowns and 99.8% decision accuracy while scaling to 41 million users.
LLM Content Moderation: Implementation Guide for Trust & Safety Teams
A practical guide to LLM content moderation for T&S teams: model selection, integration architecture, bias mitigation, golden datasets, and human oversight. Real deployment pitfalls and solutions from production systems.
Agreement Observability for AI Moderation
When AI moderation drifts in production, the signal often comes too late. This post walks through the Agreement Observability tool we built to track model-moderator agreement in real time and simulate threshold tradeoffs before they become production problems.
Next