[{"data":1,"prerenderedAt":110},["ShallowReactive",2],{"writing-ai-content-tagging":3},{"id":4,"title":5,"author":6,"authorImage":7,"body":8,"dateModified":92,"datePublished":93,"description":14,"editorsPick":94,"extension":95,"intro":96,"mainImageId":97,"meta":98,"navigation":99,"ogImage":100,"path":101,"readingTimeMinutes":102,"schemaImages":103,"seo":104,"series":97,"shortIntro":105,"sitemap":106,"slug":107,"stem":108,"twitterImage":100,"__hash__":109},"writing/writing/ai-content-tagging.md","AI can tag your whole content archive in an afternoon. That's where the trouble starts.","Daniel Betts","/images/writing/author-daniel-betts.webp",{"type":9,"value":10,"toc":85},"minimark",[11,15,18,21,26,32,35,38,41,45,50,53,56,59,67,70,74,79,82],[12,13,14],"p",{},"Nobody got into publishing to manage tags. They're the plumbing of any content archive, the labels that let a topic hub gather together everything ever written on a subject. If you want \"related articles\" to actually be related; if you want search to work well; basically, if you want a good site - you need good tagging. When it is good, nobody notices, your site works, your readers stay. When it's bad, though, your readers don't find the content they're interested in and they leave, without writing to tell you why.",[12,16,17],{},"And they are almost never good, because of how they get made. Tagging by hand fails in two very predictable ways. Firstly, time pressure. Editors are busy actually publishing articles - they don't have time for tagging, so it gets skipped, or it gets done quickly, which is to say, not very well. Or when it is done properly, the same idea ends up filed three slightly different ways: \"subscriber acquisition\" on one piece, \"acquiring subscribers\" on the next, \"subscriptions - growth\" on a third. Three editors, or one editor on three different days. Each near-duplicate quietly splits the very group of articles the tag was meant to pull together. The hub that should show forty pieces shows twelve. The other twenty-eight sit under labels just different enough to be invisible to each other, and to everyone looking.",[12,19,20],{},"So when someone tells you that AI can read every article you publish and tag it in seconds, of course it sounds like the answer to all your prayers. And it absolutely can. That part is real, and it really could be as simple as an afternoon's work to wire up. The first time you run it, it will likely look like magic, giving you the detailed tagging that you've never managed to get from those busy, inconsistent humans. But this is also going to be where most of the trouble starts, because the easy version and the version you can actually live with will look identical for about a week.",[22,23,25],"h2",{"id":24},"the-fix-that-makes-the-mess-worse","The fix that makes the mess worse",[27,28,29],"blockquote",{},[12,30,31],{},"A model left to pick its own tags doesn't tidy your taxonomy. It floods it.",[12,33,34],{},"Point a model at your articles with a free hand and it will tag each one well. The catch is hiding in the words \"each one\". Asked to tag a piece about holding on to readers, it coins \"reader retention\". On the next, written a little differently, \"audience retention\". On a third, \"keeping subscribers\". Every one of those is an entirely sensible tag. Each is decided in isolation. None of them knows the other two exist. So rather than settling on a shared vocabulary, the model does, at machine speed and machine scale, exactly what your three editors did by hand. It sprawls.",[12,36,37],{},"Only now there's no afternoon's worth of articles to fix. There are thousands, already tagged, sitting under a taxonomy with three hundred near-synonyms in it where fifty real topics should be. You haven't organised the archive. You've actually just succeeded in making it more disorganised than years of neglect ever managed, and in a single stroke, in bulk.",[12,39,40],{},"The obvious instinct is to clean it up afterwards. Run a de-duplication pass, merge the synonyms, reconcile the mess once it exists. But that's a losing race against a process that coins fresh variants faster than anyone can merge them. Consistency isn't something you can bolt on at the end; either it holds across the whole archive by design, or it doesn't hold at all. Getting that part right is most of the real work, and it has almost nothing to do with calling an API.",[22,42,44],{"id":43},"and-thats-the-trap-you-hit-first-not-the-only-one","And that's the trap you hit first, not the only one",[27,46,47],{},[12,48,49],{},"A taxonomy is an editorial asset, and the people who own the publication should stay the people who own its vocabulary",[12,51,52],{},"The other problems come up later when you're further into the project.",[12,54,55],{},"There's cost. Every article tagged is a billed call to a model provider, so the cost scales with the size of your archive. A back-catalogue of a hundred thousand articles is exactly where a careless design turns into an open chequebook. The expensive failures here are the quiet ones: they don't show up in the demo, they show up on the invoice a month later, with half the archive already processed. Keeping that under control is an engineering discipline in its own right.",[12,57,58],{},"There's the fact that a tag-on-publish system has to run untended for months inside a live site, surviving restarts, deploys and the occasional malformed article, without ever tagging the same thing twice or quietly falling over.",[12,60,61,62,66],{},"And there's the most insidious one: a model that is, every so often, confidently and plausibly wrong. At scale that poisons an archive with tags that look right and aren't. So the system has to be designed and built to be wrong ",[63,64,65],"em",{},"safely",", and to show its working to a human who can overrule it.",[12,68,69],{},"That last point matters more than it first appears, and it's worth being clear about. Just because you can hand your taxonomy to a machine and walk away, doesn't mean you should. The shape that works is the model suggesting and a person deciding: tags applied automatically, flagged plainly as machine-made, and an editor able to glance, accept, or throw them out in a second. Not because the AI can't be trusted with the easy cases - it can, that's the whole point of using it - but because a taxonomy is an editorial asset, and the people who own the publication should stay the people who own its vocabulary. A system that degrades, on a bad week, to \"tagged but not yet checked\" is miles better than one that degrades to \"never tagged\". A system that takes the decision away from editors altogether is how you end up trusting none of it.",[22,71,73],{"id":72},"why-it-looks-so-easy","Why it looks so easy",[27,75,76],{},[12,77,78],{},"The demo and the thing you can run for a year are different objects wearing the same clothes.",[12,80,81],{},"The reason AI tagging is so easy to get wrong is exactly the reason it looks so easy to get right. Suggesting a tag for one article is a solved problem. Anyone can stand it up in an afternoon and it'll look like the future. Tagging a hundred thousand articles in a way that leaves the archive more findable next year instead of less is a different problem entirely, and it happens to look the same from the outside.",[12,83,84],{},"The gap between the two is invisible right up until you're standing in it: three hundred tags where only fifty belong, a cleanup bill nobody budgeted for, and the slow realisation that the clever shortcut quietly cost you more than it saved. Tags were always the unglamorous part of running a publication, and that hasn't changed. What's changed is that now you can lose the one thing they were ever for - readers finding what they came for - faster and far more convincingly than you ever could by hand.",{"title":86,"searchDepth":87,"depth":87,"links":88},"",2,[89,90,91],{"id":24,"depth":87,"text":25},{"id":43,"depth":87,"text":44},{"id":72,"depth":87,"text":73},"2026-06-30T15:31:08.9188892","2026-06-30T15:31:08.9187046",false,"md","AI can tag your entire content archive in an afternoon. The trouble is what it does to your taxonomy - and why consistency, not speed, is the hard part.",null,{},true,"https://api.missionsystems.co.uk/api/images/4","/writing/ai-content-tagging",6,[100],{"title":5,"description":14},"Tagging one article looks like magic. Tagging a hundred thousand is anything but.",{"loc":101},"ai-content-tagging","writing/ai-content-tagging","_U6VvJyphPnhE3XAiOFlraOjExPqRj46H92JbaiUgVU",1782833630971]