📊 Full opportunity report: When a Content Network Starts Publishing to Itself on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A network of 474 WordPress sites started predominantly publishing to a small subset of its own sites, leading to content imbalance. This reveals flaws in automated content routing and supply management, with significant implications for large-scale publishing systems.

A large automated content network consisting of 474 WordPress sites has started predominantly publishing to only a small subset of its own sites, leading to a highly uneven distribution of content. This shift was confirmed through a recent 28-day audit, which revealed that 80% of all posts were concentrated on just 8% of the sites. The development matters because it exposes systemic flaws in the network’s content routing and supply management, risking spam-like behavior and content starvation across the network.

The network operates on two separate systems: Stenvrik, which gathers and evaluates news signals, and DojoClaw, which rewrites and distributes content across the sites. These systems are decoupled but communicate via a local HTTP contract. The recent audit showed that the majority of content was being published to only a handful of sites, mainly in the technology and AI categories, while more than half of the sites received no content at all. This imbalance was not caused by a single fault but by two distinct issues: within-topic concentration and supply-demand mismatch.

The within-topic concentration was driven by the LLM matcher repeatedly surfacing the same tech sites, while the supply mismatch stemmed from the fact that most content was tech-focused, but the majority of sites covered other categories like Home, Health, and Food, which received almost no material. To address this, the content engine was updated with new routing controls, including caps on site publication frequency and a network-wide recency ordering that prioritized idle sites, allowing dormant sites to participate more actively. These fixes aim to rebalance the distribution and prevent the network from self-restricting its growth and diversity.

Balancing a 474-site network — ThorstenMeyerAI.com
ThorstenMeyerAI.com
AI & Tooling · Engineering Note
Systems at scale

When a content network starts publishing to itself

A 474-site network quietly collapsed onto 38 of its own favorites while half the catalog went dark. The throughput graph looked fine. The fix wasn’t one thing — it was two causes and a three-part repair across two decoupled systems.

Stenvrik

News-intelligence layer

Ingests hundreds of feeds, scores & geo-tags stories, surfaces what’s trending.

SUPPLY · what’s worth covering
DojoClaw

AI content engine

Rewrites a story in each site’s voice and fans it out across the catalog.

PLACEMENT · where it lands & how it reads
01The symptom

80% of output on 8% of sites

A 28-day audit, bucketed per site, was lopsided in a way the totals had hidden. Every individual placement was “correct” — the aggregate was a slow-motion failure.

Where 28 days of syndication actually landed

474-site catalog · per-site audit
Top 38 sites8% of catalog
80% of all posts
Top 4 sitesall tech titles
200+ articles/week each
249 sites53% of catalog
ZERO posts — half the network dark
02The diagnosis · refuse the obvious
Amazon

WordPress site management tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Not one bug — two independent causes

The tempting move is to blame the matcher and move on. The data showed two distinct problems living on two different systems, each needing its own fix.

Cause 1 · DojoClaw

Within-topic concentration

The matcher kept surfacing the same broad tech sites for every tech story, and rotation only shuffled candidates within the matched pool. A site that never entered the pool could never get a turn — fair only among the already-chosen.

Cause 2 · Stenvrik

Supply ≠ demand

53% of supplied content was tech/AI — but only ~13% of sites are. The catalog skews the other way, so those sites starved for on-topic material.

supply
tech/AI content in53%
demand
tech/AI sites in catalog~13%
03The load balancer · flip it
Amazon

content distribution automation software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Watch the network rebalance

Each square is one of the 474 sites; color is how much it’s publishing. Toggle the selection logic to see placement spread off the red-hot favorites and into the dark long tail.

Placement simulator

Same matcher relevance gate either way — the only change is how candidates are ordered after it.

38
sites carrying 80% of posts
249
dark sites · zero posts
overloaded
hottest sites at ~30/day
dark · 0 light healthy busy overloaded
04The three-part fix
Amazon

AI content rewriting tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Placement, supply, throughput

Two causes meant the fix had to touch both systems — and only then could the ceiling rise without re-concentrating the load.

1

Placement levers

DojoClaw
  • Per-site weekly cap — any site over 25 posts/7d drops from the pool, pushing selection into the long tail (relaxes only if it would starve a fan-out).
  • Global LRU — order by network-wide recency, not just within-topic, so sites idle across the whole network float to the top.
  • Starvation floor — guaranteed by construction: the most-idle eligible site is always within the picks.
2

Supply rebalance

Stenvrik
  • Audited existing feeds for liveness — removed ones returning HTTP 200 but zero items (broken RSS).
  • Added a verified batch across Home, Garden, Health, Food, Fashion, Auto, Science, Pets & more — every feed fetched live first, weighted to the most idle categories.
  • Flagged throttled feeds (big publishers exposing only 1–2 items) for replacement rather than burying the risk.
3

Throughput raise

Scheduler
  • Fan-out width maxSites 5 → 7 — the extra slots land on fresh sites because the cap is now enforcing.
  • Quota depth K 2 → 3 — every category’s daily cap scaled ×1.5.
  • Honest note: a documented ~950/day intent the code never delivered (units quirk) stays gated behind a sign-off.
05What it adds up to
Amazon

content network monitoring tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

The scoreboard — with an honest asterisk

The change is behavioral: it shapes future placement, it doesn’t retroactively rescue the month sites sat dark. The proof is in the next weeks of data — which is why the instrumentation is the real deliverable.

Metric
Before
After
Concentration
80% on 38 sites
cap + LRU + floor
Dormant sites
249 (53%)
shrinking ↓
Feed sources
245
271 verified
Daily ceiling
~188/day
~280/day · +49%
Fan-out width
5
7
Why two systems, not one

Supply and placement are genuinely separate concerns. Diagnosing the imbalance meant looking at both sides and seeing they disagreed. A clean boundary made a failure that spanned both legible — good system boundaries organize thought, not just code.

The tradeoff taken

Ordering by load & idleness sacrifices a little topical ranking for dramatically better coverage. All candidates already cleared the relevance gate — so it’s a deliberate trade, not a regression.

ThorstenMeyerAI.com
Stenvrik (news-intelligence) ↔ DojoClaw (content engine) · figures reflect the May 2026 engineering audit & the behavioral changes made in response · the network’s response is being tracked.

Implications of Self-Publishing in Large Content Networks

This development highlights how automated content distribution systems can unintentionally favor certain sites, leading to content saturation and neglect of others. For publishers and content platforms, such imbalance can cause search engine penalties for spam, reduce overall content diversity, and diminish the value of the network for both users and publishers. It underscores the importance of monitoring systemic behaviors, not just individual decisions, in large automated systems, especially as they scale.

System Design and Past Challenges in Automated Content Distribution

Large automated content networks rely on decoupled systems for content evaluation and distribution. Historically, these systems have faced challenges with balancing content supply and demand across diverse site categories. Previous issues included over-concentration of content in specific niches and underrepresentation of others, often due to algorithmic biases or routing logic. The recent self-publishing behavior underscores the ongoing need for dynamic routing controls and systemic oversight to maintain healthy distribution and avoid self-reinforcing imbalances.

"The network decided, without explicit instruction, to publish predominantly to its favorite sites, creating a lopsided content ecosystem that risks both spammy signals and content starvation."

— Thorsten Meyer

Unclear Extent and Future Impact of Self-Publishing Pattern

It is not yet clear how widespread this self-publishing behavior will become or whether it will resolve fully with the current fixes. The long-term impact on search engine rankings, content quality, and network health remains to be seen, and ongoing monitoring is required to assess whether the adjustments are sufficient or if further systemic changes are needed.

Next Steps for Monitoring and Adjusting Content Distribution

The network administrators plan to continue monitoring the distribution patterns closely, with additional adjustments to routing algorithms and caps. Future audits will evaluate whether the imbalance diminishes and if dormant sites begin to receive more content. Further development may include more granular controls to prevent similar issues from recurring at scale.

Key Questions

Why did the network start publishing mainly to its favorite sites?

The system's routing logic favored certain sites due to within-topic concentration and supply-demand mismatches, causing content to accumulate on a few sites while others remained inactive.

Could this imbalance harm the network’s overall effectiveness?

Yes, concentrated publishing can lead to search engine penalties, reduce content diversity, and diminish the value of the network for users and publishers.

Are the current fixes sufficient to prevent future imbalance?

The fixes, including publication caps and recency-based routing, are designed to rebalance distribution, but ongoing monitoring will determine their long-term effectiveness.

What should other automated content systems learn from this?

Automated systems must incorporate dynamic controls and systemic oversight to prevent self-reinforcing imbalances that can undermine network health.

Source: ThorstenMeyerAI.com

You May Also Like

Insurance Challenges for Nano-Enabled Products

What are the key insurance challenges for nano-enabled products and how can insurers navigate this evolving landscape?

Nanotechnology Patents and Global IP Landscape

Breaking down the complexities of nanotechnology patents worldwide reveals critical insights that can shape your strategic IP decisions.

The Atlas. What the framework is.

An in-depth analysis of the Post-Labor Transition Atlas, its empirical basis, and its implications for AI-driven labor displacement and policy responses.

The referral. How AI search severs the content-for-traffic contract that funded the open web.

AI search now answers queries directly, ending the traditional referral traffic model that funded independent publishers, with significant industry impacts.