When a Content Network Starts Publishing to Itself

📊 Full opportunity report: When a Content Network Starts Publishing to Itself on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A content network with 474 sites is now publishing a majority of its content to a small subset of sites, leaving over half the network inactive. This shift exposes systemic issues in content distribution algorithms and supply-demand mismatches.

A large automated content network with 474 WordPress sites has begun predominantly publishing its content to just a few of its sites, leaving over half the network inactive. This shift was confirmed through recent audit data and reveals systemic issues in content distribution algorithms, which could impact the network’s overall health and search engine visibility.

The network is managed by two distinct systems: Stenvrik, which sources and judges editorial content, and DojoClaw, which handles content rewriting and distribution across sites. Recent analysis shows that 80% of all posts are concentrated on only 8% of the sites, with 249 sites receiving no posts at all in a 28-day period. This imbalance indicates a tendency for the system to favor certain sites, effectively marginalizing the rest.

Further investigation revealed two main causes: first, within-topic concentration, where the content matching system repeatedly surfaced the same popular sites in tech categories, limiting diversity. Second, a supply mismatch, where the majority of content was tech-related but most sites in the network covered other niches like home, health, and food, which received little to no relevant content. This dual problem led to a self-reinforcing cycle of content hoarding on favored sites and neglect of others.

To address this, adjustments were made to DojoClaw’s selection algorithms, including caps on weekly posts per site, global recency-based ordering to favor idle sites, and measures to ensure more equitable distribution. These changes aim to diversify content placement and reduce the network’s reliance on a small subset of sites.

Balancing a 474-site network — ThorstenMeyerAI.com
ThorstenMeyerAI.com
AI & Tooling · Engineering Note
Systems at scale

When a content network starts publishing to itself

A 474-site network quietly collapsed onto 38 of its own favorites while half the catalog went dark. The throughput graph looked fine. The fix wasn’t one thing — it was two causes and a three-part repair across two decoupled systems.

Stenvrik

News-intelligence layer

Ingests hundreds of feeds, scores & geo-tags stories, surfaces what’s trending.

SUPPLY · what’s worth covering
DojoClaw

AI content engine

Rewrites a story in each site’s voice and fans it out across the catalog.

PLACEMENT · where it lands & how it reads
01The symptom

80% of output on 8% of sites

A 28-day audit, bucketed per site, was lopsided in a way the totals had hidden. Every individual placement was “correct” — the aggregate was a slow-motion failure.

Where 28 days of syndication actually landed

474-site catalog · per-site audit
Top 38 sites8% of catalog
80% of all posts
Top 4 sitesall tech titles
200+ articles/week each
249 sites53% of catalog
ZERO posts — half the network dark
02The diagnosis · refuse the obvious
Build a WordPress Website From Scratch 2026: Step-by-step: New WordPress 6.9 and Gutenberg: WordPress 7: What is new?

Build a WordPress Website From Scratch 2026: Step-by-step: New WordPress 6.9 and Gutenberg: WordPress 7: What is new?

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Not one bug — two independent causes

The tempting move is to blame the matcher and move on. The data showed two distinct problems living on two different systems, each needing its own fix.

Cause 1 · DojoClaw

Within-topic concentration

The matcher kept surfacing the same broad tech sites for every tech story, and rotation only shuffled candidates within the matched pool. A site that never entered the pool could never get a turn — fair only among the already-chosen.

Cause 2 · Stenvrik

Supply ≠ demand

53% of supplied content was tech/AI — but only ~13% of sites are. The catalog skews the other way, so those sites starved for on-topic material.

supply
tech/AI content in53%
demand
tech/AI sites in catalog~13%
03The load balancer · flip it
Mastering GitHub Actions: Advance your automation skills with the latest techniques for software integration and deployment

Mastering GitHub Actions: Advance your automation skills with the latest techniques for software integration and deployment

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Watch the network rebalance

Each square is one of the 474 sites; color is how much it’s publishing. Toggle the selection logic to see placement spread off the red-hot favorites and into the dark long tail.

Placement simulator

Same matcher relevance gate either way — the only change is how candidates are ordered after it.

38
sites carrying 80% of posts
249
dark sites · zero posts
overloaded
hottest sites at ~30/day
dark · 0 light healthy busy overloaded
04The three-part fix
SEO Competitor Audit Journal: Perfect SEO tool and journal to audit, track and log your competitor’s SEO strategy

SEO Competitor Audit Journal: Perfect SEO tool and journal to audit, track and log your competitor’s SEO strategy

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Placement, supply, throughput

Two causes meant the fix had to touch both systems — and only then could the ceiling rise without re-concentrating the load.

1

Placement levers

DojoClaw
  • Per-site weekly cap — any site over 25 posts/7d drops from the pool, pushing selection into the long tail (relaxes only if it would starve a fan-out).
  • Global LRU — order by network-wide recency, not just within-topic, so sites idle across the whole network float to the top.
  • Starvation floor — guaranteed by construction: the most-idle eligible site is always within the picks.
2

Supply rebalance

Stenvrik
  • Audited existing feeds for liveness — removed ones returning HTTP 200 but zero items (broken RSS).
  • Added a verified batch across Home, Garden, Health, Food, Fashion, Auto, Science, Pets & more — every feed fetched live first, weighted to the most idle categories.
  • Flagged throttled feeds (big publishers exposing only 1–2 items) for replacement rather than burying the risk.
3

Throughput raise

Scheduler
  • Fan-out width maxSites 5 → 7 — the extra slots land on fresh sites because the cap is now enforcing.
  • Quota depth K 2 → 3 — every category’s daily cap scaled ×1.5.
  • Honest note: a documented ~950/day intent the code never delivered (units quirk) stays gated behind a sign-off.
05What it adds up to
Express Schedule Free Employee Scheduling Software [PC/Mac Download]

Express Schedule Free Employee Scheduling Software [PC/Mac Download]

Simple shift planning via an easy drag & drop interface

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

The scoreboard — with an honest asterisk

The change is behavioral: it shapes future placement, it doesn’t retroactively rescue the month sites sat dark. The proof is in the next weeks of data — which is why the instrumentation is the real deliverable.

Metric
Before
After
Concentration
80% on 38 sites
cap + LRU + floor
Dormant sites
249 (53%)
shrinking ↓
Feed sources
245
271 verified
Daily ceiling
~188/day
~280/day · +49%
Fan-out width
5
7
Why two systems, not one

Supply and placement are genuinely separate concerns. Diagnosing the imbalance meant looking at both sides and seeing they disagreed. A clean boundary made a failure that spanned both legible — good system boundaries organize thought, not just code.

The tradeoff taken

Ordering by load & idleness sacrifices a little topical ranking for dramatically better coverage. All candidates already cleared the relevance gate — so it’s a deliberate trade, not a regression.

ThorstenMeyerAI.com
Stenvrik (news-intelligence) ↔ DojoClaw (content engine) · figures reflect the May 2026 engineering audit & the behavioral changes made in response · the network’s response is being tracked.

Implications of Self-Publishing on Network Diversity

This development matters because it exposes how automated content systems can inadvertently create distribution biases, leading to potential SEO issues and reduced content diversity. Overloading a few sites risks search engine penalties for spam-like behavior and diminishes the value of the entire network. It also highlights the importance of algorithmic adjustments to maintain a healthy, balanced content ecosystem, especially as automation scales.

System Design and Past Distribution Challenges

The network’s architecture relies on two interconnected systems: Stenvrik, which sources and evaluates content based on real-time signals, and DojoClaw, which manages content rewriting and placement. Historically, the network faced issues with uneven distribution, with a small number of sites dominating the output. Previous efforts focused on tweaking routing logic, but the recent shift toward self-publishing to favored sites reveals deeper systemic behaviors that can reinforce bias if left unaddressed. The 28-day audit was the first comprehensive analysis to uncover this pattern of concentration and neglect across the network.

"Without intervention, the network risks becoming a few dominant sites and many inactive ones, which undermines its purpose and search engine performance."

— Content network engineer

Extent and Future Impact of Self-Publishing Pattern

It remains unclear whether this self-publishing trend is a temporary adjustment or a persistent systemic shift. The long-term impact on search rankings, content diversity, and network health is still being evaluated. Additionally, how widespread similar behaviors are in other automated networks is unknown, and further monitoring is needed to confirm whether the recent algorithmic tweaks will stabilize distribution.

Planned Algorithmic Adjustments and Monitoring

The team plans to continue refining the distribution algorithms, including dynamic caps and recency-based site prioritization, to promote more equitable content spread. Ongoing audits and performance metrics will track whether these changes restore balance. Further development may involve more granular controls to prevent over-concentration and to ensure all sites receive relevant content over time.

Key Questions

Why is the network publishing mostly to a few sites?

The algorithms favor certain sites based on past performance and topical relevance, creating a feedback loop that concentrates content on a small subset of sites.

Could this imbalance affect search engine rankings?

Yes, overloading a few sites with too much content may trigger spam filters or reduce overall content quality signals, harming SEO performance.

Are these issues specific to this network?

While specific to this system, similar distribution biases can occur in other automated content networks if algorithms are not carefully managed.

What measures are being taken to fix this problem?

Adjustments include site publishing caps, recency-based site prioritization, and distribution controls aimed at promoting diversity and fairness across the network.

Source: ThorstenMeyerAI.com

You May Also Like

The Real Problem With AI-Generated Advice in Everyday Life

Discover why trusting AI advice can be risky, how it often falls short, and what you need to watch out for in everyday decisions. Stay informed and smart.

Why I’m Forced to Say Farewell: Google Management Has Lost Its Moral Compass

A senior Google security leader resigns, citing loss of moral principles due to company’s new deals with military and environmental policies.

The prospectus. Where the AI labs’ singular governance history meets the auditor.

OpenAI is expected to file confidentially for its historic IPO, exposing complex governance structures and legal issues that impact investor valuation.

The Power Bottleneck: AI Data Centers and the Grid Cliff Approaching 2027-2028

AI data center growth faces a power supply constraint, with grid expansion timelines lagging behind hyperscaler capex, risking deployment delays by 2028.