May 10, 20265 min readjavascriptautomationseoai

Shipping Distribution: Indexing, Social Fan-out, and AI Cascades

Building a geopolitical news engine requires more than just generation. This week I focused on automated distribution, social fan-out queues, and hardening the SEO pipeline for GeoPolitiq.

Shipping Distribution: Indexing, Social Fan-out, and AI Cascades

Building a content engine in 2024 is a battle against invisibility. You can have the most sophisticated LLM prompts and the cleanest UI, but if the crawlers don't see you and the social graphs don't carry you, you're just screaming into a void.

This week at GeoPolitiq, the focus shifted from "how do we generate this?" to "how do we make sure the world sees this?" I pushed 10 commits that overhauled our distribution layer, hardened our bot detection, and refined the AI generation pipeline to be more regionally aware.

Here is the breakdown of what went into the repository this week.

The Distribution Problem: Beyond the Sitemap

For a news-oriented platform, waiting for Google to crawl your sitemap is a losing strategy. News is perishable. If an article about a diplomatic shift in the Indo-Pacific isn't indexed within minutes, it loses half its value.

I implemented a dual-pronged indexing strategy: Google Indexing API and IndexNow.

Google Indexing API + IndexNow

Most people think the Google Indexing API is only for job postings or livestream events. In practice, it’s the only way to get near-instantaneous crawls. I hooked this into the publish lifecycle. The moment a piece of content clears the generation hurdle, a POST request hits the Google API.

But Google isn't the only player. IndexNow covers Bing, Yandex, and Seznam. By implementing both, we ensure that as soon as a URL is live, the major search engines are pinged.

async function notifySearchEngines(url) {
  const indexNowPromise = axios.post('https://www.bing.com/indexnow', {
    host: 'geopolitiq.com',
    key: process.env.INDEXNOW_KEY,
    urlList: [url]
  });

  const googlePromise = googleClient.indexing.urlNotifications.publish({
    requestBody: {
      url: url,
      type: 'URL_UPDATED'
    }
  });

  return Promise.allSettled([indexNowPromise, googlePromise]);
}

I also fixed a nagging issue with the sitemap-news.xml. It was returning empty under certain edge cases where the date filtering was too aggressive. Now, it correctly captures the last 48 hours of activity, which is critical for Google News inclusion.

Social Fan-out: The Queue System

Automating social media isn't just about hitting an API; it's about managing rate limits and platform-specific quirks. This week I added a social fan-out queue.

Instead of making the publishing process wait for five different social APIs to respond, the system now pushes a job to a fan-out queue. This handles reposting to Twitter (X), LinkedIn, and others.

The Tumblr and Nostr Integration

I spent a significant chunk of time on maintenance scripts for the more "niche" but high-signal platforms.

  1. Tumblr OAuth Setup: Setting up the OAuth 1.0a flow for Tumblr is always more painful than it should be in a world of Bearer tokens, but it's done.
  2. Nostr (NIP-09): I added support for NIP-09 (Deletions). If we have to burn a generation because of a factual error or a prompt hallucination, we need to be able to broadcast that deletion across the Nostr relay network.
  3. Backfills: I wrote scripts to backfill our Google Indexing and social history for older posts that were published before this infrastructure was live.

Refactoring AI Generation: Regional Sonar

The core of GeoPolitiq is AI-driven analysis. Previously, the generation was a bit too monolithic. I've refactored this into what I'm calling a Regional Sonar approach.

Instead of a generic "write about this news," the system now uses per-region configurations. The prompts for a story about Eastern Europe are fundamentally different from those about Sub-Saharan Africa. They look for different indicators—energy security in the former, infrastructure debt in the latter.

The Image Cascade

Visuals shouldn't be an afterthought. I implemented an image cascade system. If a primary image generation fails or returns a low-confidence result, the system falls back to a secondary prompt or a high-quality placeholder based on the region. This ensures we never ship a "broken" looking article.

I also deepened the prompts. We aren't just looking for summaries anymore; we're looking for implications. The goal is to answer "Why does this matter to a strategist?" rather than "What happened?"

Infrastructure and Hardening

As the site gains visibility, the bots follow. Not the good bots (like Google), but the scrapers and the vulnerability scanners.

Bot Detection and Analytics

I implemented a more aggressive bot detection layer. It’s not just about blocking IPs; it’s about analyzing the Referrer and the behavior pattern. I added a rejection analytics dashboard in the admin panel so I can see exactly who is trying to scrape the site and why.

If a request comes in with a suspicious header or from a known scraper range, it gets hit with a 403, and that event is logged with the full context. This helps in fine-tuning the WAF rules without locking out legitimate users.

Admin Upgrades and Manual Triggers

I fixed a bug in the manual-trigger hook. Sometimes you just need to re-run a generation or re-push a social post manually from the dashboard. The hook was failing due to a mismatch in environment-driven configs. That’s resolved, and the admin UI now has better feedback for these long-running tasks.

SEO and Feeds

I added /feed.xml and /llms.txt aliases. The latter is becoming increasingly important for the "AI-friendly" web. It provides a markdown-optimized version of the site's content for other LLMs to consume efficiently.

I also started UTM-tagging the RSS feeds. This seems small, but it’s the only way to differentiate between a user reading in Feedly vs. someone clicking a link on a social platform. Data is only useful if it’s clean.

The Monetization Layer: Adsterra

Let’s talk about the elephant in the room: sustainability. I’ve integrated Adsterra ads into the layout. While I’m not a fan of cluttered webs, the compute costs for running high-end LLMs and image generators aren't zero. The integration is handled via the admin panel, allowing me to toggle placements or swap providers if the UX takes too much of a hit.

What’s Next?

The foundation is now solid. We have:

The next step is to refine the "Scored Related Posts" algorithm. Currently, it’s a basic keyword match. I want to move toward a vector-based similarity search so that if you’re reading about lithium mining in Chile, the system suggests articles on EV supply chains in China, even if the keywords don't perfectly overlap.

GeoPolitiq is becoming more than a project; it's becoming an autonomous organism. 10 commits this week, 0 PRs (because I'm moving fast and breaking things in my own repo), and 9 new issues to tackle.

Onward.



Discussion0

Markdown supported. Rate-limited to 5 / minute. 0/5000

Be the first to comment.