Shipping Distribution: Indexing, Social Fan-out, and AI Cascades
Building a content engine in 2024 is a battle against invisibility. You can have the most sophisticated LLM prompts and the cleanest UI, but if the crawlers don't see you and the social graphs don't carry you, you're just screaming into a void.
This week at GeoPolitiq, the focus shifted from "how do we generate this?" to "how do we make sure the world sees this?" I pushed 10 commits that overhauled our distribution layer, hardened our bot detection, and refined the AI generation pipeline to be more regionally aware.
Here is the breakdown of what went into the repository this week.
The Distribution Problem: Beyond the Sitemap
For a news-oriented platform, waiting for Google to crawl your sitemap is a losing strategy. News is perishable. If an article about a diplomatic shift in the Indo-Pacific isn't indexed within minutes, it loses half its value.
I implemented a dual-pronged indexing strategy: Google Indexing API and IndexNow.
Google Indexing API + IndexNow
Most people think the Google Indexing API is only for job postings or livestream events. In practice, it’s the only way to get near-instantaneous crawls. I hooked this into the publish lifecycle. The moment a piece of content clears the generation hurdle, a POST request hits the Google API.
But Google isn't the only player. IndexNow covers Bing, Yandex, and Seznam. By implementing both, we ensure that as soon as a URL is live, the major search engines are pinged.
async function notifySearchEngines(url) {
const indexNowPromise = axios.post('https://www.bing.com/indexnow', {
host: 'geopolitiq.com',
key: process.env.INDEXNOW_KEY,
urlList: [url]
});
const googlePromise = googleClient.indexing.urlNotifications.publish({
requestBody: {
url: url,
type: 'URL_UPDATED'
}
});
return Promise.allSettled([indexNowPromise, googlePromise]);
}
I also fixed a nagging issue with the sitemap-news.xml. It was returning empty under certain edge cases where the date filtering was too aggressive. Now, it correctly captures the last 48 hours of activity, which is critical for Google News inclusion.
Social Fan-out: The Queue System
Automating social media isn't just about hitting an API; it's about managing rate limits and platform-specific quirks. This week I added a social fan-out queue.
Instead of making the publishing process wait for five different social APIs to respond, the system now pushes a job to a fan-out queue. This handles reposting to Twitter (X), LinkedIn, and others.
The Tumblr and Nostr Integration
I spent a significant chunk of time on maintenance scripts for the more "niche" but high-signal platforms.
- Tumblr OAuth Setup: Setting up the OAuth 1.0a flow for Tumblr is always more painful than it should be in a world of Bearer tokens, but it's done.
- Nostr (NIP-09): I added support for NIP-09 (Deletions). If we have to burn a generation because of a factual error or a prompt hallucination, we need to be able to broadcast that deletion across the Nostr relay network.
- Backfills: I wrote scripts to backfill our Google Indexing and social history for older posts that were published before this infrastructure was live.
Refactoring AI Generation: Regional Sonar
The core of GeoPolitiq is AI-driven analysis. Previously, the generation was a bit too monolithic. I've refactored this into what I'm calling a Regional Sonar approach.
Instead of a generic "write about this news," the system now uses per-region configurations. The prompts for a story about Eastern Europe are fundamentally different from those about Sub-Saharan Africa. They look for different indicators—energy security in the former, infrastructure debt in the latter.
The Image Cascade
Visuals shouldn't be an afterthought. I implemented an image cascade system. If a primary image generation fails or returns a low-confidence result, the system falls back to a secondary prompt or a high-quality placeholder based on the region. This ensures we never ship a "broken" looking article.
I also deepened the prompts. We aren't just looking for summaries anymore; we're looking for implications. The goal is to answer "Why does this matter to a strategist?" rather than "What happened?"
Infrastructure and Hardening
As the site gains visibility, the bots follow. Not the good bots (like Google), but the scrapers and the vulnerability scanners.
Bot Detection and Analytics
I implemented a more aggressive bot detection layer. It’s not just about blocking IPs; it’s about analyzing the Referrer and the behavior pattern. I added a rejection analytics dashboard in the admin panel so I can see exactly who is trying to scrape the site and why.
If a request comes in with a suspicious header or from a known scraper range, it gets hit with a 403, and that event is logged with the full context. This helps in fine-tuning the WAF rules without locking out legitimate users.
Admin Upgrades and Manual Triggers
I fixed a bug in the manual-trigger hook. Sometimes you just need to re-run a generation or re-push a social post manually from the dashboard. The hook was failing due to a mismatch in environment-driven configs. That’s resolved, and the admin UI now has better feedback for these long-running tasks.
SEO and Feeds
I added /feed.xml and /llms.txt aliases. The latter is becoming increasingly important for the "AI-friendly" web. It provides a markdown-optimized version of the site's content for other LLMs to consume efficiently.
I also started UTM-tagging the RSS feeds. This seems small, but it’s the only way to differentiate between a user reading in Feedly vs. someone clicking a link on a social platform. Data is only useful if it’s clean.
The Monetization Layer: Adsterra
Let’s talk about the elephant in the room: sustainability. I’ve integrated Adsterra ads into the layout. While I’m not a fan of cluttered webs, the compute costs for running high-end LLMs and image generators aren't zero. The integration is handled via the admin panel, allowing me to toggle placements or swap providers if the UX takes too much of a hit.
What’s Next?
The foundation is now solid. We have:
- Instant Indexing (Google/Bing)
- Social Distribution (Twitter, Tumblr, Nostr, LinkedIn)
- Regional AI Logic (Sonar-based prompts)
- Bot Protection
The next step is to refine the "Scored Related Posts" algorithm. Currently, it’s a basic keyword match. I want to move toward a vector-based similarity search so that if you’re reading about lithium mining in Chile, the system suggests articles on EV supply chains in China, even if the keywords don't perfectly overlap.
GeoPolitiq is becoming more than a project; it's becoming an autonomous organism. 10 commits this week, 0 PRs (because I'm moving fast and breaking things in my own repo), and 9 new issues to tackle.
Onward.