AI SEO

Inside SellThru's AI SEO Infrastructure — Why We Built It, What It Does, and Where It's Going

Most agencies talk about AI. We decided to build with it — from the data layer up. This is a transparent look at the infrastructure powering our SEO practice in 2026 and why it matters for clients in the GCC.

At some point in late 2025, we made a decision that most agencies our size don’t make: to stop waiting for off-the-shelf tools to catch up with what the market actually needs, and to start building our own. Not because we enjoy engineering for its own sake, but because the gap between what standard SEO tooling offers and what clients in the UAE and Saudi Arabia actually need had become too wide to ignore.

This article is a transparent breakdown of what we’ve built, the reasoning behind each decision, and where the infrastructure is heading. If you’ve read our piece on how we ranked for “Best SEO LLM Agency Dubai” — this is the engine behind that result.

Why We Built It

Three problems kept appearing in our client work that standard tools couldn’t solve well enough.

01

No LLM visibility data for GCC markets

Every passive LLM monitoring tool returns empty for UAE and KSA — by design. The only valid methodology is active querying, which no off-the-shelf product was doing for Arabic-language GCC queries.

02

Too much analyst time on repeatable tasks

Keyword search volume pulls, traffic comparisons, keyword distribution analysis, technical health checks — these tasks are predictable and repeatable. We wanted to automate them to give the team’s time back for thinking, not data assembly.

03

Audit quality depends on who runs it

When methodology lives in an analyst’s head rather than a system, outputs vary. Consistency at scale requires infrastructure, not guidelines.

04

Reporting is too slow to be strategic

When data retrieval and assembly consumes most of an analyst’s time, there’s little left for interpretation. The senior SEO brain should be on insight, not spreadsheet mechanics.

“The gap between what standard SEO tooling offers and what GCC clients actually need had become too wide to paper over with process.”

What We Built — The Full Stack

The infrastructure has three layers: a data layer that pulls live signals from multiple sources, an integration layer that routes and structures those signals, and an AI analysis layer powered by Claude that turns structured data into consistent, calibrated deliverables. Here’s how they fit together.

Diagram 01The Full AI SEO Stack — Three Layers
LAYER 1 · DATA LAYER 2 · INTEGRATION LAYER 3 · AI + OUTPUT DataForSEO Keywords · Backlinks · SERP Google Search Console Impressions · CTR · Position Google Analytics 4 Organic Traffic · AI Referral ChatGPT API Active LLM Query · UAE + KSA Gemini API Market-Context Responses SELLTHRU MCP SERVER mcp.sellthru.me Live orchestration · Auth · Routing Hosted on VPS · Managed by SellThru AI ANALYSIS ENGINE Claude Anthropic API · claude-sonnet SEO Prompt Generator v10 SEO Audit Weekly Report Monthly Report GEO/LLM Audit LLM SEO TRACKING ENGINE FOR GCC sellthruagency.com · Built 2025–2026
Three-layer architecture: Data sources feed a hosted MCP server, which routes live signals into Claude for structured analysis — producing four repeatable audit outputs for clients across UAE and KSA.

Layer 1 — The Data Layer

Five live data connections power every audit we run. Three are standard to any serious SEO practice — DataForSEO for keyword and SERP data, Google Search Console for impression and ranking data, and GA4 for organic traffic and AI referral attribution. Two are specific to our LLM SEO work and are the reason we had to build rather than buy.

The ChatGPT API and Gemini API connections enable active querying — we submit market-specific prompts and analyse the actual responses returned. This is the only methodology that works for MENA. Every passive LLM monitoring tool returns empty data for GCC markets. If an agency is telling you they’re tracking your LLM visibility in UAE or KSA using a SaaS dashboard, ask them where that data is coming from.

Diagram 02Data Layer — What Each Source Contributes
DataForSEO API · Double JSON parse GSC Date dimension · Not query agg GA4 Organic Search channel group ChatGPT + Gemini Active query · not passive track Keyword volumes · difficulty · SERP features Calibrated before output Impressions · CTR · avg position WoW and MoM delta tracking Organic sessions · AI referral traffic ChatGPT / Perplexity attribution Brand mention · citation source · SOV UAE EN · KSA AR queried separately MCP SERVER Routes to Claude Structured · calibrated · consistent
Key implementation detail: GSC data uses the date dimension (not query aggregation) for accurate totals. GA4 organic uses sessionDefaultChannelGroup = "Organic Search". AI referral uses sessionSourceMedium CONTAINS to catch ChatGPT and Perplexity traffic.

Layer 2 — The MCP Integration Layer

The MCP (Model Context Protocol) server is the connective tissue. It’s a server we host internally at mcp.sellthru.me, managed by our AI developer, that allows Claude to query live data sources directly during analysis — rather than working from static CSV exports. This matters because audit outputs should reflect current conditions, not last week’s data pull.

MCP is an Anthropic-developed standard that gives AI models structured access to external tools and data sources. By hosting our own server, we control what data Claude can access, how it’s formatted before it reaches the model, and how outputs are structured. It also means we can extend the system — adding new data sources or report types — without rebuilding from scratch.

Diagram 03MCP Layer — How Claude Accesses Live Data
AI ENGINE Claude tool_call MCP SERVER mcp.sellthru.me Auth · Route · Format VPS · Managed internally API calls DataForSEO · GSC · GA4 Standard SEO Sources ChatGPT + Gemini APIs STRUCTURED RESPONSE → Claude context ① Analyst triggers prompt ② MCP routes + fetches live data ③ Structured data → Claude
Claude never works from stale exports. Every prompt triggers live API calls through the MCP server, returning fresh data structured before reaching the model — making outputs consistent regardless of which analyst triggers the prompt.

Layer 3 — The AI Analysis Layer

Claude processes all data through a structured prompt library we’ve built and refined across dozens of real client audits. The reason we use Claude specifically — over other models — comes down to instruction fidelity. SEO audit methodology is precise: the right metric, from the right source, calibrated correctly, interpreted in the right market context. Claude follows complex, multi-part instructions without drifting from them.

The prompt library currently covers four output types — each a distinct document with its own data requirements, structure, and market calibration rules. The architecture behind this library is what we’re covering in the next piece.

Diagram 04Four Output Types — What the System Produces
PROMPT LIBRARY v10 Claude 4 structured output types OUTPUT 01 New Client SEO Audit Full site · Word doc · GCC calibrated OUTPUT 02 Weekly SEO Report GSC + GA4 · WoW delta OUTPUT 03 Monthly SEO Report Full performance · MoM delta OUTPUT 04 GEO / LLM Audit ChatGPT + Gemini · MENA active
All four outputs share the same data pipeline. The GEO/LLM Audit (Output 04) is the most differentiated — it uses active querying through LLM APIs, making it the only output type that requires the full five-source data layer.

What This Means for Our Agency Positioning

This infrastructure is not just an operational efficiency play. It’s a deliberate positioning bet. The SEO LLM agency market in Dubai is still early. The agencies that establish measurable, repeatable LLM visibility capability now — while the competitive set is small — will be significantly harder to displace when the market matures.

We’ve already seen this play out on our own brand. The infrastructure we built to serve clients is the same system we used to rank SellThru for “best SEO LLM agency Dubai” — appearing in Google’s top three and being cited by name in Gemini’s AI Overview. The methodology is proven on our own domain before it’s applied to a client’s.

For clients, this means their LLM visibility audit is not a one-off project produced by an analyst manually querying ChatGPT. It’s a structured output from a system that runs repeatedly, producing comparable results over time, with consistent calibration across UAE and KSA markets.

“We used the same infrastructure we built for clients to rank ourselves. Proof of methodology, not just proof of concept.”

What’s Next — The SEO Prompt Generator

The infrastructure described above is the foundation. What sits on top of it is the SEO Prompt Generator — a structured prompt library, currently at version 10, that defines exactly how Claude analyses data for each of the four output types. It specifies which metrics to pull, how to interpret them, what calibration to apply, and how to structure the output.

The prompt generator is what turns a capable AI model into a consistent SEO analyst. Without it, Claude produces variable outputs depending on how questions are phrased. With it, every audit follows the same methodology — regardless of who triggers it.

Coming Next

Inside the SEO Prompt Generator — Architecture, Methodology and Why It Changes How Agencies Work

A detailed breakdown of how we structured a prompt library across four report types, the design decisions that make outputs consistent at scale, and why instruction architecture matters more than model selection.


See What an LLM Visibility Audit Looks Like

Active querying methodology. Dual-market UAE + KSA coverage. Consistent, calibrated outputs every time.

View the LLM Audit Service