Back to Blog
Building SEO products in-house: a product manager's guide to internal linking algorithms, programmatic SEO & technical audit apps
26 min read

Building SEO products in-house: a product manager's guide to internal linking algorithms, programmatic SEO & technical audit apps

TL;DR

Most SEO teams consume tools. The teams that compound an advantage build their own. This guide walks through the discipline of building SEO products in-house as a Product Manager (PM), with internal linking algorithms as the lead example and programmatic SEO platforms and technical audit apps as supporting playbooks. The principles are the same across all three: scope tight, ship in cohorts, A/B test the win, measure in revenue. The lead case study (an internal linking algorithm shipped by three engineers in under four months) delivered a $1.8 million annualized gross profit lift across more than a billion pages.

Most SEO teams consume tools. The teams that compound an advantage build them. That is the line that separates an SEO function plugged into a stack of vendors from one that ships software the rest of the org can feel in revenue.

This guide is for Product Managers (PMs) who own search inside a product organization, and for SEO specialists transitioning into PM roles. The angle is narrow: how to build SEO products in-house, treating internal linking algorithms, programmatic SEO platforms and technical SEO audit apps as first-class software you scope, ship and govern. The benchmark is not a clever blog post. It is a Product Requirements Document (PRD) good enough that engineering can build from it without three more meetings.

I will use one project as the lead example throughout: an internal linking algorithm we shipped at Expedia Group with three engineers, in under four months, that delivered a $1.8 million annualized gross profit (GP) lift in A/B test. The supporting examples (a programmatic SEO platform launched in cohorts, a lightweight technical audit app) come from adjacent work in the same organization. The principles are the same across all three. The PM craft of scoping, shipping and measuring is what makes the difference.

What is SEO product engineering? Unpacking the discipline of building search products in-house

SEO product engineering is the practice of treating search-relevant software as a product, with a roadmap, owner, PRD, sprint plan and success metrics. The output is not a campaign or a deck. It is code that runs in production and changes how search engines see your site.

The shift is conceptual before it is technical. Eli Schwartz framed it in Product-Led SEO when he argued that the SEO strategy is the product itself, not a marketing layer applied to it. The follow-on insight is that some of the highest-ROI work an SEO PM can do is to build a product whose only job is to drive organic outcomes. That is what an internal linking algorithm or a programmatic SEO platform actually is.

The framing has a pedigree outside SEO too. Andrej Karpathy popularized “agentic AI” and the idea (covered in his Y Combinator talk on how software is changing again) that software increasingly gets generated by giving an intelligent agent a high-level intent rather than line-by-line instructions. Adapt that idea to SEO and you get a useful contrast: a content brief is line-by-line instructions; a programmatic SEO platform is a generative system the PM specifies once and then governs. The latter scales; the former does not.

From SEO checklists to SEO products: a paradigm shift

The traditional SEO operating model is a checklist of tasks shuttled into Jira: fix canonicals, add schema, write meta descriptions, ship a redirect map. Useful, but it does not compound. Every quarter, the same backlog reappears with a fresh paint job.

A product mindset replaces the checklist with three artifacts: an inventory of SEO-influencing systems (the templates, modules, pipelines and algorithms that affect search), a roadmap that prioritizes interventions by revenue impact, and a PRD per intervention that engineering can build from. The work shifts from doing SEO tasks to building the systems that do SEO tasks for you.

That is when an internal linking algorithm becomes a roadmap item. Or a programmatic SEO platform. Or a custom audit app. They stop being side projects and become products with PMs, OKRs and post-launch governance.

Why building SEO products beats buying tools alone: benefits for teams & businesses

The business case for in-house SEO products has three pillars: velocity, fit, and unit economics. None of them argue against buying tools. They argue against relying on tools alone for everything.

Buy when the use case is generic. Crawling, log analysis, rank tracking, on-page auditing: those are well-served by Botify, Lumar, Screaming Frog, Sitebulb and Oncrawl. Build when the use case is specific to your business model. A travel site needs a destination-graph linking algorithm that knows about origin-destination corridors, not a generic “related articles” widget. A SaaS marketplace needs a programmatic SEO platform that pulls from its proprietary integrations database. Off-the-shelf cannot follow you there.

The cost case is straightforward. Enterprise SEO platforms run six figures annually at scale. Two engineer-years of work amortized across the same period often produces a sharper, narrower product that ships features faster than vendor roadmaps. The CFO test is the three-year Total Cost of Ownership (TCO) comparison. If the build cost is lower and the strategic value is higher, you build.

Speed matters too. BrightEdge channel-share research shows organic search drives 53% of trackable website traffic. This statistic is a bit old, today is probably more like 30%, but SEO is still the second most trafficked marketing channel after Direct. A channel this big should not be bottlenecked behind a vendor’s quarterly release cycle.

Boosting velocity and efficiency across the SEO SDLC

The Software Development Life Cycle (SDLC) for SEO products looks the same as for any other product, with one quirk: the user is partly Googlebot, and Googlebot does not file feedback tickets. So you instrument the system instead. Logs, indexation reports, ranking diffs, crawl coverage. They are your user research panel.

When you build the right tooling, the team’s clock speed jumps. In one initiative I led, the SEO team relied on entirely manual workflows for everything from business cases to alt-text optimization to report scheduling. We mapped the bottlenecks, built lightweight automations across Slack, JIRA and Google Colab, and prototyped a GPT-4 Vision-based alt-tag generator. The team got back roughly 100 weekly hours, or about $260,000 in annualized savings, and started running experiments instead of writing strategy decks. The point is not the absolute number. It is that automating the SEO ops layer is itself a product, and PMs who treat it as one ship faster than peers who do not.

There is a developer-productivity analogue worth borrowing. GitHub’s research on Copilot users found that developers using AI assistance completed coding tasks 55% faster than those who did not. The number does not transfer directly to SEO, but the underlying mechanic does: when the slow, repeatable parts of the work are automated, humans get to spend their time on the parts that actually require judgement.

Empowering lean enterprise SEO teams

In-house SEO products let small teams operate above their weight class. Three engineers and a PM with a clear PRD can ship something that would take a generic vendor a multi-quarter feature request to deliver, because the team owns every part of the stack: the data model, the algorithm, the deployment pipeline, the dashboards, the kill switch.

The lean-team angle is also a budget angle. Vendor contracts at enterprise scale start at six figures and grow with usage. An in-house build pays its engineers and is then amortized over years of compounding value. That is how you justify the headcount to leadership: not “we need more SEO people,” but “we are building software whose payback period is under twelve months.”

The SEO product builder’s toolkit: core platforms, frameworks & build-block stacks

The toolkit for building SEO products in-house is a layered stack: vendor platforms for the generic work, custom code for the specific work, and a thin glue layer that lets the two interoperate. Treating it that way avoids the false choice between “all SaaS” and “all in-house.”

Enterprise platforms vs. tactical tools

For the generic work, two categories of vendor cover most needs.

Enterprise SEO platforms. Botify and Lumar (formerly DeepCrawl) lead the enterprise tier with crawling, log file analysis, monitoring and (increasingly) AI-search visibility tracking. Botify has positioned heavily around AI search optimization since their 2026 predictions piece, and Lumar markets a Generative Engine Optimization (GEO), accessibility and SEO toolkit alongside an “AI dev ticket writer.” These products replace internal infrastructure for crawling, log ingestion and dashboarding, which is exactly the part you should not build yourself unless you have a very good reason.

Tactical and diagnostic tools. Screaming Frog and Sitebulb remain the standard for hands-on, per-URL investigation. Oncrawl specializes in log file analysis for crawl budget work. The comparison from The Rank Masters covers the current landscape and how each product is integrating AI-driven features for 2026.

The decision is rarely “platform vs. nothing.” It is “platform plus this targeted in-house product to cover what the platform cannot.”

Custom build stacks for the three product types

For the in-house product layer, three reference stacks cover most of the surface area you will encounter.

Internal linking algorithms. The minimum stack is a graph data store (Neo4j, or Postgres with adjacency tables for smaller graphs), a similarity model (Term Frequency-Inverse Document Frequency [TF-IDF], embeddings from a sentence-transformer, or a domain-specific score like origin-destination corridor distance), and a deployment job that writes the recommended links into your CMS template tables. The conceptual underpinning still traces to the original PageRank paper by Brin and Page: links are weighted votes, and the algorithm decides which votes to cast.

Programmatic SEO platforms. A typical 2026 stack uses Postgres or Supabase for the database layer, Next.js or a similar static-site framework for fast page generation, and a sync layer (often Python or Node.js) between the two. The DiscoveredLabs comparison of programmatic SEO tools makes the case that 2026’s programmatic SEO has shifted toward entity-first architectures designed for AI citation, not just blue-link rankings, which changes how you specify the data model in the PRD.

Technical SEO audit apps. The starter stack is a Python ETL job hitting Google Search Console’s API, BigQuery or Postgres for storage, and a dashboarding layer (Looker, Power BI, or a Streamlit app for engineers who prefer code). Add log file ingestion when crawl-budget work becomes a priority. The point is to start with the painful audit you already run by hand and automate that one first, not to clone Botify.

Hands-on SEO product building: playbooks for the three flagship products

This is the working core of the guide. Three playbooks. One lead, two supporting. Each starts with the PRD because the PRD is where most SEO product work goes wrong.

Mastering the SEO PRD: from idea to scoped engineering work

The SEO PRD is the artifact that translates “we should improve internal linking” into “engineering can build this in three sprints.” Get this right and the rest of the project moves at engineering pace. Get it wrong and you spend the rest of the project re-explaining the same idea in different rooms.

The skeleton is generic but the SEO-specific sections are what carry the work. A solid SEO PRD covers the business case in revenue terms, the technical specification (URL patterns, canonical rules, indexation policy, schema, internal linking logic), the acceptance criteria written for both users and search bots, and the rollback plan. The full template lives in the SEO PRD guide, which I would treat as the prerequisite reading for the rest of this section.

The thing that distinguishes an SEO PRD from a generic one is the bot-as-user lens. Every spec gets a “what does Googlebot need to see?” check. Canonical handling? Spec it. JavaScript rendering pathway? Spec it. Pagination? Spec it. The acceptance criteria should let a Quality Assurance (QA) engineer test for both human and bot behavior without ambiguity.

Lead playbook: shipping an internal linking algorithm in under four months

This is the project I keep coming back to because it taught me more about shipping SEO products than any other.

The setting was a large travel site’s flagship US site. A previous internal linking overhaul had taken 10 engineers and 9 months. Traffic was soft. The Senior Vice President (SVP) of Search challenged the team to deliver a winning A/B test in under four months with a fraction of the resources: three MarTech engineers, one data engineer, and one program manager. No buffer.

I led a December vision workshop to define the scope. We deliberately picked a narrow improvement (a geographic-relevance tweak that would bias internal links toward destinations in the same travel corridor) instead of a re-architecture. That single scoping decision is what made the timeline possible. The PRD specified the algorithm inputs (origin-destination distance from a partner data source, destination popularity, content-type filters), the output (a ranked list of internal link candidates per source page), and the integration points (the CMS template tables that drive the linking modules across more than a billion pages).

The team’s secret weapon was a shared deployment checklist on a Slack Canvas. Every milestone had owners: pipeline tables, data verification, User Acceptance Testing (UAT) phases, A/B test launch criteria, rollback triggers. Nothing fancy, just one document that everyone updated.

We rolled out in mid-April. The A/B test split at the page-template level (not user level, because Googlebot does not have a session cookie). Half the URLs in scope received the new linking logic; half stayed on the legacy logic as control. After six weeks of data, the test emerged a clear winner, and the algorithm rolled out to 100% with a $1.8 million annualized GP uplift. The data engineering involved is the same shape as the advanced internal linking strategies for e-commerce and large sites playbook, just adapted to a travel data model.

Three takeaways carry over to any internal linking algorithm project, regardless of vertical.

Pick a narrow scope. Ship a winning test. Then expand. Re-architectures fail because they take too long to prove value. Targeted algorithmic tweaks ship in months and either earn the investment for the next iteration or get killed cleanly.

A/B test at the template level, not the user level. This is the only way to isolate Googlebot’s response from user behavior. Page-level cohorts also let you measure ranking and crawl effects directly, not just downstream conversion.

Build a kill switch into the deployment. If the A/B test is a winner, you ramp to 100%. If something goes wrong, you roll back the cohort in minutes, not days. That alone changes the risk profile of every linking experiment after it.

Supporting playbook: launching a programmatic SEO platform in cohorts

Programmatic SEO is the second product type where in-house builds shine, because the templates and data sources are usually too company-specific for an off-the-shelf tool. The 2026 best practice has shifted toward measured rollouts in cohorts of 50 to 100 pages at a time, validated against indexation and traffic before scaling, as Rank Me Higher’s 2026 programmatic SEO guide and the DigitalApplied programmatic SEO playbook both stress.

The PRD for a programmatic SEO platform answers seven questions before any code is written. What are the data sources? What are the templates (one per intent cluster)? What is the URL pattern? What is the indexation policy by cohort? What are the canonical and pagination rules? What is the internal linking model between generated pages? Where is the kill switch?

Cohort rollout looks like this in practice. Cohort 1 is 50 pages launched as noindex, just to verify rendering, schema validity, and internal link wiring. Cohort 2 is 200 pages indexed, with a 30-day Service Level Objective (SLO): 95% of pages serve 200 OK and at least 60% of pages indexed by Google within 90 days. Cohort 3 (500 to 1,000 pages) ships only if cohort 2 hits SLO. Each cohort is a learning step. The kill switch lets you noindex an entire cohort in one deploy if quality drops or thin-content signals appear.

This is the discipline that separates a programmatic SEO platform from a content-spam pipeline. Without cohort governance, the temptation is to ship 50,000 pages on day one. With cohort governance, the platform behaves like a real product: incremental, instrumented, reversible.

Supporting playbook: building a lightweight technical SEO audit app

The third product is the one most teams should start with, because it pays for itself in the first quarter. The triggering pain point is usually the same: the team runs the same audits every week by hand, the data lives in three different tools, and nobody has time to keep dashboards current.

The product I keep referencing here started as a deceptively simple problem. Out of millions of pages exposed to Googlebot across our portfolio, only tens of thousands drove any traffic. Massive indexation gaps were suspected, but Google Search Console only reported at the sitemap level, which masked the true scope. The team built an Extract-Transform-Load (ETL) pipeline that scraped GSC indexation data, aggregated it in Power BI, and surfaced section-level indexation rates across more than fifty brand properties. The analysis revealed that only 40% of pages were actually indexed, even on our strongest sites. We then developed an algorithm to noindex or remove low-value pages, prioritizing quality over volume, and overall page indexation lifted by 30%.

The lesson is the shape of the build. The product was a Python ETL plus a dashboard plus an algorithm. Three components, each replaceable. The team did not try to build a Botify clone. They built the smallest possible audit app that answered one painful question and then shipped it.

The natural next layers, when the team is ready, are log file ingestion (to confirm what bots actually crawl), schema validation, redirect chain detection, and threshold-based alerting. Each layer follows the same pattern: solve a real audit pain, validate the solution, and only then expand. The Sprintzeal write-up on AI-driven 2026 audits is a good guide to which audit signals are now worth automating, including AI bot access verification and Interaction to Next Paint (INP) field monitoring.

Mastering PM-engineering collaboration: the art of pair-shipping SEO products

If the playbooks are the what, PM-engineering collaboration is the how. SEO products fail more often on coordination than on technique. Engineering ships great code that misses the SEO requirement. SEO writes great requirements that engineering deprioritizes. The gap is rarely malice. It is usually translation.

Best practices for effective PM-engineering pairing on SEO work

The PM’s job in an SEO product is to be the bridge, not the bouncer. Three habits cover most of it.

Specify in user stories that include the bot as a user. “As Googlebot, I want each generated page to specify a single canonical URL so that I consolidate ranking signals to the preferred version.” That format is familiar to engineers, the criteria are binary, and it kills the eternal debate about what “good SEO” means.

Pair on the technical design before the sprint, not during it. A 30-minute working session with the lead engineer to walk through the data model and integration points saves a week of rework later. Bring the PRD; expect to leave with a smarter version of it.

Be present at code review, at least for the SEO-critical parts. Not to read the code line by line, but to ask the questions that catch SEO regressions before they ship. “If this throws a 500 error, what status code does the user (and Googlebot) get?” “Where is the canonical tag set, and is there any path that strips it?”

I have seen this pay off in real numbers. On a landing-page app I worked on (about a million pages), Largest Contentful Paint (LCP) was averaging 4.5 seconds, threatening tie-breaker SEO rankings. We hosted internal “Speed and Conversion Rate (CVR)” workshops using Deloitte’s research linking 0.1-second mobile speed gains to 8.4% retail conversion uplifts and 10.1% travel uplifts, got the Chief Product Officer (CPO) to set site speed as a product OKR, and rolled out CMS template optimizations across all pages. LCP dropped 40% and the 75th percentile cleared the 2.5-second target across the full one million pages. None of that lands without engineering sitting at the same table as SEO from week one.

Integrating SEO product work into existing dev workflows (CI/CD, A/B testing, version control)

The integration is mechanical once the collaboration habits are in place. SEO product work belongs inside the same Continuous Integration / Continuous Deployment (CI/CD) pipeline as everything else: feature branches, pull requests, staging environments, automated tests, deployment gates. The framing in the agile SEO strategy guide for enterprise teams covers the cadence side. The piece worth adding here is the testing layer.

Three testing patterns repay their cost.

Pre-merge crawl tests. A short Lighthouse or Screaming Frog run against the staging build, gated on a configurable threshold (e.g. zero new orphan pages, no canonical regressions, LCP within budget).

Schema validation in CI. Run Schema.org’s validator or Google’s Rich Results Test against the generated pages. Block the merge if structured data breaks.

Post-deploy A/B test instrumentation. Tag pages by cohort in the analytics layer so the test is queryable from day one, not bolted on after the fact.

These are the same patterns that work for any product team. The SEO twist is that the user includes a non-human, so a portion of the tests need to simulate Googlebot rather than a logged-in person.

Ensuring quality & safety: critically evaluating SEO product output

In-house SEO products fail safe or fail loud. The PM’s responsibility is to make sure they fail safe. That means accepting that algorithmic output will sometimes be wrong, and designing the system so the wrong output gets caught before it reaches Googlebot at scale.

Frameworks for reviewing and validating algorithmic output

For an internal linking algorithm, the validation framework has three layers. Spot-check sample output against semantic relevance: does the algorithm pair “Paris hotels” with “Paris attractions” or with “Berlin nightlife”? Score anchor text quality: is it descriptive and varied, or does the same anchor repeat across hundreds of source pages? Audit equity distribution: is the algorithm concentrating links on a healthy set of priority pages, or starving them?

For a programmatic SEO platform, the validation framework is mostly about content depth. Each generated page needs at least 300 words of genuinely unique content beyond boilerplate, ideally with proprietary data points (local pricing, internal-system specs, first-party reviews). When a page cannot meet that bar, the platform should default to noindex rather than ship a thin page. DigitalApplied’s pSEO best-practice guide covers this in more detail.

For a technical audit app, validation flips: the auditor itself is being audited. The risk is false confidence (the dashboard says everything is green and reality is red). Run the audit app’s findings against a manual audit of a sample, monthly at first, then quarterly once the agreement rate is high enough.

Debugging and troubleshooting SEO products

When an SEO product underperforms its expected lift, the debugging tree is usually three branches deep.

The algorithm is doing what you specified, but the spec was wrong. This is the most common case. The internal linking algorithm assigned links correctly, but the geographic-relevance signal was less predictive than search volume. Fix at the spec layer.

The algorithm is right, but the integration is wrong. The links are being generated correctly upstream, but the CMS template is rendering them inside a <div> that gets stripped from the static Hypertext Markup Language (HTML) by an aggressive minifier. Fix at the integration layer.

The algorithm and integration are both right, but the measurement is wrong. The A/B test is split at the user level instead of the page level, so Googlebot’s experience averages out. Fix at the test layer.

The trick is to walk the tree from cheapest fix to most expensive. Most teams do the reverse, redesigning the algorithm before they have ruled out the integration.

Risk considerations: indexation safety, traffic regression, ethical use of AI-generated content

The last layer is the safety case. Three risks deserve a named owner in the PRD.

Indexation safety. Any product that generates URLs at scale needs an indexation policy and a kill switch. Cohorts can be moved between index and noindex without redeploying code. The SEO QA checklist for flawless performance walks through the staging-side discipline that catches most indexation problems before launch.

Traffic regression. Define the regression threshold up front. If template-level traffic drops more than X% relative to control, the change rolls back automatically. Without a pre-agreed threshold, the team will argue about whether the drop is real for weeks while traffic bleeds.

Ethical use of AI-generated content. Programmatic SEO platforms increasingly mix template-driven content with Large Language Model (LLM) generation. The 2026 best practice is Retrieval-Augmented Generation (RAG) with human review checkpoints on a representative sample. Google’s helpful content guidance is explicit: scaled content abuse is a violation regardless of how the content is produced. The internal policy should match.

Overcoming challenges & advanced SEO product strategies

The trouble with in-house SEO products is the trouble with all in-house software: they accumulate technical debt, drift in scope, and quietly become someone’s part-time job that nobody else can maintain. PMs who want their products to outlast their tenure plan for those failure modes from the start.

Common pitfalls: technical debt, scope creep, over-reliance on automation

Technical debt accumulates fastest in glue code. The ETL job that scrapes GSC every Monday started as a notebook and ended up running production dashboards. The fix is to budget refactor cycles into the roadmap (one sprint per quarter, dedicated to debt) and to set a clear “production-readiness” bar that any in-house product crosses before it gets daily-use status.

Scope creep happens when stakeholders treat the product as a wishlist. “Can the linking algorithm also surface our seasonal campaigns?” “Can the audit app also check competitor URLs?” Each request is reasonable in isolation. The PM’s job is to keep the product focused on the original problem statement and to push extensions through the same prioritization framework as new work.

Over-reliance on automation is the most subtle. When the audit app says everything is fine, the team stops looking. The countermeasure is a quarterly manual spot-check (a real human, on a real sample of URLs) to confirm the dashboard reflects reality. Treat the automation as a force multiplier, not a replacement for judgement.

Scaling SEO products for enterprise and multi-brand portfolios

Scaling adds two flavors of complexity: governance and integration. Governance is the question of who owns which version of the product across brands. Integration is the question of how the product hooks into a heterogeneous CMS landscape.

The pattern that works is centralized product, distributed deployment. One PM owns the master roadmap and the central code. Each brand or business unit gets a deployment configuration that maps the central product to its specific data sources, templates and metric definitions. Changes to the central code go through a single change-management process; changes to the configurations stay local.

The scaling challenges I have seen in enterprise environments (multi-brand SEO orgs running across different CMS stacks) are covered in more depth in the advanced enterprise SEO integration guide. The headline pattern is that structural alignment beats process alignment. If the org chart has SEO product PMs sitting outside the brand silos, with a shared mandate, the products scale. If SEO product PMs report into individual brand teams, every brand reinvents the same product slightly differently, and none of them ship as fast.

The future of SEO products: from algorithms to agentic systems

The next iteration of SEO products is already visible in the platforms that ship features fastest. The center of gravity is moving from search-engine optimization to AI-system optimization, and from algorithms to agents.

The evolving role of the SEO PM

The SEO PM’s job is widening, not narrowing. The work still includes traditional optimization: indexation, internal linking, schema, page experience. But three new responsibilities are showing up in the role’s job descriptions.

Generative Engine Optimization (GEO) is the discipline of getting cited by AI answer engines (Google AI Overviews, Perplexity, ChatGPT search). The signals that matter shift from blue-link rankings to citation worthiness: structured data, factually dense content, entity coherence. The Botify 2026 predictions piece is a useful primer on the direction.

AI-bot governance is the work of deciding which AI agents can crawl your site, on what terms. This was invisible until very recently. At Expedia, our sites were initially invisible to ChatGPT, Perplexity and Copilot because Akamai Web Application Firewall (WAF) rules blocked unallowlisted agents. We supplied the executive team with a vetted list of AI user agents, manually unblocked them, and negotiated rate-limiting guardrails. The result was a noticeable surge in AI-bot traffic and an extrapolated $0.5 million in annualized GP. The deeper write-up is in the Google AI Mode SEO strategies guide.

Predictive analytics and AI-augmented PM workflows are increasingly part of the toolkit. The SEO product management guide on AI and predictive analytics covers where forecasting, semantic content optimization and automated technical monitoring produce the most lift. PMs who learn these tools ship faster and forecast more accurately than peers who do not.

AI-driven innovation: beyond algorithms to autonomous SEO agents

The longer-term shift is from algorithms a PM specifies to agents a PM directs. An autonomous internal linking agent does not need a hand-coded similarity model; it monitors the site’s link graph, identifies imbalances, generates link suggestions, runs them through a quality gate, and ships them on a schedule. The PM’s role becomes specifying the goal, the constraints and the rollback rules, then governing the agent’s behavior.

That is closer to Karpathy’s framing of agentic AI than to traditional software engineering. It is also where the discipline is heading. PMs who get comfortable specifying intent and guardrails (rather than every line of behavior) will ship the next generation of SEO products. PMs who insist on hand-specifying every detail will spend the next three years writing PRDs the agents can already generate.

The constant across all of this is what made the lead playbook work in the first place. Pick a narrow scope. Specify the spec well. Ship in cohorts. A/B test the win. Build the kill switch. Measure in revenue. The tools change. The discipline does not.

Pick one product to build this quarter. Write the PRD. Ship the smallest version that proves the lift. Then come back and tell me the number.

References

  1. Schwartz, E. Product-Led SEO. https://www.productledseo.com/

  2. Karpathy, A. Software Is Changing (Again). Y Combinator. https://www.ycombinator.com/library/MW-andrej-karpathy-software-is-changing-again

  3. Brin, S. & Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Stanford University. http://infolab.stanford.edu/~backrub/google.html

  4. BrightEdge. Channel Share Research. https://www.brightedge.com/resources/research-reports/channel_share

  5. GitHub. Quantifying GitHub Copilot’s impact on developer productivity and happiness. https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/

  6. Deloitte. Milliseconds make millions. https://www.deloitte.com/ie/en/services/consulting/research/milliseconds-make-millions.html

  7. Botify. 6 Predictions for the Future of AI Search in 2026. https://www.botify.com/blog/future-ai-search-2026

  8. Lumar. Website Optimization Platform. https://www.lumar.io/

  9. The Rank Masters. Best AI Tools for Technical SEO Audits (2026 Guide). https://www.therankmasters.com/insights/technical-seo/best-ai-tools-technical-seo-audits

  10. DiscoveredLabs. Programmatic SEO Tools And Platforms: A Technical Marketer’s Comparison. https://discoveredlabs.com/blog/programmatic-seo-ools-and-platforms-a-technical-marketers-comparison

  11. DigitalApplied. Programmatic SEO: Scale Content with Templates 2026. https://www.digitalapplied.com/blog/programmatic-seo-scale-content-templates-2026

  12. Rank Me Higher. Programmatic SEO in 2026: A Complete Guide. https://rankmehigher.co/learn/programmatic-seo-guide/

  13. Sprintzeal. Technical SEO Audit 2026: AI Bots, INP & Automated Audits. https://www.sprintzeal.com/blog/ai-powered-technical-seo-audit

  14. Google Search Central. Creating helpful, reliable, people-first content. https://developers.google.com/search/docs/fundamentals/creating-helpful-content

  15. Google Search Console API. Webmaster Tools API Reference. https://developers.google.com/webmaster-tools/v1/api_reference_index

  16. Schema.org Validator. https://validator.schema.org/

  17. Google. Rich Results Test. https://search.google.com/test/rich-results

Oscar Carreras - Author

Oscar Carreras

Author

Director of Technical SEO with 19+ years of enterprise experience at Expedia Group. I drive scalable SEO strategy, team leadership, and measurable organic growth.

Learn More

Frequently Asked Questions

What is an SEO product, and why would you build one in-house?

An SEO product is software you build, own and ship to influence how search engines discover, render or rank your pages. Common examples are internal linking algorithms, programmatic SEO platforms (template-based page generation), and technical SEO audit apps. You build in-house when off-the-shelf platforms like Botify or Lumar do not fit your data model, when license costs exceed the engineering cost of a focused build, or when the product needs to run inside your CMS or pipelines (where third-party tools cannot reach without integration work).

When does it make sense to build vs. buy SEO tools (Botify, Lumar, Screaming Frog)?

Buy when the product is generic and the vendor's roadmap matches yours: site crawling, log file analysis, rank tracking, basic auditing. Build when the use case is specific to your business model (e.g. a destination-graph linking algorithm for a travel site, or a programmatic SEO platform that pulls from a proprietary inventory database). A useful rule of thumb is the three-year total cost of ownership: if a vendor will charge more than two engineer-years over three years and the product is core to your business, an in-house build usually wins on flexibility and unit economics.

How do you A/B test an internal linking algorithm without messing up rankings?

Split the test at the page-template level rather than the user level, because Googlebot does not have a session cookie. Pick a representative subset of URLs (typically 10 to 50 percent of a template), apply the new linking logic only to those pages, and keep a control set on the legacy logic. Track ranking, organic traffic and conversion at the cohort level over at least a six-week window to filter out weekly seasonality. Keep a kill switch in the deployment pipeline so you can roll back the cohort if traffic regressions hit a defined threshold.

What does the technical PRD for a programmatic SEO platform look like?

It defines the data sources (the database tables that feed each page), the page templates (one template per intent cluster), the URL pattern, the indexation policy (which cohorts are indexable from launch, which start as noindex), the canonicalization rules, the internal linking model between templates, the kill switch, and the monitoring SLOs (e.g. 95% of generated pages must serve 200 OK, indexation rate above 60% within 90 days). Without those guardrails the platform turns into a thin-content factory and Google notices.

How do you build a lightweight technical SEO audit app in-house?

Start with the smallest version that solves one painful audit you currently run by hand. The starter stack is a Python ETL job hitting Google Search Console (GSC) and BigQuery on a schedule, a SQL layer that joins crawl data with traffic and indexation, and a dashboard (Looker, Power BI, or a Streamlit app). Once the team gets value, layer on log file ingestion, schema validation, and alerting. The point is not to rebuild Botify; it is to surface the three or four signals your team chases every Monday morning, automatically.