27 Apr 2026 · 7 min read

Which schema and content patterns do AI engines actually reward?

Four types of background code and three content patterns determine whether AI tools pull from your website or skip it. Most sites get the content right and the structure wrong — or the structure right and the content format wrong. Both need to work together.

Most small business websites are more readable than they are useful to AI tools.

The content is well-written. The copy is clear. But AI tools skip over it because the structural signals they look for are either missing or incorrectly formatted.

This article explains what those signals are, how they work, and what you can do to fix them — without needing to understand code.

Why AI tools skip well-written websites

AI tools are pattern-matching systems. They are trained to extract information from content that matches certain patterns. Content and code that match those patterns get pulled into recommendations. Content that does not — however well-written — gets passed over.

Think of it like a supermarket scanner. The scanner does not care how good the product is. It reads the barcode. No barcode, no purchase. Background code is the barcode for your website. Without it, AI tools have to manually interpret everything — and they do this conservatively, which means they skip businesses they cannot confirm.

What is background code, and why do AI tools care about it?

Background code — known as schema markup (say it: "SKEE-muh") — is a type of code embedded in your website that tells automated systems what your content means, not just what it says.

Your visible website is written for people. Background code is written for machines.

A paragraph that says "Smith & Co helps Melbourne businesses with employment law disputes" communicates meaning to a human reader. Background code communicates the same information to a machine: there is a business called Smith & Co, it is a legal service, it operates in Melbourne, it serves businesses, and employment law is its primary field. The machine does not have to interpret anything. The code declares it directly.

AI tools value this because it removes uncertainty. A tool that can read confirmed background code has more confidence in a recommendation than one that is guessing from your visible content. Conservative tools — ChatGPT and Claude in particular — tend to recommend businesses they can confirm. Background code is one of the primary ways to give them that confirmation.

The four types of background code that matter for small businesses

Each type serves a different purpose. They are not interchangeable.

Business identity code (Organisation schema)

This declares your business name, website, industry, description, and contact information. It is the first type to implement because everything else depends on it. Think of it as registering your business with the AI tool's version of ASIC — you are formally declaring that you exist, what you are, and how to reach you.

Without it, AI tools build their understanding of your business from guesswork. With it, you give them a confirmed, structured anchor to recommend from.

Location code (LocalBusiness schema)

This applies when customers choose you partly because of where you are. It declares your address, the area you serve, your opening hours, and your phone number.

Gemini (Google's AI tool) reads this directly alongside your Google Business Profile listing. Businesses with strong location code and a complete Google Business Profile appear in Gemini's local service recommendations far more consistently than those relying on one or the other alone.

Services code (Service schema)

This declares what you sell. Most small businesses do not have it. Without it, AI tools have to infer your service category from your visible content — which produces more variable results than a clear, declared service description.

If you offer multiple services, a separate services code block for each primary service helps AI tools match you to specific recommendation queries. Someone asking "who does employment law for small businesses" gets a better match to a business with employment law declared as a specific service than one where it is only mentioned in paragraph text.

FAQ code (FAQPage schema)

This labels your question-and-answer section so AI tools know to extract from it. When paired with a well-formatted FAQ section on your page, it lets tools pull specific question-and-answer pairs and attribute them directly to your business.

This is the code type that produces the most visible improvement for most businesses. The reason is structural: AI tools produce answers in question-and-answer format. FAQ content in FAQ code is already in that format. The tool can lift it almost directly.

The three content patterns AI tools extract from most reliably

Background code tells AI tools what your business is. Content gives them something to cite when they recommend you. Both need to be in place.

Pattern 1: Question-led headings

Every page heading written as a question gives AI tools a clear extraction point. "Employment law for small businesses" is a topic label. "What happens when a small business in Victoria gets an unfair dismissal claim?" is a question.

The difference is not stylistic — it is functional. AI tools are designed to match user queries to content. A heading phrased as a question matches that format directly. A topic label does not.

Go through your website's page headings right now. For every heading that is a noun phrase or topic label, ask: what question is this content answering? Rewrite the heading as that question.

Pattern 2: Answer-first paragraphs

The first sentence of every section should directly answer the heading. No context-setting. No "it depends." The answer first, then the supporting detail.

AI tools extract the first sentence of a section far more than any other part of it. A section that opens with background before the answer loses that extraction advantage.

Before: "Employment law is a complex area that affects businesses of all sizes, and it is important to understand your obligations as an employer."

After: "Employment lawyers help small businesses manage unfair dismissal claims, workplace disputes, and redundancy processes."

The second version gives an AI tool an extractable answer in the first sentence. The first version does not.

Pattern 3: FAQ sections with customer-language questions

A dedicated FAQ section with ten to fifteen questions is the single highest-yield content investment for AI visibility.

The questions must be phrased the way a potential buyer would type them into ChatGPT or Perplexity. Specific. Outcome-focused. No industry jargon.

Test each question with this simple check: would a potential customer actually type this into ChatGPT? If not, rewrite it until they would.

Examples of weak FAQ questions that get skipped:

  • "What makes you different?"
  • "How do I get started?"
  • "What services do you provide?"

Examples of strong FAQ questions that get extracted:

  • "What should I do if an employee makes an unfair dismissal claim against my business?"
  • "How long does an employment law dispute typically take to resolve?"
  • "Can a small business handle a Fair Work claim without a lawyer?"

What is an llms.txt file, and how does it fit in?

An llms.txt file (say it: "L-L-M-S dot text") is a simple text file on your website that tells AI tools which pages are most important and how to interpret your content.

Think of it as leaving a note for a new employee: "the important folders are these three — the rest is background material." AI tools look for this file when they visit your website. If it exists, they use it to prioritise the right pages. If it does not exist, they make their own decisions — and those decisions are often wrong for smaller sites where not every page carries equal weight.

This file is not background code. It does not declare structured information about your business. What it does is guide AI tools to the pages that matter most — your service descriptions, your FAQ section, your key articles — rather than letting them weight everything equally.

Writing an llms.txt takes about 30 minutes. The format guide is at llmstxt.org. It is one of the fastest improvements available, and most small business websites do not have one.

How to put it all together

Start with business identity code and FAQ code implemented at the same time. That combination produces the most consistent baseline improvement across all AI tools.

Then work through your page headings on every key page. Rewrite any topic label as a question. This is a content edit, not a technical task — you can do it without a developer.

Check your FAQ section. If you do not have one, create it. Aim for ten questions minimum, all phrased in customer language. Each answer should directly respond to the question in the first sentence, then support it with detail.

Write an llms.txt and add it to your website.

Add services code for each primary thing you offer.

The pattern that produces the most visible improvement, in the shortest time, for the widest range of businesses: FAQ code paired with a properly formatted, customer-language FAQ section. If you only have time for one improvement this week, do both of these together.


Frequently asked questions

Which type of background code matters most for small businesses?

FAQ code matters most because it structures the content format AI tools already extract from — question and answer pairs. Business identity code is the foundation and should be implemented first, but FAQ code produces the most visible improvement in how often AI tools cite your business. If you can only do one, do business identity code first to establish who you are, then FAQ code to give tools something specific to cite.

Do I need a web developer to add background code?

Not always. WordPress sites with Yoast SEO or RankMath have background code built into the plugin — you configure it rather than code it. Wix, Squarespace and Shopify have built-in support for basic types. For services code and custom FAQ code, a developer can add them in a few hours. The return on that time investment is high relative to most website work.

Does Google's validator check whether AI tools will use my code?

No. Google's Rich Results Test and Schema.org's validator check whether your background code is written correctly. They do not check whether AI tools will act on it. Code that passes Google's validator is a good starting point, but AI tools may still weight it differently based on how it relates to your page content and other signals.

How many FAQ questions do I need?

Five is the minimum for FAQ code to signal meaningful depth. Ten to fifteen is the range that produces reliable extraction results across multiple AI tools. The questions must be phrased in customer language — how a potential buyer would type them into an AI tool — not how you would describe your own services.

Does content freshness affect AI tool extraction?

Yes, particularly for tools that check live websites. Perplexity pulls from live web content rather than relying entirely on its original training, which means a fresh FAQ page can appear in Perplexity responses within days of publication. Publishing new content consistently creates compounding material for AI tools to cite.


Frequently asked questions

Which type of background code matters most for small businesses?

FAQ code matters most because it structures the content format AI tools already extract from — question and answer pairs. Business identity code is the foundation and should be implemented first, but FAQ code produces the most visible improvement in how often AI tools cite your business. If you can only do one, do business identity code first to establish who you are, then FAQ code to give tools something specific to cite.

Do I need a web developer to add background code?

Not always. WordPress sites with Yoast SEO or RankMath have background code built into the plugin — you configure it rather than code it. Wix, Squarespace and Shopify have built-in support for basic types. For services code and custom FAQ code, a developer can add them in a few hours. The return on that time investment is high relative to most website work.

Does Google's validator check whether AI tools will use my code?

No. Google's Rich Results Test and Schema.org's validator check whether your background code is written correctly. They do not check whether AI tools will act on it. Code that passes Google's validator is a good starting point, but AI tools may still weight it differently based on how it relates to your page content and other signals.

How many FAQ questions do I need?

Five is the minimum for FAQ code to signal meaningful depth. Ten to fifteen is the range that produces reliable extraction results across multiple AI tools. The questions must be phrased in customer language — how a potential buyer would type them into an AI tool — not how you would describe your own services. Generic questions produce weaker results than specific, outcome-focused ones.

Does content freshness affect AI tool extraction?

Yes, particularly for tools that check live websites. Perplexity pulls from live web content rather than relying entirely on its original training, which means a fresh FAQ page can appear in Perplexity responses within days of publication. Publishing new content consistently creates compounding material for AI tools to cite across both types of engine.

See where you stand

Free 60-second AI visibility scan. No account, no card.

Get Your Free AI Visibility Score