How AI Uses Structured Data
When an AI model encounters a web page, it needs to answer basic questions: What is this page about? Is this a product, an article, a person, an organization? What are the key facts?
Without structured data, the AI has to infer these answers from the page's text content — a process that's error-prone and context-dependent. With schema.org JSON-LD, the answers are explicit and machine-readable. The AI doesn't have to guess that "$29.99" is a price — it's labeled as schema:price.
Which Types Matter Most
Schema.org has 800+ types, but AI models primarily use a small subset. Here are the types that have the most impact on AI readiness:
Article / BlogPosting
Use on: Blog posts, news, guides, tutorials
Helps AI identify the article's headline, author, publish date, and topic. Critical for content sites.
{
"@type": "Article",
"headline": "How to Configure robots.txt",
"author": { "@type": "Person", "name": "Jane Doe" },
"datePublished": "2025-01-15"
}Organization
Use on: Homepage, about page
Establishes your brand identity for AI — name, logo, social profiles, contact info. Helps AI accurately attribute content.
{
"@type": "Organization",
"name": "Acme Corp",
"url": "https://acme.com",
"logo": "https://acme.com/logo.png"
}Product
Use on: E-commerce product pages
Price, availability, reviews — the facts AI shopping assistants need to make recommendations.
{
"@type": "Product",
"name": "Widget Pro",
"offers": {
"@type": "Offer",
"price": "29.99",
"priceCurrency": "USD"
}
}FAQPage
Use on: FAQ sections, support pages
Question-answer pairs are directly consumable by AI assistants. The most AI-friendly schema type.
{
"@type": "FAQPage",
"mainEntity": [{
"@type": "Question",
"name": "What is llms.txt?",
"acceptedAnswer": {
"@type": "Answer",
"text": "A standard for..."
}
}]
}SoftwareApplication
Use on: Tool/app landing pages
Tells AI what your software does, what platform it runs on, and whether it's free. Essential for developer tools.
{
"@type": "SoftwareApplication",
"name": "w2agent",
"applicationCategory": "DeveloperApplication",
"operatingSystem": "Any"
}BreadcrumbList
Use on: Any page with navigation hierarchy
Helps AI understand where a page sits in your site structure — crucial for large sites.
{
"@type": "BreadcrumbList",
"itemListElement": [{
"@type": "ListItem",
"position": 1,
"name": "Docs",
"item": "https://example.com/docs"
}]
}Implementation: JSON-LD
Always use JSON-LD format (not Microdata or RDFa). JSON-LD is embedded in a <script type="application/ld+json"> tag in your page's head or body. It's easier to maintain, doesn't interleave with your HTML, and is the format recommended by Google and preferred by AI systems.
You can include multiple JSON-LD blocks on a single page — one for the Organization, one for the Article, one for BreadcrumbList. They're independent and don't need to reference each other.
What AI Actually Reads
Not all schema.org properties are equally useful to AI. Focus on these high-value properties:
- → name/headline: The primary identifier — what is this thing?
- → description: A concise summary AI can use directly in responses.
- → author/publisher: Attribution and credibility signals.
- → datePublished/Modified: Freshness — AI prefers recent content.
- → price/availability: For products — the facts users ask AI about.
Auto-Generate Structured Data
w2agent audits your existing structured data and generates missing schemas based on your page content. It detects page types (article, product, FAQ) and creates the appropriate JSON-LD.
Testing Your Structured Data
Before relying on schema.org markup to improve AI readiness, verify it's valid and parseable. Invalid JSON-LD is silently ignored — it doesn't show errors on the page, so issues are easy to miss.
# Extract and validate JSON-LD from a page curl -s https://your-site.com/blog/post | \ grep -o '<script type="application/ld+json">.*</script>' | \ python3 -m json.tool # Or use sed + jq (works on macOS and Linux) curl -s https://your-site.com/ | \ sed -n 's/.*<script type="application\/ld+json">\(.*\)<\/script>.*/\1/p' | \ jq .
Google's Rich Results Test and Schema.org's validator are the authoritative tools for deeper validation. The w2agent audit checks for the presence and syntactic validity of JSON-LD on every page it scans.
Impact on AI Responses
Sites with complete schema.org markup appear more authoritatively in AI-generated responses. When ChatGPT or Perplexity summarizes your product, it pulls structured fields first — price, availability, description — before falling back to page text. The difference is accuracy: text parsing introduces errors; structured data doesn't.
Schema.org works at the page level. For site-level discovery, pair it with llms.txt (content index) and agent-card.json (capability declaration) for a complete AI-readiness stack. The w2agent score measures all three layers together.
Related Articles
- What is llms.txt? — Site-level content indexing that works alongside page-level Schema.org markup.
- agent-card.json — Capability declaration for agents, completing the three-layer AI-readiness stack.
- AI Readiness Audit — How the w2agent score measures your structured data implementation.
Score your site now
Get your free w2agent score and generate the files your site needs.
Get Your Score