Sitemaps tell search engines what pages exist on your site and when they were last updated. robots.txt tells crawlers which pages to index and which to skip. Both are essential for SEO.
Sitemap
Basic Static Sitemap
// app/sitemap.ts
import type { MetadataRoute } from "next";
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: "https://yourdomain.com",
lastModified: new Date(),
changeFrequency: "weekly",
priority: 1,
},
{
url: "https://yourdomain.com/about",
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.8,
},
{
url: "https://yourdomain.com/services",
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.9,
},
{
url: "https://yourdomain.com/contact",
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.7,
},
];
}
This generates /sitemap.xml automatically.
Dynamic Sitemap with Blog Posts
// app/sitemap.ts
import type { MetadataRoute } from "next";
async function getBlogPosts() {
// Fetch from your CMS, database, or file system
const posts = await getAllPosts();
return posts;
}
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const baseUrl = "https://yourdomain.com";
// Static pages
const staticPages: MetadataRoute.Sitemap = [
{
url: baseUrl,
lastModified: new Date(),
changeFrequency: "weekly",
priority: 1,
},
{
url: `${baseUrl}/about`,
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.8,
},
{
url: `${baseUrl}/services`,
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.9,
},
{
url: `${baseUrl}/blog`,
lastModified: new Date(),
changeFrequency: "daily",
priority: 0.9,
},
{
url: `${baseUrl}/contact`,
lastModified: new Date(),
changeFrequency: "monthly",
priority: 0.7,
},
];
// Dynamic blog posts
const posts = await getBlogPosts();
const blogPages: MetadataRoute.Sitemap = posts.map((post) => ({
url: `${baseUrl}/blog/${post.slug}`,
lastModified: new Date(post.date),
changeFrequency: "monthly" as const,
priority: 0.6,
}));
return [...staticPages, ...blogPages];
}
Multiple Sitemaps for Large Sites
For sites with thousands of pages, use sitemap indexing:
// app/sitemap.ts
import type { MetadataRoute } from "next";
export default function sitemap(): MetadataRoute.Sitemap {
return [
// This becomes a sitemap index pointing to sub-sitemaps
];
}
// app/sitemap/[id]/route.ts — or use generateSitemaps:
export async function generateSitemaps() {
const totalPosts = await getPostCount();
const sitemapSize = 50000; // Max URLs per sitemap
const numSitemaps = Math.ceil(totalPosts / sitemapSize);
return Array.from({ length: numSitemaps }, (_, i) => ({ id: i }));
}
export default async function sitemap({ id }: { id: number }): Promise<MetadataRoute.Sitemap> {
const start = id * 50000;
const end = start + 50000;
const posts = await getPostsRange(start, end);
return posts.map((post) => ({
url: `https://yourdomain.com/blog/${post.slug}`,
lastModified: new Date(post.date),
}));
}
robots.txt
Basic robots.txt
// app/robots.ts
import type { MetadataRoute } from "next";
export default function robots(): MetadataRoute.Robots {
return {
rules: {
userAgent: "*",
allow: "/",
disallow: ["/api/", "/admin/", "/private/"],
},
sitemap: "https://yourdomain.com/sitemap.xml",
};
}
Environment-Aware robots.txt
Block indexing on staging/preview environments:
// app/robots.ts
import type { MetadataRoute } from "next";
export default function robots(): MetadataRoute.Robots {
const baseUrl = process.env.NEXT_PUBLIC_URL || "https://yourdomain.com";
const isProduction = process.env.NODE_ENV === "production"
&& baseUrl === "https://yourdomain.com";
if (!isProduction) {
return {
rules: {
userAgent: "*",
disallow: "/",
},
};
}
return {
rules: [
{
userAgent: "*",
allow: "/",
disallow: ["/api/", "/admin/"],
},
],
sitemap: `${baseUrl}/sitemap.xml`,
};
}
This prevents staging and preview deployments from being indexed by search engines.
Blocking Specific Bots
export default function robots(): MetadataRoute.Robots {
return {
rules: [
{
userAgent: "*",
allow: "/",
disallow: ["/api/", "/admin/"],
},
{
userAgent: "GPTBot",
disallow: "/", // Block OpenAI's crawler
},
{
userAgent: "CCBot",
disallow: "/", // Block Common Crawl
},
],
sitemap: "https://yourdomain.com/sitemap.xml",
};
}
Priority Values Guide
| Page Type | Suggested Priority |
|---|---|
| Homepage | 1.0 |
| Main service/product pages | 0.9 |
| Blog index, about page | 0.8 |
| Individual blog posts | 0.6 |
| Contact, legal pages | 0.5-0.7 |
| Utility pages | 0.3 |
Priority values are relative hints to search engines, not absolute rankings.
Change Frequency Guide
| Frequency | Use For |
|---|---|
| always | Real-time content (stock prices, live feeds) |
| hourly | News sites, high-frequency updates |
| daily | Blog index, active forums |
| weekly | Homepage, frequently updated pages |
| monthly | Most content pages, blog posts |
| yearly | Legal pages, rarely changed content |
| never | Archived content |
Verification
- Visit
yourdomain.com/sitemap.xml— should show XML - Visit
yourdomain.com/robots.txt— should show rules - Submit sitemap to Google Search Console
- Use Google's robots.txt tester to verify rules
- Check that blocked paths return the expected behavior
Common Mistakes
- Blocking CSS/JS files: Search engines need these to render pages
- Different robots.txt on staging: Staging should block all crawlers
- Stale lastModified dates: Use actual modification dates, not
new Date() - Missing sitemap reference: Always include sitemap URL in robots.txt
- Forgetting dynamic routes: Generate sitemap entries for all public pages
Need SEO Setup?
We configure sitemaps, robots.txt, structured data, and technical SEO for every website we build. Contact us for professional SEO implementation.