Create a Dynamic Sitemap with Next.js

March 2, 2020 (4y ago)

To improve your Search Engine Optimization (SEO), you might need to add a sitemap or robots.txt file to your Next.js site.

A sitemap defines the relationship between pages of your site. Search engines utilize this file to more accurately index your site. You can also provide additional information such as last updated time, how frequently the page changes, and more.

A robots.txt file tells search engines which pages or files the crawler can or can't request from your site.

Static Sitemaps

If your site does not update frequently, you might currently have a static sitemap. This is a basic .xml file defining the content of your site. Here's a simple example:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
      <loc>https://leerob.io</loc>
  </url>
  <url>
      <loc>https://leerob.io/blog</loc>
  </url>
</urlset>

As your site scales, you will probably want to create your sitemap dynamically.

Dynamic Sitemaps

If your site frequently changes, you should dynamically create a sitemap. Let's first look at an example where your site content is file-based (e.g., contained inside the /pages directory).

First, let's add globby so we can fetch a list of routes.

$ yarn add --dev globby

Note: globby might not work on Windows.

Next, we can create a Node script at scripts/generate-sitemap.mjs. This file will dynamically build a sitemap based on your /pages directory.

import { writeFileSync } from 'fs';
import { globby } from 'globby';
import prettier from 'prettier';

async function generate() {
  const prettierConfig = await prettier.resolveConfig('./.prettierrc.js');
  const pages = await globby([
    'pages/*.js',
    'data/**/*.mdx',
    '!data/*.mdx',
    '!pages/_*.js',
    '!pages/api',
    '!pages/404.js',
  ]);

  const sitemap = `
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmln="http://www.sitemaps.org/schemas/sitemap/0.9">
        ${pages
          .map((page) => {
            const path = page
              .replace('pages', '')
              .replace('data', '')
              .replace('.js', '')
              .replace('.mdx', '');
            const route = path === '/index' ? '' : path;

            return `
              <url>
                  <loc>${`https://leerob.io${route}`}</loc>
              </url>
            `;
          })
          .join('')}
    </urlset>
    `;

  const formatted = prettier.format(sitemap, {
    ...prettierConfig,
    parser: 'html',
  });

  // eslint-disable-next-line no-sync
  writeFileSync('public/sitemap.xml', formatted);
}

generate();

Finally, add postbuild script in your package.json to run this script after next build completes. Your generated file gets created at public/sitemap.xml which is then accessible from the root of your site.

{
  "scripts": {
    "build": "next build",
    "postbuild": "node ./scripts/generate-sitemap.mjs",
    "dev": "next dev",
    "start": "next start",
    "lint": "next lint"
  }
}

External Content

If you have externally hosted data (e.g., a CMS), you'll need to make an API request before you can create your sitemap. This implementation will vary depending on your data source, but the idea is the same. To demonstrate, I've created an example using placeholder data.

First, create a new file at pages/sitemap.xml.js.

import React from 'react';

class Sitemap extends React.Component {
  static async getInitialProps({ res }) {
    const request = await fetch(EXTERNAL_DATA_URL);
    const posts = await request.json();

    res.setHeader('Content-Type', 'text/xml');
    res.write(createSitemap(posts));
    res.end();
  }
}

export default Sitemap;

When the route /sitemap.xml is initially loaded, we will fetch posts from an external data source and then write an XML file as the response.

import React from 'react';

const EXTERNAL_DATA_URL = 'https://jsonplaceholder.typicode.com/posts';

const createSitemap = (posts) => `<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmln="http://www.sitemaps.org/schemas/sitemap/0.9">
        ${posts
          .map(({ id }) => {
            return `
                <url>
                    <loc>${`${EXTERNAL_DATA_URL}/${id}`}</loc>
                </url>
            `;
          })
          .join('')}
    </urlset>
    `;

class Sitemap extends React.Component {
  static async getInitialProps({ res }) {
    const request = await fetch(EXTERNAL_DATA_URL);
    const posts = await request.json();

    res.setHeader('Content-Type', 'text/xml');
    res.write(createSitemap(posts));
    res.end();
  }
}

export default Sitemap;

Here's a condensed example of the output.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
      <loc>https://jsonplaceholder.typicode.com/posts/1</loc>
  </url>
  <url>
      <loc>https://jsonplaceholder.typicode.com/posts/2</loc>
  </url>
</urlset>

robots.txt

Finally, we can create a static file at public/robots.txt to define which files can be crawled and where the sitemap is located.

User-agent: *
Sitemap: https://leerob.io/sitemap.xml