JSPM

  • Created
  • Published
  • Downloads 5038
  • Score
    100M100P100Q15365F
  • License MIT

Tame the robots crawling and indexing your Nuxt site with ease.

Package Exports

  • nuxt-simple-robots

Readme

nuxt-simple-robots

NPM version NPM Downloads GitHub stars

Tame the robots crawling and indexing your Nuxt site with ease.


Status: v3 Released 🎉
Please report any issues 🐛
Made possible by my Sponsor Program 💖
Follow me @harlan_zw 🐦 • Join Discord for help

Features

  • 🤖 Merge in your existing robots.txt or programmatically create a new one
  • 🗿 Automatic X-Robots-Tag header and <meta name="robots" ...> meta tag
  • 🔄 Integrates with route rules and runtime hooks
  • 🔒 Disables non-production environments from being indexed
  • Solves common issues and best practice default config

Installation

  1. Install nuxt-simple-robots dependency to your project:
#
yarn add -D nuxt-simple-robots
#
npm install -D nuxt-simple-robots
#
pnpm i -D nuxt-simple-robots
  1. Add it to your modules section in your nuxt.config:
export default defineNuxtConfig({
  modules: ['nuxt-simple-robots']
})

Documentation

📖 Read the full documentation for more information.

Module Integrations

Install

npm install -D nuxt-simple-robots
#
yarn add --dev nuxt-simple-robots
#
pnpm add -D nuxt-simple-robots

Setup

Add the module to your nuxt.config.ts.

export default defineNuxtConfig({
  modules: [
    'nuxt-simple-robots',
  ],
})

Usage

Robots.txt configuration

The recommendation way to implement your robots.txt configuration, is to simply create a robots.txt file in your project root or assets folder.

For environments that are indexable, this file will be parsed and merged with the module config.

User-agent: *
Disallow: /secret

If you'd prefer to load your robots.txt file from a different path, you can use the mergeWithRobotsTxtPath config.

Public folder

You're free to place your robots.txt in your <rootDir>/public folder, however, you won't benefit from all the features of this module.

Programmatic build-time configuration

If you need programmatic control, you can configure the module using nuxt.config with the following options:

  • disallow - An array of paths to disallow for the * user-agent.
  • allow - An array of paths to allow for the * user-agent.
  • groups - A stack of objects to provide granular control (see below).
export default defineNuxtConfig({
  robots: {
    // provide simple disallow rules for all robots `user-agent: *`
    disallow: ['/secret'],
    // add more granular rules
    groups: [
      // block specific robots from specific pages
      {
        userAgents: ['AdsBot-Google-Mobile', 'AdsBot-Google-Mobile-Apps'],
        disallow: ['/admin'],
        allow: ['/admin/login'],
        comments: 'Allow Google AdsBot to index the login page but no-admin pages'
      },
    ]
  }
})

Route Rules configuration

If you prefer, you can use route rules to configure how your routes are indexed by search engines.

You can provide the following rules:

  • { index: false } - Will disable the route from being indexed using the robotsDisabledValue config.
  • { robots: <string> } - Will add the provided string as the robots rule
export default defineNuxtConfig({
  routeRules: {
    // use the `index` shortcut for simple rules
    '/secret/**': { index: false },
    // add exceptions for individual routes
    '/secret/visible': { index: true },
    // use the `robots` rule if you need finer control
    '/custom-robots': { robots: 'index, follow' },
  }
})

The rules are applied using the following logic:

  • X-Robots-Tag header - SSR only,
  • <meta name="robots"> - When using the defineRobotMeta or RobotMeta composable or component
  • /robots.txt disallow entry - When disallowNonIndexableRoutes is enabled

Meta Tags

By default, only the /robots.txt and X-Robots-Tag HTTP header will be used to control indexing.

It's recommended for SSG apps or to improve debugging, to add a meta tags to your page as well.

Within your app.vue or a layout:

<script lang="ts" setup>
// Use Composition API
defineRobotMeta()
</script>

<template>
  <div>
    <!-- OR Component API -->
    <RobotMeta />
  </div>
</template>

Prerendering robots.txt

If you plan to prerender your robots.txt and aren't providing absolute sitemap URLs, then you should provide a canonical site URL through the nuxt-site-config module.

export default defineNuxtConfig({
  // @see https://github.com/harlan-zw/nuxt-site-config
  site: {
    url: process.env.NUXT_SITE_URL || 'https://example.com',
  },
})

Module Config

enabled

  • Type: boolean
  • Default: true
  • Required: false

Conditionally toggle the module.

sitemap

  • Type: string | string[] | false
  • Default: false

The sitemap URL(s) for the site. If you have multiple sitemaps, you can provide an array of URLs.

You must either define the runtime config siteUrl or provide the sitemap as absolute URLs.

export default defineNuxtConfig({
  robots: {
    sitemap: [
      '/sitemap-one.xml',
      '/sitemap-two.xml',
    ],
  },
})

allow

  • Type: string[]
  • Default: []
  • Required: false

Allow paths to be indexed for the * user-agent (all robots).

disallow

  • Type: string[]
  • Default: []
  • Required: false

Disallow paths from being indexed for the * user-agent (all robots).

groups

  • Type: { userAgent: []; allow: []; disallow: []; comments: [] }[]
  • Default: []
  • Required: false

Define more granular rules for the robots.txt. Each group is a set of rules for specific user agent(s).

export default defineNuxtConfig({
  robots: {
    groups: [
      {
        userAgents: ['AdsBot-Google-Mobile', 'AdsBot-Google-Mobile-Apps'],
        disallow: ['/admin'],
        allow: ['/admin/login'],
        comments: 'Allow Google AdsBot to index the login page but no-admin pages'
      },
    ]
  }
})

robotsEnabledValue

  • Type: string
  • Default: 'index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1'
  • Required: false

The value to use when the site is indexable.

robotsDisabledValue

  • Type: string
  • Default: 'noindex, nofollow'
  • Required: false

The value to use when the site is not indexable.

disallowNonIndexableRoutes

  • Type: boolean
  • Default: 'false'

Should route rules which disallow indexing be added to the /robots.txt file.

mergeWithRobotsTxtPath

  • Type: boolean | string
  • Default: true
  • Required: false

Specify a robots.txt path to merge the config from, relative to the root directory.

When set to true, the default path of <publicDir>/robots.txt will be used.

When set to false, no merging will occur.

blockNonSeoBots

  • Type: boolean
  • Default: false
  • Required: false

Blocks some non-SEO bots from crawling your site. This is not a replacement for a full-blown bot management solution, but it can help to reduce the load on your server.

See const.ts for the list of bots that are blocked.

export default defineNuxtConfig({
  robots: {
    blockNonSeoBots: true
  }
})

debug

  • Type: boolean
  • Default: false
  • Required: false

Enables debug logs and a debug endpoint.

credits

  • Type: boolean
  • Default: true
  • Required: false

Control the module credit comment in the generated robots.txt file.

# START nuxt-simple-robots (indexable) <- credits
 ...
# END nuxt-simple-robots <- credits
export default defineNuxtConfig({
  robots: {
    credits: false
  }
})

siteUrl - DEPRECATED

  • Type: string

Used to ensure sitemaps are absolute URLs.

Note: This is only required when prerendering your site.

This is now handled by the nuxt-site-config module.

You should provide url through site config instead, otherwise see the module for more examples.

export default defineNuxtConfig({
  site: {
    url: process.env.NUXT_SITE_URL || 'https://example.com',
  },
})

indexable - DEPRECATED

  • Type: boolean
  • Default: process.env.NODE_ENV === 'production'

Whether the site is indexable by search engines.

This is now handled by the nuxt-site-config module.

If you need to change the default, then you should provide indexable through site config instead or see the module for more examples.

Nuxt Hooks

robots:config

Type: async (config: ModuleOptions) => void | Promise<void>

This hook allows you to modify the robots config before it is used to generate the robots.txt and meta tags.

export default defineNuxtConfig({
  hooks: {
    'robots:config': (config) => {
      // modify the config
      config.sitemap = '/sitemap.xml'
    },
  },
})

Nitro Hooks

robots:robots-txt

Type: async (ctx: { robotsTxt: string }) => void | Promise<void>

This hook allows you to modify the robots.txt content before it is sent to the client.

import { defineNitroPlugin } from 'nitropack/runtime/plugin'

export default defineNitroPlugin((nitroApp) => {
  if (!process.dev) {
    nitroApp.hooks.hook('robots:robots-txt', async (ctx) => {
      // remove comments from robotsTxt in production
      ctx.robotsTxt = ctx.robotsTxt.replace(/^#.*$/gm, '').trim()
    })
  }
})

Sponsors

License

MIT License © 2022-PRESENT Harlan Wilton