Add site map and search engine robots
An attempt to generate and write the sitemap.xml
file to the root directory results in the following error:
Please make sure that "/" is writable by the web-server.
With Adobe Commerce on cloud infrastructure, you can only write to specific directories, such as var
, pub/media
, pub/static
, or app/etc
. When you generate the sitemap.xml
file using the Admin panel, you must specify the /media/
path.
You do not have to generate a robots.txt
file because it generates the robots.txt
content on demand and stores it in the database. You can view the content in your browser with the <domain.your.project>/robots.txt
link.
This requires ECE-Tools version 2002.0.12 and later with an updated .magento.app.yaml
file. See an example of these rules in the magento-cloud repository.
To generate a sitemap.xml
file in version 2.2 and later:
-
Access the Admin.
-
On the Marketing menu, click Site Map in the SEO & Search section.
-
In the Site Map view, click Add Sitemap.
-
In the New Site Map view, enter the following values:
- Filename:
sitemap.xml
- Path:
/media/
- Filename:
-
Click Save & Generate. The new site map becomes available in the Site Map grid.
-
Click the path in the Link for Google column.
To add content to the robots.txt
file:
- Access the Admin.
- On the Content menu, click Configuration in the Design section.
- In the Design Configuration view, click Edit for the website in the Action column.
- In the Main Website view, click Search Engine Robots.
- Update the Edit custom instruction of robots.txt field.
- Click Save Configuration.
- Verify the
<domain.your.project>/robots.txt
file in your browser.
<domain.your.project>/robots.txt
file generates a 404 error
, Submit an Adobe Commerce Support ticket to remove the redirect from /robots.txt
to /media/robots.txt
.Rewrite using Fastly VCL snippet
If you have different domains and you need separate site maps, you can create a VCL to route to the proper sitemap. Generate the sitemap.xml
file in the Admin panel as described above, then create a custom Fastly VCL snippet to manage the redirect. See Custom Fastly VCL snippets.
Use a Fastly VCL snippet for redirect
Create a custom VCL snippet to rewrite the path for sitemap.xml
to /media/sitemap.xml
using the type
and content
key-value pairs.
{
"name": "sitemapxml_rewrite",
"dynamic": "0",
"type": "recv",
"priority": "90",
"content": "if ( req.url.path ~ \"^/?sitemap.xml$\" ) { set req.url = \"/media/sitemap.xml\"; }"
}
The following example demonstrates how to rewrite the path for robots.txt
and sitemap.xml
to /media/robots.txt
and /media/sitemap.xml
{
"name": "sitemaprobots_rewrite",
"dynamic": "0",
"type": "recv",
"priority": "90",
"content": "if ( req.url.path ~ \"^/?sitemap.xml$\" ) { set req.url = \"/media/sitemap.xml\"; } else if (req.url.path ~ \"^/?robots.txt$\") { set req.url = \"/media/robots.txt\";}"
}
To use a Fastly VCL snippet for particular domain redirect:
Create a pub/media/domain_robots.txt
file, where the domain is domain.com
, and use the next VCL snippet:
{
"name": "domain_robots",
"dynamic": "0",
"type": "recv",
"priority": "90",
"content": "if ( req.url.path == \"/robots.txt\" ) { if ( req.http.host ~ \"(domain).com$\" ) { set req.url = \"/media/\" re.group.1 \"_robots.txt\"; }}"
}
The VCL snippet routes http://domain.com/robots.txt
and presents the pub/media/domain_robots.txt
file.
To configure a redirect for robots.txt
and sitemap.xml
in a single snippet, create pub/media/domain_robots.txt
and pub/media/domain_sitemap.xml
files, where the domain is domain.com
and use the next VCL snippet:
{
"name": "domain_sitemaprobots",
"dynamic": "0",
"type": "recv",
"priority": "90",
"content": "if ( req.url.path == \"/robots.txt\" ) { if ( req.http.host ~ \"(domain).com$\" ) { set req.url = \"/media/\" re.group.1 \"_robots.txt\"; }} else if ( req.url.path == \"/sitemap.xml\" ) { if ( req.http.host ~ \"(domain).com$\" ) { set req.url = \"/media/\" re.group.1 \"_sitemap.xml\"; }}"
}
In the sitemap
admin config, you must specify the location of the file using pub/media/
rather than /
.
Configure indexing by search engine
To activate robots.txt
customizations, you must enable the Indexing by search engines is On for <environment-name>
option in your project settings.
robots.txt
file, add robots.txt
to the Front Name Allowlist at Stores > Configuration > General > Web > UPWARD PWA Configuration.