Issues with Googlebot accessing my robots.txt file

Hello!

I’m experiencing issues with Googlebot accessing my robots.txt file. When Google tries to fetch /robots.txt, it returns Failed: Robots.txt unreachable

I’m hosting a static robots.txt file in the public/ directory of my Next.js app.

We have checked everything from our side and everything is okay. The website works fine and no changes have been made recently.

https://tabeer.ae/robots.txt is directly accessible from the browser but on Google Search Console it is not reachable. I am sharing some screenshots which show everything was okay before 6th April but after that robots.txt is not reachable by Google Search Console.

Can you help us resolve this issue.


Hi @it-tabeerae, welcome to the Vercel Community!

Sorry to see you’re facing this issue. I see the URL is surely accessible from internet.

Let’s try one quick fix. I’ve updated your robots.txt to have one space after each ::


User-Agent: *

Allow: /
Disallow: /private/

Sitemap: https://www.tabeer.ae/sitemap.xml
Sitemap: https://www.tabeer.ae/projects/sitemap.xml
Sitemap: https://www.tabeer.ae/news-and-events/sitemap.xml

Can you try making this change and see if that makes any difference?

1 Like

Hi @anshumanb thank you for the reply. I updated the robot.txt but it didnt make any difference. Still same when I try to do test live url on google search console it says robot.txt unreachable

I see. Have you tried the solutions mentioned in this post from Google Support?

Can you also try using the https://www.tabeer.ae/robots.txt url? I think without the www it gets redirected.

yes i have tried solutions from the link you shared. Nothing changed.

i have also tested with www…

without www
curl -I https://tabeer.ae/robots.txt
HTTP/1.1 308 Permanent Redirect
Cache-Control: public, max-age=0, must-revalidate
Content-Type: text/plain
Date: Wed, 16 Apr 2025 13:21:47 GMT
Location: https://www.tabeer.ae/robots.txt
Refresh: 0;url=https://www.tabeer.ae/robots.txt
Server: Vercel
Strict-Transport-Security: max-age=63072000
X-Vercel-Id: bom1::cm8vz-1744809707383-4bc7d6841a3f

with www
curl -I https://www.tabeer.ae/robots.txt
HTTP/1.1 200 OK
Accept-Language: en-US,en;q=0.5
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Age: 1358
Cache-Control: public, max-age=0, must-revalidate
Content-Disposition: inline; filename=“robots.txt”
Content-Length: 199
Content-Type: text/plain; charset=utf-8
Date: Wed, 16 Apr 2025 13:43:41 GMT
Etag: “c63d75e7ff334ff4d2bd33fa9affa289”
Last-Modified: Wed, 16 Apr 2025 13:21:02 GMT
Server: Vercel
Strict-Transport-Security: max-age=63072000
Vary: Accept-Language
X-Matched-Path: /robots.txt
X-Vercel-Cache: HIT
X-Vercel-Id: bom1::z86lw-1744811021221-540a48990bf9

@anshumanb It was working just fine from some time. Had no issues and also nothing has been changed as well. But all of a sudden from 6th april this is happening.
please help me resolve this.

Hi @it-tabeerae, thanks for the additional information. I’m still digging in to see what else we can try. Can you confirm that we don’t have any Firewall rules setup from your Vercel project settings?

no there are no firewall rules setup

1 Like

Hi @it-tabeerae, thanks for confirming. I’ve asked about this issue internally to see how we can help you. I’ll keep you posted when I hear back.

Hey there! :waving_hand: Thanks for reporting this. Sorry to hear you’re running into trouble with your robots.txt file in Google Search Console.

We looked into this internally and can confirm that Googlebot has been receiving valid 200 OK responses for /robots.txt — in fact, we’ve served 18 successful requests to Googlebot in just the last 12 hours. :white_check_mark:

That said, we did notice one thing: the only non-200 response was a 308 redirect from your raw domain (e.g. yourdomain.com) to www.yourdomain.com. While Google typically handles redirects well, it’s possible this redirect is causing a hiccup in how Googlebot is accessing your robots.txt.

To help us dig a little deeper, could you let us know:

  1. Which version of your domain is added to Google Search Console — www or the root domain?
  2. Is your robots.txt file accessible directly from both domain.com/robots.txt and www.domain.com/robots.txt?
  3. If you haven’t already, try using Google’s robots.txt Tester to see how it renders your file from their end.

Let us know what you find!

https://tabeer.ae/robots.txt
https://www.tabeer.ae/robots.txt
yes it is accessible both ways.

So 3 days ago i fixed the issue in google search console. i uploaded the sitemaps again and it fetched
i changed the robots.txt a bit .. like i removed the www from the sitemap links in robots.txt and everything was fine after that.

User-Agent: *
Allow: /
Disallow: /private/

Sitemap: https://tabeer.ae/sitemap.xml
Sitemap: https://tabeer.ae/projects/sitemap.xml
Sitemap: https://tabeer.ae/news-and-events/sitemap.xml


robots.txt were also fetched on Google search console after that. and everything was fine.

But i check today and the same thing is happening again..



@pawlean
in google search console we have added tabeer.ae as a property

@anshumanb @pawlean
Hi, can you please help resolve this issue because I dont see anything wrong on our side as everything is same as before, and it was working before

Hi @it-tabeerae, after speaking to the team internally we couldn’t find any requests that we declined. It could be some transient issues with Googlebot and your domain.

One thing I noticed in the sitemap.xml is that you are using url tags for linking to other sitemaps. I’d recommend following the Manage Your Sitemaps With Sitemap Index Files | Google Search Central  |  Documentation  |  Google for Developers guide to update your sitemap formatting.

but we havent changed the sitemaps formatting. It is same as before. this problem started recently in april only.

@anshumanb @pawlean

Hi, I am still facing the same issue. can you tell me one this i noticed that when accessing the urls without www

C:\Users\it>curl -I https://tabeer.ae/sitemap.xml
HTTP/1.1 308 Permanent Redirect
Cache-Control: public, max-age=0, must-revalidate
Content-Type: text/plain
Date: Tue, 06 May 2025 07:03:20 GMT
Location: https://www.tabeer.ae/sitemap.xml
Refresh: 0;url=https://www.tabeer.ae/sitemap.xml
Server: Vercel
Strict-Transport-Security: max-age=63072000
X-Vercel-Id: bom1::cfzpz-1746515000317-1abff6135925

content-type header is showing text/plain…

it should show text/html

can this be the issue ?

i check on the sitemap testing tool

with www its fine

Hi @it-tabeerae, thanks for sharing the update. I tried the same website you shared but I get a “green” result with no issues.

Regardless, I think you can use the EdgeMiddleware to update the response header and set it to text/html or you can also use the vercel.json to update headers for the /sitemap.xml path.

1 Like