Right now, robots.txt on lemmy.ca is configured this way
User-Agent: *
Disallow: /login
Disallow: /login_reset
Disallow: /settings
Disallow: /create_community
Disallow: /create_post
Disallow: /create_private_message
Disallow: /inbox
Disallow: /setup
Disallow: /admin
Disallow: /password_change
Disallow: /search/
Disallow: /modlog
Would it be a good idea privacy-wise to deny GPTBot from scrapping content from the server?
User-agent: GPTBot
Disallow: /
Thanks!
Yes. Ban them.
Probably want == instead else we will all be forbidden
I would have thought so too, but == failed the syntax check
2023/08/07 15:36:59 [emerg] 2315181#2315181: unexpected "==" in condition in /etc/nginx/sites-enabled/lemmy.ca.conf:50
You actually want ~ though because GPTBot is just in the user agent, it’s not the full string.
Strangely,
=
works the same as==
with nginx. It’s a very strange config format…https://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if
Look at me! I’m the GPTBot now!
Thanks for empowering my lazyness =)