• cranakis@reddthat.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    7 days ago

    Might be to much work but you can allow a subset of traffic to bypass a CF WAF rule if the federated traffic is identifiable vs the scrapers.

    Edit: I’m reading up. What I said above may not apply to the one click thing: https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/

    I do support turning it on after what I read at that link.

    Edit 2: From here: https://developers.cloudflare.com/bots/get-started/free/#limitations

    Limitations You cannot bypass or skip Bot Fight Mode using the Skip action in WAF custom rules or using Page Rules. Skip, Bypass, and Allow actions apply to rules or rulesets running on the Ruleset Engine. While Super Bot Fight Mode rules are implemented in the Ruleset Engine, Bot Fight Mode checks are not. This is why you can skip Super Bot Fight Mode, but not Bot Fight Mode. If you need to skip Bot Fight Mode, consider using Super Bot Fight Mode.

    It’s like they tried to make that confusing to read.

    • Tiff@reddthat.comM
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 days ago

      Possibly, as it’s one generic endpoint, but it also blocked a few other things people in the fediverse created, which are mighty helpful in diagnosis of these and other issues.

      So using some AI model or whatever CF uses is probably not going to be the best thing for us as it classified a POST request as a crawler?? 🤷

      I’d have to whitelist every regular endpoint as well and then it gets messy as CF only gives you so much control as a free user.

      So, for the moment I’ve blocked the most annoying ones based on UserAgent.

      • cranakis@reddthat.com
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        7 days ago

        I’d have to whitelist every regular endpoint

        That’s why I started with “this might be to much work” 😆. Seems like there would be a way to do it without the automated bot blocking just using allow and deny (or challenge I guess it is here). The list would be a bitch to create by hand but shouldn’t it exist already somewhere in the federation configs? If so you could broadly allow those while blocking or challenging otherwise. I guess it comes down to how do you identify bot traffic with free, without the tool on.

        Full disclosure: I have CF Enterprise experience but I’m just guessing in the Lemmy/federation part and haven’t messed with CF free.