cmd/anubis: configurable difficulty per-bot rule (#53)

Closes #30 Introduces the "challenge" field in bot rule definitions: ```json { "name": "generic-bot-catchall", "user_agent_regex": "(?i:bot|crawler)", "action": "CHALLENGE", "challenge": { "difficulty": 16, "report_as": 4, "algorithm": "slow" } } ``` This makes Anubis return a challenge page for every user agent with "bot" or "crawler" in it (case-insensitively) with difficulty 16 using the old "slow" algorithm but reporting in the client as difficulty 4. This is useful when you want to make certain clients in particular suffer. Additional validation and testing logic has been added to make sure that users do not define "impossible" challenge settings. If no algorithm is specified, Anubis defaults to the "fast" algorithm. Signed-off-by: Xe Iaso <me@xeiaso.net>
author: Xe Iaso <me@xeiaso.net> 2025-03-21 13:48:00 -0400
committer: GitHub <noreply@github.com> 2025-03-21 13:48:00 -0400
commit: d3e509517c12ddf82adf8ab29a36da9da9bd2bd2 (patch)
tree: 1bf6faa3454cadbeabd2ca822585ba038f673e73 /docs
parent: 90049001e9fac3d11cbe4f45dee473f5b2601171 (diff)
download: anubis-d3e509517c12ddf82adf8ab29a36da9da9bd2bd2.tar.xz
anubis-d3e509517c12ddf82adf8ab29a36da9da9bd2bd2.zip
3 files changed, 52 insertions, 0 deletions
diff --git a/docs/docs/CHANGELOG.md b/docs/docs/CHANGELOG.md
index 2b55261..a15af7b 100644
--- a/docs/docs/CHANGELOG.md
+++ b/docs/docs/CHANGELOG.md
@@ -11,6 +11,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+- Administrators can now define artificially hard challenges using the "slow" algorithm:
+
+  ```json
+  {
+    "name": "generic-bot-catchall",
+    "user_agent_regex": "(?i:bot|crawler)",
+    "action": "CHALLENGE",
+    "challenge": {
+      "difficulty": 16,
+      "report_as": 4,
+      "algorithm": "slow"
+    }
+  }
+  ```
+
+  This allows administrators to cause particularly malicious clients to use unreasonable amounts of CPU. The UI will also lie to the client about the difficulty.
+
 - Docker images now explicitly call `docker.io/library/<thing>` to increase compatibility with Podman et. al
   [#21](https://github.com/TecharoHQ/anubis/pull/21)
 - Don't overflow the image when browser windows are small (eg. on phones)
diff --git a/docs/docs/admin/algorithm-selection.mdx b/docs/docs/admin/algorithm-selection.mdx
new file mode 100644
index 0000000..e5bf962
--- /dev/null
+++ b/docs/docs/admin/algorithm-selection.mdx
@@ -0,0 +1,12 @@
+---
+title: Proof-of-Work Algorithm Selection
+---
+
+Anubis offers two proof-of-work algorithms:
+
+- `"fast"`: highly optimized JavaScript that will run as fast as your computer lets it
+- `"slow"`: intentionally slow JavaScript that will waste time and memory
+
+The fast algorithm is used by default to limit impacts on users' computers. Administrators may configure individual bot policy rules to use the slow algorithm in order to make known malicious clients waitloop and do nothing useful.
+
+Generally, you should use the fast algorithm unless you have a good reason not to.
diff --git a/docs/docs/admin/policies.md b/docs/docs/admin/policies.md
index bdf8a20..481a455 100644
--- a/docs/docs/admin/policies.md
+++ b/docs/docs/admin/policies.md
@@ -68,6 +68,29 @@ There are three actions that can be returned from a rule:
 
 Name your rules in lower case using kebab-case. Rule names will be exposed in Prometheus metrics.
 
+Rules can also have their own challenge settings. These are customized using the `"challenge"` key. For example, here is a rule that makes challenges artificially hard for connections with the substring "bot" in their user agent:
+
+```json
+{
+  "name": "generic-bot-catchall",
+  "user_agent_regex": "(?i:bot|crawler)",
+  "action": "CHALLENGE",
+  "challenge": {
+    "difficulty": 16,
+    "report_as": 4,
+    "algorithm": "slow"
+  }
+}
+```
+
+Challenges can be configured with these settings:
+
+| Key          | Example  | Description                                                                                                                                                                                    |
+| :----------- | :------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `difficulty` | `4`      | The challenge difficulty (number of leading zeros) for proof-of-work. See [Why does Anubis use Proof-of-Work?](/docs/design/why-proof-of-work) for more details.                               |
+| `report_as`  | `4`      | What difficulty the UI should report to the user. Useful for messing with industrial-scale scraping efforts.                                                                                   |
+| `algorithm`  | `"fast"` | The algorithm used on the client to run proof-of-work calculations. This must be set to `"fast"` or `"slow"`. See [Proof-of-Work Algorithm Selection](./algorithm-selection) for more details. |
+
 In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers:
 
 | Header            | Explanation                                          | Example          |
author	Xe Iaso <me@xeiaso.net>	2025-03-21 13:48:00 -0400
committer	GitHub <noreply@github.com>	2025-03-21 13:48:00 -0400
commit	d3e509517c12ddf82adf8ab29a36da9da9bd2bd2 (patch)
tree	1bf6faa3454cadbeabd2ca822585ba038f673e73 /docs
parent	90049001e9fac3d11cbe4f45dee473f5b2601171 (diff)
download	anubis-d3e509517c12ddf82adf8ab29a36da9da9bd2bd2.tar.xz anubis-d3e509517c12ddf82adf8ab29a36da9da9bd2bd2.zip