From 6858f66a62416354a349d8090fcb45b5262056eb Mon Sep 17 00:00:00 2001 From: Sandro Date: Fri, 25 Apr 2025 19:38:02 +0200 Subject: Add check endpoint which can be used with nginx' auth_request function (#266) * Add check endpoint which can be used with nginx' auth_request function * feat(cmd): allow configuring redirect domains * test: add test environment for the nginx_auth PR This is a full local setup of the nginx_auth PR including HTTPS so that it's easier to validate in isolation. This requires an install of k3s (https://k3s.io) with traefik set to listen on localhost. This will be amended in the future but for now this works enough to ship it. Signed-off-by: Xe Iaso * fix(cmd|lib): allow empty redirect domains variable Signed-off-by: Xe Iaso * fix(test): add space to target variable in anubis container Signed-off-by: Xe Iaso * docs(admin): rewrite subrequest auth docs, make generic * docs(install): document REDIRECT_DOMAINS flag Signed-off-by: Xe Iaso * feat(lib): clamp redirects to the same HTTP host Only if REDIRECT_DOMAINS is not set. Signed-off-by: Xe Iaso --------- Signed-off-by: Xe Iaso Co-authored-by: Xe Iaso --- docs/docs/CHANGELOG.md | 2 + docs/docs/admin/configuration/redirect-domains.mdx | 94 ++++++++++++++ docs/docs/admin/configuration/subrequest-auth.mdx | 139 +++++++++++++++++++++ docs/docs/admin/installation.mdx | 39 +++--- 4 files changed, 255 insertions(+), 19 deletions(-) create mode 100644 docs/docs/admin/configuration/redirect-domains.mdx create mode 100644 docs/docs/admin/configuration/subrequest-auth.mdx (limited to 'docs') diff --git a/docs/docs/CHANGELOG.md b/docs/docs/CHANGELOG.md index d824edd..128014c 100644 --- a/docs/docs/CHANGELOG.md +++ b/docs/docs/CHANGELOG.md @@ -16,6 +16,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Refactor check logic to be more generic and work on a Checker type - Add more AI user agents based on the [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt) project - Embedded challenge data in initial HTML response to improve performance +- Added support to use Nginx' `auth_request` directive with Anubis +- Added support to allow to restrict the allowed redirect domains - Whitelisted [DuckDuckBot](https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot/) in botPolicies - Improvements to build scripts to make them less independent of the build host - Improved the OpenGraph error logging diff --git a/docs/docs/admin/configuration/redirect-domains.mdx b/docs/docs/admin/configuration/redirect-domains.mdx new file mode 100644 index 0000000..4181143 --- /dev/null +++ b/docs/docs/admin/configuration/redirect-domains.mdx @@ -0,0 +1,94 @@ +--- +title: Redirect Domain Configuration +--- + +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; + +Anubis has an HTTP redirect in the middle of its check validation logic. This redirect allows Anubis to set a cookie on validated requests so that users don't need to pass challenges on every page load. + +This flow looks something like this: + +```mermaid +sequenceDiagram + participant User + participant Challenge + participant Validation + participant Backend + + User->>+Challenge: GET / + Challenge->>+User: Solve this challenge + User->>+Validation: Here's the solution, send me to / + Validation->>+User: Here's a cookie, go to / + User->>+Backend: GET / +``` + +However, in some cases a sufficiently dedicated attacker could trick a user into clicking on a validation link with a solution pre-filled out. For example: + +```mermaid +sequenceDiagram + participant Hacker + participant User + participant Validation + participant Evil Site + + Hacker->>+User: Click on yoursite.com with this solution + User->>+Validation: Here's a solution, send me to evilsite.com + Validation->>+User: Here's a cookie, go to evilsite.com + User->>+Evil Site: GET evilsite.com +``` + +If this happens, Anubis will throw an error like this: + +```text +Redirect domain not allowed +``` + +## Configuring allowed redirect domains + +By default, Anubis will limit redirects to be on the same HTTP Host that Anubis is running on (EG: requests to yoursite.com cannot redirect outside of yoursite.com). If you need to set more than one domain, fill the `REDIRECT_DOMAINS` environment variable with a comma-separated list of domain names that Anubis should allow redirects to. + +:::note + +These domains are _an exact string match_, they do not support wildcard matches. + +::: + + + + +```shell +# anubis.env + +REDIRECT_DOMAINS="yoursite.com,secretplans.yoursite.com" +# ... +``` + + + + +```yaml +services: + anubis-nginx: + image: ghcr.io/techarohq/anubis:latest + environment: + REDIRECT_DOMAINS: "yoursite.com,secretplans.yoursite.com" + # ... +``` + + + + +Inside your Deployment, StatefulSet, or Pod: + +```yaml +- name: anubis + image: ghcr.io/techarohq/anubis:latest + env: + - name: REDIRECT_DOMAINS + value: "yoursite.com,secretplans.yoursite.com" + # ... +``` + + + diff --git a/docs/docs/admin/configuration/subrequest-auth.mdx b/docs/docs/admin/configuration/subrequest-auth.mdx new file mode 100644 index 0000000..a4bbda6 --- /dev/null +++ b/docs/docs/admin/configuration/subrequest-auth.mdx @@ -0,0 +1,139 @@ +--- +title: Subrequest Authentication +--- + +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; + +Anubis can act in one of two modes: + +1. Reverse proxy (the default): Anubis sits in the middle of all traffic and then will reverse proxy it to its destination. This is the moral equivalent of a middleware in your favorite web framework. +2. Subrequest authentication mode: Anubis listens for requests and if they don't pass muster then they are forwarded to Anubis for challenge processing. This is the equivalent of Anubis being a sidecar service. + +## Nginx + +Anubis can perform [subrequest authentication](https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-subrequest-authentication/) with the `auth_request` module in Nginx. In order to set this up, keep the following things in mind: + +The `TARGET` environment variable in Anubis must be set to a space, eg: + + + + +```shell +# anubis.env + +TARGET=" " +# ... +``` + + + + +```yaml +services: + anubis-nginx: + image: ghcr.io/techarohq/anubis:latest + environment: + TARGET: " " + # ... +``` + + + + +Inside your Deployment, StatefulSet, or Pod: + +```yaml +- name: anubis + image: ghcr.io/techarohq/anubis:latest + env: + - name: TARGET + value: " " + # ... +``` + + + + +In order to configure this, you need to add the following location blocks to each server pointing to the service you want to protect: + +```nginx +location /.within.website/ { + # Assumption: Anubis is running in the same network namespace as + # nginx on localhost TCP port 8923 + proxy_pass http://127.0.0.1:8923; + auth_request off; +} + +location @redirectToAnubis { + return 307 /.within.website/?redir=$scheme://$host$request_uri; + auth_request off; +} +``` + +This sets up `/.within.website` to point to Anubis. Any requests that Anubis rejects or throws a challenge to will be sent here. This also sets up a named location `@redirectToAnubis` that will redirect any requests to Anubis for advanced processing. + +Finally, add this to your root location block: + +```nginx +location / { + # diff-add + auth_request /.within.website/x/cmd/anubis/api/check; + # diff-add + error_page 401 = @redirectToAnubis; +} +``` + +This will check all requests that don't match other locations with Anubis to ensure the client is genuine. + +This will make every request get checked by Anubis before it hits your backend. If you have other locations that don't need Anubis to do validation, add the `auth_request off` directive to their blocks: + +```nginx +location /secret { + # diff-add + auth_request off; + + # ... +} +``` + +Here is a complete example of an Nginx server listening over TLS and pointing to Anubis: + +
+ Complete example + +```nginx +# /etc/nginx/conf.d/nginx.local.cetacean.club.conf + +server { + listen 443 ssl; + listen [::]:443 ssl; + server_name nginx.local.cetacean.club; + ssl_certificate /etc/techaro/pki/nginx.local.cetacean.club/tls.crt; + ssl_certificate_key /etc/techaro/pki/nginx.local.cetacean.club/tls.key; + ssl_protocols TLSv1.2 TLSv1.3; + ssl_ciphers HIGH:!aNULL:!MD5; + + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + + location /.within.website/ { + proxy_pass http://localhost:8923; + auth_request off; + } + + location @redirectToAnubis { + return 307 /.within.website/?redir=$scheme://$host$request_uri; + auth_request off; + } + + location / { + auth_request /.within.website/x/cmd/anubis/api/check; + error_page 401 = @redirectToAnubis; + root /usr/share/nginx/html; + index index.html index.htm; + } +} +``` + +
diff --git a/docs/docs/admin/installation.mdx b/docs/docs/admin/installation.mdx index 1070ccb..d0dc725 100644 --- a/docs/docs/admin/installation.mdx +++ b/docs/docs/admin/installation.mdx @@ -49,25 +49,26 @@ For more detailed information on installing Anubis with native packages, please Anubis uses these environment variables for configuration: -| Environment Variable | Default value | Explanation | -| :----------------------------- | :---------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `BIND` | `:8923` | The network address that Anubis listens on. For `unix`, set this to a path: `/run/anubis/instance.sock` | -| `BIND_NETWORK` | `tcp` | The address family that Anubis listens on. Accepts `tcp`, `unix` and anything Go's [`net.Listen`](https://pkg.go.dev/net#Listen) supports. | -| `COOKIE_DOMAIN` | unset | The domain the Anubis challenge pass cookie should be set to. This should be set to the domain you bought from your registrar (EG: `techaro.lol` if your webapp is running on `anubis.techaro.lol`). See [here](https://stackoverflow.com/a/1063760) for more information. | -| `COOKIE_PARTITIONED` | `false` | If set to `true`, enables the [partitioned (CHIPS) flag](https://developers.google.com/privacy-sandbox/cookies/chips), meaning that Anubis inside an iframe has a different set of cookies than the domain hosting the iframe. | -| `DIFFICULTY` | `4` | The difficulty of the challenge, or the number of leading zeroes that must be in successful responses. | -| `ED25519_PRIVATE_KEY_HEX` | unset | The hex-encoded ed25519 private key used to sign Anubis responses. If this is not set, Anubis will generate one for you. This should be exactly 64 characters long. See below for details. | -| `ED25519_PRIVATE_KEY_HEX_FILE` | unset | Path to a file containing the hex-encoded ed25519 private key. Only one of this or its sister option may be set. | -| `METRICS_BIND` | `:9090` | The network address that Anubis serves Prometheus metrics on. See `BIND` for more information. | -| `METRICS_BIND_NETWORK` | `tcp` | The address family that the Anubis metrics server listens on. See `BIND_NETWORK` for more information. | -| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. | -| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. | -| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. | -| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. | -| `SOCKET_MODE` | `0770` | _Only used when at least one of the `*_BIND_NETWORK` variables are set to `unix`._ The socket mode (permissions) for Unix domain sockets. | -| `TARGET` | `http://localhost:3923` | The URL of the service that Anubis should forward valid requests to. Supports Unix domain sockets, set this to a URI like so: `unix:///path/to/socket.sock`. | -| `USE_REMOTE_ADDRESS` | unset | If set to `true`, Anubis will take the client's IP from the network socket. For production deployments, it is expected that a reverse proxy is used in front of Anubis, which pass the IP using headers, instead. | -| `WEBMASTER_EMAIL` | unset | If set, shows a contact email address when rendering error pages. This email address will be how users can get in contact with administrators. | +| Environment Variable | Default value | Explanation | +| :----------------------------- | :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `BIND` | `:8923` | The network address that Anubis listens on. For `unix`, set this to a path: `/run/anubis/instance.sock` | +| `BIND_NETWORK` | `tcp` | The address family that Anubis listens on. Accepts `tcp`, `unix` and anything Go's [`net.Listen`](https://pkg.go.dev/net#Listen) supports. | +| `COOKIE_DOMAIN` | unset | The domain the Anubis challenge pass cookie should be set to. This should be set to the domain you bought from your registrar (EG: `techaro.lol` if your webapp is running on `anubis.techaro.lol`). See [here](https://stackoverflow.com/a/1063760) for more information. | +| `COOKIE_PARTITIONED` | `false` | If set to `true`, enables the [partitioned (CHIPS) flag](https://developers.google.com/privacy-sandbox/cookies/chips), meaning that Anubis inside an iframe has a different set of cookies than the domain hosting the iframe. | +| `DIFFICULTY` | `4` | The difficulty of the challenge, or the number of leading zeroes that must be in successful responses. | +| `ED25519_PRIVATE_KEY_HEX` | unset | The hex-encoded ed25519 private key used to sign Anubis responses. If this is not set, Anubis will generate one for you. This should be exactly 64 characters long. See below for details. | +| `ED25519_PRIVATE_KEY_HEX_FILE` | unset | Path to a file containing the hex-encoded ed25519 private key. Only one of this or its sister option may be set. | +| `METRICS_BIND` | `:9090` | The network address that Anubis serves Prometheus metrics on. See `BIND` for more information. | +| `METRICS_BIND_NETWORK` | `tcp` | The address family that the Anubis metrics server listens on. See `BIND_NETWORK` for more information. | +| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. | +| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. | +| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. | +| `REDIRECT_DOMAINS` | unset | If set, restrict the domains that Anubis can redirect to when passing a challenge.

If this is unset, Anubis may redirect to any domain which could cause security issues in the unlikely case that an attacker passes a challenge for your browser and then tricks you into clicking a link to your domain. | +| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. | +| `SOCKET_MODE` | `0770` | _Only used when at least one of the `*_BIND_NETWORK` variables are set to `unix`._ The socket mode (permissions) for Unix domain sockets. | +| `TARGET` | `http://localhost:3923` | The URL of the service that Anubis should forward valid requests to. Supports Unix domain sockets, set this to a URI like so: `unix:///path/to/socket.sock`. | +| `USE_REMOTE_ADDRESS` | unset | If set to `true`, Anubis will take the client's IP from the network socket. For production deployments, it is expected that a reverse proxy is used in front of Anubis, which pass the IP using headers, instead. | +| `WEBMASTER_EMAIL` | unset | If set, shows a contact email address when rendering error pages. This email address will be how users can get in contact with administrators. | For more detailed information on configuring Open Graph tags, please refer to the [Open Graph Configuration](./configuration/open-graph.mdx) page. -- cgit v1.2.3