Automated link checking on a Hugo site with Lychee and Gitea Actions


Broken links on a static site are easy to miss. You rename a post, change a slug, or just forget to update a reference somewhere, and suddenly you have a 404 sitting there for weeks until someone actually mentions it. I wanted to catch those automatically on every push, not find out about them later.

Lychee is a fast, async link checker written in Rust. It can check internal links without making any network requests (working entirely against the built HTML files), and it checks external links against live URLs too. Both checks run as steps in a Gitea Actions workflow on every push to main or dev, and on every pull request.

Here is how I have it set up.

  • A Hugo site hosted on Gitea
  • Gitea with Actions enabled
  • An Actions runner connected to your Gitea instance

Create a .lychee.toml in the root of your repo. This file controls timeouts, retries, and which URLs to skip entirely.

# https://github.com/lycheeverse/lychee/blob/master/lychee.example.toml

# Timeout per request in seconds
timeout = 20

# Retry failed requests (handles transient errors and rate limits)
max_retries = 3
retry_wait_time = 2

# Suppress per-file progress output (errors still show)
no_progress = true

# Ignore non-navigable schemes, and domains that reliably block CI runners.
# LinkedIn, Twitter/X, and Facebook return 403/999 to bots - not real broken links.
exclude = [
  "^mailto:",
  "^tel:",
  "^javascript:",
  "linkedin\\.com",
  "twitter\\.com",
  "x\\.com",
  "facebook\\.com",
]

# Accept these HTTP status codes as valid beyond the default 2xx.
# 429 = Too Many Requests: the link exists, we are just being rate-limited.
accept = ["200..=299", "429"]

The exclusions matter. LinkedIn, Twitter/X, and Facebook all return 403 or 999 responses to automated tools. Those are not broken links, they are platforms blocking bots. Without excluding them you will get false positives on every single check. The 429 acceptance is the same idea: if a site rate-limits the runner, the link still works, it is just busy.

Create .gitea/workflows/link-check.yaml:

name: Link Check

on:
  push:
    branches:
      - main
      - dev
  pull_request:

jobs:
  link-check:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          submodules: recursive

      - name: Setup Hugo
        uses: peaceiris/actions-hugo@v3
        with:
          hugo-version: '0.140.1'
          extended: true

      - name: Install Node dependencies
        run: npm install

      - name: Build site
        run: hugo

      - name: Install lychee
        run: |
          curl -sL https://github.com/lycheeverse/lychee/releases/download/lychee-v0.24.2/lychee-x86_64-unknown-linux-gnu.tar.gz \
            | tar xz --strip-components=1 -C /usr/local/bin          

      - name: Check internal links
        run: lychee --offline --root-dir public --config .lychee.toml 'public/**/*.html'

      - name: Check external links
        continue-on-error: true
        run: lychee --root-dir public --config .lychee.toml 'public/**/*.html'

The workflow builds the Hugo site first (everything lands in the public folder) and then runs lychee against the generated HTML. Both lychee steps use --root-dir public so that root-relative links like /posts/some-post/ resolve to real files under public/posts/some-post/ rather than the filesystem root. Without that flag, lychee would not find any of your internal pages.

The internal check uses --offline. Lychee makes zero network requests in this mode. It reads the built HTML and verifies that every root-relative link points to something that actually exists in the public folder.

This catches:

  • Pages that have been renamed or deleted but are still referenced somewhere
  • Typos in internal links
  • Any anchor (#section) pointing to a heading that no longer exists

If an internal link is broken the step fails and the workflow stops. You get the full list of broken links in the Actions log before it exits.

The external check runs without --offline, so lychee actually makes HTTP requests to every external URL found on the site.

One thing to note here: continue-on-error: true is set on this step. External links can break for reasons entirely outside your control. A third-party site going down, a resource moving, a domain expiring. You probably do not want any of those to block a push to your own site. With continue-on-error: true, the step is marked as failed in the log and you can see exactly which URLs are returning errors, but the workflow keeps running and the deploy is not blocked.

You still get full visibility on the problem, it just does not stop the world.

[Screenshot: Gitea Actions log showing a failed external link check step, with the specific URLs that returned errors and their HTTP status codes listed]

Lychee gives a clear summary at the end of each run. You get a total count of links checked, how many were successful, and the specific URLs that failed along with the status code returned and the page they were found on.

For an external link returning a 404 you will see the URL, the page it came from, and the response. Straightforward to track down and fix.

Two steps in a workflow and one config file. Broken internal links now fail the build, which is the right call because you control those. Broken external links get logged without stopping a deploy, because you do not control what other sites do.

Lychee is quick as well. On this site both checks together take under a minute.

If you are already running Gitea with Actions enabled this is a simple addition. The full lychee documentation is at lychee.cli.rs and the example config on GitHub covers every available option.

×
Page views: