selfhosted – KO4BEP

Most search engines now bolt an AI answer box onto the top of the results page. That can be useful, but it also means your query and whatever the model does with it are happening on somebody else’s infrastructure.

This project builds the same basic workflow locally:

SearXNG handles normal web search.
Ollama runs a local model.
A tiny Flask wrapper shows search results immediately.
AI answers are optional. You check a box when you want them.
Apache or other reverse proxy can publish the whole thing under /search/ on an existing site.

What the final setup looks like

The public page is:

https://YOUR_DOMAIN/search/

The local services are:

SearXNG:   http://127.0.0.1:8080
Ollama:    http://127.0.0.1:11434
AI search: http://0.0.0.0:5001

Of course, you can also use subdomains instead of directory based; I started using directories ages ago and have too much momentum to care about changing now.

Not every search needs a summary. Sometimes you just want results. The browser hits the AI search wrapper which loads normal SearXNG results and only calls Ollama when the user requests it.

Assumptions

This guide is based on my experience setting this up on my own rig. It should apply broadly to current Ubuntu and derivatives, possibly with some tinkering:

Ubuntu 25.10 or close enough.
Docker Engine and Compose v2.
Apache 2.4 as the reverse proxy (used in this doc, can be easily adapted for other RPs)
A machine that can run Ollama locally (my machine for reference: Ryzen9 3900X, 128GiB RAM, NVIDIA 4060Ti 16GiB VRAM).
A reverse proxy path of /search/.

The commands use /opt/ai-search. Change that path if you want, but don’t scatter the files around. Future-you will hate present-you.

Install Docker from the Docker repository

Ubuntu’s Docker packages and Docker’s official packages can conflict. Pick one lane. I use Docker’s repository here.

sudo apt update
sudo apt install -y ca-certificates curl gnupg apache2

sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
  -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl enable --now docker

Check it:

sudo docker version
docker compose version

Create the project directory

sudo mkdir -p /opt/ai-search/{searxng,ollama}
sudo chown -R "$USER:$USER" /opt/ai-search
cd /opt/ai-search

Docker Compose: SearXNG and Ollama

This setup uses Docker host networking. That is deliberate.

I use a userspace VPN and systemwide TailScale on the same rig (getting my money’s worth out of a gaming machine when I’m not gaming). Docker bridge networking and Docker’s embedded DNS can get weird and create frustrating, time wasting conflicts. Host networking removes that whole layer for this project. The tradeoff is that ports bind directly on the host, so do not run this blindly on a shared machine.

Create /opt/ai-search/docker-compose.yml:

services:
  searxng:
    image: docker.io/searxng/searxng:latest
    container_name: searxng
    restart: unless-stopped
    network_mode: host
    volumes:
      - ./searxng:/etc/searxng

  ollama:
    image: docker.io/ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    network_mode: host
    volumes:
      - ./ollama:/root/.ollama

Start it:

cd /opt/ai-search
sudo docker compose up -d

SearXNG settings

Create /opt/ai-search/searxng/settings.yml:

use_default_settings: true

general:
  debug: false
  instance_name: "Search"

search:
  safe_search: 0
  autocomplete: duckduckgo
  default_lang: en-US
  formats:
    - html
    - json

server:
  secret_key: "CHANGE_THIS_TO_A_LONG_RANDOM_VALUE"
  base_url: http://127.0.0.1:8080/
  limiter: false
  image_proxy: false
  method: GET

ui:
  infinite_scroll: false
  query_in_title: true
  results_on_new_tab: true

plugins:
  searx.plugins.hostnames.SXNGPlugin:
    active: true
  searx.plugins.tracker_url_remover.SXNGPlugin:
    active: true
  searx.plugins.calculator.SXNGPlugin:
    active: true

engines:
  - name: brave
    disabled: true
  - name: karmasearch
    disabled: true
  - name: karmasearch videos
    disabled: true
  - name: mojeek
    disabled: true
  - name: yahoo
    disabled: true

Generate a real secret key:

python3 - <<'PY'
import secrets
print(secrets.token_hex(32))
PY

Replace CHANGE_THIS_TO_A_LONG_RANDOM_VALUE with the generated value.

Restart SearXNG:

cd /opt/ai-search
sudo docker compose restart searxng
curl -I http://127.0.0.1:8080

SearXNG must have JSON enabled because the AI wrapper reads search results through /search?q=...&format=json.

Pull a local model with Ollama

This uses Qwen 2.5 7B because it is small enough for normal local hardware and good enough for short search summaries.
f you use a

curl http://127.0.0.1:11434/api/pull \
  -d '{"model":"qwen2.5:7b","stream":false}'

curl http://127.0.0.1:11434/api/tags

Test generation:

curl http://127.0.0.1:11434/api/generate \
  -d '{"model":"qwen2.5:7b","prompt":"Reply with exactly: model working","stream":false}'

Install Python dependencies

The wrapper is a small Flask app. For the service, use Gunicorn instead of Flask’s development server.

sudo apt update
sudo apt install -y python3-flask python3-requests gunicorn

The AI search wrapper

Create /opt/ai-search/ai_search_app.py:

from flask import Flask, request
import html
import requests

app = Flask(__name__)

SEARX_UI = "http://127.0.0.1:8080"
SEARX_API = "http://127.0.0.1:8080/search"
OLLAMA_API = "http://127.0.0.1:11434/api/generate"
MODEL = "qwen2.5:7b"

STYLE = """
body {
  margin: 0;
  font-family: system-ui, sans-serif;
  background: #111;
  color: #eee;
}
.topbar {
  padding: 12px;
  background: #181818;
  border-bottom: 1px solid #333;
}
form {
  display: flex;
  gap: 8px;
}
input[type="text"] {
  flex: 1;
  padding: 10px;
  font-size: 16px;
}
button {
  padding: 10px 16px;
  font-size: 16px;
}
.ai-box {
  padding: 14px;
  margin: 12px;
  border: 1px solid #444;
  border-radius: 8px;
  background: #1b1b1b;
}
.ai-loading {
  opacity: 0.75;
}
#raw-results {
  background: #fff;
  color: #111;
  padding: 12px;
}
#raw-results a {
  color: #0645ad;
}
"""

SEARX_CSS = '<link rel="stylesheet" href="/search/raw/static/themes/simple/sxng-ltr.min.css" type="text/css">'

@app.route("/")
def index():
    q = request.args.get("q", "").strip()
    ai_enabled = request.args.get("ai") == "1"
    q_html = html.escape(q)
    checked = "checked" if ai_enabled else ""

    if not q:
        return f"""
<html>
<head><title>AI Search</title>{SEARX_CSS}<style>{STYLE}</style></head>
<body>
  <div class="topbar">
    <form action="/search/" method="get">
      <input type="text" name="q" autofocus placeholder="Search..." />
      <label style="display:flex;align-items:center;gap:6px">
        <input type="checkbox" name="ai" value="1">
        include AI
      </label>
      <button type="submit">Search</button>
    </form>
  </div>
</body>
</html>
"""

    quoted_q = requests.utils.quote(q)
    raw_url = "/search/raw/search?q=" + quoted_q

    if ai_enabled:
        ai_block = """
  <div id="ai" class="ai-box ai-loading">
    <b>AI Answer</b><br><br>
    Working...
  </div>
"""
        ai_script = f"""
  <script>
    fetch("/search/answer?q=" + encodeURIComponent({q!r}))
      .then(r => r.text())
      .then(t => {{
        document.getElementById("ai").classList.remove("ai-loading");
        document.getElementById("ai").innerHTML = t;
      }})
      .catch(() => {{
        document.getElementById("ai").innerHTML = "<b>AI Answer</b><br><br>Unavailable.";
      }});
  </script>
"""
    else:
        ai_block = f"""
  <div class="ai-box">
    <b>AI Answer</b><br><br>
    <a style="color:#9cf" href="/search/?q={quoted_q}&ai=1">Generate AI summary</a>
  </div>
"""
        ai_script = ""

    return f"""
<html>
<head><title>{q_html} - AI Search</title>{SEARX_CSS}<style>{STYLE}</style></head>
<body>
  <div class="topbar">
    <form action="/search/" method="get">
      <input type="text" name="q" value="{q_html}" />
      <label style="display:flex;align-items:center;gap:6px">
        <input type="checkbox" name="ai" value="1" {checked}>
        include AI
      </label>
      <button type="submit">Search</button>
      <a style="color:#9cf;padding:10px" href="{raw_url}" target="_blank">Open raw SearXNG</a>
    </form>
  </div>

{ai_block}

  <div id="raw-results">Loading search results...</div>

  <script>
    fetch("/search/raw-html?q=" + encodeURIComponent({q!r}))
      .then(r => r.text())
      .then(t => {{
        document.getElementById("raw-results").innerHTML = t;
      }})
      .catch(() => {{
        document.getElementById("raw-results").innerHTML = "Search results unavailable.";
      }});
  </script>

{ai_script}
</body>
</html>
"""

@app.route("/raw-html")
def raw_html():
    q = request.args.get("q", "").strip()
    if not q:
        return ""

    r = requests.get(SEARX_UI + "/search", params={"q": q}, timeout=45)
    r.raise_for_status()
    page = r.text

    start = page.find('<main id="main_results"')
    if start == -1:
        return page

    end = page.rfind("</main>")
    if end == -1:
        return page[start:]

    return page[start:end + len("</main>")]

@app.route("/answer")
def answer():
    q = request.args.get("q", "").strip()
    if not q:
        return ""

    try:
        sx = requests.get(SEARX_API, params={"q": q, "format": "json"}, timeout=30)
        sx.raise_for_status()
        data = sx.json()

        lines = []
        for r in data.get("results", [])[:6]:
            title = r.get("title", "")
            content = r.get("content", "")
            url = r.get("url", "")
            if title or content:
                lines.append(f"{title}\n{content}\n{url}")

        prompt = (
            "User query:\n" + q +
            "\n\nSearch results:\n" + "\n\n".join(lines) +
            "\n\nWrite a concise answer in 3 bullet points. "
            "Use only the provided search results. "
            "If the results are weak or unrelated, say so."
        )

        ol = requests.post(
            OLLAMA_API,
            json={"model": MODEL, "prompt": prompt, "stream": False},
            timeout=120,
        )
        ol.raise_for_status()

        text = ol.json().get("response", "").strip()
        safe = html.escape(text).replace("\n", "<br>")
        return "<b>AI Answer</b><br><br>" + safe

    except Exception as e:
        return "<b>AI Answer</b><br><br>Unavailable: " + html.escape(str(e))

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)

Two small design choices are doing a lot of work here:

The search results load first.
AI only runs when ai=1 is present.

That keeps normal searches quick.

Run it as a service

Create /etc/systemd/system/ai-search.service:

[Unit]
Description=Local AI Search Wrapper
After=network.target docker.service
Wants=docker.service

[Service]
Type=simple
User=YOUR_LOCAL_USER
WorkingDirectory=/opt/ai-search
ExecStart=/usr/bin/gunicorn -w 2 -b 0.0.0.0:5001 ai_search_app:app
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Set your local username:

sudo sed -i "s/User=YOUR_LOCAL_USER/User=$USER/" /etc/systemd/system/ai-search.service
sudo systemctl daemon-reload
sudo systemctl enable --now ai-search.service

Test locally:

curl -I http://127.0.0.1:5001
curl -L "http://127.0.0.1:5001/?q=linux%20firewall" | head
curl -L "http://127.0.0.1:5001/?q=linux%20firewall&ai=1" | head

Apache reverse proxy

This example publishes the wrapper at /search/ and raw SearXNG at /search/raw/.

Set the backend IP before using the config:

AI_SEARCH_HOST="192.168.1.50"

Use the actual LAN or VPN IP of the machine running the AI search service.

Put this inside your existing Apache TLS vhost:

# Local AI Search
# Public paths:
#   /search/      -> AI wrapper
#   /search/raw/  -> raw SearXNG assets and search page

RedirectMatch 308 ^/search$ /search/

ProxyPreserveHost On
ProxyRequests Off

# RAW SEARXNG MUST COME FIRST
ProxyPass        /search/raw/ http://AI_SEARCH_HOST:8080/
ProxyPassReverse /search/raw/ http://AI_SEARCH_HOST:8080/

# AI WRAPPER SECOND
ProxyPass        /search/ http://AI_SEARCH_HOST:5001/
ProxyPassReverse /search/ http://AI_SEARCH_HOST:5001/

<Location /search/raw/>
    Require all granted

    RequestHeader set X-Forwarded-Proto "https"
    RequestHeader set X-Forwarded-Host "YOUR_DOMAIN"
    RequestHeader set X-Forwarded-Prefix "/search/raw"
    RequestHeader set X-Scheme "https"
    RequestHeader set X-Script-Name "/search/raw"

    RequestHeader set X-Real-IP %{REMOTE_ADDR}s
    RequestHeader append X-Forwarded-For %{REMOTE_ADDR}s
</Location>

<Location /search/>
    Require all granted

    RequestHeader set X-Forwarded-Proto "https"
    RequestHeader set X-Forwarded-Host "YOUR_DOMAIN"
    RequestHeader set X-Forwarded-Prefix "/search"
    RequestHeader set X-Scheme "https"
    RequestHeader set X-Script-Name "/search"

    RequestHeader set X-Real-IP %{REMOTE_ADDR}s
    RequestHeader append X-Forwarded-For %{REMOTE_ADDR}s

    SetEnvIf Request_URI "^/search/" dontlog
</Location>

Replace AI_SEARCH_HOST and YOUR_DOMAIN before reloading Apache.

The order matters. /search/raw/ must come before /search/, or Apache will send raw SearXNG requests to the wrapper.

Enable modules and reload:

sudo a2enmod proxy proxy_http headers rewrite ssl
sudo apache2ctl configtest
sudo systemctl reload apache2

Test the final page

Open:

https://YOUR_DOMAIN/search/

Search normally. Results should load without calling the model.

Then check include AI and search again. Results should still load first, and the AI answer should appear after a few seconds.

Notes from the build

Do not start by hacking SearXNG plugins. That sounds cleaner than it is. The current SearXNG plugin system expects proper importable Python modules and fully qualified plugin class names. A wrapper avoids tying your project to SearXNG internals.

Do not put 127.0.0.1 URLs in HTML that will be loaded by another machine. The user’s browser interprets 127.0.0.1 as the user’s own computer, not your server. Use public paths like /search/raw/search?... and let Apache proxy them.

Do not run AI on every query unless you really want the latency (or you’re keeping warm in winter).

If you use a VPN and Docker networking explodes, try host networking for this stack. It is blunt, but it avoids a lot of route and DNS drama.

Useful references

SearXNG Search API: https://docs.searxng.org/dev/search_api.html
SearXNG JSON formats need to be enabled in settings.yml: https://docs.searxng.org/dev/search_api.html
Ollama generate API: https://ollama.readthedocs.io/en/api/
Ollama pull API: https://docs.ollama.com/api/pull
Docker host networking: https://docs.docker.com/engine/network/drivers/host/
Apache reverse proxy guide: https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html
Apache mod_proxy docs: https://httpd.apache.org/docs/current/mod/mod_proxy.html

Tag: selfhosted

Building a Local AI Search Page with SearXNG and Ollama