Tag: selfhosted

  • Building a Local AI Search Page with SearXNG and Ollama

    Most search engines now bolt an AI answer box onto the top of the results page. That can be useful, but it also means your query and whatever the model does with it are happening on somebody else’s infrastructure.

    This project builds the same basic workflow locally:

    • SearXNG handles normal web search.
    • Ollama runs a local model.
    • A tiny Flask wrapper shows search results immediately.
    • AI answers are optional. You check a box when you want them.
    • Apache or other reverse proxy can publish the whole thing under /search/ on an existing site.

    What the final setup looks like

    The public page is:

    https://YOUR_DOMAIN/search/

    The local services are:

    SearXNG:   http://127.0.0.1:8080
    Ollama:    http://127.0.0.1:11434
    AI search: http://0.0.0.0:5001

    Of course, you can also use subdomains instead of directory based; I started using directories ages ago and have too much momentum to care about changing now.

    Not every search needs a summary. Sometimes you just want results. The browser hits the AI search wrapper which loads normal SearXNG results and only calls Ollama when the user requests it.

    Assumptions

    This guide is based on my experience setting this up on my own rig. It should apply broadly to current Ubuntu and derivatives, possibly with some tinkering:

    • Ubuntu 25.10 or close enough.
    • Docker Engine and Compose v2.
    • Apache 2.4 as the reverse proxy (used in this doc, can be easily adapted for other RPs)
    • A machine that can run Ollama locally (my machine for reference: Ryzen9 3900X, 128GiB RAM, NVIDIA 4060Ti 16GiB VRAM).
    • A reverse proxy path of /search/.

    The commands use /opt/ai-search. Change that path if you want, but don’t scatter the files around. Future-you will hate present-you.

    Install Docker from the Docker repository

    Ubuntu’s Docker packages and Docker’s official packages can conflict. Pick one lane. I use Docker’s repository here.

    sudo apt update
    sudo apt install -y ca-certificates curl gnupg apache2
    
    sudo install -m 0755 -d /etc/apt/keyrings
    sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
      -o /etc/apt/keyrings/docker.asc
    sudo chmod a+r /etc/apt/keyrings/docker.asc
    
    echo \
    "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
    https://download.docker.com/linux/ubuntu \
    $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    
    sudo apt update
    sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
    sudo systemctl enable --now docker

    Check it:

    sudo docker version
    docker compose version

    Create the project directory

    sudo mkdir -p /opt/ai-search/{searxng,ollama}
    sudo chown -R "$USER:$USER" /opt/ai-search
    cd /opt/ai-search

    Docker Compose: SearXNG and Ollama

    This setup uses Docker host networking. That is deliberate.

    I use a userspace VPN and systemwide TailScale on the same rig (getting my money’s worth out of a gaming machine when I’m not gaming). Docker bridge networking and Docker’s embedded DNS can get weird and create frustrating, time wasting conflicts. Host networking removes that whole layer for this project. The tradeoff is that ports bind directly on the host, so do not run this blindly on a shared machine.

    Create /opt/ai-search/docker-compose.yml:

    services:
      searxng:
        image: docker.io/searxng/searxng:latest
        container_name: searxng
        restart: unless-stopped
        network_mode: host
        volumes:
          - ./searxng:/etc/searxng
    
      ollama:
        image: docker.io/ollama/ollama:latest
        container_name: ollama
        restart: unless-stopped
        network_mode: host
        volumes:
          - ./ollama:/root/.ollama

    Start it:

    cd /opt/ai-search
    sudo docker compose up -d

    SearXNG settings

    Create /opt/ai-search/searxng/settings.yml:

    use_default_settings: true
    
    general:
      debug: false
      instance_name: "Search"
    
    search:
      safe_search: 0
      autocomplete: duckduckgo
      default_lang: en-US
      formats:
        - html
        - json
    
    server:
      secret_key: "CHANGE_THIS_TO_A_LONG_RANDOM_VALUE"
      base_url: http://127.0.0.1:8080/
      limiter: false
      image_proxy: false
      method: GET
    
    ui:
      infinite_scroll: false
      query_in_title: true
      results_on_new_tab: true
    
    plugins:
      searx.plugins.hostnames.SXNGPlugin:
        active: true
      searx.plugins.tracker_url_remover.SXNGPlugin:
        active: true
      searx.plugins.calculator.SXNGPlugin:
        active: true
    
    engines:
      - name: brave
        disabled: true
      - name: karmasearch
        disabled: true
      - name: karmasearch videos
        disabled: true
      - name: mojeek
        disabled: true
      - name: yahoo
        disabled: true

    Generate a real secret key:

    python3 - <<'PY'
    import secrets
    print(secrets.token_hex(32))
    PY

    Replace CHANGE_THIS_TO_A_LONG_RANDOM_VALUE with the generated value.

    Restart SearXNG:

    cd /opt/ai-search
    sudo docker compose restart searxng
    curl -I http://127.0.0.1:8080

    SearXNG must have JSON enabled because the AI wrapper reads search results through /search?q=...&format=json.

    Pull a local model with Ollama

    This uses Qwen 2.5 7B because it is small enough for normal local hardware and good enough for short search summaries.
    f you use a

    curl http://127.0.0.1:11434/api/pull \
      -d '{"model":"qwen2.5:7b","stream":false}'
    
    curl http://127.0.0.1:11434/api/tags

    Test generation:

    curl http://127.0.0.1:11434/api/generate \
      -d '{"model":"qwen2.5:7b","prompt":"Reply with exactly: model working","stream":false}'

    Install Python dependencies

    The wrapper is a small Flask app. For the service, use Gunicorn instead of Flask’s development server.

    sudo apt update
    sudo apt install -y python3-flask python3-requests gunicorn

    The AI search wrapper

    Create /opt/ai-search/ai_search_app.py:

    from flask import Flask, request
    import html
    import requests
    
    app = Flask(__name__)
    
    SEARX_UI = "http://127.0.0.1:8080"
    SEARX_API = "http://127.0.0.1:8080/search"
    OLLAMA_API = "http://127.0.0.1:11434/api/generate"
    MODEL = "qwen2.5:7b"
    
    STYLE = """
    body {
      margin: 0;
      font-family: system-ui, sans-serif;
      background: #111;
      color: #eee;
    }
    .topbar {
      padding: 12px;
      background: #181818;
      border-bottom: 1px solid #333;
    }
    form {
      display: flex;
      gap: 8px;
    }
    input[type="text"] {
      flex: 1;
      padding: 10px;
      font-size: 16px;
    }
    button {
      padding: 10px 16px;
      font-size: 16px;
    }
    .ai-box {
      padding: 14px;
      margin: 12px;
      border: 1px solid #444;
      border-radius: 8px;
      background: #1b1b1b;
    }
    .ai-loading {
      opacity: 0.75;
    }
    #raw-results {
      background: #fff;
      color: #111;
      padding: 12px;
    }
    #raw-results a {
      color: #0645ad;
    }
    """
    
    SEARX_CSS = '<link rel="stylesheet" href="/search/raw/static/themes/simple/sxng-ltr.min.css" type="text/css">'
    
    @app.route("/")
    def index():
        q = request.args.get("q", "").strip()
        ai_enabled = request.args.get("ai") == "1"
        q_html = html.escape(q)
        checked = "checked" if ai_enabled else ""
    
        if not q:
            return f"""
    <html>
    <head><title>AI Search</title>{SEARX_CSS}<style>{STYLE}</style></head>
    <body>
      <div class="topbar">
        <form action="/search/" method="get">
          <input type="text" name="q" autofocus placeholder="Search..." />
          <label style="display:flex;align-items:center;gap:6px">
            <input type="checkbox" name="ai" value="1">
            include AI
          </label>
          <button type="submit">Search</button>
        </form>
      </div>
    </body>
    </html>
    """
    
        quoted_q = requests.utils.quote(q)
        raw_url = "/search/raw/search?q=" + quoted_q
    
        if ai_enabled:
            ai_block = """
      <div id="ai" class="ai-box ai-loading">
        <b>AI Answer</b><br><br>
        Working...
      </div>
    """
            ai_script = f"""
      <script>
        fetch("/search/answer?q=" + encodeURIComponent({q!r}))
          .then(r => r.text())
          .then(t => {{
            document.getElementById("ai").classList.remove("ai-loading");
            document.getElementById("ai").innerHTML = t;
          }})
          .catch(() => {{
            document.getElementById("ai").innerHTML = "<b>AI Answer</b><br><br>Unavailable.";
          }});
      </script>
    """
        else:
            ai_block = f"""
      <div class="ai-box">
        <b>AI Answer</b><br><br>
        <a style="color:#9cf" href="/search/?q={quoted_q}&ai=1">Generate AI summary</a>
      </div>
    """
            ai_script = ""
    
        return f"""
    <html>
    <head><title>{q_html} - AI Search</title>{SEARX_CSS}<style>{STYLE}</style></head>
    <body>
      <div class="topbar">
        <form action="/search/" method="get">
          <input type="text" name="q" value="{q_html}" />
          <label style="display:flex;align-items:center;gap:6px">
            <input type="checkbox" name="ai" value="1" {checked}>
            include AI
          </label>
          <button type="submit">Search</button>
          <a style="color:#9cf;padding:10px" href="{raw_url}" target="_blank">Open raw SearXNG</a>
        </form>
      </div>
    
    {ai_block}
    
      <div id="raw-results">Loading search results...</div>
    
      <script>
        fetch("/search/raw-html?q=" + encodeURIComponent({q!r}))
          .then(r => r.text())
          .then(t => {{
            document.getElementById("raw-results").innerHTML = t;
          }})
          .catch(() => {{
            document.getElementById("raw-results").innerHTML = "Search results unavailable.";
          }});
      </script>
    
    {ai_script}
    </body>
    </html>
    """
    
    @app.route("/raw-html")
    def raw_html():
        q = request.args.get("q", "").strip()
        if not q:
            return ""
    
        r = requests.get(SEARX_UI + "/search", params={"q": q}, timeout=45)
        r.raise_for_status()
        page = r.text
    
        start = page.find('<main id="main_results"')
        if start == -1:
            return page
    
        end = page.rfind("</main>")
        if end == -1:
            return page[start:]
    
        return page[start:end + len("</main>")]
    
    @app.route("/answer")
    def answer():
        q = request.args.get("q", "").strip()
        if not q:
            return ""
    
        try:
            sx = requests.get(SEARX_API, params={"q": q, "format": "json"}, timeout=30)
            sx.raise_for_status()
            data = sx.json()
    
            lines = []
            for r in data.get("results", [])[:6]:
                title = r.get("title", "")
                content = r.get("content", "")
                url = r.get("url", "")
                if title or content:
                    lines.append(f"{title}\n{content}\n{url}")
    
            prompt = (
                "User query:\n" + q +
                "\n\nSearch results:\n" + "\n\n".join(lines) +
                "\n\nWrite a concise answer in 3 bullet points. "
                "Use only the provided search results. "
                "If the results are weak or unrelated, say so."
            )
    
            ol = requests.post(
                OLLAMA_API,
                json={"model": MODEL, "prompt": prompt, "stream": False},
                timeout=120,
            )
            ol.raise_for_status()
    
            text = ol.json().get("response", "").strip()
            safe = html.escape(text).replace("\n", "<br>")
            return "<b>AI Answer</b><br><br>" + safe
    
        except Exception as e:
            return "<b>AI Answer</b><br><br>Unavailable: " + html.escape(str(e))
    
    if __name__ == "__main__":
        app.run(host="0.0.0.0", port=5001)

    Two small design choices are doing a lot of work here:

    • The search results load first.
    • AI only runs when ai=1 is present.

    That keeps normal searches quick.

    Run it as a service

    Create /etc/systemd/system/ai-search.service:

    [Unit]
    Description=Local AI Search Wrapper
    After=network.target docker.service
    Wants=docker.service
    
    [Service]
    Type=simple
    User=YOUR_LOCAL_USER
    WorkingDirectory=/opt/ai-search
    ExecStart=/usr/bin/gunicorn -w 2 -b 0.0.0.0:5001 ai_search_app:app
    Restart=always
    RestartSec=3
    
    [Install]
    WantedBy=multi-user.target

    Set your local username:

    sudo sed -i "s/User=YOUR_LOCAL_USER/User=$USER/" /etc/systemd/system/ai-search.service
    sudo systemctl daemon-reload
    sudo systemctl enable --now ai-search.service

    Test locally:

    curl -I http://127.0.0.1:5001
    curl -L "http://127.0.0.1:5001/?q=linux%20firewall" | head
    curl -L "http://127.0.0.1:5001/?q=linux%20firewall&ai=1" | head

    Apache reverse proxy

    This example publishes the wrapper at /search/ and raw SearXNG at /search/raw/.

    Set the backend IP before using the config:

    AI_SEARCH_HOST="192.168.1.50"

    Use the actual LAN or VPN IP of the machine running the AI search service.

    Put this inside your existing Apache TLS vhost:

    # Local AI Search
    # Public paths:
    #   /search/      -> AI wrapper
    #   /search/raw/  -> raw SearXNG assets and search page
    
    RedirectMatch 308 ^/search$ /search/
    
    ProxyPreserveHost On
    ProxyRequests Off
    
    # RAW SEARXNG MUST COME FIRST
    ProxyPass        /search/raw/ http://AI_SEARCH_HOST:8080/
    ProxyPassReverse /search/raw/ http://AI_SEARCH_HOST:8080/
    
    # AI WRAPPER SECOND
    ProxyPass        /search/ http://AI_SEARCH_HOST:5001/
    ProxyPassReverse /search/ http://AI_SEARCH_HOST:5001/
    
    <Location /search/raw/>
        Require all granted
    
        RequestHeader set X-Forwarded-Proto "https"
        RequestHeader set X-Forwarded-Host "YOUR_DOMAIN"
        RequestHeader set X-Forwarded-Prefix "/search/raw"
        RequestHeader set X-Scheme "https"
        RequestHeader set X-Script-Name "/search/raw"
    
        RequestHeader set X-Real-IP %{REMOTE_ADDR}s
        RequestHeader append X-Forwarded-For %{REMOTE_ADDR}s
    </Location>
    
    <Location /search/>
        Require all granted
    
        RequestHeader set X-Forwarded-Proto "https"
        RequestHeader set X-Forwarded-Host "YOUR_DOMAIN"
        RequestHeader set X-Forwarded-Prefix "/search"
        RequestHeader set X-Scheme "https"
        RequestHeader set X-Script-Name "/search"
    
        RequestHeader set X-Real-IP %{REMOTE_ADDR}s
        RequestHeader append X-Forwarded-For %{REMOTE_ADDR}s
    
        SetEnvIf Request_URI "^/search/" dontlog
    </Location>

    Replace AI_SEARCH_HOST and YOUR_DOMAIN before reloading Apache.

    The order matters. /search/raw/ must come before /search/, or Apache will send raw SearXNG requests to the wrapper.

    Enable modules and reload:

    sudo a2enmod proxy proxy_http headers rewrite ssl
    sudo apache2ctl configtest
    sudo systemctl reload apache2

    Test the final page

    Open:

    https://YOUR_DOMAIN/search/

    Search normally. Results should load without calling the model.

    Then check include AI and search again. Results should still load first, and the AI answer should appear after a few seconds.

    Notes from the build

    Do not start by hacking SearXNG plugins. That sounds cleaner than it is. The current SearXNG plugin system expects proper importable Python modules and fully qualified plugin class names. A wrapper avoids tying your project to SearXNG internals.

    Do not put 127.0.0.1 URLs in HTML that will be loaded by another machine. The user’s browser interprets 127.0.0.1 as the user’s own computer, not your server. Use public paths like /search/raw/search?... and let Apache proxy them.

    Do not run AI on every query unless you really want the latency (or you’re keeping warm in winter).

    If you use a VPN and Docker networking explodes, try host networking for this stack. It is blunt, but it avoids a lot of route and DNS drama.

    Useful references

    • SearXNG Search API: https://docs.searxng.org/dev/search_api.html
    • SearXNG JSON formats need to be enabled in settings.yml: https://docs.searxng.org/dev/search_api.html
    • Ollama generate API: https://ollama.readthedocs.io/en/api/
    • Ollama pull API: https://docs.ollama.com/api/pull
    • Docker host networking: https://docs.docker.com/engine/network/drivers/host/
    • Apache reverse proxy guide: https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html
    • Apache mod_proxy docs: https://httpd.apache.org/docs/current/mod/mod_proxy.html