Title: Run Large LLMs Locally on NVIDIA Spark (No Cloud, 100% Private)
Description: Want to run large language models locally without sending your data to the cloud? In this video, I show you exactly how to run powerful LLMs on NVIDIA Spark using Open WebUI and Ollama — completely private and hosted on your own hardware.
We walk step-by-step through setting up Open WebUI with Ollama using NVIDIA Sync and Docker, pulling models like LLaMA 70B, DeepSeek, GPT-OSS 20B, and Qwen, and configuring everything so it runs smoothly on an enterprise-class GPU. You’ll see how to install the container, configure ports, create an admin account, download models, and switch between them — all without relying on ChatGPT or any external cloud service.
I also cover real-world expectations like first-run delays, GPU memory usage, slower inference trade-offs, and how to update or stop containers to reclaim resources. By the end, you’ll have a fully private, local AI assistant capable of running surprisingly large models right at home.
If you care about data privacy, self-hosted AI, or running LLMs locally, this setup is one of the easiest and most powerful ways to get started.
👍 If this helped, like the video, subscribe, and drop a comment with models or features you’d like me to test next.
CODE SNIPPET For Custom
!/usr/bin/env bash
set -euo pipefail
NAME="open-webui" IMAGE="ghcr.io/open-webui/open-webui:ollama"
cleanup() { echo "Signal received; stopping ${NAME}…" docker stop "${NAME}" >/dev/null 2>&1 || true exit 0 } trap cleanup INT TERM HUP QUIT EXIT
Ensure Docker CLI and daemon are available
if ! docker info >/dev/null 2>&1; then echo "Error: Docker daemon not reachable." >&2 exit 1 fi
Already running?
if [ -n "$(docker ps -q --filter "name=^${NAME}$" --filter "status=running")" ]; then echo "Container ${NAME} is already running." else # Exists but stopped? Start it. if [ -n "$(docker ps -aq --filter "name=^${NAME}$")" ]; then echo "Starting existing container ${NAME}…" docker start "${NAME}" >/dev/null else # Not present: create and start it. echo "Creating and starting ${NAME}…" docker run -d -p 12000:8080 --gpus=all \ #i added this for spped testing -e PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \ -e CUDA_VISIBLE_DEVICES=0 \ -e NVIDIA_TF32_OVERRIDE=0 \ -e COMMANDLINE_ARGS="--precision fp16" \ #halves VRAM usage, much faster than FP32 #end of test -v open-webui:/app/backend/data \ -v open-webui-ollama:/root/.ollama \ --name "${NAME}" "${IMAGE}" >/dev/null fi fi
echo "Running. Press Ctrl+C to stop ${NAME}."
Keep the script alive until a signal arrives
while :; do sleep 86400; done
⭐ Support the Channel ⭐
If you enjoy the video, please consider liking, subscribing, and sharing!
🌐 Follow Me
Facebook: https://www.facebook.com/madhouse74 X (Twitter): https://x.com/MadTc74 LinkedIn: https://www.linkedin.com/in/mad-tc-086046285/ GitLab: https://gitlab.com/MadTcTutorials
📦 Affiliates:
💡 Disclaimer: Some of these links are affiliate links. They cost you nothing but help support the channel. Thank you!
Amazon: https://amzn.to/3TEscdX Unraid (Referral Discount): https://unraid.net/pricing?via=4e1dee
🎥 Equipment I Use:
Logitech Brio PRO X 4K Webcam: https://amzn.to/410ouxO Shure SM4 Studio Recording Microphone: Shure SM4 Studio Recording Microphone M-Audio M-Track Duo – USB Audio: https://amzn.to/49T5Cot HyperX Headset: https://amzn.to/4gWESpp U7-Pro AP WiFi7 PoE+: https://amzn.to/4qVGBRg Ubiquiti Switch Enterprise 24 PoE: https://amzn.to/4qPjyHL Ubiquiti Enterprise Security Gateway and Network Appliance with 10G SFP+: https://amzn.to/4anvURm NVIDIA DGX Spark: https://amzn.to/46mImyl
💸 Free Stock Through Robinhood
Robinhood: Join Robinhood with my link and we'll both pick our own free stock 🤝 https://join.robinhood.com/travisb-707fa1
⏱️ CHAPTERS
00:00 Intro 00:10 Quick Demo 02:07 Getting Started 02:33 Instructions 04:04 Step One: Configure Docker Permissions 05:16 Step Two: Verify Docker Setup and Pull Container 05:39 Step Three: Open Nvidia Sync 05:56 Step Four: Add Open WebUI Custom Port Configuration 07:25 Step Five: Launch Open WebUI 10:01 Step Six: Create Administrator Account 10:28 What is New 10:48 Step Seven: Download and Configure a Model 11:47 Step Eight: Test the Model 13:18 Change Profile Photo 13:58 Step Nine: Stop The Open WebUi 14:37 Step Ten: Next Steps - Download Other Models 17:12 Testing New Model 18:24 What's Next 19:15 Like and Subscribe Bro
🔑 Keywords:
run LLMs locally, NVIDIA Spark LLM, Open WebUI Ollama setup, local AI server, self hosted LLM, private AI assistant, Ollama Open WebUI, run LLaMA locally, local large language model, NVIDIA Spark AI, Docker LLM setup, enterprise GPU AI, offline AI models, private ChatGPT alternative, local AI workflow, run AI without cloud, GPU LLM inference, NVIDIA Sync Open WebUI, Ollama models local, Qwen LLM local, DeepSeek local AI, GPT OSS local
#️⃣ Hashtags:
#LocalLLM, #NVIDIASpark, #OpenWebUI, #Ollama, #SelfHostedAI, #PrivateAI, #RunLLMLocally, #LocalAI, #AIPrivacy, #DockerAI, #LLaMA, #HomeLabAI















