[{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/claude-code/","section":"Tags","summary":"","title":"Claude-Code"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/devops/","section":"Tags","summary":"","title":"Devops"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/homelab/","section":"Tags","summary":"","title":"Homelab"},{"content":"I nearly quit my homelab project over a markdown file.\nNot a broken deployment. Not a corrupted etcd cluster. Not a misconfigured NetworkPolicy at midnight. A markdown file. Specifically, a CLAUDE.md that described a cluster I no longer had, a roadmap I\u0026rsquo;d already executed, and a \u0026ldquo;current state\u0026rdquo; that was 10 sessions out of date. Every time I opened a new Claude Code session, I was handing my AI assistant a map of the wrong city and wondering why we kept ending up in the wrong place.\nTwo days of running in circles. Trying the same things. Re-explaining the same context. Hitting the same dead ends I\u0026rsquo;d already documented for myself and then lost track of. By the end of it I wasn\u0026rsquo;t frustrated at the tooling — I was frustrated at myself, which is somehow worse. That\u0026rsquo;s the kind of demotivation that makes you close the laptop and not open it again for a week.\nThis post is about what I built instead.\nThe actual problem #I\u0026rsquo;m building a 3-node K3s HA cluster on Lenovo M910q mini PCs, studying for the CKA while doing it, and managing the whole thing as a GitOps portfolio piece. The cluster is real — kube-vip, Longhorn, cert-manager, Cloudflare Tunnel, ArgoCD app-of-apps, the works. I work on it across multiple machines, pick it up and put it down across days and weeks, and travel between Prague and Greece and Iraq in between.\nThe context problem compounds fast in that situation. Claude Code\u0026rsquo;s context window doesn\u0026rsquo;t persist between sessions. Every new session starts cold. So I was doing what everyone does — maintaining a CLAUDE.md to give Claude project context. The problem: a static markdown file rots. The cluster moves forward, the file doesn\u0026rsquo;t. And when they diverge badly enough, you get something worse than no context at all: wrong context, delivered confidently.\nI\u0026rsquo;d also made the classic mistake of putting everything in one place. CLAUDE.md had the architecture, the hardware, the stack, the conventions, the roadmap, my notes, my todo list. It was trying to be a brain dump AND a reference doc AND a status tracker. It was none of them well. And every session, all of that got loaded into the context window whether I needed it or not — eating tokens, going stale, eventually contradicting the actual state of the cluster.\nThe specific two-day incident: I spent most of it trying to debug a deployment that the static CLAUDE.md said was still planned, but that I\u0026rsquo;d already partially deployed three sessions ago and left in a broken state. There was nothing in my setup to tell me that. I just kept starting from the wrong assumption.\nWhat I actually needed #When I stepped back and asked what would have prevented those two days, the answer was simple: something that remembered what had already been tried and failed.\nNot a context window. Not a chat history. A ledger. A running record of what I did, what broke, and why — that would survive a closed terminal, a new session, a week away, a different machine. Something I could walk up to cold and ask: \u0026ldquo;where did I leave off?\u0026rdquo;\nAnd then I realised I didn\u0026rsquo;t just need a memory for failures. I needed:\nA place to capture what I was learning (CKA is an exam — I should be building a study archive, not rediscovering the same concepts every session) A way to keep the AI assistant honest about what my cluster could actually support right now, before I went down a rabbit hole that required hardware I hadn\u0026rsquo;t built yet A reviewer that would catch my own convention mistakes before they landed in my public repo Four different jobs. Four different memory requirements. The answer was four (actually six) custom Claude Code subagents, each scoped to one job, each with their own persistent memory directory that survives across sessions.\nThe setup #Each agent is a markdown file in .claude/agents/. Claude Code loads them at startup and routes tasks to them automatically, or you invoke them explicitly with @agent-name. The key property: each agent gets its own memory directory, and its MEMORY.md is loaded into context every time it runs. So the memory persists. The context window resets; the memory doesn\u0026rsquo;t.\nHere\u0026rsquo;s what I built:\nit-accountability — the one that would have saved me two days. An append-only ledger of what\u0026rsquo;s been done, what failed, and what\u0026rsquo;s next. The Failed/Dead Ends section is the part that matters most. It\u0026rsquo;s explicitly instructed never to delete that section, because that\u0026rsquo;s the record that stops you retrying broken approaches. Start any session with @it-accountability where did I leave off? and it reads the ledger and tells you. Its memory is local (gitignored) because it logs command output that might contain secrets.\nthe-student — turns what I learn while building into CKA study cases. Every case gets a \u0026ldquo;CKA angle\u0026rdquo; (how this maps to exam domains), exact commands copied verbatim, and an Anki export at the end. There\u0026rsquo;s a rollup deck at anki/cka-deck.tsv — Anki imports it directly. It also tracks a study streak and nudges me if I\u0026rsquo;ve gone three days without capturing anything. The nudge fires when I open any session, which is the only time it can actually reach me.\nthe-contrarian — the one I\u0026rsquo;m most proud of the name for. Before I start any deployment task, it reads the live cluster (kubectl get pods -A, kubectl top nodes, read-only only, no changes ever), compares it against the repo, checks the hardware todo list, and tells me whether what I\u0026rsquo;m about to do is actually sensible right now. It knows what hardware I have, what\u0026rsquo;s coming but not yet built, and the correct sequence for post-travel upgrades. If I try to plan a GPU workload deployment before the GPU box is on Tailscale, it tells me. If I try to add a 4th node before doing the RAM upgrade, it tells me. The \u0026ldquo;rabbit hole warning\u0026rdquo; format is concrete: what you think it\u0026rsquo;ll take, what it will actually take, the first unexpected thing that will appear.\nmanifest-reviewer — read-only pre-commit gate for kubernetes/. Checks for plaintext secrets about to be committed (highest priority), missing cert-manager.io/cluster-issuer annotations, forgotten subPath on Postgres mounts (my own documented gotcha — Longhorn and lost+found), orphaned ArgoCD apps that aren\u0026rsquo;t wired into the bootstrap. Never edits, just reports.\nthe-blogger — banks blog-worthy moments in a local seeds directory. Low-friction: a few lines of raw material, the hook, angle ideas. Drafts posts on request. Never publishes automatically. (This post was seeded by it during the session where I built the setup. The irony is not lost on me.)\nadr-writer — records architecture decisions in my existing ADR format (I have docs/adr/ in the repo). Does a risk check first: flags one-way doors, hidden day-2 costs, credible alternatives I might be dismissing. If the decision is genuinely fine, it says so and drafts. The escape hatch is \u0026ldquo;just record it\u0026rdquo; — say that and it writes without argument. Drafts land locally; I move them to docs/adr/ myself when they\u0026rsquo;re ready to commit.\nThe persistence story #The agents\u0026rsquo; memory directories live inside the repo under .claude/agent-memory/. They\u0026rsquo;re gitignored — they never reach GitHub. The repo is public, so my study cases, my blog seeds, and my done/failed ledger stay local. But they\u0026rsquo;re real files on disk, which means they survive session resets, survive closing VS Code, survive switching machines.\nThe switching-machines part matters because the dev environment itself is part of this setup. I run VS Code Remote-SSH into a Debian VM on my Unraid server. All work happens inside that VM — kubectl, helm, the agents, everything. The working files live on an NFS-mounted share backed by the Unraid array, not on the VM\u0026rsquo;s SSD. So if the VM\u0026rsquo;s disk dies (it\u0026rsquo;s a CWWK board, I\u0026rsquo;ve made my peace with the risk), nothing of value was on it. And because every laptop SSHs into the same VM, there\u0026rsquo;s no sync problem. The session state is a single source of truth. I switch from the MacBook to the Windows machine and the session is just there, because it never moved.\nThe agents\u0026rsquo; memory moves with the repo because it\u0026rsquo;s in the repo. Open Claude Code on any machine SSHed into the VM, and @it-accountability reads the same ledger. That\u0026rsquo;s the \u0026ldquo;continue where I left off\u0026rdquo; story — not session resumption, but persistent state that travels with the working directory.\nWhat I\u0026rsquo;d do differently #The CLAUDE.md lesson: keep it small and stable, and point the rest at agents. My current CLAUDE.md is about 60 lines — hardware, conventions, a pointer to each agent, and a note that the live status lives in it-accountability, not in the file itself. The agents handle the accumulation; the CLAUDE.md handles the durable facts that rarely change. That separation is what stops it rotting.\nThe agent I\u0026rsquo;d add if I were starting over: a runbook agent. Something that maintains step-by-step runbooks for recurring operations — drain a node, add a new app, rotate a SOPS key — so I\u0026rsquo;m not reconstructing the same sequence from scratch every time and inevitably missing the one step that matters. It\u0026rsquo;s close to what the-student does for concepts, but scoped to operational procedures. On the list for when I\u0026rsquo;m back from travel.\nThe thing I underestimated: how much the name of an agent matters for delegation. @the-contrarian is memorable and sets the right expectation immediately — this is the thing that argues with me. A generic @k8s-advisor would get used much less. If you build this kind of setup, spend time on the names. They\u0026rsquo;re the interface.\nThe bet I\u0026rsquo;m making #The two-day incident is why I built this. Whether the system actually prevents it — I\u0026rsquo;m about to find out.\nThe idea is that I open a session, @it-accountability where did I leave off?, and I\u0026rsquo;m back in context in under a minute instead of re-deriving everything from scratch. The ledger either works or it doesn\u0026rsquo;t — and if it doesn\u0026rsquo;t, at least it\u0026rsquo;ll log why.\nThe Contrarian already earned its keep once: it talked me out of starting the GPU box setup the night before travel by giving me an honest time estimate (three sessions minimum, not two hours) and pointing out that Zero Trust Access was still open and more important. One session in. Good sign.\nThe student, the manifest-reviewer, the blogger — those are hypotheses. Study cases get logged as I solve things without having to sit down and write them up. The reviewer catches convention mistakes before they hit the cluster. Maybe they work exactly as designed. Maybe I\u0026rsquo;ll discover the real friction point is somewhere I didn\u0026rsquo;t anticipate. I\u0026rsquo;m genuinely fine with either outcome — that\u0026rsquo;s what the accountability ledger is for.\nSix markdown files. The hard part was figuring out what each one\u0026rsquo;s job was and making the memory boundaries explicit. The rest I\u0026rsquo;ll know in a few weeks.\nIf you\u0026rsquo;re studying for CKA while building real infrastructure, I\u0026rsquo;d start with it-accountability and the-student. Even if the whole system half-works, those two are the ones I\u0026rsquo;d bet on.\nThe agent definitions are at github.com/Steficzko/homelab if you want to use them as a starting point — or see what I got wrong.\nBuilt on: K3s v1.35, ArgoCD, Longhorn, cert-manager, Cloudflare Tunnel. Agents: Claude Code custom subagents with project-scoped persistent memory. Dev seat: Debian 12 VM on Unraid, VS Code Remote-SSH over Tailscale.\n","date":"22 May 2026","permalink":"https://www.kostikidis.net/posts/how-a-bad-claude-md-cost-me-two-days/","section":"Posts","summary":"\u003cp\u003eI nearly quit my homelab project over a markdown file.\u003c/p\u003e\n\u003cp\u003eNot a broken deployment. Not a corrupted etcd cluster. Not a misconfigured\nNetworkPolicy at midnight. A markdown file. Specifically, a \u003ccode\u003eCLAUDE.md\u003c/code\u003e that\ndescribed a cluster I no longer had, a roadmap I\u0026rsquo;d already executed, and a\n\u0026ldquo;current state\u0026rdquo; that was 10 sessions out of date. Every time I opened a new\nClaude Code session, I was handing my AI assistant a map of the wrong city\nand wondering why we kept ending up in the wrong place.\u003c/p\u003e","title":"How a Bad CLAUDE.md Cost Me Two Days (And What I Built to Fix It)"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/","section":"kostikidis.net","summary":"","title":"kostikidis.net"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/kubernetes/","section":"Tags","summary":"","title":"Kubernetes"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/posts/","section":"Posts","summary":"","title":"Posts"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/productivity/","section":"Tags","summary":"","title":"Productivity"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/","section":"Tags","summary":"","title":"Tags"},{"content":"I\u0026rsquo;m Stefanos. Nomad, photographer, videographer — and someone who has been refusing to lose data since 1995.\nI run a homelab on three Lenovo M910q mini PCs in Prague. The cluster runs K3s with GitOps, Longhorn storage, and everything self-hosted. No Google, no iCloud, no subscriptions.\nBeyond infrastructure, I build self-hosted automations — connecting services, eliminating manual work, making data do things automatically. On-prem AI is the current frontier: running Ollama, Whisper, and local models entirely on my own hardware, no cloud APIs, no data leaving the building.\nThis is my business card. If you want to understand what I do — read the posts.\nInfrastructure: K3s · ArgoCD · Longhorn · cert-manager · Cloudflare Tunnel · SOPS+Age · Prometheus + Grafana\nAutomations: n8n · self-hosted workflows · on-prem AI (Ollama + Whisper via ROCm)\nHardware:\n3× Lenovo M910q — K3s cluster (control plane + workers) CWWK N305 — Unraid: NAS, Home Assistant, AdGuard, *arr stack, data array Ryzen 9 5950X · RX 6700 XT · 64GB RAM — Proxmox: GPU/CPU ML workloads, Linux + Windows VMs ","date":null,"permalink":"https://www.kostikidis.net/about/","section":"kostikidis.net","summary":"\u003cp\u003eI\u0026rsquo;m Stefanos. Nomad, photographer, videographer — and someone who has been refusing to lose data since 1995.\u003c/p\u003e\n\u003cp\u003eI run a homelab on three Lenovo M910q mini PCs in Prague. The cluster runs K3s with GitOps, Longhorn storage, and everything self-hosted. No Google, no iCloud, no subscriptions.\u003c/p\u003e\n\u003cp\u003eBeyond infrastructure, I build self-hosted automations — connecting services, eliminating manual work, making data do things automatically. On-prem AI is the current frontier: running Ollama, Whisper, and local models entirely on my own hardware, no cloud APIs, no data leaving the building.\u003c/p\u003e","title":"About"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/argocd/","section":"Tags","summary":"","title":"Argocd"},{"content":"A homelab horror story about an Unraid update, a perfectly fine Nextcloud, and why you should never touch a working system at 10pm.\nHow It Actually Started #It started with an Unraid update.\nI upgraded my Unraid server to version 7.3 — a routine update, didn\u0026rsquo;t think twice about it — and the next thing I knew, Nextcloud wasn\u0026rsquo;t loading. My immediate assumption: something broke in Nextcloud. I had just finished deploying it to my K3s cluster, moved on to other homelab chores, and came back to find it dead.\nSo I started digging into Nextcloud. Logs, pod states, exec into containers. The more I looked, the more I found things I wanted to fix. And since I had just installed it fresh and hadn\u0026rsquo;t really locked it down yet, I thought: this is a good time to do a proper security audit.\nWhat followed was an entire day of chasing my own tail.\nThe Setup #I run a 3-node Kubernetes cluster on three Lenovo M910q mini PCs in Prague. Nextcloud is backed by PostgreSQL and Redis, with data on an NFS share from an Unraid server. Everything is managed by ArgoCD with GitOps, secrets encrypted with SOPS+Age.\nIt was working fine. Then an Unraid update happened, I thought Nextcloud was broken, and I decided to make it more secure.\nAct 1: The Unraid Upgrade Nobody Asked About #The actual root cause revealed itself early: the Unraid 7.3 upgrade had silently broken my NFS server. The array wasn\u0026rsquo;t starting rpcbind on boot anymore, and the NFS exports had been reset to all_squash — which maps all clients to anonymous user, meaning my init container\u0026rsquo;s chown commands were being silently ignored.\nForty minutes of digging through /etc/exports, showmount errors, and Unraid forums later, NFS was back. I added a startup hook to /boot/config/go so it survives reboots. Crisis one averted.\nBut by this point I had already decided to do the security audit. So I kept going.\nAct 2: The Security Audit #The findings were reasonable:\nNo Redis password (anyone on the cluster could connect) No NetworkPolicies (pods could talk to anything) No liveness/readiness probes No securityContext (containers running with unnecessary privileges) localhost in trusted domains federation app enabled unnecessarily Good findings. Let\u0026rsquo;s fix them all at once. What could go wrong?\nAct 3: Everything, Simultaneously #Problem 1: The Redis Password Trap #I generated a Redis password with openssl rand -base64 32. Redis rejected every connection.\nThe issue: base64 output contains +, /, and = characters. When Kubernetes base64-decodes the secret and injects it as an environment variable, the raw bytes aren\u0026rsquo;t valid UTF-8 — the container runtime refused to inject the variable at all.\nFix: openssl rand -hex 32. Hex is always valid ASCII.\nProblem 2: Dropping Capabilities Breaks Everything #I added capabilities: drop: [\u0026quot;ALL\u0026quot;] to all three containers. Security textbooks say this is correct. The containers disagreed.\nPostgreSQL crashed immediately. The official postgres image runs its entrypoint as root, does chmod on the data directory, then drops to the postgres user. With ALL capabilities dropped, the chmod fails.\nNextcloud crashed more subtly. Apache prefork MPM starts as root to bind port 80, then uses setuid/setgid to drop to www-data. Without SETUID and SETGID, the workers can\u0026rsquo;t switch users. Apache fails silently.\nRedis was fine — the alpine image starts directly as the redis user. No privilege dropping needed.\nThe lesson: security contexts aren\u0026rsquo;t one-size-fits-all. You need to understand what the container\u0026rsquo;s entrypoint actually does before restricting it.\nProblem 3: NetworkPolicy Forgot UDP #I wrote egress rules allowing DNS on port 53. TCP only.\nDNS uses UDP. Nextcloud couldn\u0026rsquo;t resolve postgres.nextcloud.svc.cluster.local. Twenty minutes of confusion before I noticed the missing protocol: UDP line.\nProblem 4: ArgoCD Is Always Watching #I applied fixes directly with kubectl apply to test them quickly. ArgoCD has selfHeal: true. Within 3 minutes it reverted everything back to git.\nWith GitOps, the only source of truth is git. kubectl apply is not a fix — it\u0026rsquo;s a temporary hallucination that ArgoCD will cure.\nProblem 5: The Probes That Broke Everything #I added readiness and liveness probes hitting /status.php. PHP hung on the first request. Apache\u0026rsquo;s worker pool filled up with stuck requests. The probe kept firing, kept hanging, and no workers were ever free to serve traffic.\nSymptoms: 503 Service Unavailable with the pod showing 1/1 Running.\nAct 4: The Real Problem #After stripping everything back, the pod was running but every HTTP request timed out. PHP CLI worked fine. The logs told the story:\nMISCONF Redis is configured to save RDB snapshots, but it\u0026#39;s currently unable to persist to disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Redis was in a locked state. When a Redis BGSAVE fails, Redis refuses all write commands to protect data integrity. It had been silently failing to write its RDB snapshot because the container had no persistent volume for it.\nEvery PHP request touching Redis — sessions, file locks, caching — hung waiting for a write that would never be acknowledged. Apache workers filled up. Cloudflare returned a 524 timeout.\nThe fix was one flag:\nredis-server --requirepass $(REDIS_PASSWORD) --save \u0026#34;\u0026#34; --save \u0026quot;\u0026quot; disables RDB persistence. Redis for Nextcloud is a pure cache and session store — there\u0026rsquo;s no data to protect. The persistence feature was never needed and silently poisoned the whole stack.\nThe Full Damage Report # What I tried What broke Why Redis password with base64 Redis rejected connections Raw bytes aren\u0026rsquo;t valid UTF-8 env vars capabilities: drop: ALL on postgres Crash loop Entrypoint needs root to chmod data dir capabilities: drop: ALL on Nextcloud Apache silent failure Needs SETUID/SETGID to drop to www-data allowPrivilegeEscalation: false Permission denied on config.php Same root cause DNS egress policy (TCP only) Can\u0026rsquo;t resolve postgres DNS is UDP kubectl apply to test changes Changes silently reverted ArgoCD selfHeal enforces git state Readiness/liveness probes 503 with pod \u0026ldquo;healthy\u0026rdquo; Hung workers fill the pool Redis RDB persistence (default) All PHP requests hang BGSAVE fails → Redis blocks all writes The Takeaway #This whole day started because I didn\u0026rsquo;t check whether an update to a completely different system could affect a dependent service. Unraid updated, NFS broke, Nextcloud went down, and I assumed the problem was Nextcloud.\nSecuring a running system is harder than securing a new one. Every container has assumptions baked into its entrypoint. Every framework has defaults that are traps in containerized environments. And every automated system has opinions it will enforce aggressively.\nThe security audit was worth doing. Most of the findings were real. But \u0026ldquo;apply all fixes simultaneously after midnight\u0026rdquo; is how you spend a full day chasing your own tail.\nNext time: check the other systems first. Then one change, test, commit, repeat.\nPosted from Prague, sometime after midnight, while Nextcloud finally loads.\n","date":"20 May 2026","permalink":"https://www.kostikidis.net/posts/i-tried-to-secure-my-nextcloud/","section":"Posts","summary":"\u003cp\u003e\u003cem\u003eA homelab horror story about an Unraid update, a perfectly fine Nextcloud, and why you should never touch a working system at 10pm.\u003c/em\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"how-it-actually-started\" class=\"relative group\"\u003eHow It Actually Started \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-actually-started\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt started with an Unraid update.\u003c/p\u003e\n\u003cp\u003eI upgraded my Unraid server to version 7.3 — a routine update, didn\u0026rsquo;t think twice about it — and the next thing I knew, Nextcloud wasn\u0026rsquo;t loading. My immediate assumption: something broke in Nextcloud. I had just finished deploying it to my K3s cluster, moved on to other homelab chores, and came back to find it dead.\u003c/p\u003e","title":"I Tried to Secure My Self-Hosted Nextcloud. It Didn't Go Well."},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/nextcloud/","section":"Tags","summary":"","title":"Nextcloud"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/redis/","section":"Tags","summary":"","title":"Redis"},{"content":"It started with a button.\nApply Update. 7.3.\nRoutine. Unraid does this all the time. I clicked it, watched the progress bar, saw the success message, and moved on. Didn\u0026rsquo;t think twice.\nAn hour later, Nextcloud was dead.\nNo logs. No obvious reason. I had just deployed Nextcloud to my K3s cluster a few days earlier, moved on to other homelab chores, and came back to a blank error page. My first instinct: something broke in Nextcloud. I never questioned that assumption. I just started digging. I dug for an entire day.\nWhat I eventually found was that the Unraid 7.3 update had silently reset my NFS export settings and stopped rpcbind from starting on boot. Nextcloud wasn\u0026rsquo;t broken at all. It just couldn\u0026rsquo;t reach its data volume. The problem was in a completely different system that I hadn\u0026rsquo;t thought to check because I hadn\u0026rsquo;t connected the timing.\nBy the time I figured it out, I had also convinced myself to run a full security audit on Nextcloud. Which I did. Which broke everything in four different ways simultaneously. Which took the rest of the day to untangle.\nBy midnight I had a working Nextcloud, a set of hard-won lessons about Kubernetes security contexts, and a story that I genuinely wanted to tell.\nThe problem was I had nobody to tell it to.\nMy friends think this is crazy. My family doesn\u0026rsquo;t know what Kubernetes is. My colleagues work in a different field entirely. The homelab community online is great, but it\u0026rsquo;s not the same as actually processing something out loud with someone who gets it.\nSo I wrote it down instead.\nkostikidis.net is where I document what I\u0026rsquo;m building and what breaks while building it. The cluster runs on three Lenovo M910q mini PCs in Prague, managed through GitOps with ArgoCD, storage on Longhorn, secrets encrypted with SOPS+Age, public access through Cloudflare Tunnel. Production-grade infrastructure for a completely personal use case — which is exactly what makes it interesting.\nI\u0026rsquo;m not an expert. I\u0026rsquo;m learning by doing, targeting the CKA certification, and eventually a platform engineering role. The blog is part of that process. Writing about what I\u0026rsquo;m doing forces me to actually understand it.\nIf you\u0026rsquo;re running a homelab, chasing a DevOps career, or just the kind of person who can\u0026rsquo;t walk away from a 503 error at midnight — you\u0026rsquo;re in the right place.\nThe full Unraid → Nextcloud → Redis disaster story is next. It\u0026rsquo;s a good one.\n","date":"20 May 2026","permalink":"https://www.kostikidis.net/posts/this-blog-exists-because-of-an-unraid-update/","section":"Posts","summary":"\u003cp\u003eIt started with a button.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eApply Update. 7.3.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRoutine. Unraid does this all the time. I clicked it, watched the progress bar, saw the success message, and moved on. Didn\u0026rsquo;t think twice.\u003c/p\u003e\n\u003cp\u003eAn hour later, Nextcloud was dead.\u003c/p\u003e\n\u003chr\u003e\n\u003col start=\"503\"\u003e\n\u003cli\u003eNo logs. No obvious reason. I had just deployed Nextcloud to my K3s cluster a few days earlier, moved on to other homelab chores, and came back to a blank error page. My first instinct: \u003cem\u003esomething broke in Nextcloud.\u003c/em\u003e I never questioned that assumption. I just started digging.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eI dug for an entire day.\u003c/p\u003e","title":"This Blog Exists Because of an Unraid Update"},{"content":"","date":null,"permalink":"https://www.kostikidis.net/tags/unraid/","section":"Tags","summary":"","title":"Unraid"}]