Runtime is first class.
eBPF signals, container runtime events, kubelet logs, and API audit events land in the same timeline.
Open source security observability for Kubernetes
Every syscall, pod, role, image, secret, and network flow in one attack graph. Built for platform engineers who have to keep clusters alive and explain the weird parts fast.
viceroy watch prod --graph --explain-path
Viceroy links a shell inside a pod to its service account, role binding, secret read, and outbound flow. The alert is the path.
The hole
Kubernetes security is not missing tools. It is missing context.
Scanners know images. Policy engines know YAML. SIEMs know logs. Runtime agents know syscalls. Attackers do not care about those org chart boundaries.
The useful unit is the path: what ran, who it ran as, what it could touch, where it connected, what changed, and what to do before the evidence disappears.
What changes
Viceroy should make the boring cases boring and the dangerous cases obvious. A CVE that cannot be reached is not the same as a pod that just read a secret and opened a new egress path.
eBPF signals, container runtime events, kubelet logs, and API audit events land in the same timeline.
Every process gets mapped back to pod, namespace, service account, RBAC, and cloud identity.
Vulnerabilities are ranked by reachability, loaded packages, live network exposure, and privilege.
Quarantine policy, token rotation, seccomp draft, and evidence capture are proposed from the graph.
The product
Click a signal. The screen changes because the model changes. The landing page is a sketch; the product should be a queryable incident machine.
A shell spawned in a web pod is not an alert by itself. It becomes useful when you know the image, service account, namespace policy, secret access, and new outbound flow.
Architecture
The architecture is intentionally plain: DaemonSets collect truth, streams normalize it, the graph correlates it, detection scores it, automation proposes the smallest safe action.
eBPF, audit API, CRI, kubelet, Prometheus, cloud logs, GitOps webhooks.
Timestamped events, schema normalized, tenant isolated, replayable by design.
Pods to processes, roles, secrets, images, endpoints, deployments, clusters.
Falco-like rules, anomaly baselines, ATT&CK chains, reachability ranking.
Quarantine namespace, kill pod, rotate token, draft seccomp, preserve evidence.
Open-core
Security infra with a closed agent is a weird ask. Viceroy is open-core: the collector, rule format, graph schema, CLI, and local control plane stay inspectable. The paid control plane wins on retention, team workflows, managed correlation, and enterprise evidence.
The core should be useful enough for a platform engineer to trust in prod. The business is hosted operations, long retention, collaboration, and fleet-scale correlation.
Single cluster or small fleet. Local graph, local rules, local evidence export. No vendor hostage move.
Managed multi-cluster correlation, long retention, team RBAC, alert routing, compliance exports, hosted upgrades.
Attack simulation packs that prove detections actually work: token theft, cryptomining, escape attempts, rogue ingress.
Roadmap
The first version should not pretend to solve every compliance acronym. It should catch obvious badness, explain it better than Falco plus dashboards, and generate the fix.
Decisions plus questions
The positioning is locked. These questions decide what the first waitlist users should believe and what the first demo must prove.
Platform engineers should expect a practical early build: install an agent, see attack paths, generate fixes, and export evidence.
Agent, rules, graph schema, API, CLI, and local console should be inspectable. Managed retention and fleet workflows can be paid.
Fast install, low overhead, clear blast radius, useful YAML output, and no dashboard that needs a dedicated operator.
Read-only recommendation, one-click response, or policy auto-apply after confidence threshold. This is the scary part.
Steal a service account token, read a secret, open egress, then show Viceroy explain and contain the whole chain.
Some teams cannot send runtime telemetry to a SaaS. The product needs a clean answer for local, hybrid, and hosted modes.
For platform engineers running Kubernetes in anger. Early access should focus on install speed, attack-path clarity, and generated remediation.