Open source security observability for Kubernetes

Viceroy

Every syscall, pod, role, image, secret, and network flow in one attack graph. Built for platform engineers who have to keep clusters alive and explain the weird parts fast.

Join waitlist See the system

$ viceroy watch prod --graph --explain-path

<2% node CPU overhead target

<1s event to incident target

6 signal planes normalized

OSS agent, rules, schema, API

cluster/prod-us-east-02 :: incident graph live

attack path: exec to secret egress

Viceroy links a shell inside a pod to its service account, role binding, secret read, and outbound flow. The alert is the path.

94 risk score

contain namespace Draft default-deny policy from observed safe flows.

rotate token ServiceAccount token used after suspicious exec.

preserve evidence Snapshot pod fs, audit trail, and network trace.

scroll / question the cluster

The hole

Kubernetes security is not missing tools. It is missing context.

Scanners know images. Policy engines know YAML. SIEMs know logs. Runtime agents know syscalls. Attackers do not care about those org chart boundaries.

The useful unit is the path: what ran, who it ran as, what it could touch, where it connected, what changed, and what to do before the evidence disappears.

EKScloud audit plus runtime

GKEidentity and workload graph

AKSmulti-cluster posture

k3ssmall clusters count too

bare metalno cloud lock-in

OpenShiftpolicy drift and evidence

What changes

Stop ranking alerts. Start explaining incidents.

Viceroy should make the boring cases boring and the dangerous cases obvious. A CVE that cannot be reached is not the same as a pod that just read a secret and opened a new egress path.

Runtime is first class.

eBPF signals, container runtime events, kubelet logs, and API audit events land in the same timeline.

Identity is attached.

Every process gets mapped back to pod, namespace, service account, RBAC, and cloud identity.

Noise gets filtered.

Vulnerabilities are ranked by reachability, loaded packages, live network exposure, and privilege.

Response is generated.

Quarantine policy, token rotation, seccomp draft, and evidence capture are proposed from the graph.

The product

One graph. Six signal planes. No spiritual dashboards.

Click a signal. The screen changes because the model changes. The landing page is a sketch; the product should be a queryable incident machine.

Runtime behavior, with Kubernetes names.

A shell spawned in a web pod is not an alert by itself. It becomes useful when you know the image, service account, namespace policy, secret access, and new outbound flow.

12:41:01execve /bin/shruntime

12:41:03sa/frontend can list secretsrbac

12:41:08read secret/stripe-keyaudit

12:41:10new egress 185.199.109.0/24flow

12:41:12container image has reachable openssl CVEsbom

Architecture

Small agents. Brutal correlation. Boring deployment.

The architecture is intentionally plain: DaemonSets collect truth, streams normalize it, the graph correlates it, detection scores it, automation proposes the smallest safe action.

01 / collect

Agents

eBPF, audit API, CRI, kubelet, Prometheus, cloud logs, GitOps webhooks.

02 / ingest

Stream

Timestamped events, schema normalized, tenant isolated, replayable by design.

03 / connect

Graph

Pods to processes, roles, secrets, images, endpoints, deployments, clusters.

04 / detect

Rules plus ML

Falco-like rules, anomaly baselines, ATT&CK chains, reachability ranking.

05 / act

Playbooks

Quarantine namespace, kill pod, rotate token, draft seccomp, preserve evidence.

90%+true positive target on ATT&CK container scenarios

<5%false alert target after graph ranking

100sof clusters per control plane target

0manual log hunts for first-response evidence

Open-core

Open where trust matters. Hosted where fleet ops hurt.

Security infra with a closed agent is a weird ask. Viceroy is open-core: the collector, rule format, graph schema, CLI, and local control plane stay inspectable. The paid control plane wins on retention, team workflows, managed correlation, and enterprise evidence.

Open-core without the agent mystery.

The core should be useful enough for a platform engineer to trust in prod. The business is hosted operations, long retention, collaboration, and fleet-scale correlation.

viceroy-agentDaemonSet

viceroy-rulesYAML

viceroy-graph-schemaOpenAPI

viceroy-consoleWeb

Self-hosted core

Single cluster or small fleet. Local graph, local rules, local evidence export. No vendor hostage move.

Cloud control plane

Managed multi-cluster correlation, long retention, team RBAC, alert routing, compliance exports, hosted upgrades.

Labs and validation

Attack simulation packs that prove detections actually work: token theft, cryptomining, escape attempts, rogue ingress.

Roadmap

Ship the wedge, then eat the category.

The first version should not pretend to solve every compliance acronym. It should catch obvious badness, explain it better than Falco plus dashboards, and generate the fix.

0-3 months: prove signal

Agent collecting audit, kubelet, runtime, eBPF basics
Rule engine and incident timeline
Minimal graph console

4-6 months: prove fleet

Multi-cluster aggregation
Cloud logs and GitOps webhooks
CIS and NSA-CISA posture mapping

7-12 months: prove action

Anomaly baselines by workload
NetworkPolicy and seccomp generation
One-click containment playbooks

13-18 months: prove category

Attack simulation lab
Advanced tenancy and evidence retention
Enterprise reporting without spreadsheet cosplay

Decisions plus questions

Viceroy, open-core, platform engineers, waitlist. Good. Now the hard parts.

The positioning is locked. These questions decide what the first waitlist users should believe and what the first demo must prove.

What does the waitlist promise?

Platform engineers should expect a practical early build: install an agent, see attack paths, generate fixes, and export evidence.

What stays open-core?

Agent, rules, graph schema, API, CLI, and local console should be inspectable. Managed retention and fleet workflows can be paid.

What does a platform engineer need first?

Fast install, low overhead, clear blast radius, useful YAML output, and no dashboard that needs a dedicated operator.

How aggressive can automation be?

Read-only recommendation, one-click response, or policy auto-apply after confidence threshold. This is the scary part.

What is the first undeniable demo?

Steal a service account token, read a secret, open egress, then show Viceroy explain and contain the whole chain.

Where does data live?

Some teams cannot send runtime telemetry to a SaaS. The product needs a clean answer for local, hybrid, and hosted modes.

Join the Viceroy waitlist.

For platform engineers running Kubernetes in anger. Early access should focus on install speed, attack-path clarity, and generated remediation.