AI Infra Brief｜KubeCon EU AI Inference, LiteLLM Supply Chain Attack & 6G AI-Native Networks (Apr. 7, 2026)

April 7, 2026 marked a pivotal moment as the cloud-native community confirmed Kubernetes is evolving from an application orchestration layer into an AI inference control plane, the AI inference security supply chain suffered a major breach, and the AI-native architecture roadmap for 6G networks came into sharper focus.

Key Highlights

☁️ KubeCon EU 2026: 66% of GenAI workloads now running on K8s, AI Conformance Program certifications surging

🔒 LiteLLM supply chain attack: TeamPCP exploits .pth files to steal cloud credentials from thousands of enterprises

💰 Sona closes $45M Series B for frontline economy AI operations platform

🌐 NVIDIA, Qualcomm, and Ericsson advance 6G AI-native network architecture

⭐ OpenClaw open-source AI assistant surpasses 215K GitHub stars, ecosystem accelerating

🧠 Dante-2B: bilingual open-source LLM trained from scratch on 2×H200

⛓️ Lithosphere expands developer toolchain, advancing AI-native contract language Lithic

Cloud Infrastructure & Kubernetes

☁️ KubeCon EU 2026: Kubernetes Officially Becomes the “Home OS” for AI Inference

According to recaps from Pulumi, Forbes, and Kubermatic, the core conversation at KubeCon EU 2026 shifted from “can AI run on K8s” to “how to run it at scale.” Key data points: CNCF surveys show 82% of container users run K8s in production, with 66% already deploying GenAI workloads; the Kubernetes AI Conformance Program has seen a surge in certified products; Google GKE, AWS EKS, and Microsoft Azure Arc all released scheduling and GPU management enhancements for AI workloads.

A key consensus at this KubeCon: K8s is no longer just an application orchestration platform — it’s the control plane for AI inference. The stateful nature of inference workloads (model weights, checkpoints, fine-tuning data) imposes new requirements on storage and backup, while GPU scheduling and multi-tenant isolation have become core challenges for platform engineering.

💰 Sona Closes $45M Series B for Frontline Economy AI Platform

According to Morningstar and SiliconANGLE, London-based Sona closed a $45M Series B round for its AI platform targeting frontline industries including retail, hospitality, and healthcare, providing AI-driven workforce forecasting and scheduling, HR management, and operational efficiency tools.

Sona’s positioning distinguishes it from general-purpose AI platforms — it focuses on the “frontline economy,” industries heavily reliant on manual scheduling and on-site operations. AI-driven predictive scheduling and operational optimization are critical needs in these sectors, representing some of the fastest AI adoption scenarios.

Security & Compliance

🔒 LiteLLM Supply Chain Attack: .pth File Backdoor Steals Cloud Credentials, Thousands of Enterprises Affected

According to the PyPI official incident report, Trend Micro, and an in-depth analysis by Arthur AI, the threat group TeamPCP pushed malicious versions of LiteLLM (1.82.7 and 1.82.8) to PyPI on March 24 using stolen credentials. The attack leveraged Python’s .pth file mechanism to automatically execute malicious code during package installation, stealing AWS, GCP, and Azure credentials along with SSH keys. AI hiring platform Mercor confirmed it was among the thousands of victim enterprises. The attack vector has been linked to a vulnerability in the security tool Trivy.

LiteLLM is a widely used AI Gateway and multi-model proxy, and this incident exposes the fragility of the AI inference toolchain supply chain: trusted open-source packages can become springboards for credential theft. The .pth file mechanism itself is a legitimate Python feature, but the lack of install-time auditing makes it an attack vector. For AI teams, dependency locking, installation auditing, and least-privilege principles are measures that must be strengthened immediately.

📋 HHS Releases AI Compliance Update, Driving Healthcare AI Regulatory Alignment

According to legislative tracking by the Transparency Coalition and Katten, the U.S. Department of Health and Human Services (HHS) continues advancing its AI compliance framework for healthcare, with recent updates involving an RFI on accelerating AI clinical adoption, changes to the Health IT certification program, and cross-agency coordination with FDA, NIH, and CMS.

HHS’s approach to AI regulation is “coordinate rather than restrict”: soliciting industry input through RFIs to identify barriers in existing regulations to AI adoption, then making targeted adjustments. For healthcare AI infrastructure builders, this means compliance is no longer an afterthought patch — it’s a front-loaded architectural constraint.

Networking & Edge Computing

🌐 6G AI-Native Network Architecture Roadmap Takes Shape

According to recent announcements from NVIDIA, Qualcomm, and Ericsson, the AI-native characteristics of 6G networks are moving from concept to concrete architecture design. NVIDIA and global telecom leaders committed to building 6G on open, secure AI-native platforms; Qualcomm introduced an end-to-device data center AI-native platform unifying connectivity, sensing, and compute; Ericsson demonstrated its “Intelligent Fabric” 6G architecture at MWC 2026, embedding AI into the radio access network, edge, and core network.

The core promise of 6G is “network as AI inference platform”: AI is no longer just an application running on the network — it’s the building material of the network itself. NVIDIA’s push for open platforms, Qualcomm’s unified device-to-data-center architecture, and Ericsson’s AI-RAN concept are all laying the groundwork for the first commercial 6G services around 2030.

Open Source Ecosystem

⭐ OpenClaw: Open-Source Personal AI Assistant Ecosystem Accelerates Expansion

According to GitHub and Towards AI, OpenClaw continues its growth as an open-source personal AI assistant project with GitHub stars surpassing 215K. The project supports Windows/macOS/Linux deployment, comes pre-loaded with the Kimi K2.5 model, and features a built-in service gateway, identity authentication, and the ClawHub skill marketplace (5,400+ skills listed). OpenClaw positions itself as a local-first AI Agent runtime, supporting multi-agent collaboration, persistent memory, and autonomous workflows.

OpenClaw’s growth trajectory reflects the broader shift of AI Agents from cloud SaaS toward local self-hosting. The ClawHub ecosystem with 5,400+ skills and surrounding projects like the OpenViking context database are building an agent skill distribution system analogous to npm.

🧠 Dante-2B: Bilingual Open-Source LLM Trained From Scratch on Two H200 GPUs

According to a Reddit r/LocalLLaMA community post, a developer shared progress on the Dante-2B project — a 2.1B parameter Italian/English bilingual open-source LLM, trained entirely from scratch using 2×H200 GPUs. Phase one is complete, with the core argument being “training good models doesn’t require massive clusters — it requires good data and clean training pipelines.”

The practical value of Dante-2B lies in demonstrating that training a competitive model from scratch on small-scale hardware is feasible. This is inspiring for resource-constrained teams and researchers, and aligns with the open-source community’s trend toward “small but excellent” models.

📦 Open Source Project Updates: Kafka-ML, TurboQuant, and More

According to GitHub kafka-ml and community discussions, the Kafka-ML framework manages ML model pipelines on Kubernetes, bridging data streams with TensorFlow/PyTorch training frameworks; TurboQuant introduces extreme quantization compression, reportedly reducing LLM memory footprint by 6×. Additionally, open-source projects including MemPalace (AI memory), Hippo, Ghost Pepper, and Freestyle have been gaining community traction.

These projects share a common direction: reducing the deployment and operational costs of AI infrastructure. Kafka-ML addresses the gap between data pipelines and model training, TurboQuant lowers inference hardware barriers through extreme compression, and memory and agent-related projects expand the capability boundaries of AI applications.

AI-Native Blockchain

⛓️ Lithosphere Expands Developer Toolchain, Advancing AI-Native Contract Language Lithic

According to MarketWatch and Barchart, following the Makalu testnet activation, Lithosphere further expanded its developer ecosystem with the launch of the Lithic toolchain. Lithic, as an AI-native smart contract language, allows AI interactions to be defined as part of contract logic, supporting verifiable execution and cost parameter control, while providing the cross-chain interoperability protocol MultX and the LEP100 standard.

The launch of the Lithic toolchain represents a critical step for AI-native blockchains from “proof of concept” to “developer-ready.” The design of embedding AI inference capabilities directly at the contract layer has practical implications for on-chain AI decision-making scenarios such as DeFi risk control and automated governance.

🔍 Infra Insights

Today’s core trends: K8s becomes the de facto control plane for AI inference, AI toolchain supply chain security emerges as an unignorable risk surface, and 6G embeds AI inference capabilities into network infrastructure.

KubeCon EU 2026 delivered a clear signal: the question is no longer “can K8s run AI” — 66% of GenAI workloads are already running on it. The surge in AI Conformance Program certifications and GPU scheduling enhancements from major cloud providers signals that K8s’s position as the AI inference control plane is established. Meanwhile, the LiteLLM supply chain attack is a stark reminder — the more prevalent AI inference toolchains become, the larger the attack surface grows; the chain reaction of .pth file backdoors, Trivy vulnerability exploitation, and thousands of enterprises losing credentials demonstrates that AI security cannot focus on models alone — it must cover the entire dependency chain. On the 6G front, the synchronized push from NVIDIA, Qualcomm, and Ericsson shows that the next generation of networks is designed from the ground up as AI-native — the network is no longer AI’s transport pipe, but AI’s execution platform.