Architecture · Delivery · Operating discipline

    Principal engineer for teams shipping AI software

    Principal-level work on architecture, data paths, and delivery for AI-facing products—systems and teams, not model prompts in isolation.

    Book a conversation

    Speed is solved. Follow-through is not.

    Tooling and agents make it cheap to produce code and docs; CI/CD is normal. The strain shows in review load, ownership of what merged, and operational surprise when volume goes up. Useful responses are process, clear guardrails, automation with explicit ownership, and supervision that is defined rather than improvised. Engagements are with leadership and engineering on those mechanics—release hygiene, risk boundaries, and how the org keeps pace without losing traceability.

    Context

    AI products: systems and organisation, not only the model

    Delivery reliability, cost and risk controls, and feedback loops into the org matter in a crowded market. Work with clients is on engineering practice and system design—documentation and decks are secondary.

    Focus

    Work themes

    Delivery pace

    Pipelines, environments, and release cadence. Keeping changes reviewable and traceable when contribution volume goes up.

    System design

    Service and data design for products that combine conventional APIs with LLM features: context boundaries, retrieval layers, latency, cost, and compliance constraints.

    Guardrails and review

    Review, testing, and rollback patterns when automation or agents touch code or operations. Automation with clear owners and audit points.

    Team operating model

    Roles, ownership, and recurring rituals for small teams under high merge frequency: who approves what, how quality is defined, onboarding when the codebase churns.

    Architecture & platforms

    Platforms, data, and operations

    Typical focus: ingestion and query paths, failure attribution, production review. UI and service frameworks follow once boundaries and data flow are settled.

    Event streaming & service boundaries

    Kafka-style topologies when the load or ordering model warrants it: topics, consumer groups, backpressure, explicit contracts between producers and consumers. Keeps services from implicitly sharing state through a single database.

    Telemetry where AI touches production

    OpenTelemetry, traces, and SLIs on paths that include models or tools. Separates failures and cost between model, retrieval, integration, and infrastructure instead of collapsing to a generic error.

    Operational visibility

    Dashboards and alerts in the Grafana class: latency, errors, saturation, queue depth, inference spend, data freshness. Same signals for ops and engineering leads.

    Search & analytics surfaces

    Elasticsearch or equivalents for full-text search, log search, and large-index exploration. Owned as product surface area where search is part of the workflow.

    Vector stores & retrieval architecture

    Milvus-class vector databases when dense retrieval is in scope: collections, embedding refresh, hybrid filters on structured fields, invalidation of stale vectors.

    Ingestion pipelines

    Structured and unstructured inputs, schema and deduplication, enrichment steps, hand-off into indexing or analytics stores. Ingestion defects propagate; design and monitoring are part of the same work.

    Semantic search & relevance

    Ranking and offline evaluation, split of transactional data vs retrieval index, maintenance of corpora used in generation. Not limited to default RAG templates.

    Background work & analysis

    Async workers and long-running jobs: re-embedding, reconciliation, batch scoring, heavy transforms off the interactive request path.

    BI

    Views that combine usage, data quality signals, model cost, and delivery metrics for product and engineering planning.

    Application frameworks

    React, Next.js, Node, Python where the product needs that surface. Secondary to architecture, data layout, and observability.

    Engagements

    Advisory, embedded, or rescue

    Fractional principal engineer, scoped architecture or delivery work, or a reset after a period of high merge volume with weak process. Engagements are time-boxed with agreed outcomes; not open-ended staff aug.

    Confidential by default

    Select engagements

    Client work is usually under NDA. References or public write-ups only when the client agrees. Ask if you need named references for procurement.