Mohammad Aziz — Founding Engineer

Founding Engineer for AI systems.

I architect and ship production AI systems — agent infrastructure, multi-model orchestration, and serverless platforms — taking products from zero to one.

AI Agents Multi-model Orchestration MCP RAG Serverless · AWS GraphQL Multi-tenant SaaS
Scroll
Selected Systems

Production platforms, owned end‑to‑end.

Read full case studies
SarahAI
Founding-level Full-Stack Engineer · AI Systems · Venturenox
2025 → Now

An AI communication assistant with its own app, in-app chat, and a major WhatsApp integration — scheduling, reminders, summaries and natural voice. I own the AI architecture: multi-model orchestration, an MCP tool layer, persistent agent memory, and a serverless notification engine rebuilt for infinite scale.

Overview

A multi-surface assistant — app, in-app chat and WhatsApp — backed by an agent runtime that routes across GPT-4o, Gemini and Claude by latency and cost.

Architecture

Django core, a dedicated MCP server, RAG memory with Langfuse, and a serverless Lambda + SQS + EventBridge scheduler.

Challenges

Multi-model orchestration, durable agent memory, Meta messaging compliance, and keeping inference cost predictable.

Impact

Eliminated CRON infra overhead, made time-based messaging infinitely scalable, and lifted answer accuracy via self-learning agents.

Open case study
sarahai · app · in-app chat · whatsapp
Informly
Founding Full-Stack Engineer
2023 → Now

A business-growth platform — surveys, analytics, automations and an AI assistant. As founding engineer I drove every architecture decision from day one: a GraphQL platform, a no-code workflow engine, and a schema-contracted AI service layer.

Overview

Multi-tenant SaaS giving non-technical teams surveys, metrics and automations with an AI agent woven through.

Architecture

Nx monorepo, Next.js + Apollo GraphQL, Inngest-powered workflow engine, and Pydantic-contracted AI.

Challenges

Multi-tenancy, a REST→GraphQL migration, fine-grained RBAC, and orchestrating user-defined workflows.

Impact

Cut over-fetching and server stress, shipped drag-and-drop automation for non-coders, and a low-latency UI.

Open case study
informly · surveys · analytics · automations
Interesting Problems Solved

A look under the hood.

Time-based WhatsApp messaging ran on Celery CRON jobs — a single worker fleet that had to be kept warm, scaled by hand, and quietly became the bottleneck for every reminder and scheduled send.

Challenge

CRON workers couldn't fan out to thousands of precisely-timed sends without over-provisioning idle compute.

Solution

Re-architected to a fully serverless stack — Lambda + SQS + EventBridge Scheduler — with each message as an independently scheduled event.

Tradeoffs

Accepted cold-start and distributed-debugging cost in exchange for eliminating a standing worker fleet entirely.

Outcome

Scheduling became effectively infinite and zero-maintenance — no infra to babysit, scale tracks demand.

Sending every request to a frontier model is simple and expensive. Most turns don't need GPT-4.5 — but quality can't regress on the ones that do.

Challenge

Balance cost, latency and quality across proprietary and open-source models without hurting the user experience.

Solution

Built a multi-provider orchestration layer with a dedicated MCP server fronting open models on Groq & Cerebras, routing by intent, latency and cost.

Tradeoffs

Added routing complexity and a fallback matrix to maintain — justified by large, ongoing inference savings.

Outcome

Frontier models reserved for hard turns; the bulk served cheaply and fast, with no perceptible quality drop.

Free-form model output is hostile to a typed product. The frontend needed guarantees, not vibes — survey schemas, metrics objects and agent actions that always parse.

Challenge

Guarantee AI output conforms to the exact shape the application expects, every single time.

Solution

Made Pydantic schemas the contract between frontend and AI backend — generation is validated against types before it ever reaches the UI.

Tradeoffs

Stricter prompts and a validation/repair loop in exchange for output the rest of the system can trust blindly.

Outcome

AI features became composable building blocks — surveys, metrics and chat actions all type-safe end to end.

Business teams wanted to automate their own processes without filing an engineering ticket for every if-this-then-that.

Challenge

Let non-coders define, configure and trigger multi-step automations safely, without touching code.

Solution

Built a drag-and-drop workflow engine on Inngest — a visual canvas compiling to durable, observable background jobs.

Tradeoffs

Constrained the building blocks to keep workflows safe and debuggable, rather than exposing raw scripting.

Outcome

Teams ship their own automations; engineering is out of the critical path for routine business logic.

An assistant that forgets is an assistant you stop trusting. Context had to survive across sessions — and get better with use.

Challenge

Maintain coherent long-term context across interactions without bloating every prompt.

Solution

RAG-backed memory with disciplined context management in Langfuse, plus agents that update their own knowledge base from past turns.

Tradeoffs

Invested in retrieval quality and write-back guards to stop memory from drifting or poisoning itself.

Outcome

Agents stay consistent and self-learning — accuracy compounds instead of resetting each session.

Engineering Principles

How I make decisions.

P/01
Build systems, not features.
A feature solves today. A system solves the next ten variations of it.
P/02
Optimize for maintainability.
Code is read and changed far more than it's written. Future-me is the user.
P/03
Measure before optimizing.
Instinct points at the wrong bottleneck. Numbers point at the real one.
P/04
Automate repetitive work.
If a human does it twice, a machine should do it the third time onward.
P/05
Prefer architecture over patches.
A patch hides a problem. Architecture removes the class of problem.
P/06
Own outcomes, not tickets.
The job isn't "done" at merge. It's done when it works in production.
Technical Expertise

A connected system, not a checklist.

Five domains that compound. The interesting work lives in the edges between them — AI that respects infrastructure limits, frontends that trust their backends.

Let's build something ambitious.

Have a hard problem, a zero-to-one product, or an architecture that needs an owner? Let's talk.