Skip to content

Traba

Traba runs a light-industrial staffing marketplace: businesses (warehousing, manufacturing, distribution, logistics, fulfillment) post shifts, and vetted flexible workers claim them from a mobile app (engineering blog). The wedge is reliability, not just liquidity — a headline 98% show rate against a ~30% industry norm (home). Staffing is the beachhead: the job board now opens “Traba is the AI operating layer for the industrial supply chain” (Ashby). The engineering story is three-part — a marketplace that migrated its system of record from Firestore to Postgres, Scout (a multi-agent AI phone interviewer on ElevenLabs that now runs 85%+ of vetting), and a founding Agents team generalizing that into autonomous supply-chain workflows.

Vitals: founded 2021 · Founders Fund · Khosla · General Catalyst · ~12+ eng · NYC + SF (in-person) + offshore ops.

Business context — founders, funding, scale, positioning
  • Co-founders: CEO Mike Shebat (ex-Uber) and CTO Akshay Buddiga (about). Backed by Founders Fund, Khosla Ventures, and General Catalyst (Ashby, careers); Keith Rabois a visible backer (press).
  • 249K workers on platform, 996+ light-industrial customers, 1.0M workers connected to businesses, across 33 states (about); 5X revenue growth (careers); the wedge is a “$50B labor market” (Ashby).
  • Reliability metrics: 98% show rate vs ~30% industry norm, 55% less turnover, 150+ assessed skill attributes per worker (home).
  • Positioning is mid-pivot: marketing still leads with “Labor That Works” staffing, but every engineering role now opens with the “AI operating layer for the industrial supply chain” framing (Ashby) — the marketplace becoming the data + distribution layer under an agent platform. “We started in workforce—temp staffing … and used it to embed ourselves inside their daily operations … by connecting to the systems running across every facility … we are building applied AI” (Ashby).
  • Thousands of parallel phone interviews via multi-agent transfer. Scout splits an interview into separate intro / vetting / logistics / Q&A agents to dodge context degradation (“at a certain threshold of context, they begin to degrade”), on an ElevenLabs voice layer over SMS/VOIP (Scout post).
  • Vetting eval as a versioned, templated artifact. A single prompt template with variables injected per call is tested against continuously-updated human-annotated Langfuse datasets, turning prompt changes around “in minutes rather than hours” — autonomy widened to 85%+ on a measured 15% shift-completion lift (Scout post).
  • Trigger-based, zero-downtime Firestore→Postgres migration. A year-long live cutover via Cloud Function triggers translating documents into Postgres rows, behind per-collection feature flags, with self-healing reconciliation crons and 15+-attempt retry logic, reaching 99.99% replication precision (pg post).

A pragmatic, TypeScript-centric marketplace stack, with Python reserved for the AI/ML surface and a deliberate “buy the managed thing” bias on infrastructure.

LayerChoiceEvidence
Web frontendReact / Node.js (business web app + internal Ops tools platform)Ashby
MobileReact Native (worker app)Ashby
BackendNode.js + TypeScriptpg post, Ashby
AI / ML servicesPythonAshby
Primary DBPostgreSQL on Aiven (managed; chosen over GCP Cloud SQL)pg post
ORMPrisma (chosen over TypeORM)pg post
Legacy / secondary DBFirestore — retained for feature flags + remaining NoSQL casespg post
CloudGCP — Firebase, Cloud Functions, Datastreampg post
Analytics warehouseBigQuerypg post
CDC / ingestionGCP Datastream (Postgres → BigQuery), routed via Aiven to avoid a ~$500/mo Kafka billpg post
Batch / scheduledFirebase cron jobs + scheduled tasks for pre-computing metricspg post
Internal toolingRetool workflows for the Ops teampg post
Load testingArtillerypg post
Foundation LLMsAnthropic + OpenAI frontier models (agent reasoning, vetting, eval)Ashby
Agent stacktool use, sub-agents, retrieval, structured outputs, MCP servers, orchestration layerAshby
Voice AIElevenLabs — TTS/ASR, telephony, multi-agent transfer, language detection/switchingScout post
LLM eval / datasetsLangfuse + Braintrust — human-annotated ground-truth datasets + prompt testingScout post, Ashby
System integrationcustomers’ WMS / TMS / ERP environmentsAshby
Worker commsSMS + VOIP (telephony)Scout post

The parts an engineer would lose sleep over. Public signal is cited (verified); likely approach is labeled speculation — best-practice fill-in, hedged.

ProblemWhy it’s hardPublic signalLikely approach (speculative)
Thousands of parallel real-time voice interviewsSub-second voice latency, telephony, and language switching at fan-out, with LLM context that “begin[s] to degrade” past a threshold”an AI recruiter that conducts thousands of real-time interviews in parallel”; degradation forced a single agent into multi-agent transfer (scout)Likely lean on ElevenLabs’ managed telephony/turn-taking for the audio path and cap per-agent context by splitting phases (intro/vetting/logistics/Q&A), as confirmed; concurrency probably scaled horizontally per call
Evaluating a probabilistic vetterA non-deterministic interviewer gates real shift assignments — regressions are invisible without ground truth, not a unit test”a single prompt template” tested against continuously-updating human-annotated Langfuse datasets, turning prompt changes around “in minutes rather than hours” (scout)Likely a versioned dataset + concurrent test harness scoring accuracy and per-question-type breakdowns on every prompt change, gating rollout behind the measured 15% shift-completion lift
Zero-downtime live DB migrationCutting the relational system of record over from Firestore to Postgres with no freeze, while features kept shipping into a moving targetone year, zero downtime via trigger-based replication; race conditions met with reconciliation crons and “custom retry logic, 15+ attempts”; data-integrity bugs traced to connection-pool exhaustion (pg)Likely per-collection feature flags for incremental read/write cutover plus a prepared rollback script, with self-healing reconciliation as the durable safety net (largely confirmed)

The infrastructure Traba doesn’t name publicly, inferred from the stack it does — low-surprise choices for a team on GCP + Aiven Postgres running an LLM-driven vetting product:

ComponentLikely choiceBasis
Backend computeCloud Run (containers)GCP-native, scales to zero, fits a small team avoiding cluster ops; matches the “managed over self-run” pattern
LLM gateway / routerthin internal abstraction over Anthropic + OpenAIthe providers are confirmed (Ashby); a routing layer that picks model per task/cost is the conventional pattern, but unstated
Agent-platform orchestrationin-house orchestrator over MCP + frontier models”tool use, sub-agents, retrieval, structured outputs, MCP servers, and the orchestration layer” (Ashby); the design isn’t public
Semantic dedupembeddings + pgvector in Postgres”semantically similar questions” implies vector similarity; Postgres is already the system of record
Autha managed IdP (Auth0 / Stytch / Firebase Auth)Firebase Auth is the zero-friction holdover given the Firebase history; an IdP is the conventional alternative
Secrets / configGCP Secret ManagerGCP-native default; table stakes once off Firebase-only
ObservabilityDatadog or GCP Cloud Monitoring + Langfuse for LLMLangfuse confirmed for prompt eval; app/infra telemetry unstated but conventional
Job / queueCloud Tasks / Pub/Subalready inside GCP; Cloud Functions + triggers are confirmed, so event plumbing exists

Traba booted in 2021 on Firebase/Firestore — “with the meeting quickly looming, they scrapped the schema and got set up on Firebase in a day” (pg post). As data turned relational (companies, workers, shifts, shift_signups, associations), Firestore’s limits bit: no fuzzy/LIKE matching, no counts without full scans, no multiple-inequality queries, no joins — operations like “workers who worked last week, aren’t currently busy, and aren’t blocked by company X” required pulling documents into Node memory and filtering by hand (pg post).

The fix was a one-year, zero-downtime migration to PostgreSQL on Aiven, behind per-collection feature flags. Today the system runs Postgres and Firebase in conjunction — Postgres as the relational system of record, Firestore retained for feature flags and lightweight NoSQL, Firebase cron jobs still pre-computing metrics, and Datastream replicating Postgres into BigQuery for analytics (pg post).

Traba core architecture: React/React Native clients and a Node.js backend over Postgres (Aiven) as the relational system of record, with Firestore retained for feature flags and NoSQL, Datastream replicating to BigQuery, and Retool for internal ops.

Mermaid source
flowchart LR
classDef client fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
classDef core fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
classDef data fill:#e8f1fd,stroke:#2563eb,stroke-width:1.5px,color:#0f172a;
classDef legacy fill:#fdf1e7,stroke:#e0892f,stroke-width:1.5px,color:#0f172a;
Worker("Worker app<br/>React Native"):::client
Biz("Business platform + Ops<br/>React / React Native"):::client
API("Node.js + TypeScript backend<br/>Prisma ORM"):::core
Retool("Retool<br/>internal Ops workflows"):::client
PG[("PostgreSQL on Aiven<br/>relational system of record")]:::data
FS[("Firestore<br/>feature flags · residual NoSQL")]:::legacy
Cron("Firebase cron / scheduled jobs<br/>pre-computed metrics"):::legacy
BQ[("BigQuery<br/>analytics warehouse")]:::data
Worker --> API
Biz --> API
Retool --> API
API --> PG
API --> FS
Cron --> FS
PG -- "Datastream CDC" --> BQ

The migration’s engine was trigger-based replication (chosen over in-code “shadow writes”): a Firestore document change fires a Cloud Function that translates the document into a Postgres row insert/update/delete (pg post). Two race-condition classes had to be solved — out-of-order replications (fixed with per-collection reconciliation scripts run on a self-healing cron) and ordering/constraint failures (fixed with custom retry logic, 15+ attempts, plus a Slack alert that “amazingly pinged us a grand total of 0 times”) (pg post). The team backfilled history with the same translation logic, hit 99.99% replication precision by April, and migrated the riskiest collections (shifts, shift_signups, workers) last — flipping reads and writes simultaneously for shifts to avoid replication-lag overfills (pg post).

Scout vets workers by phone at scale — “an AI recruiter that conducts thousands of real-time interviews in parallel” (Scout post). It runs on ElevenLabs for the voice/telephony layer (low-latency TTS/ASR, multilingual), reaches workers over SMS and VOIP, and hands back structured, decision-grade evaluations rather than raw transcripts (Scout post).

The design moved from a single agent (one large context prompt with role-specific content injected, one holistic evaluation prompt, English-only) to a multi-agent architecture with agent transfer — separate agents for job intro, vetting, shift-logistics confirmation, and worker Q&A — explicitly to dodge context degradation: “at a certain threshold of context, they begin to degrade” (Scout post). Around that core sit a semantic dedup pre-processor (omits 10–20% of questions for repeat applicants), a Spanish-capable agent via ElevenLabs language switching, and Custom Scout, an operator-trained per-question good/bad/great rubric (Scout post).

Traba Scout pipeline: a worker applies, a semantic dedup pre-processor trims repeat questions, an ElevenLabs voice layer carries SMS/VOIP, multiple transferred agents conduct the interview, and a Langfuse-backed evaluation prompt produces a structured decision routed to auto-qualify or an operator.

Mermaid source
flowchart LR
classDef io fill:#eef2f8,stroke:#94a3b8,stroke-width:1.5px,color:#0f172a;
classDef ai fill:#eef0fe,stroke:#6366f1,stroke-width:1.5px,color:#0f172a;
classDef eval fill:#e8f1fd,stroke:#2563eb,stroke-width:1.5px,color:#0f172a;
classDef human fill:#fdecec,stroke:#e0564f,stroke-width:1.5px,color:#0f172a;
Apply("Worker applies to role(s)"):::io
Dedup("Semantic dedup pre-processor<br/>drop questions already answered<br/>(10–20% omitted for repeats)"):::ai
Voice("ElevenLabs voice layer<br/>TTS / ASR · SMS + VOIP · multilingual"):::io
subgraph Agents["Multi-agent interview · agent transfer"]
direction TB
Intro("Job intro agent"):::ai
Vet("Vetting agent<br/>+ Custom Scout rubric"):::ai
Logi("Shift-logistics agent"):::ai
QA("Worker Q&A agent"):::ai
Intro --> Vet --> Logi --> QA
end
Eval("Evaluation prompt<br/>single template, variables injected<br/>tested vs Langfuse datasets"):::eval
Auto("Auto-qualify → eligible for shift"):::eval
Op("Operator review / final vetting"):::human
Apply --> Dedup --> Voice --> Agents
Agents -- transcript --> Eval
Eval -- high confidence --> Auto
Eval -- needs review --> Op

The outcomes Traba reports: 250,000+ AI-led interviews to date; 85%+ of all vetting now AI-conducted (targeting ~100% within half a year); and — counterintuitively — AI-vetted workers are 15% more likely to complete their shifts than human-vetted ones, attributed to a more consistent, empirically-refined process (Scout post).

Scout is the first agent; the job board describes a founding Agents team generalizing the pattern into “production agent systems on top of frontier LLMs: tool use, sub-agents, retrieval, structured outputs, MCP servers, and the orchestration layer that ties them together,” integrating with customers’ WMS, TMS, and ERP environments (Ashby). The reasoning substrate is explicitly frontier model APIs (Anthropic, OpenAI), with evaluation run on Langfuse / Braintrust and internal harnesses (Ashby).

~12 engineers by the end of the Postgres migration (pg post); today the job board lists 24 open roles across New York City, San Francisco, and offshore ops hubs (Colombia, Ecuador, the Philippines), including a dedicated AI Agents track (Ashby). 100% in-person (NYC + SF), “Olympian’s Work Ethic,” a stated preference for “motors, not gears” (Ashby, careers).

RolePersonSource
Co-founder / CEOMike Shebat (ex-Uber)about
Co-founder / CTOAkshay Buddigaabout, pg post
Founding engineerMoreno Antunespg post

The generalist role is full-stack founding-team product engineering — the React Native worker app, the React/Node business web app, and a Retool ops-tools platform — partnering directly with the CTO; a parallel AI Agents track ships “production agent systems on top of frontier LLMs” and treats “evaluation as a first-class discipline,” explicitly FDE-style — “embed with our customers and operators” (Ashby). The infra and Scout teams are credited by name in the engineering posts (pg post, Scout post). The house style is feature-flagged incremental cutover: both the year-long Postgres migration (per-collection read/write flags, self-healing reconciliation crons, 15+-retry triggers, a prepared rollback script, Artillery load tests) and Scout (operator safety-net → 85%+ autonomy on a measured 15% completion lift) ship behind a flag, validate against ground truth, then widen autonomy (pg post, Scout post).

Reconstructed from public sources only — no insider information. Crawled 2026-06-07. Claim tiers: verified (stated on a public page, linked) · inferred (reasoned from a cited signal, confidence flagged) · speculative (best-practice fill-in, labeled). Links are live; pages change, so the supporting quote for each claim is kept in this repo’s evidence map (evidence/traba-evidence-map.md).

#SourceLink
S1Homepagehttps://traba.work/
S2Abouthttps://traba.work/company/about
S3Careershttps://traba.work/company/careers
S4Presshttps://traba.work/company/press
S5Engineering indexhttps://traba.work/company/engineering
S6Project Scout: Building an AI Interviewerhttps://traba.work/company/engineering/building-scout
S7Out of the Fire(store): Traba’s Journey to Postgreshttps://traba.work/company/engineering/firestore-postgres-migration
S8Job board (Ashby) — incl. Software Engineer (Generalist) and Senior Software Engineer (AI Agents)https://jobs.ashbyhq.com/traba