What is the AI Role Taxonomy Engine?

The AI Role Taxonomy Engine is Orbyt's canonical catalog of 598 specialized roles. Each role has a stable identifier (slug), a BLS SOC mapping for government-data traceability, seniority bands from junior through VP, a related-role graph for adjacency lookups, and a methodology version. It is the substrate every other Orbyt engine builds on: every salary, every skill premium, every funding-stage band resolves through a role identity first.

How many roles are tracked?

598 specialized roles as of June 2026. The catalog adds new roles when at least three independent sources name a new role (job postings, comp-survey references, H-1B LCA filings) and removes a role only by deprecation with a successor mapping. We never delete a role from the live API: backwards compatibility is sacred.

What is a stable role identifier?

A stable role identifier is a slug like role:ai-engineer or role:llm-systems-engineer. It is permanent. The display name might change (Senior AI Engineer → Staff AI Engineer at a company), but the slug does not. Agents pass slugs to subsequent API calls so their queries survive role renames. Free-text role names also work: the engine resolves them to the canonical slug and returns it in the response.

How does BLS SOC mapping work?

Every Orbyt role maps to one or more BLS Standard Occupational Classification codes. SOC codes are the federal government's taxonomy of jobs: 800-ish categories covering the entire U.S. labor market. Many AI roles do not have dedicated SOC codes (AI Agent Engineer, LLM Ops, etc.), so we map them to the closest general category (commonly 15-2051, Data Scientists) with explicit notes. The mapping is part of every response so agents and compliance teams can cite government data.

How are seniority bands defined?

Each role has eight seniority bands: Junior, Mid, Senior, Staff, Principal, Lead, Director, VP. The bands are calibrated against H-1B LCA wage levels and median salary ratios in the BLS OES dataset. Every band has a salary multiplier derived from the role's national median. Agents can request a specific band on every salary calculation.

What is the related-role graph?

The related-role graph is a directed graph of adjacency. AI Engineer relates to ML Engineer (similarity 0.82), Data Engineer (0.61), Software Engineer (0.55), Research Scientist (0.48), etc. Similarity is computed from skill overlap, comp band overlap, and observed career transitions in the job-history dataset. The graph powers /adjacent endpoints and the MCP find_adjacent_opportunities tool.

How often does the taxonomy update?

Quarterly. Role identities are slow-moving by design: we do not want our customers' integrations to break every week because we renamed a role. New roles get added quarterly. Seniority band multipliers are refit quarterly against the latest H-1B LCA data. The methodology version increments on every release and is visible in every response.

How do I integrate the engine?

Two paths today: hit GET /api/v1/intelligence/salaries/roles directly with a Bearer key, or call the MCP server's discover_roles_and_cities tool from any agent (Claude, GPT, etc.). A TypeScript SDK (@orbyt/intelligence, orbyt.salaries.roles.list()) is in private preview and ships at general availability. All paths return the same envelope with the same role objects. See the API docs for full details.

Engine 1 of 6 · v1 stable · Build tier ($99/mo)

AI Role Taxonomy Engine.

598 specialized roles. Stable identifiers. BLS SOC mapping. Eight seniority bands. A related-role graph. The substrate every other Orbyt Intelligence engine builds on.

API docs All six engines

What it is, and why it exists.

The AI Role Taxonomy Engine is Orbyt's canonical catalog of the specialized roles the U.S. labor market keeps renaming. As of July 2026 it tracks 598 distinct roles, from role:ai-engineer and role:ml-engineer to emerging roles like role:llm-systems-engineer, role:ai-agent-engineer, and role:rag-platform-engineer that did not exist three years ago.

It exists because the labor market does not agree on names. The same job is called Machine Learning Engineer, ML Platform Engineer, AI Engineer, AI Infrastructure Engineer, Senior ML Engineer, and AI Systems Engineer at six different companies on the same week. Without a canonical layer, every downstream query: salaries, skill premiums, hiring velocity, projections: would silently pivot on the noun the customer happened to type.

Every other Orbyt engine resolves through this one first. When you ask analyze_compensationfor an "AI engineer in San Francisco," the Role Taxonomy Engine runs first. It returns role:ai-engineer with a confidence score, a list of alternates that nearly matched (role:ml-engineer, 0.82 similarity), and a methodology version. Every subsequent call uses the stable slug so the answer is reproducible.

That is the difference between a tool that answers your question and a tool you can build a business on. Bloomberg-grade data starts with stable identifiers. So does ours.

The 598 roles.

The catalog is organized into eight role families. Each family clusters related roles by core competency, then by specialization within that family.

Modeling & Research

112 roles

Research Scientists, AI Scientists, Applied Researchers, Reinforcement Learning Engineers

Applied ML

98 roles

ML Engineers, MLOps Engineers, ML Platform Engineers, ML Infrastructure Engineers

LLM & Agents

76 roles

LLM Engineers, AI Agent Engineers, RAG Platform Engineers, Prompt Engineers, LLM Ops Engineers

Data for AI

84 roles

Data Engineers, ML Data Engineers, Annotation Leads, Data Quality Engineers

AI Infra

71 roles

AI Platform, AI Infrastructure, GPU Cluster, CUDA/Triton Engineers, AI SRE

Product & Design

58 roles

AI Product Managers, AI Designers, AI Strategists, Conversational Designers

Trust, Safety, Policy

47 roles

AI Safety Researchers, Red Teamers, AI Policy Analysts, Trust & Safety Engineers

Leadership

52 roles

Heads of AI, VPs of ML, Chief AI Officers, AI Engineering Managers

Adding a new role requires three independent signals: at least 200 distinct job postings using the title across the past 90 days, at least one comp-survey reference (Levels.fyi, Pave, or a published industry survey), and an H-1B LCA filing (DOL) using the title. This three-source threshold keeps the catalog from inflating with marketing vocabulary while still letting genuinely new categories enter quickly. The current rate of new role additions runs at 6-8 per quarter.

Deprecating a role takes the opposite path. We never delete a role from the live API. Backwards compatibility is sacred. Instead, deprecated roles carry a successor_slug field, and lookups against the deprecated slug return a 200 with a deprecation_notice in the envelope. The role continues to resolve forever.

How role hierarchies work.

Each role lives at the intersection of three orthogonal hierarchies: family, seniority, and specialization. The taxonomy stores them as separate fields rather than flattening into one string, because each one moves independently.

{
  "id": "role:llm-platform-engineer",
  "display_name": "LLM Platform Engineer",
  "family": "llm-and-agents",
  "specialization": "platform",
  "seniority_bands": [
    { "level": "junior", "multiplier": 0.78 },
    { "level": "mid", "multiplier": 1.00 },
    { "level": "senior", "multiplier": 1.28 },
    { "level": "staff", "multiplier": 1.55 },
    { "level": "principal", "multiplier": 1.75 }
  ],
  "soc_mapping": {
    "primary": "15-2051",
    "primary_name": "Data Scientists",
    "secondary": "15-1232",
    "secondary_name": "Computer User Support Specialists"
  },
  "related": [
    { "id": "role:llm-systems-engineer", "similarity": 0.91 },
    { "id": "role:ml-platform-engineer", "similarity": 0.78 },
    { "id": "role:ai-infrastructure-engineer", "similarity": 0.74 }
  ],
  "methodology_version": "2026.2"
}

Three hierarchies, three queries. By family: list every role in thellm-and-agents family to map a team. By seniority: get the salary band forstaff across every role family to benchmark a leveling conversation. By specialization: compare platform roles across families to understand the cross-domain premium for platform work. The taxonomy supports all three without any role-string magic.

Seniority multipliers are derived from H-1B LCA wage levels and validated against BLS OES percentile distributions for the role's primary SOC code. Junior maps to OES 25th percentile, Mid maps to median, Senior maps to 75th, Staff and above are extrapolated using the role-family multiplier curves. The multiplier is the engine's output; the customer multiplies the role's national median to produce the band figure.

BLS SOC mapping methodology.

The Bureau of Labor Statistics maintains the Standard Occupational Classification system covers 800-ish occupation categories covering the entire U.S. labor market. SOC is the authoritative federal taxonomy. Every Orbyt role maps to one or more SOC codes so customers can cite a government data source for every salary figure we publish.

The mapping is two-tier: a primary SOC code that drives the salary baseline, and an optional secondary code that provides additional signal for ambiguous roles. For role:ai-engineer, the primary is 15-2051 (Data Scientists and Mathematical Science Occupations) because BLS does not yet have a dedicated SOC code for AI Engineers. The secondary is 15-1252 (Software Developers) because the role overlaps both categories.

Where BLS lacks a SOC code (most modern AI roles), we apply a documented premium adjustmenton top of the primary SOC baseline. The adjustment is derived from the role's H-1B LCA distribution relative to the SOC's overall distribution. For AI Engineer in 2026, the H-1B LCA median for the role-specific filings runs ~22% above the SOC 15-2051 median: that 22% becomes the premium multiplier.

Every Orbyt response that returns salary data includes the SOC code, the SOC name, the premium multiplier, and the methodology version. The /lineage/[data_point_id] endpoint returns the full provenance trail: every source that fed into this specific salary, with timestamps and reconciliation rules. That is Bloomberg-grade traceability applied to U.S. labor data.

Use cases.

Recruiters writing job specs

Type a working title; the engine returns the canonical slug, the BLS SOC code, the seniority bands, and three to five adjacent roles to widen the search if the primary role has thin supply. Some customers wire this into their ATS so every requisition gets a canonical role tag at creation time. Bands and adjacent roles become the visible scaffolding around the spec.

Compensation teams building bands

Pull the eight seniority bands for every role in a family, compose them into your internal leveling, and dock the multipliers to your own market percentile. Every band comes back with a methodology version and a sample size, so your comp committee can defend the band against finance and legal scrutiny.

Researchers studying labor flows

Use the related-role graph to map career transitions. Combined with our hiring velocity engine, you can quantify the rate at which engineers are flowing from role:ml-engineer into role:llm-systems-engineer over the last twelve months. Stable slugs make longitudinal analyses possible.

Agents disambiguating role queries

Through the MCP server, the discover_roles_and_citiestool returns the top-3 role matches with confidence scores. If the agent's confidence on the first match is below the configured threshold, it can prompt the user with the alternates rather than guessing. Stable slugs make the agent's chained tool calls reproducible.

API integration.

Three paths. All three return identical envelopes. Pick the one that fits your stack.

REST

curl https://api.orbytjobs.ai/api/v1/intelligence/salaries/roles \
  -H "Authorization: Bearer intelligence_live_..." \
  -H "Orbyt-Version: 2026-05-10"

TypeScript SDK (@orbyt/intelligence)

// SDK in private preview (not yet on npm). Until GA, call the REST API directly.
import { OrbytIntelligence } from "@orbyt/intelligence";

const orbyt = new OrbytIntelligence({
  apiKey: process.env.ORBYT_API_KEY,
});

const roles = await orbyt.salaries.roles.list({
  family: "llm-and-agents",
  limit: 25,
});

for (const role of roles.data) {
  console.log(role.id, role.display_name, role.soc_mapping.primary);
}

MCP (any agent: Claude, GPT, Perplexity)

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "discover_roles_and_cities",
    "arguments": {
      "query": "llm platform engineer in san francisco",
      "limit": 3
    }
  }
}

Full reference lives at the Orbyt Intelligence API docs. Locked envelope shape and error model documented per RFC-001.

Pricing and access.

The Role Taxonomy Engine ships on the Build tier at $99/mo. It is the substrate every other engine builds on, so we keep it at the floor price: a starting line that pays for its own compute without locking the ecosystem out. Build limits cap reads at 60 per minute per API key.

Build · $99/mo

60 req/min

intelligence:read

Roles list, single-role lookups

Pro · $299/mo

300 req/min

+ skills:read, market:read

Bulk endpoints, expand[]=*

Scale · $1,999/mo

1,500 req/min

+ company_data:read

Realtime change streams, custom MCP tools

Restricted API keys with per-engine scopes are available on all paid tiers. Grant a key only the scopes it needs (skills:read for a learning tool that does not need company data, company_data:read for a comp-consulting tool that does not need skill premiums, etc.). The full pricing page lives at /orbyt-intelligence/pricing.

See also.

All six engines

The full engine catalog

API docs

The canonical API reference

MCP integration

Six tools, one endpoint

Methodology

Sources, formulas, confidence

Methodology version 2026.2. Last updated July 2026. The role catalog is updated quarterly. Methodology versions are incremented on every change and visible in every API response. Full change history at the methodology document.