AI Role Taxonomy Engine.
598 AI-adjacent roles. Stable identifiers. BLS SOC mapping. Eight seniority bands. A related-role graph. The substrate every other Orbyt Intelligence engine builds on.
What it is, and why it exists.
The AI Role Taxonomy Engine is Orbyt's canonical catalog of every AI-adjacent role in the U.S. labor market. As of May 2026 it tracks 598 distinct roles, from role:ai-engineer and role:ml-engineer to emerging roles like role:llm-systems-engineer, role:ai-agent-engineer, and role:rag-platform-engineer that did not exist three years ago.
It exists because the labor market does not agree on names. The same job is called Machine Learning Engineer, ML Platform Engineer, AI Engineer, AI Infrastructure Engineer, Senior ML Engineer, and AI Systems Engineer at six different companies on the same week. Without a canonical layer, every downstream query: salaries, skill premiums, hiring velocity, projections: would silently pivot on the noun the customer happened to type.
Every other Orbyt engine resolves through this one first. When you ask analyze_compensationfor an "AI engineer in San Francisco," the Role Taxonomy Engine runs first. It returns role:ai-engineer with a confidence score, a list of alternates that nearly matched (role:ml-engineer, 0.82 similarity), and a methodology version. Every subsequent call uses the stable slug so the answer is reproducible.
That is the difference between a tool that answers your question and a tool you can build a business on. Bloomberg-grade data starts with stable identifiers. So does ours.
The 598 roles.
The catalog is organized into eight role families. Each family clusters related roles by core competency, then by specialization within that family.
Adding a new role requires three independent signals: at least 200 distinct job postings using the title across the past 90 days, at least one comp-survey reference (Levels.fyi, Pave, or a published industry survey), and either an SEC filing or H-1B LCA filing using the title. This three-source threshold keeps the catalog from inflating with marketing vocabulary while still letting genuinely new categories enter quickly. The current rate of new role additions runs at 6-8 per quarter.
Deprecating a role takes the opposite path. We never delete a role from the live API , backwards compatibility is sacred. Instead, deprecated roles carry a successor_slug field, and lookups against the deprecated slug return a 200 with a deprecation_notice in the envelope. The role continues to resolve forever.
How role hierarchies work.
Each role lives at the intersection of three orthogonal hierarchies: family, seniority, and specialization. The taxonomy stores them as separate fields rather than flattening into one string, because each one moves independently.
Three hierarchies, three queries. By family: list every role in thellm-and-agents family to map a team. By seniority: get the salary band forstaff across every role family to benchmark a leveling conversation. By specialization: compare platform roles across families to understand the cross-domain premium for platform work. The taxonomy supports all three without any role-string magic.
Seniority multipliers are derived from H-1B LCA wage levels and validated against BLS OES percentile distributions for the role's primary SOC code. Junior maps to OES 25th percentile, Mid maps to median, Senior maps to 75th, Staff and above are extrapolated using the role-family multiplier curves. The multiplier is the engine's output; the customer multiplies the role's national median to produce the band figure.
BLS SOC mapping methodology.
The Bureau of Labor Statistics maintains the Standard Occupational Classification system covers 800-ish occupation categories covering the entire U.S. labor market. SOC is the authoritative federal taxonomy. Every Orbyt role maps to one or more SOC codes so customers can cite a government data source for every salary figure we publish.
The mapping is two-tier: a primary SOC code that drives the salary baseline, and an optional secondary code that provides additional signal for ambiguous roles. For role:ai-engineer, the primary is 15-2051 (Data Scientists and Mathematical Science Occupations) because BLS does not yet have a dedicated SOC code for AI Engineers. The secondary is 15-1252 (Software Developers) because the role overlaps both categories.
Where BLS lacks a SOC code (most modern AI roles), we apply a documented premium adjustmenton top of the primary SOC baseline. The adjustment is derived from the role's H-1B LCA distribution relative to the SOC's overall distribution. For AI Engineer in 2026, the H-1B LCA median for the role-specific filings runs ~22% above the SOC 15-2051 median: that 22% becomes the premium multiplier.
Every Orbyt response that returns salary data includes the SOC code, the SOC name, the premium multiplier, and the methodology version. The /lineage/[data_point_id] endpoint returns the full provenance trail: every source that fed into this specific salary, with timestamps and reconciliation rules. That is Bloomberg-grade traceability applied to U.S. labor data.
Use cases.
Type a working title; the engine returns the canonical slug, the BLS SOC code, the seniority bands, and three to five adjacent roles to widen the search if the primary role has thin supply. Some customers wire this into their ATS so every requisition gets a canonical role tag at creation time. Bands and adjacent roles become the visible scaffolding around the spec.
Pull the eight seniority bands for every role in a family, compose them into your internal leveling, and dock the multipliers to your own market percentile. Every band comes back with a methodology version and a sample size, so your comp committee can defend the band against finance and legal scrutiny.
Use the related-role graph to map career transitions. Combined with our hiring velocity engine, you can quantify the rate at which engineers are flowing from role:ml-engineer into role:llm-systems-engineer over the last twelve months. Stable slugs make longitudinal analyses possible.
Through the MCP server, the discover_roles_and_citiestool returns the top-3 role matches with confidence scores. If the agent's confidence on the first match is below the configured threshold, it can prompt the user with the alternates rather than guessing. Stable slugs make the agent's chained tool calls reproducible.
API integration.
Three paths. All three return identical envelopes. Pick the one that fits your stack.
Full reference lives at the Orbyt Intelligence API docs. Locked envelope shape and error model documented per RFC-001.
Pricing and access.
The Role Taxonomy Engine ships on the Build tier at $99/mo. It is the substrate every other engine builds on, so we keep it at the floor price: a starting line that pays for its own compute without locking the ecosystem out. Build limits cap reads at 60 per minute per API key.
Restricted API keys with per-engine scopes are available on all paid tiers. Grant a key only the scopes it needs (skills:read for a learning tool that does not need company data, company_data:read for a comp-consulting tool that does not need skill premiums, etc.). The full pricing page lives at /orbyt-intelligence/pricing.
See also.
Methodology version 2026.2. Last updated May 2026. The role catalog is updated quarterly. Methodology versions are incremented on every change and visible in every API response. Full change history at the methodology document.