The Agentic Engineering Field Guide, Part 2: The Framework and Platform Landscape
A walk through every credible option in April 2026. What each one is built for. Who it wins for. What it locks you into.
The Three Layer Decision
Most framework debates skip the decision that matters more. Before you pick a framework, you have to pick a layer.
There are three. From most control to least.
Open framework plus your own runtime. You write the agent code in LangGraph, Microsoft Agent Framework, Mastra, or similar. You deploy it on infrastructure you own or operate. You wire in your own checkpointing, observability, and scaling. The framework does the orchestration work. Everything else is yours.
Managed platform. A hyperscaler runs the agent runtime. You write code in the matching open SDK, ship a container or config, and the platform handles deployment, scaling, identity, observability, and state. Microsoft Foundry, Google Vertex AI Agent Engine, and AWS Bedrock Agents all live here. You trade some control for a lot less plumbing.
SaaS agent layer. You do not write the agent. The vendor built it. You configure topics, actions, and data connectors inside their platform. Salesforce Agentforce and ServiceNow AI Agents live here. You trade most control for fastest time to value.
The rule of thumb I use with clients: pick the lowest layer at which your control requirements are met. Every layer up trades control for speed.
Your reasoning should go in this order.
Is there a SaaS agent that already solves this? If your workflow is customer service routing inside Salesforce data or ITSM triage inside ServiceNow data, the SaaS layer probably already has an answer. Evaluate it first.
If not, does a managed platform give you enough control? If your team can live with the runtime the cloud provides and your data already lives on that cloud, the managed platform is faster than building your own runtime.
Only if the managed platform cannot meet your requirements do you go to open framework plus your own runtime. This is where differentiation happens, and where most of the engineering cost lives.
The rest of this piece walks the three layers in that order. SaaS first. Platforms second. Open frameworks third. Then the protocols that tie them together.
Layer 1: The SaaS Agent Layer
Salesforce Agentforce
Agentforce is the Agentforce 360 Platform now, rebranded in 2025. Agent Builder for low code, plus pro code extension through Apex, JavaScript, Flows, Prompt Builder, and MuleSoft connectors. Atlas Reasoning Engine is the orchestrator under the hood. Model pluggable across OpenAI, Anthropic, Google, and Salesforce's own Einstein models.
The pricing is the thing. Agentforce has the most transparent unit economics of any enterprise SaaS agent product. Flex Credits at 500 dollars per 100,000 credits. An agent action costs 20 credits or 10 cents. A voice action costs 30 credits or 15 cents. Customer facing conversations are priced at 2 dollars each on a pre-purchase plan. Agentforce User License is 5 dollars per user per month with metered usage on top. The full Sales or Service add-on is 125 dollars per user per month unmetered. The Agentforce 1 Editions start at 550 dollars per user per month with a million Flex Credits per org per year included.
A worked example from their own pricing page: 100 users doing 3 case management tasks per day, 20 working days, 6 actions per task, comes to 1,800 dollars per month for that use case. That kind of math is what makes Agentforce a buy versus build decision rather than a platform evaluation.
Where Agentforce wins: your system of record is already Salesforce, your data already lives in Data 360 (the renamed Data Cloud), and the workflow maps onto Service, Sales, SDR, or Commerce patterns. Pre built templates for all of these ship out of the box. AgentExchange is the partner marketplace for custom topics and actions.
Where Agentforce loses: you need novel multi agent topologies, your customer is not already a Salesforce shop, or you need to run the agent outside the Salesforce security perimeter. You also give up model choice in practice. Atlas uses the models Salesforce has integrated, which is a wide set but not every model.
Named enterprise customers include Workday, OpenTable, ADP, Wiley, Heathrow Airport, FedEx, and Saks Fifth Avenue. Systems integrators have dedicated Agentforce practices at every tier.
ServiceNow AI Agents
Three layered product on the Now Platform. Now Assist is the copilot layer, generally available across ITSM, HR, CSM, Creator Workflows, Security Operations, and Sourcing. AI Agents are autonomous, expanded significantly in the Zurich release in early 2026. AI Agent Studio is the low code builder for custom agents.
The structural advantage for ServiceNow is that agents run as first class Now Platform objects. They inherit the platform's identity, data model, and audit trail without any integration work. Flow Designer, Integration Hub, MID Server access for on-premises systems, full RBAC and ACL. For internal facing workflows where your system of record is already ServiceNow, this eliminates most of the engineering you would do in a custom build.
Model strategy is pluggable. Now LLM for common workflow tasks, Azure OpenAI and Anthropic for higher reasoning, bring your own LLM for customers with existing relationships.
Pricing is per user SKU with Pro, Pro Plus, and Enterprise tiers layered on top of existing product licences. ServiceNow has been shifting toward consumption based pricing for autonomous agents based on their 2025 earnings commentary.
Where ServiceNow wins: internal facing automations, ITSM triage, HR employee services, SOC triage, and anywhere the Now Platform is already the system of record. Time to value is measured in weeks because the connectors, audit trails, and identity are already there.
Where it loses: customer facing experiences outside the Now data model, novel multi agent topologies, anything that requires custom observability or evaluation beyond what the platform exposes.
Named customers include Adobe, Hitachi, NVIDIA (large internal deployment), BT Group, AstraZeneca, Dell, and Equinix.
The SaaS Layer Reality
I have talked several clients out of building their own customer service agent because Agentforce already solves 80 percent of their problem at a fraction of the cost of building it well. The same goes for ITSM workflows and ServiceNow. Buy the SaaS layer where it fits. Build only where you differentiate. That advice sounds obvious. Most teams skip it because the SaaS layer is not where interesting engineering happens. Interesting engineering is not the goal. Outcomes are.
Layer 2: The Managed Platforms
Microsoft Foundry plus Microsoft Agent Framework
Microsoft Foundry is the rebrand of Azure AI Foundry that shipped at Ignite 2025. The old Hub plus OpenAI resource plus AI Services model collapsed into a single Foundry resource with projects. Assistants, Threads, Messages, and Runs became Responses, Conversations, Items, and Agent Versions under the Responses API. SDKs consolidated behind azure-ai-projects 2.x. The whole platform is in the middle of that naming shift. Expect procurement conversations in 2026 to include the phrase "what is this thing called now."
Three agent types ship. Prompt agents are generally available, low code, defined by instructions plus a model plus tools. Workflow agents are in preview, declarative YAML or visual designer. Hosted agents are in preview, code based, shipped as containers. Hosted agents explicitly accept MAF, LangGraph, or arbitrary code. That last detail reframes Foundry as a control plane rather than a Microsoft only runtime.
The Foundry model catalog has 1,900 plus models across foundation, reasoning, small, multimodal, domain, and industry categories. Azure Direct sold by Microsoft covers OpenAI, DeepSeek, Meta, Mistral, Cohere, NVIDIA, and Microsoft's Phi. Partners and community covers Anthropic Claude via Models as a Service, plus hundreds of Hugging Face models on managed compute. The tool catalog has 1,400 plus entries including MCP servers added directly from the portal.
Microsoft Agent Framework shipped 1.0 on April 2 for both Python and .NET. The .NET 1.1.0 landed April 10. It is the successor to AutoGen and Semantic Kernel, built by the same teams. The pairing with Foundry is the tightest in the industry. Only agent-framework-foundry is generally available. Every other provider package, Anthropic, Bedrock, Cosmos, AI Search, Durable Task, Azure Functions, Copilot Studio, Purview, is beta.
Identity through Entra. Every agent gets a dedicated Entra identity with RBAC scoped to the resources it needs. Entra Agent Registry catalogs the deployed agents. Defender for Foundry Tools surfaces prompt injection, jailbreak, and cross-prompt injection attack alerts. Application Insights and OpenTelemetry are built in.
Foundry Local is the quiet differentiator nobody is matching. Generally available in the 2026 wave. C#, JavaScript, Rust, Python SDKs. ONNX Runtime under the hood. OpenAI compatible API. No Azure subscription required. Runs on Windows, macOS, and Linux. Catalog includes GPT OSS, Qwen, DeepSeek, Mistral, Phi, Whisper. Same SDK patterns as cloud Foundry. Neither Vertex nor Bedrock has a first party on device counterpart with the same ergonomics. For healthcare on-premises, government air-gapped, or edge scenarios, this is the real story.
Compliance is the broadest portfolio in the cloud market. FedRAMP High via Azure Government. HIPAA with BAA. HITRUST. SOC 1, 2, 3. ISO 27001, 27017, 27018, 27701. PCI DSS. EU Data Boundary. Microsoft Cloud for Sovereignty for the EU sovereign stack. 21Vianet for China. India RBI, IRDAI, MeitY. When procurement asks for the compliance matrix, it already exists.
Limitations worth naming. Hosted agents preview does not yet support private networking, which matters for regulated scenarios. The evaluations SDK in MAF is still marked experimental. The rebrand has created documentation churn: classic portal, new portal, old service names in old tutorials. Expect confusion during Q2 2026 buying conversations.
Google Vertex AI Agent Engine plus ADK
Vertex AI Agent Engine is the managed agent runtime inside Vertex AI Agent Builder. The API object is still named ReasoningEngine for backward compatibility, a reminder that this product has been through two name changes.
Services inside Agent Engine, April 2026 state: Runtime is generally available, autoscaling, VPC Service Controls, configurable IAM, managed containerization. Sessions is generally available, durable per user conversation state. Memory Bank is generally available, cross session long term memory using Gemini models to generate memories, IAM Conditions support, regional ML processing. Code Execution is generally available, sandboxed code execution for agent generated code. Example Store is in preview, stores few shot examples. Quality and Evaluation is in preview, integrates the Gen AI Evaluation service and supports Gemini fine tuning for agent optimisation. Threat Detection is in preview, built into Security Command Center for attack pattern monitoring. Agent Identity is in preview.
The framework support tiers are explicit. Full integration covers ADK, LangChain, and LangGraph. Vertex AI SDK integration covers AG2 and LlamaIndex. Custom template covers CrewAI and everything else. Agent Engine runs agents speaking the A2A protocol natively. ADK itself has Python, Java, and Go SDKs, with Python the most mature.
Deployment is Terraform driven through the Agent Starter Pack. Pre built templates for ReAct, RAG, multi agent patterns, a playground UI, automated Cloud Build CI/CD, Cloud Trace and Cloud Logging wired in. Observability is Cloud Trace with OpenTelemetry, Cloud Monitoring, and Cloud Logging. Agent Engine specific dashboards surface latency, errors, token usage.
Models are any model accessible to Vertex AI. Gemini 2.x as first class. Model Garden includes Anthropic Claude Opus 4.6, Sonnet 4.6, and Haiku 4.5, Meta Llama, Mistral, and others. Non Gemini models are a first class option, not a workaround.
Compliance. HIPAA is explicitly supported. VPC Service Controls. Customer Managed Encryption Keys. Data Residency at rest. Access Transparency and Access Approval. Private Service Connect for private VPC egress. FedRAMP is covered through Google Cloud's broader FedRAMP High Assured Workloads programme.
Where Vertex wins: GCP native shops, Gemini first, Java or Go preferred languages, BigQuery data gravity, existing Vertex AI investment. The Memory Bank primitive is genuinely unique and worth studying even if you do not pick Vertex. The first party Gen AI Evaluation integration is the cleanest eval story any managed platform ships.
Where it loses: cross cloud deployments, Azure shops, and teams with zero Google footprint. The Agent Engine feature split between generally available and preview is the most complex among the three hyperscalers. Your feature selection affects your support tier.
AWS Bedrock Agents plus Strands plus AgentCore
AWS runs two paths in parallel and the split matters.
Bedrock Agents is the config first managed agent service, generally available. You define Action Groups as OpenAPI schemas plus Lambda functions or return of control callbacks. Knowledge Bases provide retrieval augmented generation as a service, with ingestion from S3, SharePoint, Confluence, Salesforce, or the web, and vector storage on OpenSearch Serverless, Aurora PostgreSQL pgvector, Pinecone, Redis Enterprise, MongoDB Atlas, or Neptune Analytics. Guardrails for Bedrock cover content filters, PII redaction, denied topics, word filters, and contextual grounding checks. Prompt Flows is the visual canvas orchestration tool. Multi agent collaboration is generally available, explicitly hierarchical supervisor and collaborator topology rather than a free form graph.
Strands Agents is the open, Apache 2.0 licensed agent SDK. Code first, Python only for now. Latest release is 1.35.0 on April 8, with Bedrock Service Tier control for Priority, Standard, and Flex as a unique feature. Strands is deliberately portable. You can run a Strands agent anywhere. Bedrock is the preferred deployment target but not a required one.
Bedrock AgentCore is the newer managed runtime that hosts Strands, LangGraph, and ADK agents on Bedrock infrastructure. It was announced at AWS re:Invent 2024 and exists in the Bedrock top navigation as of April 2026. Specific generally available status has been moving and is worth verifying with your AWS account team at contract time.
Models on Bedrock as of April 2026. Anthropic Claude Opus 4.6 at 5 dollars input and 25 dollars output per million tokens. Claude Sonnet 4.6 at 3 and 15. Claude Haiku 4.5 at 1 and 5. Amazon Nova family across Understanding, Creative, Speech to Speech, and Embeddings. Meta Llama 4. Mistral. Cohere Rerank 3.5. DeepSeek v3.2 at 62 cents input and 1.85 dollars output per million, which is notable. Google Gemma 3. MiniMax. Qwen. Stability AI. TwelveLabs. Writer. Z AI.
Pricing model. Agents themselves carry no separate per agent charge. You pay for underlying model tokens plus Knowledge Base and Guardrails usage. Batch inference is 50 percent discount on most models. Service tiers: Standard, Flex at 50 percent discount on best effort latency, Priority at 75 percent premium on lower latency, Reserved for capacity commitment.
Compliance. HIPAA BAA eligible. SOC 1, 2, 3. PCI DSS. ISO 27001, 27017, 27018. Bedrock is available in AWS GovCloud with FedRAMP High, which is the primary pathway for U.S. federal agencies running Anthropic workloads. Regional availability covers all the major AWS regions plus the new European Sovereign region.
Where AWS wins: AWS native shops, existing Bedrock investment, Claude as the preferred model family with FedRAMP High coverage, strict latency SLAs (Service Tiers are real), and mixed model fleets with strong DeepSeek or Llama economics.
Where it loses: Azure native shops, GCP native shops, and teams that need a true graph orchestration framework rather than hierarchical supervisor patterns. The Bedrock Agents config model is less flexible than AF's graph or LangGraph's Python code.
Anthropic Developer Platform
Anthropic's platform is the exception to the hyperscaler pattern. There is no Anthropic managed agent runtime. The SDK is the story. Claude runs on the Messages API plus tools, on Bedrock AgentCore, or on Vertex Agent Engine, whichever managed runtime you prefer. This is a deliberate positioning choice. Anthropic focuses its own engineering on models, the SDK, and safety. It leans on AWS and Google for managed deployment.
Model family April 2026, verified on platform.claude.com. Claude Opus 4.6 has 1 million token context, 128k output, 5 dollars input and 25 dollars output per million. Claude Sonnet 4.6 has 1 million token context, 64k output, 3 and 15 dollars. Claude Haiku 4.5 has 200k context, 64k output, 1 and 5 dollars. Extended thinking is supported across all 4.x models. Adaptive thinking is Opus and Sonnet only. Haiku 3 retires April 19.
Opus 4.6 and Sonnet 4.6 support 300k output tokens via the Batches API with the output-300k-2026-03-24 beta header. That is a material change for long report generation workflows.
Platform capabilities beyond the Messages API. Batches API at 50 percent discount. Prompt caching, with cache writes at roughly 1.25 times the base input rate and cache reads at roughly 10 percent of base. Files API for persistent cross request storage. Computer Use is still beta with a new zoom action on Opus 4.6, Sonnet 4.6, and Opus 4.5. Tool primitives include bash, text editor, and custom tools. MCP is first class, both on Claude.ai and the API.
Enterprise offerings. Claude for Enterprise provides SSO, audit logs, and fine grained access controls. Claude for Government is the dedicated product line for US national security customers, deployed in classified environments. Compliance certifications commonly cited include SOC 2 Type 2, ISO 27001, ISO 42001, HIPAA BAA through AWS Bedrock or GCP Vertex, and FedRAMP High via AWS GovCloud.
Notable customers from Anthropic press and partner announcements include Lyft, Snowflake, Notion, Pfizer, Quora, Robinhood, Asana, Zoom, LexisNexis, Intuit, Palo Alto Networks, and Palantir. The high profile 2025 announcement was Palantir plus Anthropic plus AWS for classified U.S. government workloads.
Where Anthropic wins: teams that want the best current model for agentic work, minimum vendor lock in, and the flexibility to run on any cloud's managed runtime. The Claude Agent SDK gives you the mature tool loop from Claude Code if you are building developer tools.
Where it loses: if you want a single vendor managed story end to end, you cannot get it from Anthropic alone. You pair them with a hyperscaler runtime.
Layer 3: The Open Frameworks
The platforms above are where production deployments land. The frameworks below are where the code is written. Some frameworks pair cleanly with a specific platform. Others run anywhere.
LangGraph plus LangSmith
The open cross cloud winner. Python and TypeScript near parity. The most mature graph semantics in the space. LangGraph 1.1.6 is the current stable. LangChain itself has repositioned as the high level wrapper built on LangGraph. 28,990 GitHub stars, largest ecosystem, trusted by Klarna, Replit, and Elastic per their own README.
State management is the strongest story in the field. langgraph-checkpoint-postgres for production durability, plus SQLite, Redis, and in memory checkpointers for lighter scenarios. Durable execution is the headline feature. Human in the loop through interrupt() and Command(resume=...). State can be inspected and modified mid flight.
LangSmith is the flagship tracing and eval companion. Battle tested in production. The coupling is tight, which is a benefit and a tax. Production grade observability without LangSmith requires your own OTel pipeline work.
Where it wins: Python first multi cloud shops, graph oriented orchestration needs, teams with the ops maturity to run checkpointers and LangSmith. Hosted Agents on Foundry explicitly accept LangGraph. Vertex Agent Engine has LangGraph as a full integration tier.
Where it loses: .NET shops (TypeScript support exists but Python is first class), procurement contexts where open source plus SaaS observability is a harder sell than a hyperscaler contract, and single cloud shops where the hyperscaler's own platform is the easier path.
CrewAI
The "team of agents" metaphor, 48,643 GitHub stars, Python only. Two architectural modes: Crews for autonomous role playing agents, Flows for event driven single LLM call precision. CrewAI AMP Suite is the enterprise bundle with tracing, Control Plane, and on-premises or cloud deployment.
CrewAI is the fastest path to a multi agent proof of concept. The role, goal, and backstory metaphor maps onto a slide deck and a demo in hours. For prototyping and validating the idea, it is hard to beat.
Where it wins: role based automations (research, sales, operations), teams building "a team of agents" products, and proof of concept speed.
Where it loses: anywhere you need graph orchestration, enterprise procurement where open source plus AMP Suite pricing is a harder sell, and production systems where the looser orchestration becomes a constraint.
Pydantic AI
Best static typing story in the field, from the Pydantic team. Python only. 1.80.0 is the current stable. Capabilities, Agent Specs in YAML or JSON, server side compaction capabilities for OpenAI and Anthropic shipped recently. Pydantic Logfire is the companion observability product, OTel based.
Durable execution through DBOS and Temporal style backends. Human in the loop with per tool approval that can be conditional on call arguments, conversation history, or user preferences. MCP, A2A, and AG-UI all integrated natively.
Where it wins: teams already using Pydantic or FastAPI, type safety as a design priority, novel approaches like YAML agent specs for code free deployment, and openness about where compaction and caching happen.
Where it loses: anywhere graph support is central, since graph is not the primary pattern. TypeScript and .NET shops.
Mastra
TypeScript first, from the team behind Gatsby. 22,906 stars. Y Combinator W25. Currently the only serious TypeScript option with graph workflows, MCP server authoring, and suspend and resume. Dual license with Apache 2.0 core and enterprise license for specific modules.
Where it wins: TypeScript and Next.js product teams, Node backend shops, teams shipping AI features into existing web apps.
Where it loses: non TypeScript stacks, enterprise contexts where the YC stage still matters for procurement.
OpenAI Agents SDK
Pre 1.0 after a year of public development. Version 0.13.6 as of April 2026. Python plus a separate TypeScript SDK. Provider agnostic. Primitives are Agent, Runner, Handoffs, Tools, Guardrails, Sessions, and Tracing. Realtime Agents for voice with gpt-realtime-1.5 are a differentiator.
Where it wins: teams already on the OpenAI Responses API who want minimum ceremony, voice agent builders.
Where it loses: anywhere you need durable execution or graph orchestration. The SDK is explicitly lightweight, not a production workflow engine.
Claude Agent SDK
Pre 1.0, Python, wrapping the Claude Code CLI. Built specifically to give developers programmatic access to the agent loop that powers Claude Code. Release cadence is near daily.
Where it wins: coding agents, internal developer tools, anything that wants the mature Read, Write, Edit, Bash tool ergonomics from Claude Code without rebuilding them.
Where it loses: multi agent orchestration, non coding use cases, production systems that need stability guarantees from a pre 1.0 SDK.
The Protocols That Tie It All Together
Model Context Protocol
MCP is the de facto standard for tool and context integration as of April 2026. Current spec revision is 2025-11-25. Every non legacy framework in this guide supports MCP natively: Claude Agent SDK, Microsoft Agent Framework, Google ADK, Pydantic AI, Mastra (bidirectional, consume and author), OpenAI Agents SDK, Strands Agents, CrewAI through adapters, Semantic Kernel.
Practical implication: you can write your tool integrations once as MCP servers and call them from any framework. That changes the economics of framework choice. The lock in cost is lower than it looks.
Agent to Agent Protocol
A2A reached 1.0.0 on March 12, 2026, under Linux Foundation governance. The spec refactor separated application protocol from transport bindings, modernised OAuth 2.0 to remove implicit and password flows and add device code and PKCE, added multi tenancy via gRPC scope fields, and shipped tasks/list with filtering and pagination.
Adoption is narrower than MCP but growing. Microsoft Agent Framework advertises cross runtime interoperability via A2A. Google ADK has A2ATransport as a default supported transport. Vertex Agent Engine runs agents speaking the A2A protocol natively. Pydantic AI has A2A integration. For cross framework, cross cloud, cross team agent interoperability, A2A is where to bet.
The Protocol Bet
A bet on frameworks is a snapshot of one moment. A bet on protocols is durable. Design your agent system to speak MCP and A2A fluently. Your tools, your inter agent messages, and your external integrations all go through standardised protocols. The underlying framework becomes swappable. That is the architecture I recommend to every client planning a production build this year.
Lock In Analysis
What is hardest to migrate away from, in order:
Agentforce and ServiceNow are hardest. Your agents are objects in someone else's metadata model. Migration means rebuild.
Bedrock Agents config first is next. Action Groups, Knowledge Bases, and Guardrails are AWS native objects. The Strands SDK path deliberately reduces this because Strands agents can move off Bedrock.
Vertex Agent Engine is similar. Sessions, Memory Bank, and Example Store are Google native. ADK itself is open and portable. The surrounding services are not.
Microsoft Foundry is similar. MAF SDK is open. The Foundry runtime, tool ecosystem, and Entra Agent Registry are not.
Open SDK plus your own runtime is the lowest lock in to infrastructure. You are still locked to the SDK's abstractions, which means a framework change is a rewrite, but the infrastructure is yours.
What is practically sticky across all layers is your prompts, evals, and tool schemas. These are portable in theory and rarely in practice. Teams underestimate how much implicit behaviour is encoded in prompt tuning for a specific orchestrator.
The Migration Test
Before committing to any layer, pick one agent. Build it twice. Once on the SaaS layer or managed platform you are considering. Once on an open SDK deployed to your own runtime. Measure four things.
Time to first production deployment.
Per conversation cost at ten times your current volume.
Evaluation score against your golden dataset.
Time to add a new tool, specifically a customer specific integration your vendor does not ship.
The answers will not match what the vendor decks suggested. They rarely do. The difference between what the decks say and what the numbers show is the single most valuable piece of evidence you can bring to a platform selection meeting.
What Is Coming in Part 3
This piece mapped the layers and named the options. The next piece goes into what you build inside whatever layer you pick.
Part 3 covers the building blocks. Memory patterns. RAG patterns including agentic, graph, and contextual retrieval. Tools and capabilities beyond MCP including computer use and code interpreters. Context engineering and prompt caching economics. Evaluation frameworks. Safety and guardrails. Cost management. Identity and authentication. Planning patterns.
These are the patterns every production agent build hits regardless of framework. Get them right and your architecture compounds. Get them wrong and you are rebuilding in six months.
Navneet Singh is the founder and CEO of Webority Technologies. He builds enterprise AI systems for clients in healthcare, financial services, and government, and writes weekly about what actually works.
