ARK: A Gateway for AI-Native Work

How we built ARK – an MCP gateway that connects models, tools, and permissions to enable the next generation of AI-powered work at Formation Bio

by Josh Minzner

Software Engineer

12 min. read•Jan. 7, 2026

ARK: A Gateway for AI-Native Work featured image

AI continues to become more powerful and capable with each passing month, but its usefulness and impact remains limited if it can’t access your organization’s internal data and take action for you in the places you get work done.

At Formation Bio, we’re a diverse blend of biopharma and tech industry experts, and it’s important to us that we let each function work with the best-in-class tools of their respective domains. Because of this, you’ll find drug development squads working in SharePoint and Office, software engineers writing design specifications in Google Docs, data scientists running analysis on structured clinical data in Snowflake, talent operations managing hiring pipelines in Greenhouse, and everything in between. We also have proprietary systems like Forge, our clinical study design platform, that manage critical business context.

Furthermore, as an AI-native pharma company, we understand the jagged frontier of AI capabilities, and want to enable our growing number of AI power-users to use the best model for the job, including offerings from OpenAI, Anthropic, and Google.

And finally, we want not only to enable end users to leverage all these systems and data with AI, but also software developers who can build powerful AI applications that leverage these same capabilities.

When the data is so distributed, how do we provide AI access to it all? How do we do it in a way that doesn’t lock us into a single model provider or ecosystem? And how do we support both end users and software developers in doing so?

Our Solution: ARK

Formation Bio’s solution to these questions is called ARK (AI-ready Repository of Knowledge). ARK is an AI platform that combines:

A Model Context Protocol (MCP) Gateway with centralized governance controls and authentication/authorization.
A suite of MCP Servers, from official implementations (like GitHub’s remote MCP), to open-source (like BioMCP), to custom-built (like Forge) that connect AI to all the systems where we do work.
An agent builder, enabling even non-technical users to create AI agents that leverage our suite of MCP-powered integration.
An AI of choice, by connecting ARK to ChatGPT, Claude, or chatting directly in its own interface.

ARK’s Backbone: MCP

The Model Context Protocol (MCP) was created by Anthropic in late 2024 to connect any LLM to any data source or system, and has seen widespread adoption across the industry since then.

We started developing our MCP Gateway in March 2025 to make the MCP ecosystem more accessible to non-technical users. We were quickly able to see the potential of MCP to have an impact across our business, but its implementation back then still required some software development knowledge. At the time, almost all MCP Servers were Python/NodeJS scripts that needed to be executed directly on somebody’s computer. (In official terms, these are servers that use the “stdio” transport). Most of these MCP Servers required users to manually provide API keys or access tokens when running them, which meant they had to locate and configure credentials in the developer settings of the services they wanted to connect to. Supporting non-technical users in utilizing MCP in this way would’ve been unfeasible.

Our MCP Gateway solved these challenges in a couple of ways. First, it moved execution of MCP Servers from individual users’ computers into our AWS cloud environment. Critically, it did this in a way that respected the statefulness of the protocol—each session established with the gateway provides a private instance of that MCP Server for the user, as if they had run it themselves on their computer.

Second, our gateway centralizes authentication by managing OAuth on behalf of our users. Users connect ARK to all their downstream services and data via OAuth flows. Our gateway then manages all their access tokens, keeping them encrypted, secure, and automatically refreshing them in the background so the user doesn’t have to constantly re-authenticate. After this initial setup is complete, users need only authenticate with ARK, and the gateway automatically uses their relevant access tokens, passing them along via arguments/environment variables/etc. as needed to establish an authenticated session with the downstream service.

By centrally managing our organization’s MCP Servers, and running them in the cloud, we’re able to strengthen our security posture by only providing access to MCP Servers that have been vetted by our team. We audit every MCP Server for security and quality issues and only add them to ARK if they meet our standards.

Of course, a lot has changed since March. Since then, official remote MCP Servers, accessible via HTTP directly from service providers, have gained popularity. But our gateway has set us up well for this transition, enabling us to instantly migrate our entire organization from unofficial stdio servers to official HTTP ones in a way that is completely transparent to our users.

Picking the Right Tools

When we first shipped ARK, we had only a handful of MCP Servers available, namely Snowflake, Airtable, and Atlassian. We now have twenty servers, and are adding new ones all the time.

With all these new servers came a problem: too many tools. Each MCP Server contains a collection of tools for the AI to use to do things, like query data, or even perform actions like updating/creating records. These tools include descriptions of what the tools do and how they should be used. High-quality tool descriptions are vital for MCP Servers as they effectively become part of the model’s prompt, guiding it in how to make use of the tools effectively.

But models are limited in the number of input tokens they can accept, and tool descriptions consume tokens. Our gateway required users to manually toggle MCP Servers on and off depending on the task at hand. If they turned everything on, our gateway would send every tool and its description to the model. The models would fail to use the right tool for the job, getting overwhelmed by the volume of tools, many of them completely irrelevant for the task at hand. All those tool descriptions would consume valuable tokens, causing our users to hit conversation length limits sooner. For these reasons, we advised our users to only enable at most a few servers at a time.

To solve this problem, we built a new feature into our gateway we call Everything Mode. When a user turns on Everything Mode, we replace the individual tools from each MCP Server with just three:

discover_tools
call_tool
call_read_tool

The special sauce is in discover_tools. The description of this tool includes a list of all the MCP Servers we have connected to ARK. We’ve also included company-specific context about how we use each system at Formation Bio, for example, instructing the model that proprietary system manuals can be found in Confluence, while drug development artifacts like draft protocols can be found in SharePoint.

The model uses all this context to call the discover_tools tool with the name of the system it would like to access and a description of the task it is trying to accomplish in that system. We then send the full list of tools and the description of the task to GPT-5-mini, asking it to return a list of only the tools that are relevant for the task at hand. The model can then call those tools via the call_tool or call_read_tool tool (for tools that only read data).

This technique has worked remarkably well. Users can connect everything once and not have to toggle things on and off manually. They can just chat with the AI and let it pick the right tools for the job. We reduced the total number of tools from 100+ to just three, leading to more efficient use of tokens, and therefore better quality and speed. And because we’ve augmented the context with company-specific guidance on how each system is used, our AI has become effective at directing users to the right place to get answers, even when the user doesn’t know where to look.

Building Agentic Systems on Top of MCP

At Formation Bio, we believe AI Agents will be crucial in freeing up our employees to focus on their most high-value work and deliver on our mission to bring new treatments to patients faster and more efficiently. An AI Agent can mean different things to different people, but we’ve internally aligned on Anthropic’s definition, which distinguishes between Agents and Workflows:

Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
Agents are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

We realized that with our MCP Gateway, we had laid the foundation to solve some of the harder problems of agent development: tool connectivity and access control. With our growing list of MCP Servers connected, we had a wealth of tools to connect to our agents. And with our centralized authentication/authorization management, we could reuse the existing authorizations users had performed with our systems, ensuring our agents act with the permissions of the end user while providing a frictionless user experience.

Upon this foundation, we were able to quickly create an Agent Builder that lets even non-technical users build agents. The user needs only to configure a system prompt for their agent, instructing it on how to behave and respond, and select tools for the agent, choosing from a list of all the MCP Servers available via our gateway.

We also support multi-agent architectures, enabling agents that delegate tasks to other agents, like a manager delegating tasks to their reports, or hand off the conversation to another agent, like a support rep transferring a phone call to a colleague.

These agents can then be deployed directly into our Slack workspace, where Formation employees can easily DM them like a coworker, or chat with them directly in the ARK user interface.

Let’s take a look at a few examples.

ARK in Action

Snowflake

All of Formation Bio’s structured data assets are stored in a Snowflake data warehouse. This data ranges from the NIH’s Osteoarthritis Initiative database, to market research databases from industry data vendors, to snapshots of our internal asset acquisition deal-flow pipeline. Our Snowflake MCP Server enables any Formation Bio employee to leverage all this data—even if they’re not fluent in SQL.

One of the most frequent users of our Snowflake MCP Server is our Business Development (BD) team. This team is constantly surveying the pharma/biotech landscape to hunt for new asset acquisition opportunities. Conducting this research would typically involve navigating web portals from a multitude of data vendors, then manually cross-checking and compiling information from each. However, because we’ve made these datasets available in Snowflake, the team can instead use our Snowflake MCP to have AI perform this research on their behalf:

As you may have noticed from the screenshot, we’re also leveraging the new Claude Agent Skills feature to help Claude better understand how to better leverage our data in Snowflake to conduct market research. This is how Claude immediately knows to query a specific vendor when I ask it about asset acquisition/licensing deals.

Endpoint Agent

We are often looking to acquire drug candidates that have completed Phase I or II trials. As such, we often need to know if a trial for an asset we’re evaluating met its primary endpoint. To enable our team to do this quickly and accurately, we developed the Endpoint Agent using the ARK Agent Builder. This Agent:

Takes the name of the drug candidate in question
Searches the internet to find the NCTID of the relevant clinical trial for that asset
Looks up information about the trial in clinicaltrials.gov
Searches PubMed for published results
Searches the internet for relevant press releases and news
Presents a summary of its findings to the user

We then benchmarked the accuracy of our agent using human-source ground truth data, finding it to have an accuracy of 85%, beating an industry vendor dataset’s accuracy of 74%.

BrightHire

Our talent team uses BrightHire to record and transcribe the interview sessions (with participant consent), as well as internal debriefs, for candidates going through the hiring process. We’ve built and deployed a BrightHire MCP Server that allows an AI to find and process transcripts from all these sessions.

This enables our talent team to efficiently analyze these interviews to look for trends and insights regarding our interview process. For example, we can ask the model to analyze any concerns raised in the candidate debrief, and then find specific examples from the interview that either support or refute those concerns:

ARK for Developers

Connecting Apps to Our Data

Our team of software engineers have been able to leverage our MCP Gateway to more easily build AI apps and agents that can utilize all of the data sources and systems accessible via ARK. Much like with end users, developers don’t need to manage the complexity of spawning sub-processes in their services to utilize MCP connectors, instead they can simply initiate a connection to our gateway over HTTP.

But the real benefit comes in the simplicity of authentication/authorization: it is critical that AI applications and agents respect the access controls of the systems they access, and do not leak sensitive data to users that should not be able to access it. Developers get this out-of-the-box with our gateway. They need only authenticate the end user with the gateway via OAuth, and any data accessed by their app via MCP will be done so with the permissions of the end user using the app.

Connecting our Apps to ARK

For the developers of our proprietary systems like Forge, the gateway makes it simple to develop and deploy MCP Servers that provide access to their systems. Developers can create MCP Servers that run via the simple stdio transport and quickly ship them to production. Our gateway simplifies:

Deployment: MCP Servers are pushed as NodeJS/Python packages to our private AWS Codeartifact repository. From there, the gateway can pull them down and execute them. Deploying a new version is as simple as pushing a new package version to the repository.
Authentication: Developers design their MCP Servers to access credentials (like OAuth access tokens) as environment variables/arguments to their CLI. The gateway automatically passes along the relevant credentials for the end user.
Session state: As previously mentioned, the gateway spawns a unique instance of each MCP Server for each connection established to it. This means that developers can safely keep session state in memory and not worry about persistence in databases or shared caches.

ARK Enables the Future of Work

As models become ever more capable, they’ll power agents that supercharge human operators and collaborators, enabling a single expert to do more. In drug development, a process that is bottlenecked by knowledge work, we believe this technology can usher in an era of abundance where fewer patients are left wanting for life or health-saving treatments. This has always been Formation Bio’s mission: to bring new treatments to patients faster and more efficiently, and with the productivity gains enabled by AI, the path has never been clearer.

But even as models improve, providing them access to the systems where data lives and human operators work will be fundamental. With ARK, we’ve built the foundation that enables this future and allows us to take full advantage of advancements from new MCP features like apps and tasks, to the latest AI models. By building on top of MCP, we’ve chosen an architecture that will let us connect our agents to the systems of record and context graphs of the future, unlocking cutting-edge agentic performance while preserving a stable, secure interface for our data and systems.

Next Steps

We’re excited about where we go from here. Our focus will primarily be on further developing the capabilities of our agents, evolving them from chatbots into systems that can:

Work in the background
Run automatically on a schedule
React to events (like new press releases being ingested into our data warehouse)

We’ll also be continuing to add more MCP Servers to our gateway, including by working with engineers at the company to make more of our proprietary systems, like our ATLAS drug discovery and acquisition platform, accessible by ARK.

Back to blog