The Technical Architecture Powering Our AI Recruitment Tool Muse

by John Klingelhofer

Senior Software Engineer

7 min. read•Mar. 4, 2025

Introduction

Patient recruitment has long been one of the toughest hurdles in running a clinical trial. Defining a patient recruitment strategy and crafting recruitment materials that engage participants within regulatory guidelines—and ensuring they appeal to the diverse range of individuals needed for a trial to deliver accurate and meaningful results—is no small feat, and typically takes many months across a number of distinct teams. This challenge is frequently a bottleneck and a major-cost driver in the process of drug development.

Muse, Formation Bio’s AI system for patient recruitment, represents a novel approach to overcome this challenge, harnessing the latest in GenAI technology to streamline the recruitment process for clinical trials (Figure 1). Its architecture is built around OpenAI's Assistants API tools (Figure 2). From a small handful of trial-relevant documents, Muse is able to produce everything ranging from prescreening questionnaires to fully-realized ads in a matter of minutes. Most importantly, it will synthesize a wide range of incoming information into strategic insights and various target audiences, and adapt the created content to best align with each of these audiences. The application of GenAI continues after initial creation, with conversational agents and sophisticated automatic validation enabling users to edit and adapt this content precisely to their target populations in a quick and intuitive manner.

In this post, we’ll explore the high-level architecture and key components that power Muse, offering a behind-the-scenes look at how we leverage OpenAI’s LLMs to create an efficient and effective tool for crafting recruitment strategies and content. We'll start by going over our data inputs, do an overview on Assistants, how Assistants are orchestrated and executed, how users interact with the Assistant outputs, and any challenges or additional opportunities to explore.

Figure 1. Screenshot from the Muse application detailing program overview information for a study.

Incoming Data

Muse’s outputs are entirely based on the data provided by the user, supplemented only by industry rules and guidelines that inform validation processes. Typically, user-provided documents include a study protocol, which outlines the eligibility criteria and objectives of a clinical trial. However, users can upload any relevant information to enhance Muse’s understanding.

Common sources of additional data include:

Disease research – Background on the disease, including symptoms and impacts that may not be detailed in the protocol.
Patient demographics – Statistical insights into the target patient population.
Market landscape – Information on existing treatments and competitors.

Once these files are processed and stored in a vector database, Muse’s AI assistants are ready to generate insights.

Assistants Overview

Each discrete task is handled by a dedicated OpenAI Assistant, which leverages machine learning and natural language processing (NLP) to analyze text, generate responses, and provide relevant information based on extensive training data. These assistants process user input, predict the most appropriate response, and can utilize tools such as web search or calculations to enhance their outputs.

Our implementation extends beyond OpenAI’s standard framework. In addition to configuring system prompts and key parameters (e.g., temperature), we design multi-step assistant workflows, chaining prompts together to generate more sophisticated and context-aware outputs.

Figure 2. How assistants communicate with each other in Muse

Our AI assistants are grouped into three key categories, each designed to handle distinct aspects of the workflow:

1. Strategic Assistants – Focused on research and analysis that inform content creation.

Glossary – Establishes a standardized set of terms to ensure medical terminology remains accessible across all materials.
Attributes – Extracts novel aspects of the trial from protocols and research, identifying unique characteristics that differentiate the study. For example, it may combine protocol details with treatment landscape data to highlight a drug’s distinctive effectiveness.
Segments – Uses attributes to identify meaningful patient groupings, allowing for more targeted and relevant outreach

2. Content-Producing Assistants – Leverage strategic outputs to generate recruitment materials tailored to the identified segments.

3. Validation Assistants – Evaluate content through various lenses (e.g., IRB compliance, see image from Muse below) and provide structured feedback.

These assistants use dedicated vector stores containing regulatory guidelines and industry best practices to deliver precise, context-aware recommendations—similar to a human reviewer.

Example: IRB Review Challenges

The IRB must approve all direct advertising materials for clinical trials before they can be used. In traditional workflows, this leads to significant back-and-forth revisions to ensure compliance. Many regulatory requirements are subtle—for example:

Ensuring the tone doesn’t overpromise or imply a guaranteed positive outcome.
Avoiding excessive emphasis on financial incentives for participants.

By equipping an assistant with IRB regulatory documents and defining a multi-step review process, we can surface IRB-like feedback early in the content creation process. This reduces compliance issues and minimizes last-minute revisions when the material is formally reviewed.

Manually programming and managing dependencies across multiple assistants would be both time-consuming and brittle. To keep our application flexible and support the seamless addition of new agents at any level, we designed it with a loosely coupled architecture in two key ways:

1. Modular Assistant Definitions

Each assistant’s workflow is configurable through a no-code interface, allowing users to chain prompts together into multi-step processes. For example, when defining an assistant to generate patient segments, the workflow might include:

Evaluating attributes – Analyzing all generated attributes to identify meaningful patterns.
Discovering key patient groupings – Identifying underserved populations or traditionally hard-to-recruit groups that might be a good fit for the trial.
Expanding segment profiles – Enriching identified segments with expected race, age, and demographic insights to guide content adaptation.

To enhance efficiency, we enable dynamic branching, where each workflow step clones the parent thread and executes in parallel with other branches (see Assistant Execution for details).

Example of generated patient segments in Muse

2. Dynamic Use of Strategic Context

Not all strategic elements are equally relevant to every assistant. Instead of manually defining these relationships, we allow assistants to self-determine relevant context at runtime by:

Analyzing their own description and comparing it with the descriptions of strategic assistants.
Automatically inserting relevant strategic outputs at the beginning of new threads to ensure contextual awareness.

This approach ensures assistants operate with the right strategic context without requiring explicit dependencies, making the system highly adaptable and scalable.

To illustrate how dynamic context gathering works, consider a simplified version of the Prescreening Form creation process. The system automatically pulls in relevant strategic outputs while filtering out unnecessary ones.

For example, in this process above:

It accepts outputs from strategic assistants that extract Inclusion/Exclusion (I/E) criteria and the glossary, ensuring consistency in terminology and regulatory compliance.
It excludes study-differentiating attributes, recognizing that these are more relevant for advertising rather than the strict guidelines required for prescreening forms.

Efficient Processing via Dynamic Branching

After the question outlining step, we employ dynamic branching to streamline execution:

Each question is assigned its own parallel subprocess, rather than processing the entire questionnaire at once.
This allows multiple evaluation and rewriting prompts to run concurrently, ensuring faster, more focused refinement of each question.

By leveraging context-aware selection and parallelized execution, our system maintains precision while optimizing efficiency.

This flexibility has significantly improved the quality of Muse’s outputs, allowing developers and domain experts to easily refine and adjust workflows without requiring Python expertise. By lowering the technical barrier, our system enables seamless collaboration and continuous optimization.

Assistant Orchestration

When generating content for a project, assistants are executed in a structured sequence:

1. Strategic Assistants Execute First: These assistants analyze key inputs, such as glossary terms, eligibility criteria, and study attributes, providing foundational context.

2. Content Assistants Run Next: Once strategic assistants complete, content assistants leverage their outputs dynamically, as described earlier.

3. Validators Review and Apply Feedback: As each content assistant completes its task, validation assistants evaluate the output and apply necessary modifications for compliance and clarity.

4. Segment-Specific Adaptation:

After generating the initial content, we refine it for each patient segment in an adaptation step.
Rather than recreating content from scratch, this phase adjusts language, tone, and emphasis to better align with the segment’s needs.
Early in development, we considered re-running the full content creation flow per segment but found that LLMs excel at fine-tuning for specific audiences—allowing us to optimize efficiency without sacrificing quality.

Despite the complexity of this workflow—potentially involving hundreds of individual tasks across multiple segments—our system is designed to:

Run tasks concurrently where possible.
Trigger sub-tasks immediately once dependencies are met.

As a result, the entire process - which typically requires many pharma teams across multiple months – can be completed in under 15 minutes.

Conclusion

Muse represents a significant leap forward in leveraging GenAI for patient recruitment in clinical trials. By integrating OpenAI’s Assistants API, dynamic workflows, and strategic automation, we have created a system that not only accelerates content generation but also enhances compliance, adaptability, and precision. The modular and loosely coupled architecture allows for continuous evolution, ensuring that Muse remains a powerful and scalable tool for recruitment teams.

Looking ahead, we see opportunities to refine our AI models further, incorporate real-time learning from recruitment outcomes, and expand Muse’s capabilities to cover even more aspects of the trial lifecycle. As the industry continues to embrace AI-driven efficiencies, tools like Muse will play a pivotal role in reshaping how clinical trials recruit and engage diverse patient populations.

In Part 2 of this blog series we’ll take a deeper look at the execution layer that powers Muse’s assistants—focusing on how the Threads API ensures contextual continuity, how assistants orchestrate multi-step workflows, and how users interact with and refine AI-generated content.

Interested in joining our team to work on AI for clinical development? Check out our open roles here.

Back to blog

The Technical Architecture Powering Our AI Recruitment Tool Muse

More Articles