Across all services
How we build
A deliberate hybrid. Probabilistic where unstructured language and judgement live; deterministic where defensibility lives. Loop is the orchestration layer between them, calling each existing system through its own API rather than building bespoke integration pipelines.
The cases where AI fails a charity are rarely cases where the model got an answer wrong. They're cases where the wrong shape of work was given to the model, where there was no audit of what it did, or where the system couldn't talk to the data it needed to.
The three sections below describe the architectural choices we make on every Loop deployment. They're the substrate underneath all five sector offerings, and they're why a Loop reads differently from a chatbot built on the same models.
Agentic AI
A chatbot answers questions. An agent does work.
Loop deployments are agentic in the proper sense - the system has access to a defined set of tools (read this CRM record, look up this finance row, draft this comms message, write to this audit log, route to this reviewer), can plan a sequence of tool calls to complete a task, and can decide when it's done or when it needs to escalate.
That's a different engineering problem from prompting a chat model. The model still does the language work, but the architecture around it - the tool definitions, the schemas the tool calls have to return, the audit trail of what was called and what came back, the gates that decide whether the agent can act on its own or needs a human - is the bit that makes the system useful in production.
The agentic layer is also what makes the work defensible. Every tool call is logged. Every decision the agent made is attributable. When a trustee asks "how did the system decide to send this email", the answer is a query against the agent's own log, not a guess about what the prompt produced.
Where this earns its keep: any operational job with more than one step. Triage and routing. Multi-stage extraction. Drafting that requires looking things up. Anything where the work has a shape - a flow with branches and checks - rather than a single prompt-and-response.
Hybrid (deterministic + probabilistic)
The most consequential architectural choice on every Loop deployment is the line between what the model does and what code does.
The model is good at:
- Reading unstructured language and turning it into structured data
- Drafting prose that sounds like a person wrote it
- Holding a conversation with appropriate tone and pacing
- Classifying ambiguous cases where the categories don't have hard edges
- Summarising long documents while preserving what matters
The model is bad at:
- Being the source of truth for things that have to be defensible
- Following a hard rule that must never bend
- Producing the same output for the same input
- Knowing what it doesn't know
- Computing anything more complicated than a simple count
Loop deployments use the model for the first list and keep the second list in deterministic code. We don't pretend the model is reliable for things it isn't reliable for.
Two concrete examples of where this line gets drawn:
In the Learning Lab clinical simulation, the patient's medical history, deterioration curve, vital signs, and valid emergency treatments live in JSON files that clinical experts can read and sign off on. The bounded LLM is the patient's voice - anxious, in pain, describing symptoms in their own language. Anyone who knows the medicine can audit the JSON. They don't need to read prompts. Keep the medicine in JSON. Let the model be the patient.
In the Breast Cancer Now AIDA build, the schema each form is extracted against is generated and stored as structured data - every question has a defined type, every multiple-choice has a defined enum, every matrix has defined rows and columns. The LLM does the perception work (read this scan, fill these fields). Validation happens against the schema. Semantic equivalence across question variants is computed deterministically against the canonical schema. The model reads. The schema validates.
This is sometimes called neurosymbolic AI - neural network for perception and language, symbolic logic for rules and defensibility. We just call it "the right tool for the right job".
The line moves per Loop. A donor comms draft has more language work, less rule enforcement - more model, less code. A safeguarding triage has less language work, more rule enforcement - less model, more code. Designing the line for each Loop is one of the things a Sense Map produces.
Integrations as glue
Custom integrations are where most charity technology budgets quietly disappear. A bespoke pipeline between your CRM and your finance system costs £30-60k to build, breaks when one of the vendors changes an API, and adds a moving part that someone has to maintain forever. Charities own dozens of these.
Loop reduces most of that work, and the reason is architectural rather than commercial.
External systems Loop talks to are exposed as tools the agent calls when it needs them. In simpler cases - read this CRM record, draft to this comms platform, look up this finance row - the call happens at runtime against the live API, no intermediate store. In more demanding cases - survey platforms with rate-limited APIs, datasets too big to fetch in one go, sources that need to be pulled on a schedule - a thin worker tier handles the fetching, keeps a small cache or projection, and exposes the same tool shape to the agent. That's not nothing. But it's a fraction of the bespoke ETL charity tech projects usually involve, and the adapters are reusable: the QuestionPro adapter we built once is the same one the next charity gets, not a from-scratch project.
What that buys:
- Your CRM stays the source of truth. Loop reads from it; Loop writes to it. The CRM doesn't lose ownership of the data, and the AI capability doesn't get bolted on as a sidecar.
- Adding a new system later is a tool definition plus, sometimes, a small adapter - not a months-long integration project. When you adopt a new survey platform, we write a tool wrapper. If it needs scheduled fetching, the worker tier handles that too. Days to weeks, not months.
- When an API changes, only the adapter changes. Loop's logic about what to do with the data is decoupled from how that data is fetched.
- Vendors changing pricing or deprecating features doesn't trap you in the same way bespoke pipelines do. If you switch CRM in two years, you rewrite the CRM adapter. The Loops on top still work.
There is a practical limit: the external system needs to have an API. Most modern charity-sector tools do (Salesforce, Beacon, Donorfy, Charitylog, QuestionPro, Mailchimp, Dotdigital, Xero, Sage, Microsoft 365, Google Workspace, the major MEAL platforms). For the ones that don't, we either build a small ingestion layer or - more commonly - recommend you replace the system. Tools without APIs in 2026 are a deeper problem than any single Loop.
The wider point is one we keep coming back to with charity tech directors: building an AI system on top of an existing data stack is much cheaper than building an integrated data stack and then putting AI on top of it. We do the integration work as part of the Loop, not as a separate project that delays the value.
Why this matters together
Each of these choices is defensible on its own. Together they're the shape of what a Loop deployment is.
The agentic layer is what makes the system do work rather than just answer questions. The hybrid model is what makes the system trustworthy to put in front of trustees and DP leads. The integration approach is what makes the system affordable without rebuilding your data stack first.
None of these are novel ideas in the wider AI engineering world. What's less common is doing all three deliberately on every project, and explaining the choices honestly to charity boards. That's the part that takes ten years of doing this work.
Sound like the kind of work you'd like back?
A one-week shape-finding engagement is how we start. If you decide to go ahead, that fee comes off the build.