Bring Your Own Data to Copilot Studio: Grounding Generative Answers with API + Power Automate (My Go-To Pattern)
A lot of organisations ask the same question:
“What if our knowledge isn’t in SharePoint or Dataverse?”
Realistically, plenty of valuable knowledge lives in:
- internal APIs
- legacy systems
- custom databases
- third-party platforms
- compliance portals and registries
In those cases, my approach is:
retrieve from the system of record → normalise → pass the best content into generative answers.
The goal isn’t “connect everything”. The goal is to give the agent the right data in the right shape, so it can produce a grounded response reliably.
The pattern I use
1) Retrieve relevant data (API / system of record)
I typically do this using a workflow (Power Automate) or an HTTP call, depending on the environment and governance.
Key point: do retrieval with a query that’s specific enough to avoid dumping loads of irrelevant data into the model.
2) Normalise into “answer chunks”
This is the part people skip — and it’s why their agents hallucinate.
I reshape API responses into a small set of readable chunks, each chunk containing:
- the content (human readable)
- a source reference (URL/record link if possible)
- a title/label (optional but helpful)
Think of it like building a mini knowledge pack per question.
3) Keep it short and ranked
Even if your API returns 200 records, the agent doesn’t need 200.
I rank the results (most relevant first) and pass only the top content chunks. This keeps the model focused and improves answer quality.
4) Feed it into the generative answer step
Once you have clean chunks, Copilot Studio can summarise and answer in a grounded way.
Why this works (and why “raw JSON” doesn’t)
If you give the agent raw JSON or a massive blob of unstructured output:
- it wastes tokens parsing noise
- it misses key fields
- the response becomes inconsistent
When you provide clean chunks:
- the model can reason over content instead of structure
- responses are shorter and more accurate
- citations/source references become meaningful
- the same question tends to produce consistent answers
Example: Compliance assistant agent
This pattern works brilliantly for compliance or policy portals.
User: “What evidence do we need for Control A-12?”
Agent:
- calls your compliance API (control + evidence requirements)
- returns 3–5 chunks (requirements + examples + policy link)
- generates a clear checklist response grounded in those chunks
Practical tips (from real builds)
- If your API is slow, add caching for common queries
- If your data is sensitive, enforce permission filtering at retrieval time
- Always log the retrieved records so you can audit and improve
- Create reusable “chunking” functions in your workflow so it’s consistent across topics
Closing
Copilot Studio gets seriously powerful when you stop limiting yourself to “where the docs live” and instead treat your enterprise APIs as first-class knowledge sources — but only if you shape the data properly.
