Skip to main content
GCC Jumpstart WA Government AI Training Partners Impact About Procurement Capability Insights Contact
Uncategorized

From Natural Language to Graph API: Letting Non-Scripters Run Tier-3 M365 Admin Safely

Most government IT teams have the same bottleneck, and it is not technology. It is that one or two people know the PowerShell and the Graph API calls for tier-three M365 administration, and everyone else opens a ticket and waits. Assign a license, fix a mailbox permission, provision a shared mailbox, clean up a distribution group. Routine work, gated behind the two humans who can do it safely. The queue grows, the senior engineers burn out doing junior tasks, and the obvious fix, just give more people access, is the one thing you cannot responsibly do.

There is a better answer, and it is the kind of Copilot Studio Graph API PowerShell automation that lets a non-scripter run bounded admin tasks without ever holding the keys. This is a teardown of how that agent is built, generalized from patterns I have deployed in production GCC environments. No client specifics. Just the architecture.

The Wrong Way to Solve This

Before the right way, kill the tempting wrong ones. Do not hand the help desk a PowerShell console and a privileged account. Do not build a script that runs under a service account with global admin and let an agent fire it on any input. Both of those trade a workload problem for a security incident, and in a regulated environment that trade ends careers.

The goal is narrower and harder. A user describes what they want in plain language. The system performs only a specific, pre-approved operation, only with validated parameters, under an identity that the user never touches, and logs the whole thing for audit. Natural language goes in. A bounded, governed action comes out. Nothing in between is improvised.

Layer One: Intent, Not Execution

The front door is a Copilot Studio agent. A staffer types something like “give the new finance hire a G3 license and add her to the AP shared mailbox.” The agent’s job at this layer is interpretation only. It uses generative orchestration to map that sentence to one or more known operations and to extract the parameters: which user, which license SKU, which mailbox, what access level.

The critical design rule is that the language model never executes anything and never generates the command that runs. It produces a structured intent, a small validated object that says “operation: assign-license, target: user@agency.gov, sku: SPE_G3.” That object is the only thing that crosses into the next layer. The model is a translator, not an operator. If a prompt tries to talk the agent into something outside its known operations, there is nothing on the other side to receive it.

Layer Two: The Allowlist Is the Whole Security Model

Behind the agent sits a tooling layer, in my builds a custom MCP server, that exposes a fixed set of vetted operations and nothing else. Each operation is a parameterized function written and reviewed by an engineer: assign a license, add a member to a specific class of group, grant a named mailbox permission. There is no generic “run this PowerShell” endpoint. There is no path to an arbitrary Graph call. If an operation is not on the list, it cannot happen, no matter how the request is phrased.

This is what makes the thing safe enough to put in front of non-scripters. The blast radius is defined in code, not by trusting the user or the model. Every function validates its inputs before touching anything: confirm the target exists, confirm the SKU is one this operation is allowed to assign, confirm the group is in the permitted set rather than, say, a privileged security group. A request to add someone to a role-assignable group simply is not a function the layer offers.

If an operation is not on the allowlist, it cannot happen, no matter how cleverly the request is phrased.

Layer Three: Identity Separation

The user requesting the action and the identity performing it are not the same, and that separation is deliberate. The operations run under a managed identity or an app registration with narrowly scoped Graph permissions, governed by conditional access. The requester’s own token is never elevated. A help desk analyst can trigger a license assignment without ever being granted license-management rights on their own account.

This is least privilege applied to people and to the agent at the same time. The service identity holds only the specific Graph scopes the allowlisted operations require, and not one permission more. If the agent’s job is licensing and group membership, it has no business reading mailbox content, so it cannot. You scope the identity to the operations, then you scope the operations to the mission.

Layer Four: Confirmation and the Human in the Loop

Not every action should fire the instant it is understood. Reversible, low-risk operations can execute directly. Anything consequential, removing access, touching anything that resembles a security boundary, bulk changes, gets a confirmation step where the agent restates exactly what it is about to do and waits for a human to approve it.

The point is not friction for its own sake. It is that the moment of approval is also a moment of logging, and it puts a named person on the decision. For the genuinely irreversible category, deletions and the like, the agent does not act at all. It hands the request to a senior administrator with the context already assembled. The automation removes the routine toil; it does not pretend to own judgment that belongs to a person.

Layer Five: Audit-Ready by Construction

Every operation writes a complete record: who requested it, what natural-language input they gave, the structured intent it resolved to, the exact parameters, the timestamp, and the result. That record flows into the tenant’s audit logging and Purview, where it lives with the rest of your compliance evidence.

This matters more in government than the convenience does. When an assessor asks how a permission was granted, the answer is not “someone ran a script we cannot reconstruct.” It is a queryable trail that ties an action to a requester, an approver, and a defined operation. The agent does not just do the work. It produces the proof that the work was done correctly, which is the part that survives an audit.

Why GCC Changes the Build

None of this ports cleanly from a commercial tenant. Graph endpoints, app registration consent, and conditional access behave differently inside Government Community Cloud (GCC), connector availability is narrower, and assumptions from the commercial docs will quietly fail. The tooling layer has to be built against the government cloud from the start, not adapted to it after a commercial prototype works on someone’s laptop. That is the difference between an agent that demos and one that runs in your environment.

Built this way, the payoff is real. Tier-one staff safely handle work that used to bottleneck on your senior engineers, the ticket queue shrinks, and the people who actually know the platform get their hours back for the work only they can do. The agent is bounded, governed, and traceable, which is the only version of this that belongs in a regulated tenant.

If your team is the bottleneck and you want an admin agent that is safe by construction rather than safe by hope, that is exactly the kind of workflow I build. The piece on aligning government AI with NIST 800-171 and zero trust covers the governance side in depth. When you are ready, let’s talk.

Questions About Your GCC Environment?

Book a 20-min scoping call or send a message. We respond within one business day.