SteveITpro - Learning AI & Cloud in Public

Every enterprise rushing to deploy Microsoft 365 Copilot is discovering the same uncomfortable truth: the AI is only as good as the identity infrastructure underneath it. THRSP956 at Microsoft Ignite 2025 laid out why identity modernisation is not just a security prerequisite for AI but the foundational layer that determines whether AI deployments deliver value or become an expensive source of wrong answers and data leaks.

Session: THRSP956 Date: Wednesday, Nov 19, 2025 Time: 11:00 AM - 11:30 AM PST Location: Moscone South, The Hub, Theater C

This session reframed identity modernisation from a security initiative to an AI readiness initiative. The argument is compelling and, for many platform engineering teams, uncomfortable: the identity shortcuts and technical debt accumulated over years of "good enough" IAM management now directly limit AI capability.

Identity is not authentication: The AI-era redefinition

For twenty years, identity in enterprise IT has been predominantly about one thing: proving you are who you claim to be and granting access accordingly. Authentication, authorisation, and access management. The entire identity industry, from on-premises Active Directory to cloud-native Entra ID, has been built around this paradigm.

The session argued that the AI era requires a fundamentally broader conception of identity. When Microsoft 365 Copilot processes a user's request, it does not just check whether the user has permission to access a document. It uses identity information to understand:

Organisational context: Where does this person sit in the organisation? Who are their peers, their reports, their leadership chain? What department, function, and business unit do they belong to? This context shapes how Copilot prioritises information and frames responses.

Relationship mapping: Who does this person collaborate with? What projects involve them? Which teams do they participate in? Relationship data — derived from email, calendar, Teams interactions, and document collaboration — enables Copilot to surface relevant information from collaborators' work that the user might not know exists.

Expertise signals: What does this person know about? What topics do they write about, present on, or get consulted on? Copilot uses identity-linked activity data to build expertise profiles that inform response relevance and information routing.

Information boundaries: What should this person see, and critically, what should they not see? In the AI context, information boundaries are not just about access control. They determine the data that Copilot can reference, summarise, and surface in responses. Overly permissive boundaries mean Copilot might surface information the user should not see. Overly restrictive boundaries mean Copilot cannot provide useful responses.

The redefinition: Identity in the AI era is not just "who are you and what can you access?" It is "who are you, what do you do, who do you work with, what do you know, and what information is appropriate for you?" This is a materially different scope, and most organisations' identity infrastructure was not designed for it.

The organisational graph: Why Microsoft Graph is the real AI infrastructure

The session positioned Microsoft Graph as the infrastructure layer that transforms identity data into AI-usable organisational intelligence. This is not a new technology; Microsoft Graph has existed for years. What has changed is its significance.

What Microsoft Graph contains:

Directory data: User profiles, group memberships, organisational hierarchy, reporting relationships
Collaboration signals: Email interactions, meeting patterns, Teams conversations, document co-authorship
Activity data: Application usage, content creation, search queries, file access patterns
Relationship data: Manager-report chains, project team compositions, cross-functional collaboration patterns

Why this matters for AI:

When a user asks Copilot "What did my team discuss about the Henderson project last week?", Copilot needs to:

Identify who constitutes "my team" (organisational graph)
Determine what content relates to "the Henderson project" (activity and content graph)
Filter to "last week" (temporal data)
Apply information boundaries (access control)
Synthesise a response from relevant emails, Teams messages, documents, and meeting transcripts

Every step depends on identity data being accurate, complete, and current. If the organisational hierarchy is wrong, Copilot identifies the wrong team members. If group memberships are stale, Copilot misses relevant content. If access permissions are over-permissioned, Copilot surfaces information the user should not see.

The uncomfortable truth: Most organisations' Microsoft Graph data is incomplete, inconsistent, or stale. Manager fields are not populated. Group memberships are not maintained. Distribution lists contain people who left the organisation years ago. Job titles are inconsistent. Department names vary between systems. This identity debt, accumulated over years of "good enough" administration, directly degrades AI capability.

The identity debt problem: What "good enough" IAM really costs

The session described identity debt as the gap between the identity data an organisation has and the identity data its AI systems need. Most organisations have significant identity debt, and until the AI era, the consequences were manageable.

Pre-AI identity debt consequences:

A user has access to a SharePoint site they should not. Risk: low-probability data exposure.
A manager field is wrong. Consequence: approval routing goes to the wrong person occasionally.
Group memberships are stale. Impact: some users receive emails they should not. Some do not receive emails they should.

These are administrative inconveniences. They generate helpdesk tickets, not security incidents.

AI-era identity debt consequences:

A user has access to a SharePoint site they should not. Copilot now actively surfaces documents from that site in response to the user's queries. Low-probability exposure becomes high-probability exposure because the AI is actively seeking relevant content across everything the user can access.
A manager field is wrong. Copilot provides incorrect organisational context in responses, misidentifies team members, and routes information to the wrong audience.
Group memberships are stale. Copilot includes content from departed employees in team summaries, misses content from new team members, and generates responses based on an incorrect understanding of team composition.

The amplification effect: AI does not create new identity problems. It amplifies existing ones. Every identity data quality issue that existed before Copilot deployment still exists after. But the frequency and severity of consequences increase dramatically because AI systems actively consume and act on identity data in every interaction.

The security dimension the session emphasised: Over-permissioned access is the most dangerous form of identity debt in the AI era. When a user has access to files they do not need — a common result of accumulated permissions over time — Copilot can surface those files' contents in responses. The user may not even know they have access. Pre-AI, the risk was that a user might stumble across sensitive content. Post-AI, the risk is that Copilot delivers sensitive content directly to the user in response to an unrelated query.

The session put it bluntly: "Copilot is the best audit tool you have ever deployed. It will find every permission you wish you had never granted."

The modernisation roadmap: Five capabilities before AI deployment

The session outlined a modernisation path that treats identity readiness as a prerequisite for AI deployment, not a parallel initiative that can proceed independently.

Capability 1: Accurate organisational data

The requirement: Every user record must have an accurate manager, department, job title, location, and cost centre. Organisational hierarchy must reflect actual reporting relationships, not historical artefacts.

Why this is hard: Organisational data is typically maintained by HR systems, synchronised to directory services, and consumed by downstream applications. Synchronisation failures, delayed updates, and inconsistent data models across systems create persistent inaccuracies. Most organisations have never audited the accuracy of their directory data because, until AI, it did not matter enough to justify the effort.

The practical approach: Automate directory data validation against the HR system of record. Flag discrepancies for resolution. Establish SLAs for data freshness after organisational changes. This is not glamorous work, but it is foundational.

Data Quality Priority Matrix:
                        Used by AI         Not Used by AI
Inaccurate              FIX IMMEDIATELY    PLAN REMEDIATION
Incomplete              POPULATE NOW       BACKLOG
Accurate                MAINTAIN           MAINTAIN

Capability 2: Permission hygiene

The requirement: Users should have access to exactly the data they need for their current role. No more, no less. Stale permissions from previous roles, project assignments, or inherited group memberships must be removed.

Why this is hard: Permission accumulation is the default state in most organisations. Users gain access to resources over time, and access is rarely revoked when it is no longer needed. Access reviews exist in most organisations, but they are periodic (quarterly at best), self-attested (managers certify without actually checking), and incomplete (not all resources are in scope).

The AI-specific urgency: Before Copilot, over-permissioned access was a latent risk. After Copilot, it is an active risk because the AI surfaces content from every accessible source. Deploying Copilot without first cleaning up permissions is equivalent to giving every user an assistant whose job is to find and present information from every corner of the organisation — including corners the user should not access.

The practical approach: Run access reviews with AI-assisted anomaly detection. Identify users whose access patterns differ significantly from their peers in similar roles. These anomalies often indicate accumulated permissions that should be revoked. Implement just-in-time access for sensitive resources rather than standing access.

Capability 3: Group and team management

The requirement: Microsoft 365 groups, distribution lists, security groups, and Teams memberships must accurately reflect current collaboration patterns. Orphaned groups, stale memberships, and duplicate groups for the same purpose must be cleaned up.

Why this is hard: Group proliferation is universal. Users create groups for projects, events, and ad-hoc collaboration. When the project ends, the group persists. When team members leave, their memberships remain. Over years, organisations accumulate thousands of groups, many of which are inactive or redundant.

The AI impact: Copilot uses group membership data to determine information relevance. Stale groups mean stale context. Copilot may surface information from defunct project groups or include departed employees in team summaries. The data quality of group management directly affects AI response quality.

Capability 4: Sensitivity labelling and information classification

The requirement: Documents and data must be classified by sensitivity level, with labels that determine how AI systems can reference, summarise, and surface the content. Classification must be consistent, comprehensive, and machine-readable.

Why this is hard: Information classification has been an aspiration for most organisations, not an operational practice. Manual labelling is labour-intensive and inconsistent. Automated classification requires training data and tuning. The consequence of under-classification is over-exposure; the consequence of over-classification is AI systems that cannot access enough data to be useful.

The balance: The session recommended a pragmatic approach: classify the most sensitive content first (executive communications, M&A materials, HR investigations, legal matters) and apply default classifications to everything else. This provides meaningful protection for high-risk content while allowing AI systems to operate effectively with general business information.

Capability 5: Lifecycle management

The requirement: User identities, group memberships, and access permissions must be managed across the full lifecycle: onboarding, role changes, project assignments, and offboarding. Each lifecycle event must trigger appropriate identity updates within defined SLAs.

Why this is hard: Lifecycle management requires integration between HR systems, identity platforms, access management tools, and application-level provisioning. Most organisations have automated some lifecycle events (onboarding and offboarding) but not others (role changes, project transitions). The gaps create identity debt that accumulates over time.

The AI-era consequence: A user who changes roles but retains their previous role's access permissions continues to receive Copilot responses informed by their previous role's data. This is not just an over-permissioning problem. It actively confuses Copilot's understanding of what the user needs, leading to responses that blend irrelevant historical context with current role requirements.

Permission boundaries for AI agents

The session introduced a dimension of identity governance that most organisations have not yet considered: permission boundaries specifically for AI agents acting on behalf of users.

The distinction that matters: A user reading a confidential document is one thing. An AI system summarising that document and including the summary in an email to external recipients is a fundamentally different risk. The access is the same; the action is different. Permission boundaries for AI need to consider not just data access but data use.

The emerging model:

User Permission:   "Can read HR performance reviews for direct reports"
AI Agent Permission: "Can access HR data for direct reports but cannot
                      summarise, quote, or include in external communications"

This is the "delegation boundary" concept — AI agents should have explicit, scoped permissions that are narrower than the delegating user's permissions. The implementation in Entra ID is still maturing, but the architectural direction is clear.

The Conditional Access gap

Current Conditional Access policies operate at the application level, not the data level. You can require MFA to access Copilot, but you cannot use Conditional Access to prevent Copilot from surfacing specific documents. That granularity requires sensitivity labels and information barriers, which are separate capabilities with separate configuration requirements. The session acknowledged this gap but positioned it as an area of active development.

What AI-specific access policies should address:

Copilot access scope: Define which resources Copilot can access on behalf of different user groups. Senior leadership might have Copilot access to strategic planning documents; individual contributors should not.
Agent delegation limits: When AI agents act on behalf of users, define the maximum permission scope the agent can exercise. An agent that schedules meetings should not be able to access financial data, even if the delegating user can.
Data residency for AI processing: Ensure that AI systems process data within appropriate geographic boundaries, particularly for organisations subject to data sovereignty requirements.
Audit and logging: Enable comprehensive audit logging for all AI system access to user data. This creates the evidence trail needed for compliance and investigation.

The Copilot readiness assessment

The session provided a practical framework for assessing identity readiness for Copilot deployment. The metrics that matter:

Data completeness: What percentage of user records have populated and accurate manager, department, and location fields? Target: greater than 95%.

Permission freshness: What is the average time between a role change and the corresponding access permission update? Target: less than 5 business days.

Group hygiene: What percentage of groups have had activity in the last 90 days? What percentage have had a membership review in the last 180 days?

Over-permission indicators: How many users have access to resources outside their department's normal scope? How does each user's access profile compare to their peers in similar roles?

Sensitivity labelling coverage: What percentage of documents in high-sensitivity locations (executive SharePoint sites, HR sites, legal sites) have sensitivity labels applied?

The recommendation: Measure these metrics before Copilot deployment. Organisations that skip this assessment and deploy Copilot onto an unmodernised identity infrastructure will discover their identity debt through user complaints, data exposure incidents, and AI responses that reveal information boundaries are not where they should be.

What the session got right and what it missed

What it got right:

Identity as AI infrastructure. The framing of identity modernisation as AI readiness, rather than as a standalone security initiative, is correct and strategically important. It changes the business case from risk reduction (which is hard to fund) to capability enablement (which is easier to fund).

The amplification thesis. AI systems amplify identity data quality problems. This is demonstrably true and underappreciated. Most organisations deploying Copilot have not fully reckoned with the consequences of their existing identity debt.

The modernisation roadmap. The five capabilities described — accurate organisational data, permission hygiene, group management, sensitivity labelling, and lifecycle management — are the right priorities in roughly the right order.

What it missed:

The effort required. Identity modernisation at the level described is a multi-quarter initiative for most organisations. The session presented it as a readiness checklist. In practice, it is a programme of work that requires dedicated staffing, executive sponsorship, and sustained attention. Cleaning up years of identity debt is unglamorous, laborious work that competes with more visible AI initiatives for attention and resources.

Multi-cloud and hybrid identity. The session assumed an all-Microsoft identity stack. Organisations using Okta, Ping, or other identity providers alongside Entra ID face additional complexity that was not addressed. The principles apply, but the implementation is significantly harder in heterogeneous environments.

The cost of getting it wrong. The session described the benefits of identity modernisation but did not quantify the cost of deploying AI without it. Real-world examples of data exposure through Copilot — anonymised if necessary — would have made the case more compelling than theoretical risk frameworks.

Third-party AI systems. Copilot is not the only AI system that enterprises deploy. ChatGPT Enterprise, GitHub Copilot, Salesforce Einstein, and dozens of other AI tools all interact with identity and permissions. A comprehensive identity modernisation programme needs to address all AI systems, not just Microsoft's.

The timeline reality. The session implied that identity modernisation could be completed in preparation for AI deployment. For large enterprises with decades of permission sprawl, the remediation effort is measured in years, not months. The practical question is not "how do we modernise before deploying AI?" but "how do we manage risk while modernising in parallel with AI deployment?"

The verdict

THRSP956 delivered a message that every platform engineering team deploying AI needs to hear: your AI is only as good as your identity infrastructure. The session correctly identified identity modernisation as a prerequisite for AI success, not a parallel initiative.

The practical challenge is prioritisation. Identity modernisation competes with AI development, model deployment, and prompt engineering for attention and resources. It is less visible, less exciting, and harder to demonstrate in a sprint review. But it is foundational in a way that the more visible work is not. A brilliantly engineered AI application deployed onto a poorly maintained identity infrastructure will produce responses that are contextually wrong, over-permissioned, or both.

For organisations planning Copilot deployments or broader AI adoption, the action is clear: assess your identity data quality honestly, prioritise the most impactful modernisation work (permission hygiene and organisational data accuracy), and invest in the operational practices that sustain data quality over time. Do this before you deploy AI broadly, not after.

The organisations that treat identity modernisation as an AI programme prerequisite will deploy AI that actually understands their organisation. The organisations that skip this step will deploy AI that misunderstands their organisation at scale, which is considerably worse than not deploying AI at all.

What to watch

Copilot readiness assessment tooling. Microsoft and partners are developing tools that assess organisational readiness for Copilot based on identity data quality metrics. Watch for tooling that provides actionable remediation guidance, not just readiness scores.

AI-driven identity governance. The same AI capabilities being deployed for business productivity can be applied to identity management itself. Watch for identity governance tools that use AI to detect anomalous access patterns, recommend permission cleanup, and automate lifecycle management.

Information barrier evolution. As AI systems become more capable at accessing and synthesising information across the organisation, information barriers will become critical governance controls. Watch for more granular and flexible barrier configurations.

Cross-platform identity for AI. Organisations using AI systems across multiple platforms (Microsoft, Google, Salesforce, custom) need consistent identity data across all of them. Watch for identity standards and integration patterns that ensure AI systems on every platform have the same organisational context.

Regulatory requirements for AI-accessible data. As regulators develop frameworks for AI in enterprise settings, identity and access management requirements for AI-accessible data will likely become more prescriptive. Organisations that modernise identity proactively will be better positioned for compliance than those that wait for mandates.

Related Coverage:

Session: THRSP956 | Nov 19, 2025 | Moscone South, The Hub, Theater C