Office Agent PM — Take-Home Assignment

Microsoft Office Agent is building agentic workflows that let information workers delegate real tasks to AI agents. We're exploring a new scenario: Email Processing Agent. This Agent goes beyond smart replies — it needs to understand the user's work context (calendar, to-dos, recent conversations), determine each email's priority and handling action (ignore, flag, draft a reply, create a to-do, forward to a colleague), and execute upon user confirmation.

Please complete the following four parts.

Product Decisions

~25% weight

User scenarios & value judgment: Who is the core user of this Agent? How do different roles (IC, manager, executive, support, sales) differ in their email handling behavior? Who benefits most? Which user segment should V1 target, and why?
MVP trade-offs: Your team has 2 engineers and 2 months. Define what your MVP includes, what it excludes, and why:
- (A) Priority triage — Scan the inbox and classify emails into "Act now / Later / Ignore"
- (B) Smart reply drafts — Auto-generate reply drafts for emails that need a response
- (C) Action item extraction — Extract action items from emails and create to-dos or calendar events
- (D) Thread summary — Summarize long email threads so the user doesn't have to read the entire chain
- (E) Follow-up reminders — Track sent emails that haven't received a reply and remind the user to follow up
Risk judgment: What happens when the Email Agent gets it wrong? Which error is more dangerous — marking an important email as "Ignore" vs. marking a routine email as "Act now"? How would you reduce these risks through product design?
Kill criteria: Under what conditions would you recommend not building this product, or delaying it?

Agent Design

~25% weight

For your chosen MVP scope, design the complete Agent workflow:

Decision logic: How does the Agent determine an email's priority and handling action? What signals does it use (sender, content, context, user history)? Which operations can be automated, which require user confirmation? What are your principles?
Failure modes: List at least three critical failure modes and their product-level mitigations — note that email failures are often social (wrong tone, missing a key recipient, replying at the wrong time) rather than purely technical.
Evaluation framework: How do you define "the Agent got it right"? What do you measure, how do you measure it, and what metrics determine whether this is ready to ship? How do you track changes in user trust?

Prompt Design

~15% weight

Design a Prompt for one core capability of the Email Agent (e.g., priority classification, reply draft generation, action item extraction — pick the one you consider most critical). Include:

The Prompt itself
Expected failure scenarios — what inputs will cause this Prompt to break?
The iteration path from prototype to production

Prototype

~35% weight

Using any tool of your choice (Coze, Dify, GPTs, Cursor, Python, Replit, etc.), build a working prototype of at least one core component of your Agent workflow.

Requirements:

Choose the component you consider most critical or highest-risk — the choice itself is a product judgment
The prototype must be runnable — we will test it; screenshots or document descriptions alone are not accepted

Deliverables

A link to the prototype or a runnable code repository
Build log: What approaches did you try? What problems did you encounter? How did you adjust? We want to see the iteration process, not just the final result
Prototype vs. design gap analysis: After running it, what parts of your Part 2 design need revision? What issues did the prototype reveal that you didn't foresee on paper?
Eval dry-run: Using the evaluation framework from Part 2, run 5–10 test cases against your prototype. Record actual performance and your analysis

Bonus (not required)

Prototype covers multiple components or demonstrates end-to-end chaining
Comparison of different models or prompt strategies with recorded results

Submission Guidelines

Timeline

3–4 days

Format

Open (5–8 pages suggested)

Language

English or Chinese

Most Important

Part 4 — Prototype

We don't expect a perfect solution. We want to see how you think about a real, complex product problem. The Prototype section is the most important part — we will actually use it, so please make sure it runs.

Please note which AI tools you used during the assignment, and how you verified and modified the AI's output.

Office Agent PMTake-Home Assignment

Deliverables

Submission Guidelines

Office Agent PM
Take-Home Assignment