AI Agents Are Escaping the IDE: What Happens When Cursor-Style Tooling Meets Productivity Apps
The agent pattern that made Cursor powerful is moving into docs, spreadsheets, and email. The engineering challenges are harder than you think.
A developer named Keith Curtis shipped something interesting recently: an AI agent layer for LibreOffice, complete with voice input and agentic task execution. You can find the write-up on his site. The Hacker News thread treated it mostly as a curiosity: "neat project, but who uses LibreOffice?" I think that response completely missed the point.
The interesting thing about this project isn't LibreOffice. It's what it represents: the underlying agent interface pattern that made Cursor transformative in code editors is now escaping the IDE. It's going to show up in spreadsheets, document editors, presentation tools, data analysis environments, and every other productivity context where humans work with structured information.
For developers building agent-powered tools, this is worth thinking through carefully. The technical challenges are different from what you face in a code editor, and the failure modes are more dangerous.
Why Code Editors Got There First
Cursor and similar tools succeeded in code editors for reasons that seem obvious in retrospect:
Code is already structured text. The context the agent needs: function signatures, variable names, file dependencies. All of it is machine-readable without interpretation. The agent can traverse an AST, read imports, understand call graphs.
Failure is recoverable. If the AI generates bad code, it doesn't compile. If it does compile, tests catch it. If tests don't catch it, code review might. The feedback loops are fast and the blast radius of a mistake is usually contained.
The action space is relatively constrained. The agent writes text into a file. It runs commands. It reads file contents. The set of things it can do is large but bounded, and most actions are reversible.
The user population is technical. Developers using Cursor understand what the AI is doing, can read the output critically, and have the background to evaluate whether the generated code is correct.
None of these properties hold in a general productivity context.
What Changes When You Leave the IDE
Documents Don't Have an AST
A Word document or a spreadsheet is structured at the UI level: sections, paragraphs, cells, sheets. But the semantic structure is implicit. When someone asks an AI agent to "update the executive summary to reflect the Q3 results," the agent needs to understand what the executive summary is, where it starts and ends, what "reflecting Q3 results" means in context, and what data constitutes the Q3 results.
In a code editor, this kind of context is mechanically accessible. In a document, it requires natural language understanding plus document understanding plus task decomposition. The failure surface is much larger.
Undo Is Not a Safety Net
In a code editor, if the AI makes a mess, you revert the file. Version control is a reset button.
In a document workflow, the consequences are more complicated. If the AI agent sends an email on your behalf, you can't unsend it. If it updates a financial model and you save and close the file, the history might be gone. If it modifies a shared document that other people are actively reading, the change propagates immediately.
Agent actions in productivity contexts are more likely to be irreversible, more likely to affect other people, and more likely to have consequences outside the software itself.
Voice Compounds the Risk
The LibreOffice project includes voice input, and this is where things get genuinely interesting from an engineering perspective. Voice introduces a new class of failure:
Transcription errors become action errors. "Delete the second paragraph" and "delete to the second paragraph" are transcriptions that an ASR system might easily confuse. One deletes a paragraph, the other deletes a range.
Natural language is ambiguous. "Move the budget section after the timeline" sounds unambiguous. But which timeline? The document has three sections with timeline-related content. What if the budget section is longer than the space available? Does "after" mean immediately after or at the end of the page?
Intent verification is harder without a visual confirmation step. In a code editor, the AI makes a change and you see it highlighted in a diff. With voice commands, the loop is: speak, AI acts, observe result. If the action is irreversible, the observation step comes too late.
Building voice-driven agents for productivity contexts requires explicit confirmation patterns that feel natural in speech: "I'm going to delete the second paragraph, is that right?" The trick is doing this without being so confirmation-heavy that the interface becomes annoying.
The Technical Challenges Worth Solving
If you're building agent-powered tooling outside the IDE, here's where to focus your engineering effort:
Context Window Management for Non-Code Documents
A large document is a lot of tokens. A complex spreadsheet is a lot of tokens. If you're passing the entire document context to the model on every action, you'll hit limits quickly and you'll be paying for context that isn't relevant to the current task.
You need a retrieval strategy: what parts of the document are relevant to the current instruction? For document editors, this might mean extracting just the relevant sections. For spreadsheets, it might mean providing the schema (column headers, sheet names, named ranges) plus the rows adjacent to where the agent is working.
This is an active research area and there's no clean solution yet. The agents that win in this space will have good context retrieval.
Action Representation and Reversibility
In a code editor, the canonical action is "write text to a file." In a document editor, the action space includes: insert paragraph, delete paragraph, format text, move section, insert table, update cell value, add comment, change heading level. Each of these needs a reversible representation.
Build your agent actions as operations that can be undone, and maintain an operation log that can be traversed backward. This is more complex than "track file changes." You need semantic operations, not byte diffs.
Confirmation Patterns
For any action that is irreversible or that affects content outside the current document (sending emails, updating shared resources, triggering workflows), the agent should require explicit confirmation before acting. This isn't optional. It's the difference between a useful tool and a liability.
The UX challenge is making confirmation feel natural. "I'm about to send this email to [email protected] with the attached budget report. Should I proceed?" That's the right pattern. It's specific, it's actionable, and it puts the human in control of the irreversible step.
Trust and Verification for Non-Technical Users
This is the hardest problem. Technical users of Cursor can read the output and evaluate whether the AI's code is correct. Non-technical users of a document agent often cannot. They may not know whether the AI correctly understood their instruction until the consequences are visible, which might be after the document is published or the email is sent.
The interface needs to communicate uncertainty clearly. If the AI is not confident it understood the instruction, it should say so before acting, not after. If the action it's about to take has significant downstream consequences, it should surface that information.
What This Means for Agent Builders
The agent interface pattern is genuinely powerful. Cursor showed that. But the pattern needs to be adapted, not just transplanted, as it moves into other contexts.
The key adaptations:
- Treat reversibility as a first-class design constraint, not an afterthought
- Build confirmation patterns that are proportional to the consequence of the action
- Invest in context retrieval rather than passing the entire document on every call
- Design for non-technical users who cannot evaluate AI output critically
- Be explicit about uncertainty before acting, not after
The LibreOffice agent project is an early experiment in a direction the whole industry is moving. The productivity app that builds this pattern well: not just fast, but safe, recoverable, and trust-appropriate, will have a significant advantage.