Meta will record employees’ keystrokes and use it to train its AI models
Summary (EN)
Meta plans to collect employee mouse movements and keystrokes on certain internal applications and use that data to help train AI systems designed to operate computers, according to reporting highlighted by TechCrunch and originally broken by Reuters. In Meta’s explanation, if the company wants to build agents that can complete everyday computer tasks, the models need examples of how people actually interact with software, including clicking buttons, navigating menus, and moving through application workflows. The company says it is launching an internal tool to capture those interaction patterns, and maintains that safeguards are in place to protect sensitive content and that the data will not be used for unrelated purposes. The development is significant because it points to a widening search for high-quality behavioral training data as AI companies move from text generation toward computer-use agents. Rather than relying only on public internet data or synthetic traces, Meta is turning to live operational behavior generated inside the firm. That makes the story notable both technically and socially. Technically, it reflects the data demands of UI-operating models and the difficulty of sourcing realistic interaction traces at scale. Socially, it raises fresh questions about consent, workplace surveillance, and the boundaries of acceptable internal data collection for model training. Within the last 24 hours, it stands out as a concrete controversy tied directly to the next generation of agentic AI systems and the increasingly invasive forms of data gathering that those systems may incentivize.
Summary (ZH)
据 TechCrunch 转述路透社报道, Meta 计划在部分内部应用中采集员工的鼠标移动和键盘输入数据, 用于训练能够操作电脑的 AI 模型。Meta 的逻辑是, 如果希望构建能替人完成日常软件任务的 AI 代理, 模型就必须学习真实用户是如何点击按钮、切换菜单、浏览界面并完成流程操作的。为此, Meta 正上线一套内部工具, 记录这些交互轨迹。公司同时表示, 已设置保护措施以避免敏感内容被不当使用, 且相关数据不会被用于其他无关目的。这一事件的重要性在于, 它清楚显示出 AI 公司正在为“电脑操作型代理”寻找新的高质量训练数据来源。与传统依赖网页文本或合成数据不同, Meta 这次直接把公司内部真实的人机交互行为纳入训练素材。技术上, 这说明 UI 操作型模型对真实操作轨迹的需求非常强, 而这类数据在外部市场并不容易大规模获得。社会层面上, 该做法也引发了新的争议, 包括员工知情与同意、工作场景监控边界、以及企业是否可以以模型训练为由扩大内部行为采集范围。就过去 24 小时的高价值动态而言, 这是一次直接触及 agentic AI 数据边界与伦理边界的典型事件。
Source