H Company advances the frontier of computer-use agents with Surfer 2, a unified architecture built for real and impactful use cases. Our mission is to move from models that know to agents that do: we create capable, context-aware systems that operate reliably on digital environments.
H is thrilled to present Surfer 2: a cross-platform computer-use agent that runs seamlessly on desktop, web, and mobile environments. Surfer 2 surpasses existing state-of-the-art agents on four major agentic benchmarks spanning multiple platforms, outperforming systems developed by other leading AI labs, such as OpenAI, Anthropic, and Google.
Our original web browsing agent, Surfer-H, delivered Pareto-optimal performance on web browsing tasks (from WebVoyager). In just four months, we built on what we learned to achieve state-of-the-art results across platforms with Surfer 2.
Surfer 2 is an agent architecture for computer use. It separates strategic planning from tactical execution, with an orchestrator module managing planning and coordination while sub-agents act across interfaces.
Surfer 2 can be configured with or without the orchestrator module. When enabled, the orchestrator decomposes the primary goal into a set of sub-tasks assigned to sub-agents. Upon completion, each sub-agent reports its outcome and intermediate state back to the orchestrator. Based on these reports, the orchestrator determines its own next step: either producing a final output, advancing to the next sub-task, or replanning in the event of failure.
Surfer 2 follows a ReAct (reason+act) loop. At each step, it assesses its progress towards its task, and determines its next action. Reliable performance across environments relies on dedicated components for visual grounding (see more below), task validation, and failure recovery, along with a robust integration layer that ensures actions translate accurately into system control. Because an agent is only as capable as the actions it can reliably perform, these layers are central to Surfer 2’s consistency.
This design yields a resilient agent that continuously benefits from advances in frontier models and extracts maximum performance from them.
OSWorld evaluates an agent’s ability to control a full desktop environment on Ubuntu systems across diverse applications via human-like interaction. In the Foundation E2E GUI category where only visual perception and interaction are allowed, Surfer 2 achieves state-of-the-art with pass@1 of 60.1%. Our system reaches 77.0% with pass@10, surpassing the human baseline score of 72.4%.
WebArena evaluates agents in a sandboxed web environment from five categories (E-commerce, social forum, collaborative development, maps, CMS platforms) with functionality and data mimicking their real-world equivalents. Surfer 2 achieves a state-of-the-art score of 69.6% by decoupling planning and execution.
WebVoyager assesses web information retrieval on dynamic live websites. On this benchmark Surfer 2 reaches a success rate of 97.1%, surpassing Magnitude's previous state-of-the-art of 93.9%.
AndroidWorld evaluates an agent's capability to control an Android device and use 20 real applications. On this suite, Surfer 2 achieves a success rate of 87.1%, reaching state-of-the-art level in the visual interaction category. Relying on vision and human-like interaction (swipe, press, type) allows Surfer 2 to be proficient across applications and Android versions, surpassing the human baseline of 80.0%.
Surfer 2 is the culmination of what is currently possible using a mix of third-party and internal foundation models. By combining our proprietary agent training methods and infrastructure with the best external models available, we’ve achieved state-of-the-art performance across desktop, web, and mobile benchmarks, often matching or exceeding human capabilities.
These results didn’t come cheap: Surfer 2 runs are extremely costly. That’s why we’re now focused on training Holo2, our next-generation proprietary model designed to deliver the same breakthrough performance at a fraction of the cost, bringing truly scalable and accessible AI agents within reach.
In just the past few months, we’ve open-sourced Surfer-H, launched Holo1.5, and set new state-of-the-art records in desktop, web, and mobile performance with Surfer 2. And we’re just getting started. H Company is committed to pushing these boundaries even further, making AI more capable, reliable, and accessible for everyone.