Microsoft Launches Fara-7B, Its New On-Device AI Computer Agent

insight

Microsoft has announced Fara-7B, a new “agentic” small language model built to run directly on a PC and carry out tasks on screen, marking a significant move towards practical AI agents that can operate computers rather than simply generate text.

What Is Fara-7B?

Fara-7B is Microsoft’s first computer-use small language model (SLM), designed to act as an on-device operator that sees the screen, understands what is visible and performs actions with the mouse and keyboard. It does not read hidden interface structures and does not rely on multiple models stitched together. Instead, Microsoft says it works in the same visual way a person would, interpreting screenshots and deciding what to click, type or scroll next.

Compact

The model has 7 billion parameters, which is small compared with leading large language models. However, Microsoft says Fara-7B delivers state-of-the-art performance for its size and is competitive with some larger systems used for browser automation. The focus on a compact model is deliberate. For example, smaller models offer lower energy requirements, faster response times and the ability to run locally, which has become increasingly important for both privacy and reliability.

Where Can You Get It?

Microsoft has positioned Fara-7B as an experimental release intended to accelerate development of practical computer-use agents. It is openly available through Microsoft Foundry and Hugging Face, can be explored through the Magentic-UI environment and will run on Copilot+ PCs using a silicon-optimised version.

Why Build A Computer-Use SLM?

Microsoft’s announcement of Fara-7B is not that surprising, given the wider trend in AI development. The industry has now moved beyond text-only chat models to models that can act, reason about their environment and automate digital tasks. This actually reflects the growing demand from businesses and users for assistants that can complete work rather than merely describe how to do it.

There is also a strategic element. For example, Microsoft has invested heavily in AI across Windows, Azure, Copilot and its device ecosystem. Building a capable agentic model that runs directly on Windows strengthens this position and gives Microsoft a competitive answer to similar tools emerging from OpenAI, Google and other major players.

By releasing the model with open weights and permissive licensing, Microsoft is also encouraging researchers and developers to experiment, build new tools and benchmark new methods. This approach has the potential to shape the direction of computer-use agents across the industry.

How Fara-7B Has Been Developed

One of the biggest challenges in creating computer-use agents is the lack of large, high-quality data showing how people interact with websites and applications. For example, a typical task might involve dozens of small actions, from locating a button to entering text in the correct field. Gathering this data manually would be too slow and expensive at the scale needed.

Microsoft says its team tackled this by creating a synthetic data pipeline built on the company’s earlier Magentic-One framework. The pipeline generates tasks from real public webpages, then uses a multi-agent system to explore each page, plan actions, carry out those actions and record every observation and step. These recordings, known as trajectories, are passed through verifier agents that confirm the tasks were completed successfully. Only verified attempts are used to train the model.

In total, Fara-7B was trained on around 145,000 trajectories containing around one million individual steps. These tasks cover e-commerce, travel, job applications, restaurant bookings, information look-ups and many other common activities. The base model, Qwen2.5-VL-7B, was selected for its strong multimodal grounding abilities and its support for long context windows, which allows Fara-7B to consider multiple screenshots and previous actions at once.

How Fara-7B Works In Practice

During use, Fara-7B receives screenshots of the browser window, the task description and a history of actions. It then predicts its next move, such as clicking on a button, typing text or visiting a new URL. The model outputs a short internal reasoning message and the exact action it intends to take.

Mirrors Human Behaviour By Just Looking At The Screen

This is all designed to mirror human behaviour. For example, the model sees only what is on the screen and must work out what to do based on that view. This avoids the need for extra data sources and ensures the model’s decisions can be inspected and audited.

Strong Results

Evaluations published by Microsoft appear to show strong results. For example, on well-known web automation benchmarks such as WebVoyager and Online-Mind2Web, Fara-7B outperforms other models in its size range and in some cases matches or exceeds the performance of larger systems. Independent testing by Browserbase also recorded a 62 per cent success rate on WebVoyager under human verification.

What Fara-7B Can Be Used For

The current release is aimed at developers, researchers and technical users who want to explore automated web tasks. Typical examples include:

– Filling out online forms.
– Searching for information.
– Making bookings.
– Managing online accounts.
– Navigating support pages.
– Comparing product prices.
– Extracting or summarising content from websites.

These tasks reflect everyday processes that take time in workplaces. Automating them could, therefore, reduce repetitive admin, speed up routine workflows and improve consistency when handling high-volume digital tasks.

Also, the fact that the model is open weight means organisations can fine tune it or build custom versions for internal use. For example, a business could adapt it to handle specialist web portals, internal booking systems or industry-specific interfaces.

Who Can Use It And When?

Fara-7B is available now through Microsoft Foundry, Hugging Face and the Magentic-UI research environment. A quantised and silicon-optimised version is available for Copilot+ PCs running Windows 11, allowing early adopters to test the model directly on their devices.

However, it should be noted here that it’s not yet a consumer feature and should be used in controlled experimentation rather than in production environments. Microsoft recommends running it in a sandboxed environment where users can observe its actions and intervene if needed.

The Benefits For Business Users

Many organisations have been cautious about browser automation due to concerns about data privacy, vendor lock-in and cloud dependency. Fara-7B’s on-device design appears to directly address these issues by keeping data local. This is especially relevant for sectors where regulatory requirements restrict the movement of sensitive information.

Running the model locally also reduces latency. For example, an agent that is reading the screen and clicking through a webpage must respond quickly, and any delay can disrupt the experience. An on-device agent avoids these delays and provides more predictable performance.

Benefits For Microsoft

For Microsoft, Fara-7B essentially strengthens its position in agentic AI, supports its Windows and Copilot+ hardware strategy and provides a foundation for future systems that combine device-side reasoning with cloud-based intelligence.

Developers

For developers and researchers, the open-weight release lowers barriers to experimentation, allowing new techniques to be tested and new evaluation methods to be developed. This may accelerate progress in areas such as safe automation, grounding accuracy and long-horizon task completion.

Challenges And Criticisms

Microsoft is clear that Fara-7B remains an experimental model with limitations. It can misinterpret interfaces, struggle with unfamiliar layouts or fail partway through a complex task. Like other agents that control computers, it remains vulnerable to malicious webpages, prompt-based attacks and unpredictable site behaviour.

There are some notable governance and security questions too. For example, businesses will need to consider how to monitor and log agent actions, how credentials are managed and how to prevent incorrect or undesired operations.

That said, Microsoft has introduced several safety systems to address these risks. The model has been trained to stop at “Critical Points”, such as payment stages or permission prompts, and will refuse to proceed without confirmation. The company also notes that the model achieved an 82 per cent refusal rate on red-team tasks designed to solicit harmful behaviour.

Early commentary has also highlighted that benchmark success does not necessarily translate directly into strong real-world performance, since live websites can behave unpredictably. Developers will need to conduct extensive testing before deploying any form of autonomous web agent in operational settings.

What Does This Mean For Your Business?

Fara-7B brings the idea of practical, controllable computer-use agents much closer to everyday reality, and the implications reach far beyond its immediate research release. The model shows that meaningful on-device automation is now possible with compact architectures rather than sprawling cloud systems. That alone will interest UK businesses that want to streamline manual web-based tasks without handing sensitive data to external services. These organisations have long relied on browser-driven processes in areas such as procurement, HR, finance and customer administration, so a tool that can take on repeatable workflows locally could offer genuine operational value if it proves reliable enough.

The wider AI market is likely to view the launch as a clear signal that Microsoft intends to compete directly in the emerging space for agentic automation. Fara-7B gives the company a foothold that it controls end to end, from the hardware and operating system through to developer tools and safety frameworks. This matters in a landscape where other players have approached computer-use agents with more closed or cloud-first designs. The open-weight release also sets a tone for how Microsoft wants the community to interact with the model, and it encourages a level of scrutiny that could shape future iterations.

In Fara-7B, developers and researchers gain a flexible platform that they can adapt, test and benchmark in their own environments. The training methodology itself, built on large scale synthetic tasks, raises important questions about how best to model digital behaviour and how to ensure that agents can generalise beyond curated datasets. These questions will continue to surface as more organisations explore automation that depends on visual reasoning rather than structured APIs.

It’s likely that stakeholders across government, regulation and security will now be assessing the risks as closely as the opportunities. For example, a system capable of taking actions on a live machine introduces new oversight challenges, from governance and auditing to resilience against hostile prompts or malicious web content. Microsoft’s emphasis on safety, refusal behaviour and Critical Points is a start, although much will depend on how reliably these mechanisms perform once the model is exposed to diverse real-world environments.

The release ultimately gives the industry a clearer view of what agentic AI might look like when it is embedded directly into personal devices rather than controlled entirely in the cloud. If the technology matures, it could affect expectations about digital assistance in the workplace, reduce friction in routine operations and extend automation to tasks that currently have no clean API-based alternative. The coming months will show whether developers and early adopters can turn this experimental foundation into stable, responsible tools that benefit businesses, consumers and the wider ecosystem.

Sponsored

Ready to find out more?

Drop us a line today for a free quote!

Mike Knight