Core

Full Browser Automation in Your Side Panel

Harness the power of multi-modal Large Language Models to navigate and manage web tasks using simple, intuitive commands. Unlike many other web agents, Fuji Chat is available as a browser extension, allowing you to call it anytime you need it. Even in the middle of tasks, you can hand them off to the agent and watch it take care of the rest.

Prior Knowledge Augmentation System

Fuji Chat navigates websites with past experiences, improving its understanding of web dynamics. Users can customize and inject domain-specific insights in real-time by adding instructions in the settings menu.

In the future, Fuji Chat will support additional mechanisms to enhance its web interaction capabilities, including:

Custom text/CSS selectors
JavaScript execution

Enhancing Website Understanding

When analyzing a webpage and composing a prompt, Fuji Chat filters for relevant elements, ensuring a high success rate in interactions.

The system leverages HTML semantics and WAI-ARIA roles to accurately identify interactive elements:

Website

Interactive Elements (HTML Tags)

Interactive Elements (HTML Tags + WAI-ARIA)

amazon.com

534

547

twitter.com

121

github.com

1364

1446

Fuji Chat ensures that essential interactive elements are never missed, while filtering out redundant or hidden components, enhancing its web navigation accuracy.

Benchmarks

Fuji Chat’s ability to complete real-world tasks has been compared to other models using industry benchmarks. The results demonstrate superior success rates across multiple websites:

Model

Allrecipes

ArXiv

Apple

Google Search

BBC News

GitHub

Cambridge Dictionary

GPT-4 (All Tools)

11.1%

17.1%

44.2%

60.5%

9.5%

48.8%

25.6%

WebVoyager

53.3%

51.2%

65.1%

76.7%

61.9%

63.4%

65.1%

Fuji Chat

64.4%

65.1%

60.4%

81.4%

76.2%

73.2%

86.0%

Note: Fuji Chat was benchmarked using the GPT-4o model, while WebVoyager results are from their February 2024 report using GPT-4V.

A more detailed benchmark report will be released soon.

Limitations

While Fuji Chat offers advanced automation, some limitations exist:

1. Missing Semantics

Fuji Chat relies on semantic HTML and accessibility roles to identify interactive elements. Websites that do not follow accessibility standards may cause inconsistencies in detection.

2. Non-Semantic Web Technologies

Some applications, like Google Sheets, rely on Canvas/WebGL instead of standard HTML elements, making interaction difficult.

3. Limited Interaction Types

Currently, Fuji Chat can scroll entire webpages, but it may struggle with:

Scrolling within specific containers
Dropdown menus with excessive options
Drag-and-drop interactions

Workaround: Users can leverage Fuji Chat's "instructions" feature to manually guide interactions when necessary.

Future Development

Fuji Chat is not just a tool for automating online tasks—it is also a state-of-the-art web automation agent for complex workflows.

1. Supporting Programmatic Usage

Fuji Chat will introduce a JavaScript API to facilitate integration with automation frameworks like:

Puppeteer
Playwright
Selenium

This API will enable:

Automated performance benchmarking
Fuji Chat as a sub-agent in larger AI systems
Task execution triggered by external signals (e.g., scheduled tasks, email triggers, etc.)
Cloud-based Fuji Chat services

2. Cross-Tab Workflows

Most real-world automation spans multiple websites and requires context awareness across tabs. Fuji Chat will introduce:

Cross-tab memory to retain information between sessions
Seamless automation even when switching tabs

3. Copilot Mode

Fuji Chat will proactively seek user input when necessary, such as:

Logging in
Entering verification codes
Reviewing actions before proceeding

4. Long-Term & Decentralized Memory

To improve efficiency, Fuji Chat is developing a "Prior Knowledge Augmentation" system to store task-specific insights. Planned improvements include:

Task saving for quick re-use
Task & instruction sharing
A knowledge extraction tool to create general automation rules
A Wikipedia-like knowledge base where users can collaboratively enhance Fuji Chat’s capabilities
Autonomous site exploration to generate useful automation instructions

Fuji Chat is committed to building a powerful, intelligent AI partner for modern web automation. 🚀

PreviousFundamental

Last updated 5 months ago