The web agent that sees, thinks,
and clicks for you

Opticlick Engine is a Chrome Extension that autonomously navigates any website. Describe your goal in plain English — it handles every click.

Add to Chrome — it's free View on GitHub

opticlick-engine · agent log

00:00.124[THINK]Goal received: "Book the cheapest flight from NYC to London next week"

00:00.381[ACT]Annotating 47 interactable elements, piercing 3 Shadow DOM roots

00:00.692[OBSERVE]Screenshot captured (2560×1600). Sending to Gemini 3.1 Pro…

00:01.204[THINK]Model identified target #12 — "Search flights" button (confidence 0.97)

00:01.209[ACT]CDP Input.dispatchMouseEvent → {x: 724, y: 432} scaled for devicePixelRatio 2×

00:01.480[OBSERVE]DOM idle confirmed. 12 results rendered. Proceeding to Step 2…

How it works

Think. Act. Observe.
Repeat until done.

Opticlick runs a continuous perception–action loop, powered by multimodal AI, until your goal is complete.

🗣️

You describe the goal

Type a plain-English instruction in the extension popup. No selectors, no scripting required.

🎯

Page is annotated

A canvas overlay numbers every clickable element on the page, piercing Shadow DOM and iframes.

🧠

AI picks the target

A screenshot is sent to Gemini 3.1 Pro with your prompt. The model returns the ID of the element to interact with.

🖱️

Hardware-level click

The Chrome DevTools Protocol fires a real mouse event sequence — bypassing React, Vue, and Angular synthetic event guards.

🔁

Loop until complete

After each action the agent observes the new page state and decides the next move, autonomously.

Features

Built for the modern web

Designed to handle dynamic SPAs, cross-origin iframes, and high-DPI displays out of the box.

👁️

Set-of-Mark Vision

Numbered bounding boxes rendered on a unified canvas give the LLM a precise, unambiguous spatial map of every interactable element.

🌐

Cross-origin iframe support

Content scripts are injected into all frames including sandboxed third-party iframes, so embedded widgets are never out of reach.

🔮

Shadow DOM traversal

Recursively pierces open Shadow DOM roots to discover components hidden inside Web Components and design-system libraries.

⚡

CDP hardware simulation

Uses Chrome Debugger API to dispatch real mouseMoved → mousePressed → mouseReleased events, indistinguishable from physical input.

📐

Retina-safe coordinates

Click coordinates are automatically divided by devicePixelRatio before dispatch, ensuring pixel-perfect accuracy on any display.

💾

Persistent state across restarts

MV3 service workers are ephemeral — Opticlick uses chrome.storage.session and IndexedDB so the agent never loses its place.

🛡️

Input blocking

Capturing event listeners prevent accidental user interference while the agent is mid-task, with a clear visual indicator when active.

⏳

DOM idle detection

A MutationObserver-based idle gate ensures annotations are only drawn once network and DOM activity have settled.

🔒

Minimal permissions

Requests only activeTab, scripting, debugger, and storage. No broad host permissions. Your API key stays local.

The web agent that sees, thinks,
and clicks for you

Think. Act. Observe.
Repeat until done.

You describe the goal

Page is annotated

AI picks the target

Hardware-level click

Loop until complete

Built for the modern web

Set-of-Mark Vision

Cross-origin iframe support

Shadow DOM traversal

CDP hardware simulation

Retina-safe coordinates

Persistent state across restarts

Input blocking

DOM idle detection

Minimal permissions

Engineered on open standards

Ready to automate
your browsing?

The web agent that sees, thinks,and clicks for you

Think. Act. Observe.Repeat until done.

You describe the goal

Page is annotated

AI picks the target

Hardware-level click

Loop until complete

Built for the modern web

Set-of-Mark Vision

Cross-origin iframe support

Shadow DOM traversal

CDP hardware simulation

Retina-safe coordinates

Persistent state across restarts

Input blocking

DOM idle detection

Minimal permissions

Engineered on open standards

Ready to automateyour browsing?

The web agent that sees, thinks,
and clicks for you

Think. Act. Observe.
Repeat until done.

Ready to automate
your browsing?