Skip to content

webfuse-com/D2Snap

Repository files navigation

D2Snap

Example of downsampling on an image (top) and a DOM (bottom)

D2Snap is a first-of-its-kind DOM downsampling algorithm, designed for use with LLM-based web agents.

Integrate

D2Snap.d2Snap(
  dom: DOM,
  rE: number, rA: number, rT: number,
  options?: Options
): Promise<{
  html: string;
  meta: {};
}>

D2Snap.adaptiveD2Snap(
  dom: DOM,
  maxTokens: number = 4096,
  maxIterations: number = 5,
  options?: Options
): Promise<{
  html: string;
  meta: {};
}>
type DOM = Document | Element | string;
type Options = {
  debug?: boolean;                      // false
  groundTruth?: object;                 // compare src/types.ts:GroundTruthJSON
  groundTruthReplaceDefault?: boolean;  // false
  skipMarkdown?: boolean;               // false
  uniqueIDs?: boolean;                  // false
};

The downsampling ground truth can be overridden via options.groundTruth (full replacement via groundTruthReplaceDefault: true). Wildcards for aria and data attributes are supported ({aria-|data-}*).

Browser

<script src="https://cdn.jsdelivr.net/gh/webfuse-com/D2Snap@main/dist.browser/D2Snap.js"></script>

Module

npm install webfuse-com/D2Snap

Install jsdom to use the library with Node.js:

npm install jsdom
import * as D2Snap from "@webfuse-com/d2snap";

Example

<main class="container" tabindex="3" required="true">
  <div class="mx-auto" data-topic="products" required="false">
    <h1>Our Pizza</h1>
    <div aria-description="Choose one product">
      <strong>Choose one</strong>
      <section class="shadow-lg">
        <h2>Margherita</h2>
        <p>
         A simple classic: mozzarela, tomatoes and basil.
         An everyday choice!
        </p>
        <button type="button">Add</button>
      </section>
      <section class="shadow-lg">
        <h2>Capricciosa</h2>
        <p>
          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.
          A true favourite!
        </p>
        <button type="button">Add</button>
      </section>
    </div>
  </div>
</main>

↓ D2Snap ↓

<main class="container" required="true">
  # Our Pizza
  <section aria-description="Choose one product" class="shadow-lg">
    **Choose one**
    ## Margherita
    A simple classic mozzarela tomatoes and basil
    <button>
      Add
    </button>
    ## Capricciosa
    A rich taste
    A true favourite
    <button>
      Add
    </button>
  </section>
</main>

↓ D2Snap ↓

# Our Pizza
**Choose one**
## Margherita
A simple classic
<button>Add</button>
## Capricciosa
A rich taste
<button>Add</button>

Experiment

Setup

npm install
npm install jsdom

Build

npm run build

Test

npm run test

Evaluate

Provide LLM API provider key(s) to .env (compare example).

npm run eval:<snapshot>

<snapshot> ∈ { gui, dom, bu, D2Snap }

npm run eval:D2Snap -- --verbose --split 10,20 --provider openai --model gpt-4o

Re-create Snapshots

npm run snapshots:create

Beyond Pixels: Exploring DOM Downsampling for LLM-Based Web Agents
Thassilo M. SchiepanskiNicholas Piël
Surfly BV

About

Beyond Pixels: Exploring DOM Downsampling for LLM-Based Web Agents

Resources

License

Stars

Watchers

Forks

Contributors