.. _embedding: Embedding Buckaroo ================== Buckaroo started life as a Jupyter widget. It still works that way — the table that pops up after ``import buckaroo`` is the same component you'll be embedding. But there are now several other ways to render that component outside of a notebook: static HTML files, custom web pages, a standalone server, and Solara apps. This guide is a map of those options so you can pick the one that fits your use case. The decision comes down to two axes: 1. **Which widget?** Full Buckaroo UI (status bar, summary stats, command UI, sampling toggle) vs. a plain DFViewer table. Eager-loaded base vs. infinite-scrolling. 2. **Which deployment?** Notebook kernel, static HTML, custom HTML + JS, Buckaroo server, or Solara. Pick a widget and a deployment — almost any combination works. Widget types ------------ There are two orthogonal choices that produce four widget classes. **Buckaroo vs. DFViewer** — how much UI shows up: - **BuckarooWidget** is the full experience. Above the table is the status bar with toggles for summary statistics (``Σ``), command-edit mode (``λ``), sampling (``Ξ``), and help (``?``). Below the status bar there's a tabbed display switcher. Use this when you want users to *explore and clean* data. - **DFViewer** is just the data grid — sortable columns, formatting, histograms in the header, but no status bar, no command UI, no summary stats panel. Use this when you want a styled read-only table inside a larger app or page. **Base vs. Infinite** — how rows reach the browser: - **Base** widgets serialize the entire (sampled) DataFrame up front and ship it to the browser in one shot. Sampling kicks in around 10k rows by default to keep payloads reasonable. - **Infinite** widgets stream rows on demand. The browser asks for a row range; the Python side serializes that slice as parquet and sends it back. Sorting is also pushed to the server. This scales to dataframes that won't fit in the browser, at the cost of a live Python connection. The four classes are: .. list-table:: :header-rows: 1 :widths: 24 38 38 * - - **Base** (eager) - **Infinite** (lazy) * - **Buckaroo** (full UI) - ``BuckarooWidget`` - ``BuckarooInfiniteWidget`` * - **DFViewer** (table only) - ``DFViewer`` (helper) - ``DFViewerInfinite`` For polars, swap the prefix: ``PolarsBuckarooWidget``, ``PolarsBuckarooInfiniteWidget``, ``PolarsDFViewer``. For xorq (ibis expressions): ``XorqBuckarooWidget``, ``XorqBuckarooInfiniteWidget``. The xorq path doesn't currently expose a DFViewer-only variant — it ships with the full Buckaroo UI. Picking between them: - Default to ``BuckarooWidget`` in notebooks. It's the full pitch. - Use ``DFViewer`` when Buckaroo is a component of a larger UI you've already built (a Solara dashboard, a static report page). - Use the Infinite variants when the dataframe is too big to ship eagerly, or when you want server-side sorting on the full set rather than only the sampled subset. Backends: pandas, polars, and xorq ---------------------------------- Buckaroo supports three backends. The selection happens at the import path: .. code-block:: python # Pandas from buckaroo import BuckarooWidget, BuckarooInfiniteWidget, DFViewer # Polars from buckaroo.polars_buckaroo import ( PolarsBuckarooWidget, PolarsBuckarooInfiniteWidget, PolarsDFViewer) # Xorq / ibis expressions from buckaroo.xorq_buckaroo import ( XorqBuckarooWidget, XorqBuckarooInfiniteWidget) The user-facing UI is identical across all three — same status bar, same column histograms, same command UI. What differs is internal: the analysis classes (mean, median, null counts, histograms, etc.) are implemented against each library's native API, so neither pandas nor polars pays a conversion cost to render, and xorq pushes computation down to whatever backend is behind the expression. A few entry points accept either pandas or polars frames and dispatch by type. The static-embed helpers (``prepare_buckaroo_artifact``, ``to_html``) inspect the input and pick the right widget class for you. ``LazyFrame`` is collected to a ``DataFrame`` first. Polars is an optional dependency: ``pip install buckaroo[polars]``. Without it, the polars import paths simply aren't there, and the pandas classes work the same. **xorq** is a third backend, built on `xorq `_/ibis, that takes an *expression* rather than a materialized frame. The stat pipeline compiles to a small, fixed number of batched SQL queries: one ``expr.aggregate(...)`` for length / null-count / min / max / distinct-count across every column, plus the histogram queries. Computation stays in the engine — the only thing pulled into Python is a display-sized sample (``expr.limit(N).execute()``). This means Buckaroo can render summary statistics over DuckDB, Postgres, Snowflake, BigQuery, and any other ibis-supported engine without materializing the table. .. code-block:: python import xorq.api as xo from buckaroo.xorq_buckaroo import XorqBuckarooInfiniteWidget con = xo.connect() # built-in datafusion expr = con.read_parquet("citibike-trips.parquet") XorqBuckarooInfiniteWidget(expr) The default backend is xorq's built-in datafusion engine. Swap to duckdb, postgres, etc. by registering the table on the relevant connection: .. code-block:: python con = xo.duckdb.connect("warehouse.db") expr = con.table("trips").filter(con.table("trips").year == 2024) XorqBuckarooInfiniteWidget(expr) The Infinite variant is usually what you want for xorq — each scroll window pushes a ``LIMIT/OFFSET`` to the backend and streams the resulting Arrow window straight to the browser. Postprocessing is expression-to-expression: register a function that takes the current expression and returns a new one, and stats keep pushing down to the engine. Install with ``pip install 'buckaroo[xorq]'``. See :doc:`xorq-stats` for a walkthrough of the underlying stat pipeline and how to add custom aggregates. Embedding modes --------------- The Python widget has the same surface area in every mode. What changes is *where* the JS bundle runs and *how* data gets to it. 1. Notebook (anywidget) ~~~~~~~~~~~~~~~~~~~~~~~ This is the original deployment. Buckaroo is an `anywidget `_, so it works in any notebook host that speaks the Jupyter widget protocol — Jupyter Lab, classic Notebook 7, marimo, VS Code, JupyterLite, Google Colab. .. code-block:: python import pandas as pd from buckaroo import BuckarooWidget df = pd.read_csv("sales.csv") BuckarooWidget(df) The kernel runs your Python; ``anywidget`` ships ``widget.js`` to the front end and wires up the bidirectional traitlet sync. For Infinite widgets the kernel also handles row-range requests over the comm channel. When to use it: you're already in a notebook. ``import buckaroo`` also installs Buckaroo as the default DataFrame display, so a bare ``df`` cell renders the widget — no widget class needed. 2. Static HTML ~~~~~~~~~~~~~~ ``buckaroo.to_html()`` renders a complete HTML document with the data embedded as base64-encoded parquet inside a ``