Clean Room for Data Scientists

Version 1.0 | March 2026

Beta Access — Clean Room is currently in beta and not yet generally available. To request early access, email [email protected].

The Gap Between Exploration and Delivery

Data science moves fast. A typical analysis starts as a few lines in a notebook, grows into a sprawling series of cells with experiments, dead ends, and half-commented code blocks—each one a record of the thinking that got you to the answer. That's not messiness. That's the scientific process.

The problem isn't the exploration. The problem is what happens after you've found something worth sharing.

You've built something valuable: a model, a visualization, an analysis that decision-makers need to understand. Now you face a different kind of work—translating that discovery into something reproducible, explainable, and interactive. Usually that means:

Cleaning up the notebook to make it presentable
Answering endless "what if we changed X?" questions from stakeholders
Rerunning the whole analysis every time someone wants a slightly different cut
Hoping whoever runs it next has the same package versions you did

GoFigr Clean Room is built to close that gap—without disrupting how you actually work.

What Clean Room Does

Clean Room transforms a Python function into a self-contained, browser-based interactive application. You write a function, decorate it with @reproducible, and that's it. GoFigr creates a clean boundary between this function and the rest of the notebook: it only has access to the variables you give it, the packages you declare, and nothing else.

Now, when you call this function and it produces a plot or a figure, the function and the figure get packaged together and published as an interactive web application. The whole process is seamless and automatic—GoFigr supports fully automatic figure capture in Jupyter.

How It Works

Setup: One Magic, One Decorator

In your notebook, load the GoFigr extension once. This enables automatic capture whenever a @reproducible function runs:

# Cell 1 — imports and GoFigr setup
from typing import Literal
import seaborn as sns

%load_ext gofigr  # enables automatic figure capture and injects reproducible, SliderParam, DropdownParam, etc.

Then decorate your analysis function with @reproducible(interactive=True):

# Cell 2 — define the analysis as a reproducible function

@reproducible(interactive=True)
def flipper_length_distribution(
    data,                    # DataFrames are passed in from the notebook -- not embedded in source
    bins: int     = SliderParam(20, min=5, max=100, step=5),
    alpha: float  = SliderParam(0.7, min=0.1, max=1.0, step=0.05),
    show_kde: Literal["yes", "no", "auto"] = "yes",  # Literal types become dropdowns automatically
    species: str  = DropdownParam("Adelie", choices=["Adelie", "Chinstrap", "Gentoo"]),
    show_grid: bool = True,  # booleans become checkboxes
    title: str    = "Flipper Length Distribution"
):
    filtered = data[data['species'] == species]
    kde = True if show_kde == "yes" else (False if show_kde == "no" else None)

    ax = sns.histplot(
        data=filtered,
        x='flipper_length_mm',
        bins=bins,
        alpha=alpha,
        kde=kde,
    )
    ax.set_title(title)
    if show_grid:
        ax.grid(True, alpha=0.3)


# Cell 3 — call it like a normal function
# GoFigr captures the output and packages everything automatically
flipper_length_distribution(penguins_df)

That's the entire integration. There's no separate publish step, no export, no post-processing. Running the function in Jupyter is all it takes.

What Gets Captured

When the function runs, GoFigr captures:

Source code — the function body, extracted cleanly from the notebook cell
Parameters — types, defaults, and widget configuration for every parameter
Data — DataFrames passed as arguments, serialized and stored alongside the revision
Environment — package names and versions, imports, Python version
Output — the figures produced by the run

Each run creates a new revision. Every revision is immutable and traceable. You always know exactly what produced a given figure.

Parameters and Widgets

GoFigr maps Python types to interactive controls automatically. You can use type annotations and Literal for the common cases, or use explicit parameter classes when you need more control:

Type

Widget

Example

int / float with SliderParam

Slider with bounds

bins: int = SliderParam(20, min=5, max=100, step=5)

str with Literal[...]

Dropdown (inferred)

show_kde: Literal["yes", "no", "auto"] = "yes"

str with DropdownParam

Dropdown (explicit)

species = DropdownParam("Adelie", choices=[...])

bool

Checkbox

show_grid: bool = True

Free-form str

Text input

title: str = "My Chart"

pd.DataFrame

Static (read-only)

Passed in at call time, available in studio

One thing worth noting: data is passed in at call time from your notebook, not hardcoded in the function. GoFigr serializes it automatically. The function stays general; the data is bound to the specific revision.

The Provenance Model

Every time a Clean Room figure is re-run with new parameters and saved, a new revision is created that is:

Linked to the original — the full revision history is preserved
Watermarked — each output image contains a QR code linking back to the exact revision that produced it
Parameterized — the parameter values used for that run are stored with the revision

This means you can answer "where did this chart come from?" with precision: the code, the data, the packages, the parameter values, the timestamp, and the user who ran it.

No more hunting through Slack to find which version of the notebook produced the slide in the board deck.

The Workflow

Explore — work however you normally work in Jupyter: experiment freely, iterate fast.
Distill — pull the core logic into a @reproducible function. This is the moment of crystallization: you're extracting the essential analysis from the surrounding scaffolding.
Run — call the function as normal. GoFigr captures and packages everything automatically.
Share — enable link sharing and send the URL to whoever needs it.

From that point, stakeholders interact with the Clean Room studio directly. They adjust sliders, change dropdowns, re-run—and optionally save the result as a new revision. You get notified. You don't have to re-run anything yourself unless the underlying logic needs to change.

The Studio Environment

When someone opens a Clean Room figure, they land in the studio:

Code editor — the full function source, editable, with syntax highlighting
Parameter panel — generated controls for every parameter
Figure output — live rendering of whatever the function produces
Console — stdout and stderr from the execution
Environment inspector — browse live variables, preview DataFrames, inspect imports
AI assistant — request code modifications in natural language

The runtime runs in the browser via WebAssembly Python. Packages are installed on-demand. No server-side execution, no infrastructure to manage.

Supported Visualization Backends

Clean Room supports the output formats you're already using:

Matplotlib / Seaborn — PNG, SVG, HTML
Plotly — interactive HTML figures (coming soon)
Plotnine — ggplot2-style static plots

What You Don't Have to Do Anymore

Once a function is a Clean Room figure:

You don't rerun the analysis every time someone wants a different filter
You don't share notebooks with a page of setup instructions
You don't maintain a separate "presentable" version of your notebook for stakeholders
You don't try to reconstruct which parameters produced a specific output six months later

The exploration lives in your notebook. The deliverable lives in GoFigr.

When to Use Clean Room

Clean Room is the right tool when:

The analysis will be revisited with different parameters, by you or someone else
Stakeholders need to explore "what if" scenarios without your involvement
Reproducibility matters (regulatory, audit, or internal review)
You want to deliver an interactive result, not a static slide

It's less appropriate for:

Pure exploration that won't be revisited
Analyses that depend on local databases or custom infrastructure not available in browser Python
Code that requires packages not available in WebAssembly Python

Summary

Clean Room lets you move from rapid, exploratory iteration to a shareable, reproducible, interactive asset without changing how you work in Jupyter. The @reproducible decorator captures your function's full context—code, data, parameters, environment—automatically when you run it. Every subsequent run produces a traceable revision. Stakeholders interact directly with the studio, adjusting parameters and re-running without your involvement.

Your exploration stays exploratory. Your deliverables become durable.

GoFigr Clean Room — gofigr.io

PreviousREST Endpoints NextClean Room for Data Science Leaders

Last updated 1 day ago

Good afternoon

hashtagThe Gap Between Exploration and Delivery

hashtagWhat Clean Room Does

hashtagHow It Works

hashtagSetup: One Magic, One Decorator

hashtagWhat Gets Captured

hashtagParameters and Widgets

hashtagThe Provenance Model

hashtagThe Workflow

hashtagThe Studio Environment

hashtagSupported Visualization Backends

hashtagWhat You Don't Have to Do Anymore

hashtagWhen to Use Clean Room

hashtagSummary