# Clean Room for Data Scientists

**Version 1.0 | March 2026**

***

> **Beta Access** — Clean Room is currently in beta and not yet generally available. To request early access, email <info@gofigr.io>.

***

## The Gap Between Exploration and Delivery

Data science moves fast. A typical analysis starts as a few lines in a notebook, grows into a sprawling series of cells with experiments, dead ends, and half-commented code blocks—each one a record of the thinking that got you to the answer. That's not messiness. That's the scientific process.

The problem isn't the exploration. The problem is what happens *after* you've found something worth sharing.

You've built something valuable: a model, a visualization, an analysis that decision-makers need to understand. Now you face a different kind of work—translating that discovery into something reproducible, explainable, and interactive. Usually that means:

* Cleaning up the notebook to make it presentable
* Answering endless "what if we changed X?" questions from stakeholders
* Rerunning the whole analysis every time someone wants a slightly different cut
* Hoping whoever runs it next has the same package versions you did

GoFigr Clean Room is built to close that gap—without disrupting how you actually work.

***

## What Clean Room Does

Clean Room transforms a Python function into a self-contained, browser-based interactive application. You write a function, decorate it with `@reproducible`, and that's it. GoFigr creates a clean boundary between this function and the rest of the notebook: it only has access to the variables you give it, the packages you declare, and nothing else.

Now, when you call this function and it produces a plot or a figure, the function and the figure get packaged together and published as an interactive web application. The whole process is seamless and automatic—GoFigr supports fully automatic figure capture in Jupyter.

***

## How It Works

### Setup: One Magic, One Decorator

In your notebook, load the GoFigr extension once. This enables automatic capture whenever a `@reproducible` function runs:

```python
# Cell 1 — imports and GoFigr setup
from typing import Literal
import seaborn as sns

%load_ext gofigr  # enables automatic figure capture and injects reproducible, SliderParam, DropdownParam, etc.
```

Then decorate your analysis function with `@reproducible(interactive=True)`:

```python
# Cell 2 — define the analysis as a reproducible function

@reproducible(interactive=True)
def flipper_length_distribution(
    data,                    # DataFrames are passed in from the notebook -- not embedded in source
    bins: int     = SliderParam(20, min=5, max=100, step=5),
    alpha: float  = SliderParam(0.7, min=0.1, max=1.0, step=0.05),
    show_kde: Literal["yes", "no", "auto"] = "yes",  # Literal types become dropdowns automatically
    species: str  = DropdownParam("Adelie", choices=["Adelie", "Chinstrap", "Gentoo"]),
    show_grid: bool = True,  # booleans become checkboxes
    title: str    = "Flipper Length Distribution"
):
    filtered = data[data['species'] == species]
    kde = True if show_kde == "yes" else (False if show_kde == "no" else None)

    ax = sns.histplot(
        data=filtered,
        x='flipper_length_mm',
        bins=bins,
        alpha=alpha,
        kde=kde,
    )
    ax.set_title(title)
    if show_grid:
        ax.grid(True, alpha=0.3)


# Cell 3 — call it like a normal function
# GoFigr captures the output and packages everything automatically
flipper_length_distribution(penguins_df)
```

That's the entire integration. There's no separate publish step, no export, no post-processing. Running the function in Jupyter is all it takes.

### What Gets Captured

When the function runs, GoFigr captures:

* **Source code** — the function body, extracted cleanly from the notebook cell
* **Parameters** — types, defaults, and widget configuration for every parameter
* **Data** — DataFrames passed as arguments, serialized and stored alongside the revision
* **Environment** — package names and versions, imports, Python version
* **Output** — the figures produced by the run

Each run creates a new revision. Every revision is immutable and traceable. You always know exactly what produced a given figure.

***

## Parameters and Widgets

GoFigr maps Python types to interactive controls automatically. You can use type annotations and `Literal` for the common cases, or use explicit parameter classes when you need more control:

| Type                               | Widget              | Example                                               |
| ---------------------------------- | ------------------- | ----------------------------------------------------- |
| `int` / `float` with `SliderParam` | Slider with bounds  | `bins: int = SliderParam(20, min=5, max=100, step=5)` |
| `str` with `Literal[...]`          | Dropdown (inferred) | `show_kde: Literal["yes", "no", "auto"] = "yes"`      |
| `str` with `DropdownParam`         | Dropdown (explicit) | `species = DropdownParam("Adelie", choices=[...])`    |
| `bool`                             | Checkbox            | `show_grid: bool = True`                              |
| Free-form `str`                    | Text input          | `title: str = "My Chart"`                             |
| `pd.DataFrame`                     | Static (read-only)  | Passed in at call time, available in studio           |

One thing worth noting: `data` is passed in at call time from your notebook, not hardcoded in the function. GoFigr serializes it automatically. The function stays general; the data is bound to the specific revision.

***

## The Provenance Model

Every time a Clean Room figure is re-run with new parameters and saved, a new revision is created that is:

* **Linked to the original** — the full revision history is preserved
* **Watermarked** — each output image contains a QR code linking back to the exact revision that produced it
* **Parameterized** — the parameter values used for that run are stored with the revision

This means you can answer "where did this chart come from?" with precision: the code, the data, the packages, the parameter values, the timestamp, and the user who ran it.

No more hunting through Slack to find which version of the notebook produced the slide in the board deck.

***

## The Workflow

1. **Explore** — work however you normally work in Jupyter: experiment freely, iterate fast.
2. **Distill** — pull the core logic into a `@reproducible` function. This is the moment of crystallization: you're extracting the essential analysis from the surrounding scaffolding.
3. **Run** — call the function as normal. GoFigr captures and packages everything automatically.
4. **Share** — enable link sharing and send the URL to whoever needs it.

From that point, stakeholders interact with the Clean Room studio directly. They adjust sliders, change dropdowns, re-run—and optionally save the result as a new revision. You get notified. You don't have to re-run anything yourself unless the underlying logic needs to change.

***

## The Studio Environment

When someone opens a Clean Room figure, they land in the studio:

* **Code editor** — the full function source, editable, with syntax highlighting
* **Parameter panel** — generated controls for every parameter
* **Figure output** — live rendering of whatever the function produces
* **Console** — stdout and stderr from the execution
* **Environment inspector** — browse live variables, preview DataFrames, inspect imports
* **AI assistant** — request code modifications in natural language

The runtime runs in the browser via WebAssembly Python. Packages are installed on-demand. No server-side execution, no infrastructure to manage.

***

## Supported Visualization Backends

Clean Room supports the output formats you're already using:

* **Matplotlib / Seaborn** — PNG, SVG, HTML
* **Plotly** — interactive HTML figures (coming soon)
* **Plotnine** — ggplot2-style static plots

***

## What You Don't Have to Do Anymore

Once a function is a Clean Room figure:

* You don't rerun the analysis every time someone wants a different filter
* You don't share notebooks with a page of setup instructions
* You don't maintain a separate "presentable" version of your notebook for stakeholders
* You don't try to reconstruct which parameters produced a specific output six months later

The exploration lives in your notebook. The deliverable lives in GoFigr.

***

## When to Use Clean Room

Clean Room is the right tool when:

* The analysis will be revisited with different parameters, by you or someone else
* Stakeholders need to explore "what if" scenarios without your involvement
* Reproducibility matters (regulatory, audit, or internal review)
* You want to deliver an interactive result, not a static slide

It's less appropriate for:

* Pure exploration that won't be revisited
* Analyses that depend on local databases or custom infrastructure not available in browser Python
* Code that requires packages not available in WebAssembly Python

***

## Summary

Clean Room lets you move from rapid, exploratory iteration to a shareable, reproducible, interactive asset without changing how you work in Jupyter. The `@reproducible` decorator captures your function's full context—code, data, parameters, environment—automatically when you run it. Every subsequent run produces a traceable revision. Stakeholders interact directly with the studio, adjusting parameters and re-running without your involvement.

Your exploration stays exploratory. Your deliverables become durable.

***

*GoFigr Clean Room —* [*gofigr.io*](https://gofigr.io)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.gofigr.io/white-papers-and-case-studies/cleanroom-data-scientists.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.