Get started with DataFramer on Databricks
This page summarizes the main steps. For the full notebook and code, see the official documentation.
1. Prerequisites
- Run the notebook with a service principal that has Unity Catalog access (USE CATALOG, USE SCHEMA, CREATE TABLE, SELECT, MODIFY on the schema).
- Get a DataFramer API key from app.aimon.ai → Account → Keys.
- Create a Databricks secret scope (e.g.
dataframer) with:DATAFRAMER_API_KEY,DATABRICKS_HTTP_PATH,DATABRICKS_CLIENT_ID,DATABRICKS_CLIENT_SECRET,DATABRICKS_SERVER_HOSTNAME.
2. Install SDK
In a Databricks notebook cell:
%pip install --upgrade pydataframer pydataframer-databricks pyyaml tenacity 3. Initialize client and connector
Use the Dataframer client with your API key from secrets, and the DatabricksConnector for Unity Catalog:
from dataframer import Dataframer
from pydataframer_databricks import DatabricksConnector
databricks_connector = DatabricksConnector(dbutils, scope="dataframer")
client = Dataframer(api_key=dbutils.secrets.get("dataframer", "DATAFRAMER_API_KEY")) 4. Upload seed data and generate spec
Fetch a sample from your Unity Catalog table, upload it as a seed dataset to DataFramer, then create a specification (optionally using a Databricks model like databricks/databricks-claude-sonnet-4-5 with databricks_connector.serving_credentials()).
5. Generate samples and load into Delta
Create a run with your spec and Databricks generation model, poll for completion, then download the generated files and use databricks_connector.load_generated_data() to write to a Delta table.
Full documentation
The complete notebook with all code cells, retries, and evaluation steps is in the DataFramer docs.
Open Databricks integration docs →