A modern web application for managing Apache Iceberg tables via a REST Catalog.
- Multi-Catalog Support: Connect to multiple catalogs simultaneously and switch between them
- Cross-Catalog Joins: Query and join tables from different catalogs in a single SQL statement
- DML Operations: Execute INSERT and DELETE statements directly on Iceberg tables
- File Uploads: Upload CSV, JSON, and Parquet files to append data to your tables
- Enhanced UI: Improved catalog management with dropdown selector and logout functionality
- Multi-Catalog Support: Connect to and manage multiple Iceberg catalogs simultaneously
- Table Management: Browse namespaces and tables across all connected catalogs
- SQL Querying: Run SQL queries with Apache DataFusion, including cross-catalog joins
- DML Operations: Execute INSERT and DELETE statements on Iceberg tables
- File Uploads: Upload CSV, JSON, and Parquet files to append data to tables
- Metadata Viewer: View table schema, snapshots, properties, and statistics
- Table Maintenance: Perform snapshot expiration and other maintenance tasks
- Schema Evolution: Add, rename, drop, and update table columns
- Time Travel: Query historical table snapshots
- Modern UI: Built with React and Material UI, featuring Light/Dark modes
- Cross-Catalog Joins: Query and join tables from different catalogs in a single SQL statement
- Query Caching: Automatic caching of query results for improved performance
- Export Results: Export query results to CSV, JSON, or Parquet formats
- Query History: Track and reuse previous queries
- Saved Queries: Save frequently used queries for quick access
- Backend: Python (FastAPI)
- PyIceberg: Handles interactions with the Iceberg REST Catalog.
- DataFusion: Provides a high-performance query engine.
- Frontend: React (Vite)
- Material UI: For a polished, responsive user interface.
- Explorer: Browse namespaces and tables. Use the dropdown to switch catalogs.
- Query Editor: Write and execute SQL. Supports multiple tabs.
- Metadata Viewer: Inspect Schema, Snapshots, Files, and Manifests.
- Dark Mode: Toggle the theme using the sun/moon icon in the header.
You can run Iceberg UI using Docker (recommended for quick start) or by setting it up locally (recommended for development).
You can run the application easily using the pre-built Docker image.
Run the UI on port 8000:
docker run -p 8000:8000 alexmerced/iceberg-uiAccess the UI at http://localhost:8000.
To spin up a complete testing environment with a Nessie Catalog and Minio S3 storage, use the provided docker-compose.yml:
docker-compose up -d- Python 3.9+
- Node.js 16+
- (Optional) An Iceberg catalog server
- Navigate to the
backenddirectory:cd backend - Create a virtual environment and install dependencies:
python -m venv venv source venv/bin/activate pip install -r requirements.txt - Start the server:
uvicorn main:app --reload
-
Navigate to the
frontenddirectory:cd frontend -
Install dependencies:
npm install
-
Start the development server:
npm run dev
Note: You can configure the port using the
PORTenvironment variable:PORT=3000 npm run dev
For the backend, you can also set the port:
PORT=8001 python main.py
You can configure the application using the following environment variables:
PORTorFRONTEND_PORT: Port to run the frontend server (default: 5173).VITE_BACKEND_URL: URL of the backend API (default:http://localhost:8000).
PORTorBACKEND_PORT: Port to run the backend server (default: 8000).FRONTEND_URL: Comma-separated list of allowed frontend URLs for CORS (default:*).
Copy example.env.json to env.json and update it with your catalog details. The application will automatically connect to this catalog on startup.
{
"catalogs": {
"default": {
"uri": "https://catalog.example.com/api/iceberg",
"oauth2-server-uri": "https://auth.example.com/oauth/token",
"token": "your-token-here",
"warehouse": "s3://your-warehouse",
"type": "rest"
}
}
}Note: env.json is gitignored for security. Never commit credentials to version control.
You can also connect to catalogs directly through the UI without pre-configuring env.json. This allows you to:
- Connect to multiple catalogs in a single session
- Give each catalog a friendly name
- Switch between catalogs easily
- REST: Iceberg REST Catalog (Dremio, Polaris, Nessie, etc.)
- Hive: Hive Metastore
- Glue: AWS Glue Data Catalog
- DynamoDB: AWS DynamoDB Catalog
- SQL: PostgreSQL, MySQL, SQLite catalogs
-
Open your browser to the frontend URL (usually
http://localhost:5173). -
Click "Connect" and enter your catalog connection details:
- Catalog Name: A friendly name for this connection (e.g., "production", "staging")
- Catalog Type: REST, Hive, Glue, etc.
- URI: The catalog endpoint URL
- Warehouse: The warehouse location (S3, HDFS, etc.)
- Authentication: Choose "OAuth2", "Bearer Token", or "None" (for no-auth catalogs)
- Credentials: Authentication details if required
-
You can connect to multiple catalogs and switch between them using the catalog selector.
- Use the sidebar explorer to browse namespaces and tables.
- Click on a table to view its metadata, schema, and snapshots.
- Use the upload button (cloud icon) next to any table to upload data files.
- Use the "Play" button (
▶️ ) next to any table to instantly populate aSELECT *query in the editor.
Execute SQL queries in the Query Editor:
-- Simple query
SELECT * FROM my_namespace.my_table LIMIT 10;
-- Cross-catalog join
SELECT u.name, o.amount
FROM catalog1.db.users u
JOIN catalog2.db.orders o ON u.id = o.user_id;
-- INSERT data
INSERT INTO my_namespace.my_table VALUES (1, 'Alice'), (2, 'Bob');
-- DELETE data
DELETE FROM my_namespace.my_table WHERE id > 100;
-- Time travel
SELECT * FROM my_namespace.my_table
FOR SYSTEM_TIME AS OF TIMESTAMP '2024-01-01 00:00:00';You can query Iceberg metadata tables by appending $ to the table name:
$snapshots: History of table states$files: Data files in the current snapshot$manifests: Manifest files$partitions: Partition statistics
SELECT * FROM db.orders$snapshots;- Append Data: Navigate to an existing table and click the upload icon (cloud) next to the table name.
- Create Table: Click the upload icon on a Namespace folder.
- Select a CSV, JSON, or Parquet file.
- If creating a new table, enter a name. The schema will be automatically inferred from the file.
- The data will be uploaded and the table created/updated.
Iceberg supports full schema evolution. You can modify table schemas using SQL commands (if supported by your catalog):
- Add Column:
ALTER TABLE ... ADD COLUMN - Drop Column:
ALTER TABLE ... DROP COLUMN - Rename Column:
ALTER TABLE ... RENAME COLUMN - Update Type:
ALTER TABLE ... ALTER COLUMN ... TYPE
After running a query:
- Click the "Export" button.
- Choose your format (CSV, JSON, or Parquet).
- The file will be downloaded to your browser.
- Switch Catalogs: Use the dropdown in the explorer to switch between connected catalogs.
- Log Out: Click "Log Out" in the header to disconnect from all catalogs.
- Filter Early: Always use WHERE clauses on partition columns to prune data.
- Limit Results: Use
LIMITwhen exploring data to avoid fetching huge datasets. - Use Metadata Tables: Check
$filesto see how many files your query might scan.
- Compaction: Regularly compact small files to improve read performance.
- Expire Snapshots: Remove old snapshots to free up storage space.
- Environment Separation: Use separate catalogs for Prod, Dev, and Staging.
- Configuration: Use
env.jsonto share configuration with your team (but don't commit secrets!). - Naming: Use descriptive names for your catalogs to avoid confusion in cross-catalog joins.
The project includes an End-to-End (E2E) testing suite using Playwright.
cd frontend
npx playwright test