A comprehensive migration tool that analyzes Tableau workbooks (.twb files) and generates ThoughtSpot TML (ThoughtSpot Modeling Language) files to facilitate seamless migration from Tableau to ThoughtSpot.
- Overview
- Features
- Architecture
- Installation
- Quick Start
- Usage
- Migration Behavior
- Formula Conversion
- Database Setup
- Migration Analysis liveboard
- Supported Features
- Contributing
- License
This tool provides automated migration capabilities from Tableau to ThoughtSpot by:
- Analyzing Tableau workbooks (.twb) for migration feasibility
- Generating ThoughtSpot TML files for live connections
- Creating SQL files for extract-based data sources
- Converting Tableau calculated fields to ThoughtSpot formulas
- Supporting comprehensive migration reports and analytics
The tool includes an ETL pipeline that processes Tableau metadata and, while Snowflake is used in this demo, it supports integration with almost all major cloud data warehouses (CDWs) for comprehensive migration analysis.
- Data Connections: Live and Extract connections
- Data Sources: Tables with join relationships
- Calculated Fields: Formula conversion using ANTLR grammar
- Logic: Formulas and filters
- Output:
- Live connections: TML files (tables, models, SQL views)
- Extract connections: SQL files + Model TMLs
- SQL Proxy Support: Generation of Table.tml files from SQL Proxy as a Datasource
- Note: It is a “Tableau Server‑hosted published datasource proxy”, a logical wrapper that hides the real database objects and joins from the workbook XML.
- Parameters
- Groups
- Bins
- Sets
The migration tool follows a three-tier architecture:
- Input Layer: Tableau workbook (.twb) file processing
- Processing Layer: Metadata extraction, formula conversion, and transformation
- Output Layer: TML/SQL file generation and migration reporting liveboard
- Python 3.10.x to 3.12.x (recommended: 3.11.5)
- CDW account for migration analytics and data sources. (Snowflake in this case)
- ThoughtSpot cluster access
-
Clone the repository
git clone https://github.com/YOUR_USERNAME/ts_migration.git cd ts_migration -
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables Create a
.envfile in thetwb_parserdirectory with your Snowflake credentials:SNOWFLAKE_USER=your_username SNOWFLAKE_PASSWORD=your_password SNOWFLAKE_ACCOUNT=your_account SNOWFLAKE_WAREHOUSE=your_warehouse SNOWFLAKE_DATABASE=your_database SNOWFLAKE_SCHEMA=your_schema SNOWFLAKE_ROLE=your_role
-
Database Setup Before running the migration tool, you must create the necessary schema in Snowflake and execute the setup stored procedures once to initialize the tables and views.
Note: Schema.sql is readily available to run in the snowflake.
-Database: TB_2_TS
-Schema: TTTM_RAW
Ensure the following structure is created:
Tables:
MIGRATION_EXECUTION_HEADER MIGRATION_EXECUTION_DETAIL TWB_FILE DATASOURCE_HEADER DATASOURCE_DETAIL WORKSHEET_HEADER WORKSHEET_DETAIL WORKSHEET_DATASOURCE_XREF liveboard_HEADER liveboard_DETAIL TABLE_HEADER TABLE_DETAIL TABLE_OUTPUT_FILE DATASOURCE_TABLE_XREF DATASOURCE_COLUMN_XREF WORKSHEET_HEADER2Stored Procedures: You must run the following stored procedures to handle the migration logic and data population:
MIGRATION_EXECUTION_HEADER twb_file Header_table Detail_table View_Model POPULATE_WORKSHEET_HEADER2Data Model: View ERD
# Basic migration (default: convert operation)
python main.py ./input_folder ./output_folder
# Explicit convert operation
python main.py convert ./input_folder ./output_folder
# Migration with live connection preference for extracts
python main.py convert ./input_folder ./output_folder --live_flag
## Usage
### Command Line Interface
```bash
python main.py [operation] input_folder output_folder [--live_flag]Arguments:
operation(optional):feasibilityorconvert(default:convert)input_folder: Directory containing Tableau (.twb) filesoutput_folder: Directory for generated output files--live_flag: Force live connections for extract-based Tableau workbooks
Feasibility Analysis:
- Parses Tableau workbooks and analyzes migration compatibility
- Uploads metadata to Snowflake for analysis
- Provides migration feasibility report via liveboard link
Convert:
- Performs full migration with TML/SQL file generation
- Uploads metadata to Snowflake for analysis
- Provides migration feasibility report via liveboard link
- Generates output files based on connection type
IMPORTANT - Once the output files (.tml) are generated, they can be manually imported into the ThoughtSpot cluster after setting up the connection between the CDW and ThoughtSpot.
For Tableau workbooks with live connections:
Output Structure:
output_folder/
├── Live/
│ ├── Table TML/
│ │ └── *.table.tml
│ ├── SQL View TML/
│ │ └── *.sqlview.tml
│ └── Model TML/
│ └── *.model.tml
└── combined_output.csv
Generated Files:
- Table TMLs: Data source definitions for tables
- SQL View TMLs: Custom SQL query definitions
- Model TMLs: Data model configurations with joins and formulas
For Tableau workbooks with extract connections:
Output Structure:
output_folder/
├── Extract/
│ ├── SQL Files/
│ │ └── *.sql
│ └── Model TML/
│ └── *.model.tml
└── combined_output.csv
Generated Files:
- SQL Files: Query files for use with Mode or similar tools (These files (.sql) can be imported in Analyst Studio and can be published as a dataset in Thoughtspot cluster)
- Model TMLs: Data model configurations for the extracted data
Use --live_flag to force extract connections to be treated as live connections, generating TML files instead of SQL files.
The tool includes sophisticated formula conversion capabilities:
The tool uses ANTLR grammar to parse and convert Tableau calculated fields:
- Aggregate Functions: SUM, AVG, MIN, MAX, COUNT, COUNTD
- Mathematical Functions: ABS, CEILING, FLOOR, ROUND, POWER, SQRT
- String Functions: LEN, MID, LEFT, REPLACE
- Date Functions: DATE, DAY, MONTH, YEAR, TODAY, NOW
- Logical Functions: IF, CASE, WHEN, THEN, ELSE
- Statistical Functions: MEDIAN, STDEV, VAR
| Tableau Formula | ThoughtSpot Formula |
|---|---|
CEILING([Sales]) |
ceil([Sales]) |
LEN([Product]) |
strlen([Product]) |
POWER([Sales], 2) |
pow([Sales], 2) |
Formulas marked as "TBD" (To Be Determined) require manual conversion after migration.
Important: This tool supports all major cloud data warehouses (Snowflake in this case) Using other CDWs requires making appropriate changes to the code.
The tool uses Snowflake for both:
- Data Sources: Source tables for migration
- Analytics: Migration metadata storage
STG_WORKSHEET_NOT_IN_liveboardSTG_DATASOURCE_NOT_IN_WORKSHEETSTG_TABLES_CONNECTED_WITH_UNSUPPORTED_DATA_PLATFORM
-
Data Extraction: Tableau workbooks are parsed and metadata extracted
-
Raw Data Load: Metadata ingested into
RAW_DATA_DUMPtable -
Data Transformation: Six SQL stored procedures transform and load data:
MIGRATION_EXECUTION_HEADERtwb_fileHeader_tableDetail_tableView_ModelPOPULATE_WORKSHEET_HEADER2
-
Analytics Ready: Data available for migration analysis liveboard
- Configure Connection: Set up connection between Snowflake and ThoughtSpot
- Import TML Files: Import all files from the
Tableau Evaluation Report TMLfolder to your ThoughtSpot cluster - Access liveboard: The tool provides a direct link to the staging liveboard after execution
The tool uses a predefined staging environment:
https://ps-internal.thoughtspot.cloud/?param1=Execution_ID¶mVal1={value}&#/pinboard/caf7e1f5-823a-41ce-9462-bfbd07bd7903
RUNTIME PARAMETER in OBJECT URL: DOCUMENTATION
- Migration feasibility analysis
- Unsupported feature identification
- Data source complexity assessment
- Conversion success metrics
- Formula conversion status
- Join compatibility analysis
| Feature Category | Live Connections | Extract Connections | Details |
|---|---|---|---|
| Data Sources | ✅ TML files | ✅ SQL files | Tables with join relationships |
| Custom SQL | ✅ SQL View TML,Table TML for SQL Proxy | ✅ Included in SQL | Custom query support |
| Calculated Fields | ✅ Model formulas | ✅ Model formulas | ANTLR-based conversion |
| Joins | ✅ Model joins | ✅ SQL joins | Complete relationship support |
| Filters | ✅ Model/Live filters | ✅ SQL WHERE clauses | Basic filtering capabilities |
| Data Platform | ✅ All major CDWs | ✅ All major CDWs | -- |
| Advanced Features | ❌ Parameters, Groups, Bins, Sets | ❌ Parameters, Groups, Bins, Sets | Planned for future releases |
Generated TML files include:
- Hardcoded connection name:
dgprojectTS(requires manual update) - Index type:
DONT_INDEXfor all columns - Default aggregation: Based on column type mapping
We welcome contributions from the community! Please see our Contributing Guidelines below.
- Fork the repository and clone your fork
- Create a feature branch:
git checkout -b feature/your-feature-name - Set up development environment following the installation instructions
- Code Style: Follow PEP 8 guidelines
- Type Hints: Use type annotations where appropriate
- Documentation: Add docstrings for new functions/classes
- Testing: Write tests for new features in the
tests/directory
-
Run Tests: Ensure all tests pass with
python -m pytest tests/ -
Commit Changes: Use conventional commit messages:
feat:for new featuresfix:for bug fixesdocs:for documentation updatestest:for test additionsrefactor:for code refactoring
-
Submit Pull Request: Include a clear description of changes and testing performed
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Documentation update
- [ ] Test addition
- [ ] Refactoring
## Testing
Description of testing performed
## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] Tests pass
- [ ] No new warnings generated- Hardcoded Values: Connection names and URLs require manual updates
- Formula Coverage: Some complex formulas may need manual conversion
- Extract Behavior: Generates SQL files, not TMLs for direct ThoughtSpot import
Please refer to the LICENSE file for details.
- Issues: Report bugs and request features via GitHub Issues
- Documentation: Refer to the code documentation and this README
- Community: Join our community discussions for help and best practices
Supported Python Versions: 3.10.x - 3.12.x (Recommended: 3.11.5)
