Data Forge
Data Forge generates business-valid, cross-table, time-consistent synthetic data for databases, APIs, and pipelines.
Not random fixtures, test-ready data that respects schemas, foreign keys, business rules, and optional anomaly injection. Built for demos, UAT, integration testing, and pipeline development.
Schema-driven: Use your DDL, JSON Schema, or OpenAPI
10+ domain packs: SaaS, e-commerce, fintech, healthcare, IoT
ETL modes: full snapshot, incremental, CDC, bronze/silver/gold
Export to Parquet, JSON, CSV, SQL, load to Postgres, Snowflake, BigQuery
Integrations: dbt seeds, Great Expectations, Airflow DAGs
Open source. Python backend. Next.js UI. Local-first.
https://github.com/ojasshukla01/data-forge
Would love feedback, especially from other data engineers. What would make this more useful for your workflows?


Replies