Time-Series Synthetic Data Generation (Internship at Govtech)
I explain some of the highlights of my internship projects as a Research Engineer at Govtech.
This blog is a Work-in-progress due to the ongoing nature of the internship.
Govtech Singapore is a government agency which functions as the tech arm of Singapore’s government. It builds tech-based products to make live better for citizens, businesses, international audiences and the public service sector.
During my internship, I was attached to the Data Engineering Practice, making data-focused products to help other government agencies.
More specific details about the project can be published once my work is open-sourced for educating other government agencies (~ May 2025).
Some expected technical learnings from the project include:
- Large-scale ML experiment design
- Literature review of sequential (mostly time-series) synthetic data generation techniques
- In-depth study and exploration of relevant ML models (Probabilistic Autoregressive, GANs, Fourier analysis-based models, Diffusion models, Transformer hybrid models, etc.)
- White paper research / publication / writing
The white paper is expected to be open-sourced at the end of the project as well.