Data Engineer @ datastack.tv
Alexandra is a Google Cloud Certified Data Engineer & Architect and founder of datastack.tv, the learning platform for the modern data stack. At datastack.tv she and a handful of other technical instructors create concise screencast video tutorials for Data Engineers. Alexandra has created the Modern Data Engineer Roadmap 2020 which got 2k stars on GitHub in just a few weeks.
With the release of Cloud Workflows creating a fully serverless workflow scheduler is now possible on Google Cloud Platform.
As of today, the mainstream workflow scheduler used by most companies is Apache Airflow. On Google Cloud you have two options when it comes to setting up Apache Airflow infrastructure. You either use Cloud Composer which is a managed Kubernetes based Apache Airflow service or you provision your own infrastructure. Even with the managed option you need to know the number of Kubernetes nodes, machine type and disk size that you need.
In this experiment, Alexandra is attempting to build a fully serverless workflow management solution on Google Cloud combining Cloud Workflows, Cloud Scheduler and Cloud Functions. This solution is expected to drastically reduce the cost and effort needed to provision and maintain a workflow scheduler.
Alexandra will share her learnings from this experiment.