Skills & Keywords
Apache AirflowApache SparkAutomated testingData LakesData WarehousesDataset validationDistributed ComputingFlinkFlyteMachine LearningMachine Learning PipelinesObservability
Job Description
Automate testing for data pipelines; Build large scale offline data platform; Design and operate data pipelines for training datasets; Develop batch and stream data processing infrastructure; Enable large scale experimentation and model iteration; Improve dataset validation and monitoring; Integrate data pipelines with workflow orchestration; Lead architectural improvements for scalability reliability and cost; Optimize distributed compute performance and resource utilization;
View full posting