Machine Learning Systems Engineer
Eight Sleep
- Fully architected a model training stack to scale up data and compute — self-hosted Anyscale on AWS using aws-cdk, dockerized environments for CI, EC2, and Anyscale, and built a config-driven training paradigm using omegaconf and ray
- Saved ~$60K/year by reducing the memory footprint of a production Kinesis pipeline through signal processing algorithm optimizations
- Sped up dataset preparation and model evaluations ~100x by moving local workflows to AWS ECS
- Developed observability pipelines that detected and patched failures in 10% of devices in the field
- Built live data streamer and various data parsers to enable experiments with new sensors
aws-cdkec2ecskinesissqsdynamo-dbpostgresqls3nodepythontorchonnxrayanyscaledockerdatadog