10 . 19 . 2021

Cheaper and 3X Faster Parallel Model Inference with Ray Serve

Wildlife Studios’s ML team was deploying sets of ensemble models using Flask. It quickly became too hard and too expensive to scale. By using Ray Serve, Wildlife Studios was able to improve the latency and throughput while reducing the cost. Ray Serv...