Metaflow: Master Learnings from Netflix

Sep 1, 2022by, Pooja S Kumar

Machine Learning

Most of us believe in Netflix supremacy when it comes to streaming movies and series online. Kudos to the masterminds behind it! Now how about considering Netflix’s recommendation system as a synonym for Machine Learning?

Yes, you heard it right. It’s not just their innovative ideas and appealing user interface that made them a hit, but their decision to leverage machine learning before their competitors did. 

With machine learning, the system is able to execute tasks on its own and learn without human intervention. In addition to Netflix, there are several other companies that take advantage of machine learning to enhance their operations. This list includes Amazon, Walmart, Apple, Google, Facebook, etc. Our team at Dexlock also enjoys assisting clients with machine learning projects. And if you are a company with a dream project using ML framework, we are here at your disposal. 

Why Metaflow?

Netflix is famous for its most sophisticated recommendation systems. Its interface is a complete reflection of your personal preferences. Logging into Netflix, you’ll find all those TV shows and movies recommended to you as a result of its machine learning system. 

Being one of the biggest studios, they rely on data science for efficient content production. Apart from that, they use ML in diverse areas, from filtering the right content for their subscribers to detecting payment fraud. Over time, a diverse set of use cases for machine learning started popping up with the growth of the company and thus its complexity. Gradually their data scientists found difficulties related to data access and processing. It was then they realised the need for an advanced human-centric framework to ease their tasks. This was the motivation behind the human-centric python framework “Metaflow”.

Metaflow is an open-source platform that assists data scientists in managing, deploying, and running their code in a production environment hence a rescue for data scientists. Netflix uses Metaflow to build and manage hundreds of data-science projects from natural language processing to operations research. This framework helps data scientists in many aspects like automation, parallelization, orchestration, failover etc.

What do they do?

Data is accessed from a data warehouse, which can be a folder of files, a database, or a multi-petabyte data lake. The modeling code crunches the data produced in a computation environment, and a job scheduler is used to orchestrate multiple units of work. Then the team architects’ code would be executed by structuring it as an object hierarchy, Python modules, or packages. 

Its machine learning code could have been described as a DAG (Directed Acyclic Graph) in the early stage. In short, a DAG represents a workflow, it describes the steps you take to treat and transform data, in other words, it is an abstraction for a data pipeline.

The ‘Flow’ in Metaflow is an outline of the steps you want to execute and the order in which these steps should be executed. The concept of ‘Metaflow-Run’ was something new and smart.

Each time you run a flow, either locally or on AWS, all the metadata related to the flow will be automatically stored in S3 or a central database for later retrieval.

Metaflow allows you to pass variables across steps. It abstracts away (de-)serialization using Pickle and S3. It works fine for most machine learning workflows based on Python. Metaflow can abstract this away to serialize state in different ways, for example when running locally. This makes it a lot easier right?

It’s too much of the Metaflow backstory. Now let’s take a look at the use cases. 

Metaflow can be used for Tracking, Deployment, Cloud Integration, Scaling, Versioning, Debugging, Resuming failed flows and much more. 

For Netflix, Metaflow was indeed their best decision. It was as important for them as using Git for the code. The data scientists were able to manage their code all the way through deployment, giving the engineers more time to focus on other parts of the system. The Metaflow framework aims for “seamless scalability”, and it does it very well. Getting it to work on AWS requires a bit of manual setup, but it’s well documented.

Metaflow is definitely worth a look if you are a data scientist looking for help managing, deploying, tracking etc. If you have a project that utilises this framework, connect with us here. 

Disclaimer: The opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Dexlock.