Practical Green MLOps

July 21, 2022

A manifesto on environmentally sustainable AI infrastructure.

When we consider the environmental impact of AI and MLOps, our attention is usually focused on the significant energy usage for the compute cycles required for training. After all, a single high-performance GPU can consume 300 watts of power, which means that running 500 of them at full power for two weeks (to train a very large language model, for example) would use approximately 500,000 kWh – or roughly the amount of power needed to power 50 houses for an entire year.

But there are even more critical factors affecting the overall power requirements of real-world AI development and deployment than the power efficiency of various GPUs. Operational and workflow best practices known under the umbrella term MLOps can affect the development process’s efficiency and overall power requirements by over three orders of magnitude. 

AI as a force for good

First, let’s zoom out for a moment and talk about why any of this matters. To put it briefly, AI has the potential to change the world for the better for all humanity meaningfully.

To give just a few examples of this, in the near future AI systems developed by Google’s Deepmind may help to solve the mystery of how to produce self-sustaining fusion reactions, providing the world with a safe and clean new energy source that does not destroy the environment or lead to dangerous pollution. [1]

And even sooner than that, we may even see real-world AI systems controlling distributed fleets of automated robotaxis at a fraction of the cost of current transportation; and AI systems that allow intelligent robots to safely carry out routine tasks around the average home (just to name two of Elon Musk’s most recently announced AI projects). [2]

In healthcare and medicine, there are likewise revolutionary advances that are waiting to be made by AI researchers in terms of new drug discoveries and medical technologies with the potential to improve lifespan and quality of life for millions worldwide. [3]

That being said, as practitioners we also need to make sure that this powerful technology is used ethically and in a way that does not destroy the environment.

Ethical AI, Green MLOps, and Climate Change

Ethical AI is concerned with ensuring that AI adheres to fundamental ethical values such as privacy, fairness, and freedom of choice; as well as with minimizing the potential for violation of rights, harm, or injustice that could result from the operation of AI systems, either because of poor design or deliberate or accidental misuse. It is also a belief in the potential for AI to lead to fundamental positive and beneficial change for humanity and the world.

In the most general terms, if Ethical AI is a guiding philosophy for why to do AI, then Green MLOps is more concerned with how to do AI. Green MLOps is concerned with promoting zero-emission options for AI development and deployment as well as supporting efficiency improvements that impact its overall energy usage.

This brings us back to one of the most important factors that can affect the overall power requirements of real-world AI development and deployment.

While the power efficiency and AI capabilities of GPUs are incredibly important, there are actually other factors that can affect the efficiency and thereby the power requirements of the development process itself by up to 3000 times.

We refer to this excess CO2e expenditure resulting from unoptimized ML development practices as “Digital Waste.”\

For example, recent research from the University of Massachusetts Amherst, shows that inefficient hyperparameter tuning and experimentation procedures have the potential to increase the energy usage for training a single model by 2000x

In addition, they showed that wasted time and energy resulting from a poorly managed neural architecture search phase can increase energy usage for training a large Transformer-based NLP model by 3000x.

Robust MLOps best practices that simplify pipeline creation and modification and allow for instrumentation and automation at all stages significantly reduce the overhead for tuning, experimentation, and neural architecture search, dramatically reducing this digital waste produced by ML development processes.

In the example of hyperparameter tuning from the article above, the researchers used the grid search technique. Even though this method is highly reliable (as it goes through all possible variations of hyperparameters), it is also the most expensive. As the development of new approaches in the algorithms in hyperparameter tuning and neural architecture search domains happens, researchers may turn to more advanced ones. See https://github.com/microsoft/nni as an example of such advanced frameworks. 

Furthermore, recent research at Google AI and elsewhere has shown that the use of sparsity strategies (where a model has a very large capacity, but only some parts of the model are activated for a given task, example, or token), as well as other strategies including better model architectures, can improve the energy efficiency of models by 100x and produce 650x less CO2e emissions compared to baseline transformer models. These strategies can be combined together for maximum efficiency gains and include:

  • Using a sparsely-gated mixture-of-experts layer
  • Using switch transformers that pair a mixture-of-experts architecture with transformers, or
  • Using the GLaM model, which uses multiple mixture-of-expert–style layers to produce a model that exceeds the accuracy of the GPT-3 model

Another source of digital waste is the necessity of repetitive model training on the same data and parameters due to poor data and model management. Proper management of data and lineage, experiment tracking, and storage of models and metadata may significantly decrease excessive energy consumption. 

The research above, as well as an increasing amount of anecdotal evidence from AI development projects from our enterprise customers and university colleagues, shows that the adoption of MLOps can have the most significant impact on any single decision on AI.

Recognizing the limited options available to AI practitioners who want to pursue responsible and ethical AI at scale, Neu.ro announced its commitment to Green MLOps in 2021 – supported by our Zero-Emissions AI Cloud  and our high-efficiency, waste-reducing cloud and on-prem MLOps software offerings.

Neu.ro’s Green MLOps Initiative had its genesis in our ongoing client work supporting AI workloads for cloud service providers and developing custom AI systems in a wide range of industries. As we saw the power of AI to produce sustained growth and drive operational efficiencies first-hand, we became increasingly convinced of the need for businesses to both dramatically scale-up near-term AI development and deployments and to do so in a sustainable and carbon-neutral manner. 

Neu.ro invites AIIA members to join the Green MLOps initiative and will be proud to share its experience and approach with a broader audience. 

REFERENCES

[1]  Google’s DeepMind project has been working with fusion researchers to achieve stable plasmas in tokamak reactors at progressively higher energies by dynamically controlling magnetic actuator coils in the reactors with deep learning systems. As a result of this research, the estimate for the invention of a sustainable fusion energy reactor has been reduced from thirty years to ten years.

https://www.deepmind.com/blog/accelerating-fusion-science-through-learned-plasma-control

[2] Autoweek, Elon Musk Promises Robots and Robotaxis https://www.autoweek.com/news/technology/a39797326/elon-musk-tesla-robots-robotaxis/

[3] TechCrunch, The next healthcare revolution will have AI at its center https://tcrn.ch/2XAiONX