Are data and AI sustainable?
The concept of sustainable development is defined by the World Commission of the Future as « the development that meets the needs of the present without compromising the ability of the future generations to meet their own needs ». In this concept, the environment, economy and society are identified as the three elements interrelated to contribute to achieve sustainability.
So the question is what impact have data and AI on the environment. We will divide into three questions.
First, how data and AI contribute to solve environmental issues.
Second, what are the challenges associated with using data and AI for solving environmental issues.
Third, how data/AI value chain constitutes a pressure on environment.
In a sense, using the Pharmakon metaphor, data can be considered both as a poison and a remedy to sustainability.
Data and AI used to solve environmental issues
Many studies have investigated how data and AI can be used to solve sustainability issues. In particular, a recent one focused on the chemical industry and reviewed the main research and use cases of AI use to solve sustainability issues. Authors concluded on several positive impacts: energy efficiency increase, choice of more efficient energy source, reductions of emissions to air for example.
Another research has analysed the AI driven engineering solutions and showcased other positive use cases. For example, use of machine learning models to predict stream flow and and examine water quality parameters, natural language processing solutions to predict ecosystem services or use of artificial neural network for energy use of buildings, energy production and distribution.
In a recent review of 287 papers by Kar, Choudary and Singh (2022) on AI and sustainability, authors identify the sectors using the more AI for sustainability: Transportation, energy, medical, IT, education, food sector, services, construction, and manufacturing. In the same review, are listed the methods used and the main use cases.
An hypothesis is common to these three research: as data and AI have unique benefits (such as automating tasks, identifing new insights in vast amount of unstructured information and make complex calculations and simulations), they can be used for solving complex environment issues. Given the number of studies analysed in the three research, we can reasonably assert that data and AI are massively used to solve environmental issues.
But is this enough to conclure ? Probably not.
Challenges to secure positive impact
Even though the technology works well and the use cases prove significant results, some challenges impede the full delivery of promises. Interestingly, not all of them concern the technology. They most frequently relate to how humans use the technology and the scope chosen to calculate the impacts.
5 majors challenges are identified in a recent research by Nishant, Kennedy and Corbett (2020):
- reliance on historical data: historical datasets (on temperature and air composition for example) may have limited value because past data reflect ages and climate cycles before the extensive human activity,
- uncertain human behavioral responses to AI-based interventions: AI applications are often static and isolated, they lack taking into consideration the complex interactions between human, ecological, political, economic, social stakeholders and subsystems. Rebound effects are good examples of such challenges: many research have demonstrated that many eco-efficiency innovations resulted in an increase of the total consumption of limited resources,
- increased cybersecurity risks: as using AI techniques for environmental sustainability requires integration of data sets from a wide range of data owners, format and structures, the hacking risk increases,
- adverse impacts of AI applications: it’s the direct environmental footprint of AI applications we will describe more precisely in the next section,
- inadequate measurements of performance or intervention strategies: measuring both positive and negative impacts of AI solutions for sustainability will be difficult, because the success of AI for stustainability requires a holistic metric that combines technical and predictive performance models, measures acceptance by target audiences and is cognizant of AI’s technical limitations and the complexity of climate crises.
Although AI solutions are inherently technical, their success will be determined by how well they navigate and influence psychological, sociological, and organisational factors that currently impede human progress in this area. (Nishant, Kennedy and Corbett, 2020).
So, the impact on the environment will not only be realised through technical solutions. Sociological and psychological dimensions will come into play.
But is this enough to conclude? Well, we need to address a last aspect of the question: the direct environmental footprint of data and AI.
Data value chain exerts significant pressure on the environment
Delivering that positive impact on the environment and society requires a whole value chain which exerts pressure on environment. From raw data to insights generation and decision, 4 main blocks compose the data and AI value chain
- Data generation refers to activities and assets necessary to capture and record data. It blends data accessible directly by the company (ERP, social media, website and applications, connected device data, ….) and data the company sources from other companies or organisations (open data sets, bi-lateral data procurement agreements, …).
- Data collection consists in activities and assets to collect, validate and store data. E.g.: cleansing, reduction, integration, storage infrastructure and models, security, …
- Data analysis refers to activities and assets to analyse and generate insights. E.g.:Semantic analysis, models (Predictive, Descriptive, Prescriptive), visualisation (graph, maps, 3D, …), …
- Data exchange refers to activities to expose outputs internally and externally. E.g.: decision making, trading.
This value chain involves devices and sensors to capture the data, networks for communicating data, data centers to store them. All of this requires natural resources and energy to build and transport the devices and products. Aditionnal efforts are required to manage waste at the end of the life cycle of these products. Beyond the products themselves, energy is required to have them functionning and to run machine learning algorithms. Last during all the life cycle greenhouse gases and CO2 are emitted .
A recent report analysed France on these dimensions and here are two interesting results of the study:
- Compared with networks and data centers, the devices (smartphones, computers, …) account for the biggest share of impact (64% of energy, 84% of greenhouse gas, 91% of water, 79% of resources),
- when analysing the life cycle, the majority of impacts are made at the fabrication phase (not the use one). For example out of the 84% of greenhouse gases by devices, 76% occur during fabrication (and 8% during usage).
Specifically on energy, Kate Saenko, associate professor of computer science at Boston University, warns. « AI is getting more expensive in terms of power to train the newer models. »
The power consumption of neural networks is due to two main reasons: the computing used to train the model and the computing used to infer new data from the model. Training the model takes a lot of computing. According to researchers associated with OpenAI it increases by a factor of 10 each year. The search for a maximum accuracy of the model requires a lot of training, A language processing model might be able to understand 95 per cent of what people say, but wouldn’t it be great if it could handle exotic words that hardly anyone uses? More importantly, your autonomous vehicle must be able to stop in dangerous conditions that rarely ever arise.
Here are some comparison points to get a sense of the orders of magnitude:
- in 2021, Bitcoin electricity consumption equaled the one of Argentina,
- the emissions generated during the research and development phase of large-scale language models are equivalent to the emissions of five cars throughout their lifecycle,
- in German financial center Frankfurt, data centers are today responsible for 20 percent of all electricity consumption.
For other examples, you can refer to AlgorithmWatch which recently published a very interesting report on sustainable AI.
So, the data value chain exerts significant pressure on the environment and analysts predict the pressure will grow bigger in the near future.
How to overcome the challenges and reduce the environmental footprint
Data and AI applications contribute to solve environmental but we raised two main concerns. First, some benefits are offset by rebound effects and human pratices. Second, the direct negative impact of the data and AI value chain is significant.
Securing how data and AI have global positive environment impact doesn’t seem an easy problem to solve.
At a corporate level, in the research by Nishant, Kennedy and Corbett (2020) 5 propositions are made to overcome the challenges:
- adopt a multi-level view to capture the complexity of the real world (and limit rebound effects for example),
- use a system dynamics perspective to capture interactions and feedback loops among the technology, users and other stakeholders,
- follow a design thinking approach to minimize potential unintended consequences and improve effectiveness of AI solutions,
- understand psychological and sociological underpinnings of human response for effective long term solutions,
- examine the economic value of AI for sustainability to develop our understanding of how AI differs from conventional IT.
The true value of AI will not be in how it enables society to reduce its energy, water and land use intensities but rather, at a higher level, how it facilitates and fosters environmental gouvernance. (Nishant, Kennedy and Corbett, 2020)
At a global and government level, several reports are urging governments to take that road of sustainable AI. It’s the case of CLIMATE CHANGE AND AI, Recommendations for Government Action published in 2021 which lists recommendations for government organised in 3 categories:
- Supporting AI applications in climate change mitigation and adaptation
- Reducing AI’s negative impacts on the climate
- Building implementation, evaluation, and governance capabilities
What does it mean for strategy and competition dynamics
From this short review we can conclude with the following implications
- As the volume of data and the use of AI increase and spread accross industries, companies size and countries, corporations will more and more be challenged by public and most likely regulators on the environmental implications of data and AI.
- With an increased maturity level of public, companies and governments, only counting the benefits of using AI for solving environmental issues will no longer be enough. Balancing with other negative impacts of the organisation (on other topics than digital) and taking into consideration the direct negative impact will become necessary.
- Achieving that measurement will require a systemic view of environmental impact, way beyond pure digital impact (wether direct or indirect) and including the stakeholders outside the company (clients and suppliers for example).
- Aiming to have a positive effect will require to leverage expertise which were either absent or isolated from the engineering expertise (behavioral science, design thinking, …).
From a competition dynamics perspective:
- it’s likely a few companies will act above the standards and expectations to build a distinctive positionning. For those, the upside will be about branding and a way to support a premium positionning. However, it will require for them hard commitments on many dimensions to avoid to be labelled as greenwashing. This would entail significant changes of the business model.
- An even fewer number of companies will put the environment first and redefine their operations and models accordingly. They would challenge for example if they need all the data they have access to and if they need energy intensive models.
- Some companies, more numerous, will define strategies on radical energy savings for a cost reason. The higher the relative importance of energy in the costs, the more efficiency constitutes an advantage. Similarly, some companies will try to secure cheaper access to clean energy (as big tech firms are already doing).
Data and AI obviously contribute to solving environmental problems when revealing new insights and contributing to making smarter decisions. However, companies should look beyond this direct positive effect to secure a real positive impact and build a consistent positionning.
This is a difficult though necessary conversation to have. It’s a necessary conversation because we are reminded everyday of the urgency of the decisions to make. It’s a difficult one because it’s not a pure technological one but a governance one.