The Role of Data Scientists in Helping to Fight Wildfires
2017 was a devastating year for wildfires. In California alone, we witnessed some of the deadliest and most destructive fires in State history. 98 civilians and 6 firefighters lost their lives. In total, a staggering 1.9 million acres were burnt, with the loss of over 10,300 structures, and incurred more than $3.5 billion in damages. A further $1.8 billion was spent by the various agencies to control and extinguish the fires. Wildfires are costly events, in so many ways.
The heroic efforts of the first responders - from firefighters to medical personnel - are without parallel. They put themselves in harm's way to protect lives and property, administer to the injured, and assist the distressed and displaced. No words can adequately describe the debt we owe to these individuals. However, they are also supported by a large number of behind-the-scenes personnel, without whom the front-line responders could not be as effective as they are. That support team includes data scientists.
Data lies at the heart of decision making - all decision making. In the case of wildfires, those decisions can mean the difference between life and death, or whether a community is saved or utterly destroyed. With the increase in global warming, longer, hotter summers, and sprawling urbanization changing the landscape, the myriad things impacting wildfires is extremely complex. The stakes are extremely high, and data analysis has to rise to the challenge. It's the analysis of data that helps us make sense of this overwhelming complexity.
When it comes to wildfires, the role of the data scientist can be considered into two broad phases:
- Predicting when/where fires are most likely to start
- Predicting what fires will do/when once they have started
Let's consider the pre-fire phase first. Predicting when and/or where wildfires are most likely to occur involves data scientists modeling a broad range of data:
- Combustible materials, often obtained from satellite imagery:
- Vegetation: what is the tree coverage like; what type of undergrowth exists; is the terrain grass cover or rocky.
- Man-made combustibles: is there a build-up of flammable waste (e.g., wooden packing cases, or waste paper); are flammable or explosive materials being stored (e.g., paint, gasoline, or gas canisters)
- Potential Heat sources:
- Campgrounds and other BBQ sites, or places with open fires; Hot houses or other glass structures; parks and other locations where heat may arise (e.g., from discarded bottle, cigarette lighters, etc.)
- Weather conditions:
- How hot/dry has it been in the past few days/weeks/months
- What are the weather projects for the coming days/weeks/months ·
- History:
- What has happened in the past when the above conditions have unfolded
- What caused past fires - and where
These, and many other data are used by data scientists to develop predictive models of when, and where wildfires are most likely to occur. These models enable the appropriate organizations to take preventative action, such as cutting firebreaks, as well as informing planning and preparedness activities, such as where to store fire retardant.
Once a wildfire has started, the role of the data scientist is far from over. They now switch into a different form of predictive modeling: predicting what the fire is most likely to do now that it is ablaze. In this phase of fighting wildfires, data scientists are again, drawing upon a wide range of data to help develop their predictive models. This includes most of the data above but with a greater focus on forward-looking data, such as hour-by-hour weather predictions, along with real-time data such as observations from firefights on the ground, social media reports, and aircraft/drone footage. These models take into account the ground topology (fires tend to follow the lay of the land), precipitation and humidity projections, and perhaps most importantly: wind! The direct and speed of the wind and its projected changes are one of the most critical pieces of data in helping data scientist's model what the fire behavior is likely to be.
As in forecasting where fires are likely to occur, trying to predict what they do will once alight is extremely complex. It is, perhaps, misleading to refer to 'a wildfire.' Fire can create its own 'local' weather conditions, with powerful vortexes that spiral the fire up into 'firenadoes.' A large wildfire can, therefore, be comprised of a great many individual fires, each of which may need to be modeled to understand its likely trajectory.
Data scientists use a range of highly sophisticated, and powerful tools to develop these models. First, all of the data, which are coming from an extensive range of disparate systems, must be collected, cleaned, and merged to provide an integrated suite of data with which to work. That data must be kept up to date, often on a minute-by-minute basis. Once the data are cleaned, it can be fed into the predictive models that the data scientists have prepared. Often this is done using highly sophisticated Artificial Intelligence/Machine Learning tools known as Neural Networks. These tools, in-turn provide a variety of outputs that inform decision makes - such as graphs and topological maps showing where, and when the fire is likely to progress.
The models developed by data scientists are crucial in ensuring that the front-line firefights are deployed to the right locations to have the maximum impact, while simultaneously minimizing the risk to their personal safety. Similarly, there will be an army of other people using these models to determine the safest evacuation routes, determining how many fire-retardant planes will be needed, or how much food will be needed to support all of those fighting the fires, and looking after the injured and the displaced.
Once a wildfire has been contained, and eventually extinguished, the data scientist has one last job: post-fire analysis. This stage, too, is crucial in helping to fight future fires. Data scientists gather as much data as they can about what the fire 'actually' did, versus what it was predicted to do. It may be because the predictive model needs to assign more importance to the topology of the landscape, or the rate at which fire spreads between oak trees vs. pine trees. Or maybe it was the weather changed unexpectedly. Whatever the reason, these post-fire data help data scientists perform a forensic analysis of how their models worked, so they can improve them for the next time a wildfire occurs.
And then they start all over again - predicting when and where the next fire is likely to occur. A data scientist's work is never done...
What to learn more about Data Science and wildfires? Listen to an interview with MS Data Science program director Rick Hutley in which he discusses the data science of wildfires.
Rick Hutley is the Program Director and Clinical Professor of Analytics at the University of the Pacific, CEO of StrataThought, frequent keynote speaker and a former Vice President of Innovation at Cisco Systems.