Reshaping Capacity Planning: The Unseen Impact of Weather Data

Andrea Vasco
7 min readJan 27, 2024

Disclaimer: The views and opinions expressed in my article are solely my own and do not necessarily reflect the official policy or position of Google or any of its affiliates. The content provided is for informational purposes only and is based on my personal experiences and insights as a Google Cloud Data Engineer. It should not be construed as representing the strategies, plans, or opinions of Google.

Introduction

Quite a long ago, I found myself part of a forward-thinking capacity planning team. Our success in identifying major business events impacting IT infrastructure was commendable, yet we realized a crucial piece was missing from our analytical puzzle — the integration of ‘non-capacity data’. The quest began to broaden our horizon, pushing beyond traditional metrics to encompass a wider spectrum of influences.

Our choice fell on weather data. Its ubiquitous nature and apparent disconnect from capacity planning made it a perfect candidate. This decision wasn’t just about adding another dataset; it was about challenging our own understanding of what truly affects business operations. Based in Chicago, a city known for its dramatic weather swings, we turned to local weather data as our starting point. Our hypothesis? That weather patterns could significantly influence customer interactions, employee behavior, and even the utilization of our IT infrastructure.

As we embarked on this journey, our initial assumptions were challenged, and our findings took us in unexpected directions. This is the story of how incorporating weather data into capacity planning not only reshaped our approach but also brought to light the nuanced interplay between environmental factors and corporate infrastructure.

Method

Our journey into integrating weather data with capacity planning began with the quest for the right source. Initially, we turned to freely available weather datasets, a choice driven by convenience and accessibility. Over time, however, the landscape has evolved, and what was once freely available now often hides behind paywalls. Yet, the plethora of projects correlating business metrics with weather conditions bears testimony to the increasing interest in this data.

Our weapon of choice for data format was CSV — straightforward and compatible with our tools. The approach we adopted was methodical yet adaptable: a cycle of hypothesis formulation, rigorous testing, and continuous refinement. This process led us on a path of discovery, redefining our understanding of existing data, and compelling us to dive deeper into the intricacies of our systems.

As we scrutinized and juxtaposed weather data against our metrics, our initial theories were systematically dismantled. Daily weather patterns, contrary to our expectations, showed no significant correlation with the overall usage of IT resources. This revelation was partly due to the geographic dispersion of our operations, spanning five states. The diversity in weather conditions across these regions diluted any potential impact on our global operational metrics.

However, this exploration wasn’t in vain. It succeeded in broadening the scope of our data collection, allowing us to integrate nearly a hundred ‘non-capacity’ metrics into our analysis framework. This expansion provided a more comprehensive view, enabling finer capacity planning and more nuanced insights.

Results

Our journey into the realm of weather data and capacity planning yielded intriguing results, defying our initial expectations. Our hypothesis, which postulated a direct correlation between daily weather conditions and IT resource usage, was upended. It became evident that the daily weather had negligible impact on our IT operations. The primary reason for this was the geographic diversity of our operations across five states, rendering any localized weather effects inconsequential on a global scale.

However, one of the most surprising findings pertained to human behavior and accidents. Contrary to common assumptions, we discovered that the propensity of people to get injured does not significantly depend on weather conditions. This insight led us to a rather humorous yet profound realization:

Eventually, people will always find creative ways to hurt themselves,
regardless of the weather

This observation underscores the unpredictability and ingenuity of human behavior in ways that defy even the most thorough data analysis. If you don’t believe me, look into Unicycle Hockey or Cycleball.

This exploration wasn’t without its merits, though. One of our pivotal achievements was the successful expansion of our data collection scope. We managed to integrate an array of nearly a hundred ‘non-capacity’ metrics, which significantly enhanced our capacity planning analyses. These additional metrics allowed us to uncover more nuanced patterns and correlations, ultimately leading

Discussion

Armed with a wealth of weather data from Chicago, our analysis ventured into comparing this data with various business metrics such as claims processed and website traffic. However, we quickly encountered a significant challenge: our business data was not geographically segmented, making it difficult to draw direct correlations with localized weather patterns.

Sample weather data for one site

This led us to a crucial realization: the type of weather data that mattered most was not immediately obvious. Was it rainfall, temperature, humidity, or barometric pressure that had the most significant impact? The lack of clear correlation prompted us to broaden our scope, incorporating weather data from other major cities within our operational footprint.

Our analysis aimed to answer two seemingly straightforward questions:

Does weather influence insurance claim volumes?
And does it affect work-from-home infrastructure utilization?

However, these questions proved to be anything but simple. The quest for answers deepened our understanding of our systems and metrics, prompting us to ask better, more informed questions. It became clear that the term ‘weather data’ was deceptively complex, encompassing a variety of factors such as temperature, precipitation, and more. Moreover, the impact of weather conditions varied greatly depending on the season and location.

We also faced the challenge of defining ‘weather’ for each location. A hot day with high humidity differs significantly from a similarly hot day with low humidity. Moreover, a pleasant day in winter is not the same as a pleasant day in summer. These nuances highlighted the complexity of accurately scoring weather conditions and their diverse impacts across different geographical locations.

GEOS Water Vapor Imagery from January 31st. https://www.weather.gov/lot/2015_Feb01_Snow

In our quest to correlate business metrics with local weather data, we faced another hurdle: our business metrics were not organized by location but rather by business unit. This made it challenging to match them with corresponding local weather conditions. The learnings from this endeavor were profound: data needs to be grouped similarly to draw meaningful comparisons. When grouping is imprecise, so too are the insights derived from the data.

Through this intricate process, we learned that significant weather events correlate more easily with global total daily counts. If our business data had been more granular, enabling location-by-location comparisons, we might have unearthed valuable insights. The key takeaway was that for effective correlation, smaller geographic areas must be compared with corresponding local weather data.

Conclusion

Our foray into integrating non-capacity data like weather into capacity planning has been a journey of discovery, challenges, and unexpected insights. Initially, we sought to answer two specific questions about the influence of weather on insurance claim volumes and work-from-home infrastructure utilization. However, our exploration took us much further, revealing the complexities and limitations of our data, and pushing us to rethink our approach.

This journey underscored the importance of matching the precision of data grouping with the precision of the metrics themselves. When the grouping is imprecise, the derived insights are equally vague. Our attempt to correlate weather with global claim counts highlighted this point, showing that significant weather events have a clearer correlation with global totals than with more localized metrics.

The exploration into weather data not only provided specific insights but also catalyzed a broader understanding of our business processes and metrics. This process of continual questioning, data addition, and analysis reshaped our understanding of well-established metrics, leading to new questions and insights. It demonstrated the value of incorporating diverse data sets into capacity planning, even those that might not initially seem relevant.

Our work revealed that liberating ourselves from the constraint of using only traditional ‘capacity data’ in our tools can lead to richer insights and a more comprehensive understanding of the metrics and systems we evaluate. The integration of work-from-home technologies into our business practices, as a response to external conditions, proved to be just as crucial for handling these situations as our ability to predict demand and adjust capacity.

In conclusion, our exploration into the intersection of weather data and capacity planning has been a testament to the power of interdisciplinary analysis. It has shown us that the answers we seek often lie beyond the confines of conventional data sets and that the pursuit of these answers can yield unexpected but invaluable insights.

Bibliography

Historical Weather Events and Corporate Impacts. Additional context on notable events can be found on Wikipedia: 2015 North American Blizzard, 2012 Chicago Summit, and Chicago Blackhawks Stanley Cup Wins.

--

--

Andrea Vasco

Analytics and AI at Google | Startup Mentor | Innovation Champion | If you have a problem, if no one else can help and if you can find me...