Big data's big impact
by Rick Hutley
Pacific Review, fall 2016
From shining light on the presidential candidates to becoming vital for our careers, data analytics makes a big difference in our lives.
If you voted in this year's presidential election, you were influenced by it. If you work in the San Francisco Bay Area or Sacramento or anywhere, really, it makes a difference in your world.
Data analytics — the result of analyzing extremely large and diverse data sets (big data) to reveal patterns, trends and associations, especially relating to human behavior — is touching every aspect of our lives, the economy and our society.
01010101010101010101 |
Data analytics is touching every aspect of our lives, the economy and our society.
First, let's look at one of the most important events our country has experienced in recent history: the 2016 presidential election. Thanks to the vast scope of the internet, we can obtain a wide variety of data, such as voter preferences, which can give us an understanding of what people actually think; campaign profiles; corporate and foundation annual reports; and corporate tax information. As I'm teaching my data science students, this broad range of factual data allows us to do our own analysis of the candidates, even as the campaigns analyze us.
Debate transcripts are like court transcripts — they are an accurate, factual rendition of who said what. That makes them a very reliable source of information about candidates — devoid of bias or other influence that may be presented in third-party blogging or reporting about the debate. Similarly, social media postings from the candidate directly or via official campaign accounts are excellent sources of data. When we subject them to computer analysis, we can learn many things about the candidates based on how they express themselves.
The transcript can certainly tell us who spoke most, but that's not the whole picture. How much someone is talking isn't enough. We also need to consider what they are talking about and the style of language they are using to discuss their topics. And how about emotion?
A simple count of the words spoken during the 16 primary debates that took place up to February 2016 suggests that Hillary Clinton spoke about 20 percent more words than did Donald Trump. By a simple count, she was the most prolific speaker of all of the candidates in these debates. But that's not the whole picture.
Some candidates may have fielded more questions than others, or been given more leeway to speak at length. When we account for these and other factors — such as how many debates a candidate attended and how many other participants there were — a very different picture emerges: Trump was, in fact, the most verbose candidate, and exceeded Clinton by around 18 percent.
The quantity of talking isn't enough. We also need to look at the issues they are talking about, their vocabulary and the emotions they apply. Clinton used a wider vocabulary: Using the combined data from these primary debates, she used around 2,300 distinct word bases or stems (counting related terms such as "vote," "voter" and "voting" as a single term). Trump used a much smaller vocabulary of only 1,750 stems.
Clinton used lengthier, more sophisticated sentence constructions — scoring around 12 on the Gunning fog index, which measures the complexity of language — while Trump used tweet-like short phrases that score a 7. This suggests Clinton was seeking to communicate with a more educated and socially sophisticated audience, while Trump made an effort to be readily understood at all socioeconomic levels.
It doesn't matter what field you are in or even if you want to work as a data scientist or not...Every single job is going to depend on data and how we use it to our advantage.
We can also use sentiment analysis to get a sense of the language and emotion in the debate. We can determine whether a candidate is under stress or remaining calm by looking at the tone of the words used and whether they are imparting a positive or negative message. Analysis of the first presidential debate shows the two candidates were close: Clinton used 53 percent negative terms while Trump used 55 percent. She was also more positive when tweeting.
The election is just one example of the importance and power of analytics. It applies to every industry and every company, and to every leader of every function to as well.
01010101010101010101 |
Let's look at the impact of data analytics in our regions and life.
There's a fundamental shift in the economy, and it's vital that we all take it personally. The San Francisco Bay Area, where University of the Pacific has a campus and where we launched the university's first analytics program last year, has been for decades a hub of technological innovation.
The Sacramento region, where the university also has a campus and is launching an analytics program early next year, has its own unique needs and contributions, too. This is important. For each and every person, it means career development and career opportunity.
It doesn't matter what field you are in or even if you want to work as a data scientist or not. It's important for everyone — we all need to better understand how data can be gathered and analyzed and how powerful analytic insights can be developed to help you make informed decisions that lead to business or personal success.
University of the Pacific, with its three campuses in Northern California, is uniquely positioned to provide these critical skills to the region.
As we continue to move from the technological era to the data era, it will be vital for San Francisco, Sacramento and their regions to continue developing a data/analytics infrastructure and a workforce with data skills — from managers who are data-savvy to deep data scientists who know how to create those powerful insights. That is how the region will continue to grow jobs and the economy as we continue to be the place the world turns to for innovation and leadership, not only in technology but also in analytics.
Recently, I sat on a panel at the San Francisco Chamber of Commerce's biztechSF event at the Oracle OpenWorld convention at the Moscone Center. I joined innovators from LinkedIn and HireMojo to discuss using analytics to grow businesses — a practice that is becoming increasingly crucial no matter what industry you work in.
University of the Pacific, with its three campuses that span Northern California, is uniquely positioned to provide these critical skills to the Bay Area and Sacramento regions. And beyond its regional advantage, Pacific has a broad focus and vision for this field that draws from and enhances the university's existing programs in health, law, sports and business. It's important to continue to take data and analytics seriously — and personally — as we move into the data era.
It is already influencing every aspect of our lives. And it's evident that personal, corporate and regional success will depend on it.
01010101010101010101 |
Rick Hutley is the program director and clinical professor of analytics at University of the Pacific, CEO of Stratathought LLC, and former vice president of innovation at Cisco Systems.