Sentiment Analysis of Financial Blog Posts: State of the U.S. Housing Market

By Jett Hollister

Technical explanations of procedures contained in this analysis can be viewed here.

During the leadup to the collapse of the U.S. housing market, few economic forecasts identified the growing imbalances in the American economy. Before the fire sale of Bear Stearns, collapse of Lehman Brothers, and failing subprime mortgage investments rippled throughout the global economy, intimations of the crisis to come were shrouded by all-time-high U.S. homeownership rates[1] and massively-profitable mortgage-backed securities[2]. Although the general public remained largely unaware of the signals of impending crisis, some economists and financial pundits paid closer attention to those signals, while others did not.  One can trace recognition of looming problems in the housing market, as well as shifting attitudes toward economic conditions, through sentiment analysis of financial blogs.

We examined the attitudes of four economic pundits through their commentary on the financial blog sites: Calculated Risk, Grasping Reality, Café Hayek and Marginal Revolution. These blogs represent a wide scope of backgrounds and ideologies. The economic opinions and ideologies of each blog can be found under data descriptions.

The sentiment analysis, which tracks the amount of positive and negative words included within the blog posts, allows us to quantify the emotion of these experts over the course of the housing boom and subsequent collapse. We focused on how the language of the financially-aware aligned with, reacted to, and possibly foresaw the collapse of the subprime mortgage market. While we analyzed the sentiment of these posts in tandem with key milestones during this period, these instances only provide a way to anchor the change in sentiment over time with events, and do not imply any causal relationships.

As the above graph demonstrates, the collective sentiment across all four blogs exhibited a consistent and considerable negative trend after Fall 2007, which accompanied signs of the negative impact that the U.S. subprime mortgage market was having on the domestic and global economy[3]. The collapse of Bear Stearns in March 2008 corresponds with the peak of negative language across all observed months, with sentiment remaining consistently negative thereafter. Hereafter, the impacts of the U.S. housing market collapse were not restricted to the financial sector–they were ubiquitous throughout all industries. The consistently negative sentiment in the blogs reflected more general public pessimism, as there was not much left to speculation at that time – the global economy had been brought to its knees and showed almost no signs of a smooth recovery ahead. The emotion displayed in the blogs over this period of time is not only consistent with what we would expect to see from financial experts, but also what we would expect to see in the sentiment of the general public.

While post-2007 sentiment generally follows the abysmal economic performance, the ebbs and flows of emotion during the prior years demonstrate a volatile evolution of opinions and uncertainty. From 2005 to 2007, sentiment consistently oscillated between drastically positive and negative language on a monthly basis. The first half of 2005 alone presents some of the strongest pre-2007 expressions of emotion in both directions. As more concrete indicators of the impending collapse became evident, the height of the positive peaks and the variance of sentiment decreased heading into late 2007. One might see the distance between sentiment peaks as the perceived range of outcomes for the economy for these bloggers. As speculations and projections turned into news headlines and press releases, the range of outcomes rapidly diminished and the number of positive projections approached zero. The oscillatory nature of the sentiment reflected the susceptibility of these experts to current events and highly-conflicting perceptions.

