高樓低廈,人潮起伏,
名爭利逐,千萬家悲歡離合。

閑雲偶過,新月初現,
燈耀海城,天地間留我孤獨。

舊史再提,故書重讀,
冷眼閑眺,關山未變寂寞!

念人老江湖,心碎家國,
百年瞬息,得失滄海一粟!

徐訏《新年偶感》

2013年2月19日星期二

Saving for a rainy day: Keith Chen on language that forecasts weather — and behavior

Keith-Chen

How are China, Estonia and Germany different from India, Greece and the UK? To an economist, one answer is obvious: savings rates. Germans save 10 percentage points more than the British do (as a fraction of GDP), while Estonians and Chinese save a whopping 20 percentage points more than Greeks and Indians. Economists think a lot about what drives people to save, but many of these international differences remain unexplained. In a recent paper of mine, I find that these countries differ not only in how much their residents save for the future, but also how their native speakers talk about the future.

In late 2011, an idea struck me while reading several papers in psychology that link a person’s language with differences in how they think about space, color, and movement. As a behavioral economist, I am interested in understanding how people make decisions. Could a person’s language subtly affect his or her everyday decisions? In particular, could the way a person’s language marks the future affect their propensity to save for the future?

In a nutshell, this is precisely what I found. After scouring many datasets with millions of records on individual household savings behavior—along with a number of peculiar health performance metrics like grip strength and walking speed—I find that languages that oblige speakers to grammatically separate the future from the present lead them to invest less in the future. Speakers of such languages save less, retire with less wealth, smoke more, practice more unsafe sex and are more obese. Surprisingly, this effect persists even after controlling for a speaker’s education, income, family structure and religion.

Back when my first paper on this topic circulated, many linguists were appropriately skeptical of the work. Their concerns are concisely explained in two well-thought out posts (here and here) by the linguists Mark Liberman and Goeffrey Pullum on the blog they founded, Language Log. Mark and Geoffrey also invited me to write a guest post explaining the work. In that post, I discuss which of their possible concerns are unlikely given the patterns I find across the world in people’s savings and health behaviors, and also try to clarify which of their concerns I was not yet able to address.

This exchange prompted a broad set of discussions as to what different types of data, analyses and experiments could, in principle, answer the questions raised by the patterns I find. Cross-disciplinary discussions took place in a subsequent post by Julie Sedivy and followup posts by Mark Liberman, and also at the Linguistic Data Consortium’s 20th Anniversary Workshop. Several new avenues of investigation and work came out of these interactions, three of which are now ongoing projects.

One new idea that I’ve begun to explore entails measuring a language’s time reference by scraping the web—to search for natural patterns in language—in addition to using linguistic classifications. This led me to search the web for the simplest form of writing about the future I could find: weather forecasts. Why weather forecasts? Well, forecasts rarely talk about the past, so they’re a natural place to look for speech about the future. Weather forecasters also generally communicate in natural, straightforward language, and often convey similar content across different settings. Can patterns in weather forecasts measure how languages structure the future, and can these differences predict how people save for the future? Amazingly, they do.

A team of linguistics and economics students assisted with this analysis, and managed to scrape the web for weather forecasts in 39 languages from around the world. The figure below summarizes what we found: wide variation in how often, when talking about future weather, forecasts in a particular language grammatically mark the future as something distinct from the present. In English, for example, this comes down to the relative frequency of sentences like:

Rain is likely this weekend.                (present tense “is”)
It will likely rain this weekend.          (future tense “will rain”)

What’s surprising is that when I repeat the statistical analysis I did in the paper, I find an incredibly strong relationship between how forecasters talk about weather and how much people choose to save.  Essentially, a 20 percentage point increase in the frequency of future tenses results in 1% more of GDP saved. This finding holds even after taking into account a country’s level of development, rate of growth, demographics, social security protections and major religions.

What does this mean? I don’t believe it demonstrates extreme weather forecaster persuasion. Rather, I think it shows that many different ways of measuring how languages mark time share a strong and striking relationship with how speakers of those languages save. In short, I believe more than ever that the data suggests a strong and robust relationship between linguistic and economic data, a relationship that leaves us at an exciting crossroads: one where economists have a tremendous amount to learn from linguists.
The figure below measures the percent of time weather forecasts use future vs. present tenses (download a larger version as a PDF). See the paper here for details.

Graph of Future Tense Use