You’re evaluating two loan applications.
One prospective borrower writes that he hates to ask for help, but he’s had a tough run of luck. First his car got stolen, and a short time later he was diagnosed with cancer. He promises to repay the loan as soon as the auto insurance check arrives.
The second applicant is more succinct. She simply states that she needs money to replace her leaky roof.
Who is more likely to repay the loan? Research from professors at Columbia and the University of Delaware suggests that the second applicant, who makes the straightforward request, is the better bet. The sob story presented by the first borrower may be a sign of his intent to deceive the lender, according to the researchers.
“My belief is that the text we write is much like our body language,” said Oded Netzer, the Columbia Business School professor who co-authored the research. “We leave traits in the text.”
Text analysis is one form of so-called alternative data, an umbrella term that refers to information that cannot be found on credit reports. It can be used to study not only what borrowers write on loan applications, but also what they post on social media sites such as Facebook and Twitter. As of today, it is not used much by U.S. lenders, but it has gained more traction in other, less regulated parts of the globe.
Text analysis holds the promise of improving lenders’ ability to evaluate which borrowers will repay, but it also carries substantial risks. For example, if a particular phrase on a loan application is correlated with default, and is also more likely to be used by members of a particular ethnic group, the lender could be charged with discriminatory lending.
“The data is informative. Now we need to figure out, is it: a) ethical; and b) legal?” Netzer said.
‘The sweat in between the words’
To conduct his research, Netzer analyzed more than 18,000 loan requests from the online consumer lender Prosper Marketplace. The loan applications, which date to 2007 and 2008, included a text box where prospective borrowers could write anything they wished. San Francisco-based Prosper no longer collects this information.
Netzer’s goal was to determine which words people use in loan applications when they are worried about their ability to repay. “Can we identify the sweat in between the words?” he asked.
The research controlled for factors such as borrowers’ credit scores and their financial capacity to make regular payments. Netzer found that folks who use more sophisticated language, and exhibit signs of financial literacy, are more likely to repay their loans.
Meanwhile, borrowers who invoke religion, using phrases like “God bless,” were found to be less likely to pay. Borrowers who write about hardships they have endured also have a greater likelihood of defaulting.
Netzer drew a comparison between the language employed by borrowers whose loans went bad and the words used in so-called Nigerian email scams. Those fraud schemes are typically marked by polite language and convincing tales of woe.
“If you think about it, it’s the extreme version of what we found in these loans,” Netzer said.
Another pattern that Netzer and his fellow researchers identified involved borrowers’ verbosity. Applicants whose written submissions were lengthier were found to be more likely to default.
Tala, a Santa Monica, Calif.-based company that makes small-dollar loans in east Africa and the Philippines, has made a similar finding about its borrowers. “What’s interesting is the length of their answer,” said Shivani Siroya, the company’s CEO.
Tala asks prospective borrowers to disclose how they plan to spend their cash, and those who do not select a specified use cases are asked to type in an explanation. The company has found that borrowers who are more succinct are more likely to repay, Siroya said.
‘You have to be very, very careful’
In February, the Consumer Financial Protection Bureau launched an inquiry into the use of alternative data by lenders.
That might sound like the start of a crackdown, but so far at least, it hasn’t been. The agency said that it would study the impact of data that may be correlated to race, ethnicity or gender, but it also noted that new methods of underwriting might help consumers who do not have credit scores.
Then in September, the CFPB issued its first-ever no-action letter to Upstart Network, an online lender based in San Carlos, Calif., that uses alternative data in its underwriting.
The consumer bureau stated that it did not currently intend to take an enforcement action against Upstart, a declaration that some observers interpreted as a sign that the CFPB wants to encourage the use of alternative data to enable borrowing by consumers with thin credit histories.
Still, many consumer lenders remain wary of the regulatory risks associated with text analysis.
“You have to be very, very careful,” cautioned Kalpesh Kapadia, the CEO of Deserve, a startup that offers credit cards to consumers with slim credit files.
Douglas Merrill, the CEO of Zest Finance in Los Angeles, which has explored the use of text analysis in loan underwriting, expressed skepticism for two reasons.
First, he said that text analysis is hard to get right. Speech is messy, full of allusions and similes that computers still have a hard time comprehending, he noted.
Second, Merrill said that even if text analysis is perfected, applying it in ways that comply with fair lending laws will be difficult. He noted that language is sometimes connected to race and ethnicity.
“I feel like it’s going to be quite a challenge to really bring language analysis to bear on credit,” he said.
Even if lenders overcome those hurdles, they may face yet another obstacle: firms who see an opportunity to game the process by advising borrowers on what to write on their loan applications.
Still, Netzer said that he has been invited to brief several large banks about his research findings. He is bullish about the potential of text analysis to help banks better understand applicants who lack credit scores.
“I think there is an opportunity there for them to gain an important piece of information,” Netzer said.