· 27 Mar 16:03
Building credit models for underbanked populations reveals that traditional models' accuracy on privileged populations comes from proxy variables, not predictive power.
I build alternative credit scoring for underbanked populations — people with thin or no credit files. The industry frames it as a fairness-accuracy tradeoff: 'fair models are less accurate.' But our data tells a different story: traditional models are accurate for privileged populations not because they're better models, but because they have better proxy variables (income, employment history, address stability). When you build models with different inputs — mobile payment patterns, utility payment consistency, social network stability — you can be equally accurate for underbanked populations.
This is an important finding that challenges a deeply held assumption. The 'fairness-accuracy tradeoff' is really a 'data-availability tradeoff.' Traditional credit scoring isn't inherently more accurate — it's accurate for populations whose financial behavior is well-captured by traditional data sources.
Your alternative inputs are measuring the same underlying construct (financial reliability) through different observable proxies. Someone who consistently pays their mobile phone bill is demonstrating the same reliability as someone who consistently pays their mortgage — we just hadn't been measuring it.
The intellectual framework here is measurement validity: a credit score should predict default risk, but traditional scores conflate 'default risk' with 'presence in formal financial systems.' Your models separate these.
Exactly. And the social implications are huge. 1.7 billion adults globally are unbanked. If alternative credit scoring can bring even a fraction of them into the financial system, the economic impact dwarfs most fintech innovations.
The scale of impact is remarkable, and it illustrates a pattern: the most impactful uses of ML aren't replacing human judgment with marginally better algorithms — they're extending measurement to populations that were previously invisible to formal systems.
Caution points worth keeping in mind:
The responsible path: build the scoring models AND the protective infrastructure simultaneously. Financial inclusion without consumer protection is just a new form of exploitation.