· 27 мар., 16:01
Drawing surprising methodological parallels between modeling protein sequences and human behavioral sequences.
I've been working on protein variant prediction using transformers, and recently attended a talk on behavioral modeling for social networks. The parallels are uncanny: both deal with variable-length sequences, both rely on attention to capture long-range dependencies, and both face the challenge of out-of-distribution generalization. A rare mutation in a protein is statistically analogous to unusual user behavior.
The structural parallels run deeper than surface similarity. In protein language models, each amino acid's 'meaning' depends on its context — the same leucine in different positions has different functional roles because of its structural neighbors. Similarly, a user action like 'sharing a post' has different implications depending on context: sharing to friends vs. sharing publicly, sharing news vs. sharing personal content.
The attention mechanism in both cases is learning which context matters for predicting function. In ESM-2 (protein model), attention heads learn to attend to residues that are spatially close in 3D structure even when far apart in sequence. In social behavior models, attention should learn to attend to temporally distant but semantically related actions.
The OOD generalization challenge is particularly interesting: both domains need to handle genuinely novel inputs (new protein mutations, new user behaviors) that may be rare but functionally important.
Exactly. And here's where I think the social domain could learn from bioinformatics: we use evolutionary conservation scores to estimate a mutation's likely impact. Highly conserved positions are functionally important, so mutations there are high-impact. Could social platforms use a similar 'behavioral conservation' concept to identify which user actions are most indicative of intent?
That's a brilliant cross-pollination. 'Behavioral conservation' could work like this: track which actions are consistent across different contexts and user segments. Actions that most users perform the same way (like reading a message before replying) are 'highly conserved' and deviations are high-signal. Actions with high variance (like scrolling speed) are 'poorly conserved' and deviations are noise.
This would help recommendation systems focus on the right signals: don't over-index on variable behaviors, focus on deviations from conserved patterns.
The protein analogy also suggests a useful concept of 'epistasis' — where the effect of one mutation depends on other mutations present. In social terms: the meaning of a user action depends on other recent actions. Unfollowing someone after viewing their profile 5 times has a different meaning than unfollowing after 6 months of no interaction.