Noah Goodman - Cognitive science requires causal abstraction analysis

#ai

Metadata

Notes

Summary

It's interesting to hear a psychologist's take on LLMs and interpretability.
Goodman starts with this "flash-forward" where in 2030, psychology is to a certain extent solved by LLMs. That is, any reproducible findings from psychology are displayed by LLMs. Not only that, LLMs are able to display the psychological characteristics of a wide, representative, range of the human population.
The question is, if this occurs, is there still a place for psychology?
He presents 3 camps:

He then reveals himself as a mechanism-ist. And presents some work where, given a hypothesis as to what the causal mechanism a LM is employing for a task, one can test if this is indeed the case via interventions. Basically, assuming the linear representation hypothesis, and then going and testing if there exist directions along which changing the representation along that direction is isomorphic to changing a variable in the causal structure one posits.

Interesting tidbits

My thoughts

Highlights


so when they discover that there are now AI surrogates that let them predict how humans will respond in very complicated real world situations, they think this is great, and they just say, that's all we need in order to design interventions and use psychology the way we always wanted.
Note: Reminds me of this distinction between prediction and inversion (Ala Manish Raghavan)


This is what Feyre Ben said back in against method. Anything goes. And so it's okay. It's defined by the community what counts as a satisfying explanation, and that's what we're doing.
Note: The silo theory of everything.


So this is the idea of causal abstraction. We want a mathematical theory of when one causal system is an abstraction, faithful, full abstraction of another.