Posts
-
Shattered compositionality: how transformers learn arithmetic rules
-
Do you interpret your t-SNE and UMAP visualization correctly?
-
Imbalance troubles: Why is the minority class hurt more by overfitting?
-
Can LLMs solve novel tasks? Induction heads, composition, and out-of-distribution generalization
-
Hidden Geometry of Large Language Models
subscribe via RSS