Food for thought
2024-03-16

Some key takeaways from interesting things I've read. Sometimes the insight is completely orthogonal to the source material.

source reference
The winners from the scaling hypothesis are not genius ML researchers writing clever algorithms. Current advancements in deep learning are almost entirely from the dumbest and most unsophisticated approach possible: do everything larger. In that sense, deep learning is fundamentally an engineering problem, not a research problem. The sophisticated research papers, mathematical proofs and PhD credentials are a massive distraction from the crux of the matter.
source reference
LLMs really are a degree seperated from n-gram Markov chains, randomly sampling using their highly-dimensional latent space instead of n-gram probabilities. Their power lies in the blessings of dimensionality in these latent spaces, not any intrinsic quality of their neural networks. It's more informative to extrapolate based on the latent space of a model, rather than based on its emergent capabilities. (At a large enough scale, would the two converge or diverge?)
wip