Bookmarks Menu

Bookmarks Toolbar

machinelearning

Deep Learning with PyTorch: A 60 Minute Blitz — PyTorch Tutorials 2.2.0+cu121 documentation
The End of Finetuning — with Jeremy Howard of Fast.ai

llmpretrainingspeedup

Cramming: Training a Language Model on a Single GPU in One Day - 2212.14034.pdf
Liuhong99/Sophia: The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
transcending scaling laws with 0.1% extra compute
[2307.05695] ReLoRA: High-Rank Training Through Low-Rank Updates
early weight averaging + high lr
grow length
No train no gain (review)
General transformer optimizations from RWKV
Single Headed Attention RNN: Stop Thinking With Your Head - 1911.11423v1.pdf
[2106.10860] Multiplying Matrices Without Multiplying
Entropy | Free Full-Text | Acceleration of Approximate Matrix Multiplications on GPUs

llminferencespeedup

Towards 100x Speedup: Full Stack Transformer Inference Optimization
GitHub - pytorch-labs/gpt-fast: Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

llmmemoryefficiency

MeZO/large_models at main · princeton-nlp/MeZO
mobiusml/hqq: Official implementation of Half-Quadratic Quantization (HQQ)
LASER
[2312.05821] ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Bandwidth Efficient Inference
Paper page - LLM in a flash: Efficient Large Language Model Inference with Limited Memory
101_for_distillation_tokens_are_no.pdf

llmknowledgeediting

GitHub - zjunlp/EasyEdit: An Easy-to-use Knowledge Editing Framework for LLMs.
Personality Edit

humandatasets

databricks/databricks-dolly-15k · Datasets at Hugging Face
HuggingFaceH4/no_robots · Datasets at Hugging Face
GAIR/lima · Datasets at Hugging Face
OpenAssistant/oasst2 · Datasets at Hugging Face
bsd_ja_en · Datasets at Hugging Face
Japanese-English Bilingual Corpus
Asian Language Treebank (ALT) Project
ParaNatCom
Index of /pub/archives/usenet/utzoo
2007 Gopherspace Mirror : John Goerzen : Free Download, Borrow, and Streaming : Internet Archive

datasetcleaning

anon8231489123/ShareGPT_Vicuna_unfiltered · Datasets at Hugging Face
Minipile
dolma/docs/deduplication.md at main · allenai/dolma
roberta-base-openai-detector · Hugging Face

interesting

The Melancholy of Subculture Society · Gwern.net
Who wants to play the status game? | Hacker News
The Economics of Status (2006) | Hacker News
The Cursed Computer Iceberg Meme