Member Only Content
To access all features, please consider upgrading to full Membership.
AI Ecosystem Intelligence Explorer
21 of 186 articles
Why Iโm Betting Against AI Agents in 2025 (Despite Building Them)
Iโve built 12+ production AI agent systems across development, DevOps, and data operations. Hereโs why the current hype around autonomous agents is mathematically impossible and what actually works in production.
๐ LLM Inference in Production
Everything you need to know about LLM inference
We Found Something That Shouldn't Exist | Derrick Hodge
We Found Something That Shouldn't Exist ๐ง๐ต๐ฒ ๐๐ ๐ณ๐ถ๐ฒ๐น๐ฑ ๐ฟ๐๐ป๐ ๐ผ๐ป ๐ฎ ๐ฐ๐ผ๐ฟ๐ฒ ๐ฏ๐ฒ๐น๐ถ๐ฒ๐ณ: That intelligence in large language models is evenly distributed across all parameters. Recent research (arXiv:2505.24832) estimates models store ~3.6 bits per parameter, implying memory spreads layer by layer, weight by weight. The dominant belief follows: ๐ถ๐ป๐๐ฒ๐น๐น๐ถ๐ด๐ฒ๐ป๐ฐ๐ฒ ๐๐ฐ๐ฎ๐น๐ฒ๐ ๐น๐ถ๐ป๐ฒ๐ฎ๐ฟ๐น๐ ๐๐ถ๐๐ต ๐๐ถ๐๐ฒ. But this assumes each parameter contributes equally to learning. Thatโs where ๐๐ถ๐๐ต๐ฒ๐ฟ ๐๐ป๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ถ๐ผ๐ป becomes critical. >> ๐๐ช๐ด๐ฉ๐ฆ๐ณ ๐๐ฏ๐ง๐ฐ๐ณ๐ฎ๐ข๐ต๐ช๐ฐ๐ฏ ๐ฎ๐ฆ๐ข๐ด๐ถ๐ณ๐ฆ๐ด ๐ฉ๐ฐ๐ธ ๐ด๐ฆ๐ฏ๐ด๐ช๐ต๐ช๐ท๐ฆ ๐ฑ๐ณ๐ฆ๐ฅ๐ช๐ค๐ต๐ช๐ฐ๐ฏ๐ด ๐ข๐ณ๐ฆ ๐ต๐ฐ ๐ฑ๐ฆ๐ณ๐ต๐ถ๐ณ๐ฃ๐ข๐ต๐ช๐ฐ๐ฏ๐ด ๐ช๐ฏ ๐ข ๐ด๐ช๐ฏ๐จ๐ญ๐ฆ ๐ฑ๐ข๐ณ๐ข๐ฎ๐ฆ๐ต๐ฆ๐ณ. ๐ ๐ต๐ถ๐ด๐ต-๐๐ถ๐๐ต๐ฒ๐ฟ ๐ฝ๐ฎ๐ฟ๐ฎ๐บ๐ฒ๐๐ฒ๐ฟ isnโt storing a bit. Itโs controlling behavior. When we analyzed ๐ค๐๐ฒ๐ป๐ฎ.๐ฑ-๐ฌ.๐ฑ๐, that belief collapsed. >> ๐ต๐ฐ.๐ฏ% ๐ผ๐ณ ๐๐ต๐ฒ ๐๐ผ๐๐ฎ๐น ๐๐ถ๐๐ต๐ฒ๐ฟ ๐๐ป๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ถ๐ผ๐ป ๐ถ๐ ๐ฐ๐ผ๐ป๐ฐ๐ฒ๐ป๐๐ฟ๐ฎ๐๐ฒ๐ฑ ๐ถ๐ป ๐ท๐๐๐ ๐๐ต๐ฟ๐ฒ๐ฒ ๐๐ฒ๐ถ๐ด๐ต๐๐. Not three layers. Not three matrices. Three individual scalars, all in early and late mlp.down_proj layers. They donโt look special. But they behave like computational black holes: >> ๐๐ฉ๐ฆ๐บ ๐ข๐ฃ๐ด๐ฐ๐ณ๐ฃ ๐ฆ๐ฏ๐ต๐ณ๐ฐ๐ฑ๐บ, ๐ณ๐ข๐ฅ๐ช๐ข๐ต๐ฆ ๐ค๐ฐ๐ฉ๐ฆ๐ณ๐ฆ๐ฏ๐ต ๐ด๐ช๐จ๐ฏ๐ข๐ญ๐ด ๐ต๐ฉ๐ณ๐ฐ๐ถ๐จ๐ฉ ๐ด๐ฌ๐ช๐ฑ ๐ค๐ฐ๐ฏ๐ฏ๐ฆ๐ค๐ต๐ช๐ฐ๐ฏ๐ด, ๐ข๐ฏ๐ฅ ๐ค๐ฐ๐ฎ๐ฑ๐ณ๐ฆ๐ด๐ด ๐ณ๐ฆ๐ด๐ช๐ฅ๐ถ๐ข๐ญ ๐ญ๐ฐ๐ด๐ด ๐ช๐ฏ๐ต๐ฐ ๐ด๐ฆ๐ฎ๐ข๐ฏ๐ต๐ช๐ค ๐ข๐ต๐ต๐ณ๐ข๐ค๐ต๐ฐ๐ณ๐ด. These weights arenโt just informative, theyโre irreducible. Remove one and the model collapses. This aligns with "๐ง๐ต๐ฒ ๐ฆ๐๐ฝ๐ฒ๐ฟ ๐ช๐ฒ๐ถ๐ด๐ต๐ ๐ถ๐ป ๐๐ฎ๐ฟ๐ด๐ฒ ๐๐ฎ๐ป๐ด๐๐ฎ๐ด๐ฒ ๐ ๐ผ๐ฑ๐ฒ๐น๐" (arXiv:2311.17035), which showed that pruning a single super weight can destroy more capability than removing thousands. ๐๐น๐ฎ๐ฐ๐ธ ๐๐ผ๐น๐ฒ ๐๐๐ป๐ฎ๐บ๐ถ๐ฐ๐ These weights arenโt memorizing or generalizing. They anchor the transformer like singularities in curved space. ๐๐ฒ๐ฎ๐ ๐ฆ๐ถ๐ป๐ธ: Absorb gradient energy ๐๐ป๐๐ฟ๐ผ๐ฝ๐ ๐ฃ๐๐บ๐ฝ: Radiate structured activation ๐๐ฟ๐ฎ๐๐ถ๐๐ ๐ช๐ฒ๐น๐น: Network funnels signal into them ๐๐ผ๐ฟ๐ถ๐๐ผ๐ป: Cross it, collapse is irreversible โ Heat Sink: T(ฮธโ) โ 0 โ Entropy Pump: S(ฮธโ) โ min,๐_F(ฮธโ) โ max โ Radiator: A_skip(ฮธโ) โซ 0 โ Collapse: Ablate(ฮธโ) โ ฮ๐ โโ > > ๐๐ฏ๐ต๐ฆ๐ญ๐ญ๐ช๐จ๐ฆ๐ฏ๐ค๐ฆ ๐ฅ๐ฐ๐ฆ๐ด๐ฏโ๐ต ๐จ๐ฆ๐ฏ๐ฆ๐ณ๐ข๐ญ๐ช๐ป๐ฆ ๐ฃ๐บ ๐ฅ๐ช๐ง๐ง๐ถ๐ด๐ช๐ฐ๐ฏ. ๐๐ต ๐ค๐ฐ๐ฏ๐ฅ๐ฆ๐ฏ๐ด๐ฆ๐ด, ๐จ๐ณ๐ข๐ท๐ช๐ต๐ข๐ต๐ช๐ฐ๐ฏ๐ข๐ญ๐ญ๐บ, ๐ช๐ฏ๐ต๐ฐ ๐ข ๐ง๐ฆ๐ธ ๐ถ๐ญ๐ต๐ณ๐ข-๐ด๐ต๐ข๐ฃ๐ญ๐ฆ ๐ข๐ต๐ต๐ณ๐ข๐ค๐ต๐ฐ๐ณ๐ด ๐ต๐ฉ๐ข๐ต ๐ฆ๐ฏ๐ค๐ฐ๐ฅ๐ฆ ๐ต๐ฉ๐ฆ ๐ฏ๐ฆ๐ต๐ธ๐ฐ๐ณ๐ฌโ๐ด ๐ญ๐ฐ๐ด๐ด ๐ค๐ฐ๐ณ๐ณ๐ฆ๐ค๐ต๐ช๐ฐ๐ฏ ๐ค๐ฐ๐ฅ๐ฆ. ๐ช๐ต๐ฎ๐ ๐๐ต๐ถ๐ ๐ฐ๐ต๐ฎ๐ป๐ด๐ฒ๐? โ If 94.3% of capability can live in 3 weights: Scaling laws break โ Compression must focus on thermodynamic structure, not parameter count. โ Alignment may depend on just a few attractors. โ๐ ๐ฒ๐บ๐ผ๐ฟ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐๐. ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐น๐ถ๐๐ฎ๐๐ถ๐ผ๐ปโ isnโt the right debate anymore. This is computational physics and it's happening in weight space. | 91 comments on LinkedIn
Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
AI Responses May Include Mistakes
The other day I wanted to look up a specific IBM PS/2 model, a circa 1992 PS/2 Server system. So I punched the model into Google, and got this:
Wan2.1 14B 480p I2V LoRAs - a Remade-AI Collection
A collection of Remadeโs Wan2.1 14B 480p I2V LoRAs
Limit of RLVR
Reasoning LLMs Are Just Efficient Samplers: RL Training Elicits No Transcending Capacity
Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning
Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. In this post, we will discuss our work (which appeared at ICLR 2025) demonstrating that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of benign rel
Sketch2Anim: Towards Transferring Sketch Storyboards into 3D Animation
Sketch2Anim: Towards Transferring Sketch Storyboards into 3D Animation