-
May 26, 2026 · 20 min read · Technical Blog Post · Inverse RL (Part 1)
Show me what you do, and I will tell you what you want. That is the bet of inverse reinforcement learning, and it is also the bet behind every modern LLM alignment pipeline, every robot learning from human demonstration, and every recommendation system trying to figure out what users actually care about. This post is about how the field first learned to take that bet seriously, in 2004 and 2006, on examples small enough to fit in a 12x12 gridworld. With interactive widgets at every key step.
-
Apr 16, 2026 · 10 min read · Technical Blog Post · LLM Art Auctions (Part 1)
Four frontier multimodal models (Gemini 3.1 Pro, Claude Sonnet 4.6, GPT-5.4, Qwen 3.6 Plus) appraised fifteen paintings worth $1.46 billion in two conditions: image only, and image plus a four-word metadata label. Visual recognition is largely solved at the frontier; what separates the models is what they do with that recognition. The first post in my LLM Art Auctions research series.
-
Mar 19, 2026 · 10 min read · Technical Blog Post · Lossfunk Letters
We present EsoLang-Bench, a benchmark using esoteric programming languages where training data is virtually nonexistent. Five frontier models scored 85-95% on standard benchmarks but achieved only 11.2% maximum on EsoLang-Bench, with most below 5%. All models scored exactly 0% beyond the "Easy" difficulty tier, a uniform failure suggesting fundamental limitations rather than gradual degradation.
-
Technical Blog Post
A deep-dive into the Engram module, exploring its architecture, memory mechanisms, and the elegant design principles behind continual learning in neural networks.
-
Technical Blog Post
A technical walkthrough of importance sampling, covering how to estimate expectations under one distribution using samples from another, and why it matters for reinforcement learning and probabilistic inference.
-
Nov 6, 2025 · 8 min read · Technical Blog Post · Lossfunk Letters
Sequential reasoning wins in 95.6% of configurations at matched compute, with accuracy gains up to 46.7%. On AIME-2025 with Qwen3-235B: 76.7% vs parallel's 30.0%. We introduce inverse-entropy weighted voting, a training-free aggregation method that achieved optimal performance in 97% of sequential runs.
-
Oct 29, 2025 · 6 min read · Technical Blog Post · Lossfunk Letters
We demonstrate that post-trained models can recognize correct solutions through output entropy analysis. Sequence-level entropy cleanly separates correct from incorrect reasoning, but only in reward-trained models, not instruction-tuned ones. This enables 25-50% token reduction without sacrificing accuracy.
-
Technical Blog Post · Medium
An introduction to Flux.jl, Julia's machine learning library, covering how to build custom neural network architectures from scratch with a clean, composable API that makes deep learning in Julia intuitive and flexible.