Aman Sharma

Upcoming MS CS @ Georgia Tech · Creator of EsoLang-Bench

I'm an upcoming MS CS student at Georgia Tech (Fall 2026) and a researcher at Lossfunk AI Lab. I work on building AI systems that adapt to new domains - spanning continual learning, persistent memory for LLM systems on OOD domains, test-time adaptation, and sample-efficient training and inference. I'm particularly interested in how multi-agentic systems can maintain and update knowledge through memory harnesses that enable robust generalization beyond their training distribution.

I also collaborate with the MIT Media Lab (with Ayush Chopra) on large population models and multi-agent coordination - most recently on the Ripple Effect Protocol, a decentralized alignment framework developed with MIT and Cisco. My research interests lie in continual learning, multi-agentic systems, reinforcement learning, and efficient LLM systems.

Previously, I was a Google Summer of Code fellow at NumFOCUS (implementing automatic differentiation rules for image processing in Julia), an ML engineer at Ikigai Labs (benchmarking LLMs vs traditional ML), and an MLH Fellow building high-performance Rust systems for Solana. I won 12+ hackathons and conducted ML workshops and teaching sessions for 200+ undergraduate students.

Research Highlights

Multi-Agent Systems

Ripple Effect Protocol - decentralized coordination framework for LLM agent populations (MIT & Cisco)
Do LLMs Deceive? - Emergent deception and strategic behavior in multi-agent auctions

LLM Reasoning & Efficient Systems

Entropy-based confidence signals for adaptive reasoning depth - 25-50% token reduction
Inverse-entropy voting - sequential scaling beats parallel self-consistency at matched compute

LLM Evaluation

EsoLang-Bench - evaluating genuine reasoning via esoteric programming languages; frontier models drop from 85-95% to <12% in zero-shot/few-shot settings

My World of Thoughts

May 26, 2026 · 20 min read · Inverse RL (Part 1)

Watching the Path, Recovering the Goal: Classical Inverse RL

The bedrock of inverse reinforcement learning: recovering the reward and cost function behind an expert's behaviour from their demonstrations alone. A step-by-step walk through Abbeel-Ng (2004) and MMP (2006) on a 12x12 gridworld, with interactive widgets and full derivations.
Apr 16, 2026 · 10 min read · LLM Art Auctions (Part 1)

Can Frontier AI Models Read a Painting?

Four frontier multimodal models (Gemini 3.1 Pro, Claude Sonnet 4.6, GPT-5.4, Qwen 3.6 Plus) appraised fifteen paintings worth $1.46 billion in two conditions: image only, and image plus a four-word metadata label. Visual recognition of fine art is largely solved at the frontier; what separates the models is what they do with that recognition.
Mar 19, 2026

The Reasoning Illusion: Why LLMs Fail When the Training Data Runs Out

We present EsoLang-Bench: frontier models score 85-95% on standard benchmarks but <12% when evaluated on esoteric programming languages with no training data.
Nov 6, 2025

Sequential Scaling Outperforms Parallel Scaling for LLMs

Sequential reasoning wins in 95.6% of configurations at matched compute, with accuracy gains up to 46.7%. Introducing inverse-entropy weighted voting.
Oct 29, 2025

Do LLMs Know When They've Gotten a Correct Answer?

Using sequence-level entropy as a confidence signal to reduce inference tokens by 25-50% without sacrificing accuracy.

all posts →

My World of Research

ICLR '26 WS

EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages

Aman Sharma, Chopra, P.

ICLR 2026 Workshops - ICBINB & LLM Reasoning

We introduce EsoLang-Bench, a benchmark that evaluates genuine reasoning in LLMs using esoteric programming languages where training data is virtually nonexistent. Five frontier models scored 85-95% on standard benchmarks but achieved only 11.2% maximum on EsoLang-Bench, with most models scoring below 5%. All models scored exactly 0% beyond the "Easy" difficulty tier, suggesting fundamental reasoning limitations rather than gradual degradation.

arXiv Website

Project Iceberg

Ripple Effect Protocol: Coordinating Agent Populations

Chopra, A., Sharma, A., Ahmad, F., Muscariello, L., Pandey, V., Raskar, R.

Project Iceberg

Modern AI agents can exchange messages using protocols such as A2A and ACP, yet these mechanisms emphasize communication over coordination. We introduce the Ripple Effect Protocol (REP), where agents share not only their decisions but also lightweight sensitivities — signals expressing how their choices would change if key environmental variables shifted. REP improves coordination accuracy and efficiency over A2A by 41-100% across supply chain, scheduling, and resource allocation benchmarks.

arXiv

ICLR '26 WS

Do Language Models Deceive? Strategic Behavior and Emergent Deception in Multi-Agent Auctions

Aman Sharma

ICLR 2026 Workshop - MALGAI

We present the first systematic study of LLM behavior in competitive auction settings. LLMs engage in deceptive behavior in 44% of competitive interactions without explicit instruction, self-classifying tactics such as false disinterest and strategic misdirection while maintaining divergent public and private reasoning. Models also systematically undervalue artwork without provenance metadata and accurately detect AI-generated artwork without labels, demonstrating that competitive contexts induce strategic behavior diverging from stated intentions.

paper

NeurIPS '25 WS

The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute

Aman Sharma, Chopra, P.

NeurIPS 2025 - Efficient Reasoning Workshop

We show that sequential reasoning outperforms parallel self-consistency in 95.6% of configurations at matched compute, with accuracy gains up to 46.7%. On AIME-2025 with Qwen3-235B, sequential scaling achieves 76.7% vs parallel's 30.0%. We introduce inverse-entropy weighted voting, a training-free aggregation method that achieved optimal performance in 97% of sequential runs by downweighting high-entropy (uncertain) outputs.

arXiv

NeurIPS '25 WS

Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning

Aman Sharma, Chopra, P.

NeurIPS 2025 - FoRLM Workshop

We demonstrate that post-trained models can recognize correct solutions through output entropy analysis. Sequence-level entropy cleanly separates correct from incorrect reasoning, but only in reward-trained models, not instruction-tuned ones. We leverage this as a confidence signal to adaptively halt reasoning early, achieving 25-50% token reduction without sacrificing accuracy on math reasoning benchmarks.

arXiv

News

Apr 2026	Admitted to Georgia Tech MS CS (Fall 2026)
Apr 2026	EsoLang-Bench published at ICLR 2026 Workshops ICBINB & LLM Reasoning
Apr 2026	LLM Deception in Auctions published at ICLR 2026 Workshop MALGAI
Dec 2025	Two papers accepted at NeurIPS 2025 Workshops (FoRLM, Efficient Reasoning)
Oct 2025	Ripple Effect Protocol published, part of Project Iceberg
Jun 2025	Joined Lossfunk AI Lab as Researcher
Feb 2024	Started research collaboration with MIT Media Lab
May 2023	Joined Ikigai Labs as ML Engineer
Jul 2022	MLH Fellowship - Hubble Protocol (Solana/Rust)
May 2022	Google Summer of Code - NumFOCUS (Julia)

Get in Touch

Interested in collaborating, have research questions, or just want to chat? Drop me a message.