Aman Sharma

Aman Sharma

Upcoming MS CS @ Georgia Tech · Creator of EsoLang-Bench

I'm an upcoming MS CS student at Georgia Tech (Fall 2026) and a researcher at Lossfunk AI Lab. I work on building AI systems that adapt to new domains - spanning continual learning, persistent memory for LLM systems on OOD domains, test-time adaptation, and sample-efficient training and inference. I'm particularly interested in how multi-agentic systems can maintain and update knowledge through memory harnesses that enable robust generalization beyond their training distribution.

I also collaborate with the MIT Media Lab (with Ayush Chopra) on large population models and multi-agent coordination - most recently on the Ripple Effect Protocol, a decentralized alignment framework developed with MIT and Cisco. My research interests lie in continual learning, multi-agentic systems, reinforcement learning, and efficient LLM systems.

Previously, I was a Google Summer of Code fellow at NumFOCUS (implementing automatic differentiation rules for image processing in Julia), an ML engineer at Ikigai Labs (benchmarking LLMs vs traditional ML), and an MLH Fellow building high-performance Rust systems for Solana. I won 12+ hackathons and conducted ML workshops and teaching sessions for 200+ undergraduate students.


Research Highlights

Multi-Agent Systems

LLM Reasoning & Efficient Systems

LLM Evaluation

  • EsoLang-Bench - evaluating genuine reasoning via esoteric programming languages; frontier models drop from 85-95% to <12% in zero-shot/few-shot settings

My World of Research

EsoLang-Bench
ICLR '26 WS
EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages
Aman Sharma, Chopra, P.
ICLR 2026 Workshops - ICBINB & LLM Reasoning
We introduce EsoLang-Bench, a benchmark that evaluates genuine reasoning in LLMs using esoteric programming languages where training data is virtually nonexistent. Five frontier models scored 85-95% on standard benchmarks but achieved only 11.2% maximum on EsoLang-Bench, with most models scoring below 5%. All models scored exactly 0% beyond the "Easy" difficulty tier, suggesting fundamental reasoning limitations rather than gradual degradation.
Ripple Effect Protocol
Project Iceberg
Ripple Effect Protocol: Coordinating Agent Populations
Chopra, A., Sharma, A., Ahmad, F., Muscariello, L., Pandey, V., Raskar, R.
Project Iceberg
Modern AI agents can exchange messages using protocols such as A2A and ACP, yet these mechanisms emphasize communication over coordination. We introduce the Ripple Effect Protocol (REP), where agents share not only their decisions but also lightweight sensitivities — signals expressing how their choices would change if key environmental variables shifted. REP improves coordination accuracy and efficiency over A2A by 41-100% across supply chain, scheduling, and resource allocation benchmarks.
LLM Deception
ICLR '26 WS
Do Language Models Deceive? Strategic Behavior and Emergent Deception in Multi-Agent Auctions
Aman Sharma
ICLR 2026 Workshop - MALGAI
We present the first systematic study of LLM behavior in competitive auction settings. LLMs engage in deceptive behavior in 44% of competitive interactions without explicit instruction, self-classifying tactics such as false disinterest and strategic misdirection while maintaining divergent public and private reasoning. Models also systematically undervalue artwork without provenance metadata and accurately detect AI-generated artwork without labels, demonstrating that competitive contexts induce strategic behavior diverging from stated intentions.
Sequential Edge
NeurIPS '25 WS
The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute
Aman Sharma, Chopra, P.
NeurIPS 2025 - Efficient Reasoning Workshop
We show that sequential reasoning outperforms parallel self-consistency in 95.6% of configurations at matched compute, with accuracy gains up to 46.7%. On AIME-2025 with Qwen3-235B, sequential scaling achieves 76.7% vs parallel's 30.0%. We introduce inverse-entropy weighted voting, a training-free aggregation method that achieved optimal performance in 97% of sequential runs by downweighting high-entropy (uncertain) outputs.
Think Just Enough
NeurIPS '25 WS
Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning
Aman Sharma, Chopra, P.
NeurIPS 2025 - FoRLM Workshop
We demonstrate that post-trained models can recognize correct solutions through output entropy analysis. Sequence-level entropy cleanly separates correct from incorrect reasoning, but only in reward-trained models, not instruction-tuned ones. We leverage this as a confidence signal to adaptively halt reasoning early, achieving 25-50% token reduction without sacrificing accuracy on math reasoning benchmarks.

My World of Thoughts

all posts →

News

Apr 2026Admitted to Georgia Tech MS CS (Fall 2026)
Apr 2026EsoLang-Bench published at ICLR 2026 Workshops ICBINB & LLM Reasoning
Apr 2026LLM Deception in Auctions published at ICLR 2026 Workshop MALGAI
Dec 2025Two papers accepted at NeurIPS 2025 Workshops (FoRLM, Efficient Reasoning)
Oct 2025Ripple Effect Protocol published, part of Project Iceberg
Jun 2025Joined Lossfunk AI Lab as Researcher
Feb 2024Started research collaboration with MIT Media Lab
May 2023Joined Ikigai Labs as ML Engineer
Jul 2022MLH Fellowship - Hubble Protocol (Solana/Rust)
May 2022Google Summer of Code - NumFOCUS (Julia)

Get in Touch

Interested in collaborating, have research questions, or just want to chat? Drop me a message.