Jaehee Kim

I study language models beyond static tasks: how they reason, retrieve, and operate over time.

I believe the hard part is not raising a score, but naming what a capability is and making it visible.

Open to research internships in LLM reasoning, agents, RL, and retrieval.

About

I am a Ph.D. student in Industrial Engineering at Seoul National University, advised by Prof. Pilsung Kang in the Data Science and Business Analytics Lab.

Many of my projects start from a small discomfort in existing work, a comparison that feels incomplete or an evaluation claim that seems too narrow, and I turn it into a setting where a vague capability becomes measurable.

I like simple sentences for difficult problems, and I usually want to know what we mean by a capability before trying to improve it.

Selected research

AlienLMICML 2026

Studies language transformed to look alien: useful to the model, hard for an outside observer to recover. Project page with a live demo.
ContAccumNeurIPS 2024

Improves contrastive learning by accumulating comparison queries, not just comparison targets.
CheckEvalEMNLP 2025

Reframes LLM-as-a-judge around stability under model and sample changes, not human correlation alone.

See all publications

Ongoing direction

The output is not the answer. The output is the trajectory.

Most agent benchmarks ask whether an agent finishes a task. I care more about whether it keeps doing the job, and OperationBench is my early-stage attempt to evaluate that.

Beyond research

Outside research, I like books, films, dry humor, and diving. Scuba and freediving have recently become my favorite ways to get out of my head. I also run and play badminton when I want to stop overthinking for a while.

I share my days with two cats, Luce and hoyang, who have firm opinions about my working-from-home schedule.

Luce

hoyang

Get in touch

If you work on LLM reasoning, agents, RL, or retrieval, I would be glad to talk. I am looking for a research internship.

Reach me at jaehee_kim@snu.ac.kr.

News

Jul 2026AlienLM project page with an in-browser tokenizer demo is up. [project page] [arXiv] [code]
May 2026AlienLM was accepted to ICML 2026.
May 2026Updated my CV with new under-review manuscripts and archived publications.
Mar 2026New manuscripts on belief-shift rewards and semiconductor equipment log retrieval are under review.
Jan 2026AlienLM is now public on arXiv, with code released. [arXiv] [code]
Nov 2025CheckEval appeared at EMNLP 2025.
Jul 2025Verbosity-Aware Rationale Reduction appeared at ACL 2025.
Dec 2024ContAccum, a memory-efficient dense retriever training method, appeared at NeurIPS 2024.