Research

Research Focus

Reasoning Signals & Process Supervision: Designing process-level training signals and reinforcement learning methods that teach models to reason step by step — moving beyond outcome-based rewards toward belief-aware supervision.
Semantic Embedding & Retrieval: Training and evaluating dense retrievers for scientific, industrial, and domain-specialized corpora where data is sparse, heterogeneous, or operationally complex.
Trustworthy LLM Systems: Building privacy-aware evaluation and deployment pipelines — including API-boundary privacy via transformed representation spaces and agent benchmarks where reliability and auditability matter.

Data Science and Business Analytics Lab

I am a member of the Data Science and Business Analytics Lab led by Prof. Pilsung Kang. We collaborate on projects spanning trustworthy AI, healthcare analytics, and user-centered evaluation methodologies.

Selected Sponsored Projects

Development of Worker-Friendly Innovative AI Agents for Autonomous Manufacturing (IITP, Apr 2025–Dec 2025)
Building a multi-agent, multi-party engine and orchestration pipeline that supports retrieval-augmented workflows for factory operations.
Data Construction and Optimization for Training/Inference of AI Chatbot (Samsung Fire & Marine Insurance, Mar 2025–Mar 2026)
Designing a document preprocessing architecture, PDF parsing modules, and hierarchical training datasets that strengthen in-house chatbots.
Information Retrieval System for Scholarly Achievement and Research Projects (College of Engineering, SNU, Jan–Jul 2025)
Developing evaluation protocols for finance-domain LLMs, assembling metadata-driven corpora, and training dense retrievers for scientific content.
Large Language Model Evaluation Framework for the Financial Domain (KakaoBank, Nov 2023–Aug 2024)
Constructing safety, truthfulness, and numerical reasoning benchmarks and deploying finance-specific LLM evaluation pipelines.
Developing Customer Content via Data Analysis Techniques (Stages 1 & 2) (LG Electronics, Apr 2022–Nov 2023)
Delivering unsupervised dense retrieval, review clustering, anomaly detection, and self-training pipelines for customer insight discovery.

If you are interested in collaborations or internships, feel free to reach out.

Jaehee Kim

Research Focus

Data Science and Business Analytics Lab

Selected Sponsored Projects