Research
Research Focus
- Reasoning Signals & Process Supervision: Designing process-level training signals and reinforcement learning methods that teach models to reason step by step — moving beyond outcome-based rewards toward belief-aware supervision.
- Semantic Embedding & Retrieval: Training and evaluating dense retrievers for scientific, industrial, and domain-specialized corpora where data is sparse, heterogeneous, or operationally complex.
- Trustworthy LLM Systems: Building privacy-aware evaluation and deployment pipelines — including API-boundary privacy via transformed representation spaces and agent benchmarks where reliability and auditability matter.
Data Science and Business Analytics Lab
I am a member of the Data Science and Business Analytics Lab led by Prof. Pilsung Kang. We collaborate on projects spanning trustworthy AI, healthcare analytics, and user-centered evaluation methodologies.
Selected Sponsored Projects
-
Development of Worker-Friendly Innovative AI Agents for Autonomous Manufacturing (IITP, Apr 2025–Dec 2025)
Building a multi-agent, multi-party engine and orchestration pipeline that supports retrieval-augmented workflows for factory operations. -
Data Construction and Optimization for Training/Inference of AI Chatbot (Samsung Fire & Marine Insurance, Mar 2025–Mar 2026)
Designing a document preprocessing architecture, PDF parsing modules, and hierarchical training datasets that strengthen in-house chatbots. -
Information Retrieval System for Scholarly Achievement and Research Projects (College of Engineering, SNU, Jan–Jul 2025)
Developing evaluation protocols for finance-domain LLMs, assembling metadata-driven corpora, and training dense retrievers for scientific content. -
Large Language Model Evaluation Framework for the Financial Domain (KakaoBank, Nov 2023–Aug 2024)
Constructing safety, truthfulness, and numerical reasoning benchmarks and deploying finance-specific LLM evaluation pipelines. -
Developing Customer Content via Data Analysis Techniques (Stages 1 & 2) (LG Electronics, Apr 2022–Nov 2023)
Delivering unsupervised dense retrieval, review clustering, anomaly detection, and self-training pipelines for customer insight discovery.
If you are interested in collaborations or internships, feel free to reach out.