Hi, I'm Dongjun Kim

AI Safety · Model Evaluation · Mechanistic Interpretability

I build transparent, reliable AI through LLM Evaluation and Mechanistic Interpretability. I am a master's student at Korea University advised by Dr. Heuiseok Lim.

Dongjun Kim

About

I’m a master’s student and researcher at Korea University advised by Dr. Heuiseok Lim. My work centers on AI Safety through Mechanistic Interpretability and rigorous Model Evaluation.

I design reproducible, time‑aware evaluation systems to measure capabilities, surface failures, and track regressions; and I study model internals to explain and steer behavior.

I’m experienced across the LLM training stack: dataset curation/quality, instruction tuning and post‑training (including RL with GRPO), and end‑to‑end evaluation and safety.

I work across evaluation, interpretability, safety, and reasoning. Understanding how models work is key to unlocking capability and reducing risk. If you’re exploring these areas, I’d love to connect.

Publications

Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks

Dongjun Kim, Gyuho Shim, Yong Chan Chun, Minhyuk Kim, Chanjun Park, Heuiseok Lim

Proceedings of EMNLP 2025 (Oral Presentation)

Benchmark Profiling thumbnail

KoLEG: On-the-Fly Korean Legal Knowledge Editing with Continuous Retrieval

Jaehyung Seo, Dahyun Jung, Jaewook Lee, Yong Chan Chun, Dongjun Kim, Hwijung Ryu, Donghoon Shin, Heuiseok Lim

Findings of EMNLP 2025

KoLEG thumbnail

MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation

Weihua Zheng, ..., Dongjun Kim, ...

arXiv Preprint, 2025 — ICLR 2026 Under Review

MMA-Asia thumbnail

Exploring Coding Spot: Understanding Parametric Contributions to LLM Coding Performance

Dongjun Kim, Minhyuk Kim, Yong Chan Chun, Chanjun Park, Heuiseok Lim

arXiv Preprint, 2024 — EACL 2026 Under Review

Coding Spot thumbnail

From Snapshot to Stram: A Self-Improving Leaderboard for Robust and Evolving Natural Language Processing (NLP) Evaluation

Chanjun Park, Hyeonseok Moon, Dongjun Kim, Seolhwa Lee, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim

arXiv Preprint, 2025

SIL thumbnail

Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval

Yong Chan Chun, Minhyuk Kim, Dongjun Kim, Chanjun Park, Heuiseok Lim

Findings of ACL 2025

ATE thumbnail

KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models

Dongjun Kim, Chanhee Park, Chanjun Park, Heuiseok Lim

arXiv Preprint, 2025 — IEEE Access 2026 Under Review

KITE thumbnail

Exploring Inherent Biases in LLMs within Korean Social Context: A Comparative Analysis of ChatGPT and GPT-4

Seungyoon Lee, Dongjun Kim, Dahyun Jung, Chanjun Park, Heuiseok Lim

NAACL 2024 SRW

Korean Biases thumbnail

CitySEIRCast: an agent-based city digital twin for pandemic analysis and simulation

Shakir Bilal, Wajdi Zaatour, Yilian Alonso Otano, Arindam Saha, Ken Newcomb, Soo Kim, Dongjun Kim, Raveena Ginjala, Derek Groen, Edwin Michael

Complex & Intelligent Systems, 2024

CitySEIRCast thumbnail

Projects

Independent AI Foundation Model Project (WBL)

Evaluation Data

Collaboration with NC AI, ETRI (NC AI Consortium)

  • Implemented an evaluation framework (50+ benchmarks) covering reasoning, safety, and robustness with reproducible pipelines and CI triggers.
  • Designed a unified metric layer and Weights & Biases dashboards for time series tracking, regression checks, and cross-model slice analysis.
  • Conducted pre- and post-training analyses on successive checkpoints, identified failure modes, and provided recommendations used for data and recipe updates.
  • Introduced contamination checks (deduplication and overlap scans) and standardized runbooks to support fair, comparable evaluations across systems.

KoLEG: Korean Legal Knowledge Editing

Training Interpretability Data

Collaboration with KT Corporation

  • Co-developed an on-the-fly legal knowledge editing framework with continuous retrieval and timestamp-aware sequential updates (team of 8).
  • Led crawling and construction of the Korean Legislative Amendment dataset, aligning implementation periods and applying high-precision filtering.
  • Contributed mechanistic interpretability analyses on locality and generalization in edited knowledge.
  • Built expert evaluation demos and protocols for human assessment, enabling attorney review and iterative error analysis.

KULLM 3 & KULLM R & Ko Gemma Model Training

Training Data

NLP&AI Lab, Korea University

  • Contributed on post-training for a team of 10, including instruction tuning, training framework, and multilingual and code-switch datasets.
  • Curated code and math corpora, established quality gates, and ran capability evaluations for coding and math.
  • KULLM Reasoning: implemented reinforcement learning with custom reward functions and GRPO and applied adaptive response length (concise on easier tasks, deeper chains on harder ones).
  • Observed improved Korean reasoning and math accuracy versus a Qwen3 reasoning baseline while reducing unnecessary verbosity in internal evaluations.

Self Improving Leaderboard

Evaluation Data

Automated Benchmark Generation from Real-Time Data

  • Implemented daily crawlers across news categories, real-time QA generation, and automated multi-LLM evaluation on daily refreshed data.
  • Launched a live leaderboard with time-aware ranking and quarterly stability/volatility metrics to track consistency over time.
  • Maintained scheduling, monitoring, and data hygiene to support regular refreshes and clear longitudinal comparisons.

LLM Lesion Brain Mapping

Interpretability

NLP&AI Lab, Korea University

  • Studied how targeted parameter perturbations relate to language deficits in LLMs across syntax, semantics, and retrieval behaviors.
  • Explored correspondences between internal subspaces and neuro-linguistic functionality with layer- and module-level analysis.
  • Prototyped a lesion-based causal tracing method linking internal changes to observable behaviors with reproducible analysis tooling.

Education

2024 – Present

Korea University, Seoul, South Korea

Master of Science in Computer Science — Expected Feb 2026

  • First-author, EMNLP 2025 Main — Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
  • Built an automated LLM evaluation framework (≈50 benchmarks) for pre/post-training model diagnostics
  • Co-developed KoLEG (legal knowledge editing) with continuous retrieval; constructed amendment dataset
  • Explored lesion-style causal tracing for mechanistic interpretability of language deficits
2019 – 2023

University of South Florida, Tampa, FL

Bachelor of Science in Computer Science — May 2023

  • Worked with Dr. Edwin Michael on building a city-scale digital twin for Hillsborough County, developing agent-based models to simulate pandemic dynamics and inform public health policies
  • Collaborated with Dr. Wanwan Li on automatic room mapping in augmented reality (AR) systems, focusing on view direction-driven SLAM algorithms for dynamic environments
  • Explored neural scene representation networks for AR applications, optimizing spatial intelligence systems for real-time use cases

Contact