cv

Education

  • 2015–2019
    B.S. in Computer Science
    University of California, Davis, CA, USA
    • GPA: 3.68/4.0 (Major GPA: 3.91/4.0)
    • Dean’s Honor List: Spring 2017, Fall 2017, Winter 2018, Spring 2018, Fall 2018, Winter 2019, Spring 2019

Research Experience

  • 2025–Present
    Researcher
    Vision & Learning Lab, Seoul National University
    • Conduct research on spoken dialogue systems and general audio processing under Prof. Gunhee Kim.
    • Created a benchmark for evaluating omnimodal LLMs’ multimodal reference resolution capabilities.
    • Developed a speech tokenization method aligned with LLM vocabularies for spoken language modeling.
    • Designed a benchmark assessing fine-grained acoustic perception in audio-language models.
    • Implemented incremental response rewriting for spoken dialogue systems.
  • 2019
    Undergraduate Researcher
    RUbiNet Lab, University of California, Davis
    • Contributed to ML-powered clinical decision support systems under Prof. Chen-Nee Chuah.
    • Built an iOS client for monitoring ventilator data and anomaly notifications.
  • 2018–2019
    Undergraduate Researcher
    DECAL Lab, University of California, Davis
    • Conducted experiments on entropy scoring for genetic programming–based automated software repair.

Employment

  • 2025–Present
    Research Intern
    SK Telecom, Seoul, South Korea
    • Contribute to Korea’s Sovereign AI Foundation Model Project.
    • Design and implement spoken language interfaces for multimodal large language models.
  • 2022–2023
    NLP Engineer
    Mindlogic Inc., Seoul, South Korea
    • Developed persona-grounded dialogue systems using historical chat data.
    • Built LLM-based modules including prompt engineering for chatbot solutions.
  • 2022
    Senior AI Scientist
    MINDsLab Inc., Seongnam, South Korea
    • Led speech recognition research achieving 68% relative error-rate reduction.
    • Built multilingual TTS service integrated with a talking-face generation system.
    • Developed Japanese and Chinese grapheme-to-phoneme pipelines.
  • 2020–2022
    AI Scientist
    MINDsLab Inc., Seongnam, South Korea
    • Built text preprocessing pipeline for LM pretraining.
    • Developed transformer-based speech recognition systems.
  • 2019
    Undergraduate Reader
    University of California, Davis
    • Graded assignments and exams.
    • Held office hours to assist students.

Projects

  • 2023–2024
    EnCLAP / EnCLAP++
    Audio Captioning Research
    • Developed state-of-the-art audio captioning models using pretrained audio and language models.
    • Achieved top-tier performance in competitive audio captioning challenges.
    • Published work in multiple venues.

Skills

  • Languages
    • English (Native)
    • Korean (Native)
    • Japanese (Fluent)
  • Programming
    • Python
    • Java
    • Kotlin
    • Swift
    • SQL
  • Tools
    • Git
    • Docker
  • Frameworks & Libraries
    • PyTorch
    • TensorFlow/Keras
    • Hugging Face Transformers
    • gRPC
    • FastAPI
    • LangChain
    • vLLM

Publications

  • DExTER: Can Omnimodal Language Models Resolve Audio-Visual Deixis?
    Sehun Lee, Yoonji Nam, Sang Hoon Woo, Gunhee Kim
    Under review
  • SubAlign: Speech Tokenization Aligned with LLM Vocabularies for Spoken Language Modeling
    Kang-wook Kim, Sehun Lee, Sang Hoon Woo, Gunhee Kim
    To be submitted to ARR (January 2026 cycle)
  • WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
    Jaeyeon Kim, Heeseung Yun, Sang Hoon Woo, Chao-Han Huck Yang, Gunhee Kim
    Submitted to ARR (October 2025 cycle)
  • Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech
    Sang Hoon Woo*, Sehun Lee*, Kang-wook Kim, Gunhee Kim
    EMNLP 2025
  • EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
    Jaeyeon Kim, Minjeong Jeon, Jaeyoon Jung, Sang Hoon Woo, Jinjoo Lee
    DCASE Workshop 2024
  • Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
    Jaeyeon Kim, Jaeyoon Jung, Minjeong Jeon, Sang Hoon Woo, Jinjoo Lee
    DCASE 2024 Task 6 (2nd Place)
  • EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
    Jaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo
    ICASSP 2024
  • SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
    Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo
    InterSpeech 2022
  • Talking Face Generation with Multilingual TTS
    Hyoung-Kyu Song*, Sang Hoon Woo*, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim
    CVPR 2022 (Demo)
  • Leveraging IoTs and Machine Learning for Patient Diagnosis and Ventilation Management in the Intensive Care Unit
    Gregory B. Rehm, Sang Hoon Woo, Xin Luigi Chen, Brooks T. Kuhn, Irene Cortes-Puch, Nicholas R. Anderson, Jason Y. Adams, Chen-Nee Chuah
    IEEE Pervasive Computing (2020)
  • Mining Vehicle Failure Consumer Reports for Enhanced Service Efficiency
    Ali Khodadadi, Chen-Nee Chuah, Sang Hoon Woo, Ashish Dalal
    IEEE VTC 2019-Fall