cv | Tony Woo

Education

2015–2019
B.S. in Computer Science

University of California, Davis, CA, USA
- GPA: 3.68/4.0 (Major GPA: 3.91/4.0)
- Dean’s Honor List: Spring 2017, Fall 2017, Winter 2018, Spring 2018, Fall 2018, Winter 2019, Spring 2019

Research Experience

2025–Present
Researcher

Vision & Learning Lab, Seoul National University
- Conduct research on spoken dialogue systems and general audio processing under Prof. Gunhee Kim.
- Created a benchmark for evaluating omnimodal LLMs’ multimodal reference resolution capabilities.
- Developed a speech tokenization method aligned with LLM vocabularies for spoken language modeling.
- Designed a benchmark assessing fine-grained acoustic perception in audio-language models.
- Implemented incremental response rewriting for spoken dialogue systems.
2019
Undergraduate Researcher

RUbiNet Lab, University of California, Davis
- Contributed to ML-powered clinical decision support systems under Prof. Chen-Nee Chuah.
- Built an iOS client for monitoring ventilator data and anomaly notifications.
2018–2019
Undergraduate Researcher

DECAL Lab, University of California, Davis
- Conducted experiments on entropy scoring for genetic programming–based automated software repair.

Employment

2025–Present
Research Intern

SK Telecom, Seoul, South Korea
- Contribute to Korea’s Sovereign AI Foundation Model Project.
- Design and implement spoken language interfaces for multimodal large language models.
2022–2023
NLP Engineer

Mindlogic Inc., Seoul, South Korea
- Developed persona-grounded dialogue systems using historical chat data.
- Built LLM-based modules including prompt engineering for chatbot solutions.
2022
Senior AI Scientist

MINDsLab Inc., Seongnam, South Korea
- Led speech recognition research achieving 68% relative error-rate reduction.
- Built multilingual TTS service integrated with a talking-face generation system.
- Developed Japanese and Chinese grapheme-to-phoneme pipelines.
2020–2022
AI Scientist

MINDsLab Inc., Seongnam, South Korea
- Built text preprocessing pipeline for LM pretraining.
- Developed transformer-based speech recognition systems.
2019
Undergraduate Reader

University of California, Davis
- Graded assignments and exams.
- Held office hours to assist students.

Projects

2023–2024
EnCLAP / EnCLAP++

Audio Captioning Research
- Developed state-of-the-art audio captioning models using pretrained audio and language models.
- Achieved top-tier performance in competitive audio captioning challenges.
- Published work in multiple venues.

Skills

Languages
- English (Native)
- Korean (Native)
- Japanese (Fluent)
Programming
- Python
- Java
- Kotlin
- Swift
- SQL
Tools
- Git
- Docker
Frameworks & Libraries
- PyTorch
- TensorFlow/Keras
- Hugging Face Transformers
- gRPC
- FastAPI
- LangChain
- vLLM

Publications

DExTER: Can Omnimodal Language Models Resolve Audio-Visual Deixis?
Sehun Lee, Yoonji Nam, Sang Hoon Woo, Gunhee Kim
Under review
SubAlign: Speech Tokenization Aligned with LLM Vocabularies for Spoken Language Modeling
Kang-wook Kim, Sehun Lee, Sang Hoon Woo, Gunhee Kim
To be submitted to ARR (January 2026 cycle)
WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
Jaeyeon Kim, Heeseung Yun, Sang Hoon Woo, Chao-Han Huck Yang, Gunhee Kim
Submitted to ARR (October 2025 cycle)
Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech
Sang Hoon Woo*, Sehun Lee*, Kang-wook Kim, Gunhee Kim
EMNLP 2025
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Jaeyeon Kim, Minjeong Jeon, Jaeyoon Jung, Sang Hoon Woo, Jinjoo Lee
DCASE Workshop 2024
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
Jaeyeon Kim, Jaeyoon Jung, Minjeong Jeon, Sang Hoon Woo, Jinjoo Lee
DCASE 2024 Task 6 (2nd Place)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
Jaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo
ICASSP 2024
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo
InterSpeech 2022
Talking Face Generation with Multilingual TTS
Hyoung-Kyu Song*, Sang Hoon Woo*, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim
CVPR 2022 (Demo)
Leveraging IoTs and Machine Learning for Patient Diagnosis and Ventilation Management in the Intensive Care Unit
Gregory B. Rehm, Sang Hoon Woo, Xin Luigi Chen, Brooks T. Kuhn, Irene Cortes-Puch, Nicholas R. Anderson, Jason Y. Adams, Chen-Nee Chuah
IEEE Pervasive Computing (2020)
Mining Vehicle Failure Consumer Reports for Enhanced Service Efficiency
Ali Khodadadi, Chen-Nee Chuah, Sang Hoon Woo, Ashish Dalal
IEEE VTC 2019-Fall

Education

B.S. in Computer Science

Research Experience

Researcher

Undergraduate Researcher

Undergraduate Researcher

Employment

Research Intern

NLP Engineer

Senior AI Scientist

AI Scientist

Undergraduate Reader

Projects

EnCLAP / EnCLAP++

Skills

Languages

Programming

Tools

Frameworks & Libraries

Publications