Harim Kim

I am an Incoming PhD Student in Electrical Engineering at Boston University. I received my M.S. degree in Computer Science at Handong Global University under the supervision of Prof. Charmgil Hong in Handong Artificial Intelligence Lab. (HAIL).

My current research interests lie in developing intent-driven deep learning frameworks to address fundamental challenges in medical AI.

In particular, I am interested in:

Exploring multimodal data fusion strategies for robust and informative representation learning
Constructing anomaly detection techniques informed by latent space understanding

For more details about my academic background, publications, and research experiences, please refer to Resume page.

Selected Projects

This section introduces a selection of representative research projects. For each topic, I provide a brief overview and a description of the most recent publication.

Integrating Multimodal Medical Data

To build more reliable and explainable medical AI, this project focuses on designing deep learning mechanisms that effectively fuse and utilize multimodal medical data. The goal is to improve disease prediction, enable early detection, and support the discovery of potential biomarkers.

• Most Recent Publication •

[Kim et al.] Harnessing EHRs for Diffusion-based Anomaly Detection on Chest X-rays (MICCAI, 2025) - Early Accepted (Top 9%)

Unsupervised anomaly detection (UAD) in medical imaging is crucial for identifying pathological abnormalities without requiring extensive labeled data. However, existing diffusion-based UAD models rely solely on imaging features, limiting their ability to distinguish between normal anatomical variations and pathological anomalies. To address this, we propose Diff3M, a multi-modal diffusion-based framework that integrates chest X-rays and structured Electronic Health Records (EHRs) for enhanced anomaly detection. Specifically, we introduce a novel Image-EHR Cross-Attention module to incorporate structured clinical context into the image generation process, improving the model’s ability to differentiate normal from abnormal features. Additionally, we develop a static masking strategy to enhance the reconstruction of normal-like images from anomalies. Extensive evaluations on CheXpert and MIMIC-CXR/IV demonstrate that Diff3M achieves state-of-the-art performance, outperforming existing UAD methods in medical imaging. Our implementation is available at https://github.com/nth221/Diff3M.

Reinterpreting for Enhanced Anomaly Detection

To detect unusual patterns in real-world scenarios, this project proposes new anomaly detection frameworks by reinterpreting existing deep learning mechanisms. The proposed frameworks can be applied to identify various abnormalities, such as pathological features in radiological images and suspicious activities in surveillance footage.

• Most Recent Publication •

[Kim et al.] Transformer for Point Anomaly Detection (CIKM, 2024)

In data analysis, unsupervised anomaly detection holds an important position for identifying statistical outliers that signify atypical behavior, erroneous readings, or interesting patterns within data. The Transformer model, known for its ability to capture dependencies within sequences, has revolutionized areas such as text and image data analysis. However, its potential for tabular data, where sequence dependencies are not inherently present, remains underexplored. This paper introduces Transformer for Point Anomaly Detection (TransPAD), a novel Transformer-based AutoEncoder framework specifically designed for point anomaly detection. Our method captures interdependencies across entire datasets, addressing the challenges posed with non-sequential, tabular data. It incorporates unique random and criteria sampling strategies for effective training and anomaly identification, and avoids the common pitfall of trivial generalization that affects many conventional methods. By leveraging an attention weight-based anomaly scoring system, TransPAD offers a more precise approach to detect anomalies. Extensive testing on a range of benchmark tabular datasets shows that TransPAD consistently outperforms existing methods. Our source code is available at https://github.com/nth221/TransPAD.

Development for System-level Application

To address real-world challenges, this project aims to design task-specific deep learning frameworks and builds end-to-end pipelines for system-level applications.

• Most Recent Publication •

[Kim et al.] AREST: Attention-Based Red-Light Violation Detection for Safety Technology (IEEE AVSS, 2024) - Best Paper Award

As car-sharing services evolve, there is a growing effort to analyze users’ safe driving behaviors and effectively manage shared vehicles. Unlike previous researches that focus on simple situations like sudden acceleration and lane departure using cameras with additional sensors, we introduce a new approach that detects more complex traffic rule violation, especially red-light violation, using only the monocular dashcam videos. The proposed framework employs the attention mechanism of Transformer, and effectively encodes the traffic signal objects and contextual information within the video. It utilizes a novel method, POISE (Positional Object Information by Spatial Encoding), to handle the positional information of traffic signal objects. Our quantitative and qualitative evaluations demonstrate the effectiveness of our proposed framework in detecting red-light violations compared to existing methods.