Publications

A connected view of my research trajectory, followed by accepted papers and arXiv preprints.

Research Trajectory

My research has evolved from efficient representation learning, to multimodal modeling and uncertainty, then to geometric interpretation of representation spaces, multimodal agentic reasoning, and internal analysis for controllable agentic systems.

→

Efficient Representations

Linear AlgebraEfficiencyEmbedded Devices

Exploiting Boosting in Hyperdimensional Computing for Enhanced Reliability in Healthcare

→

Multimodal Learning

MultimodalCross-Modal AlignmentEvent Streams

Cross-Modal Event Encoder: Bridging Image-Text Knowledge to Event Streams

→

Probabilistic Fusion

ProbabilisticBayesian UncertaintyRobustness

Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection

→

Geometric Interpretation

Linear AlgebraGeometryVector Space

Understanding the Visual Projection Space of Multimodal LLMs

→

Multimodal Agents

MultimodalAgentic Expert OrchestrationVisual Reasoning

Draft and Refine with Visual Experts

→

Hallucination Analysis

Linear AlgebraGeometryFlow Signatures

Internal Flow Signatures for Self-Checking and Refinement in LLMs

Controllable Agents

AI AgentControlInterpretabilityState-Centric Reasoning

State-Centric Decision Process

Research Blueprint

A conceptual map of how my research themes connect across papers. Time flows from top to bottom.

ThemesPapers

Time

Top → Bottom

Efficiency

Linear Algebra

Multimodal

Probability Theory

Hallucination

Agentic

[1]Boosting in HDC

[2]Cross-Modal Event Encoder

[3]Geometric Interpretation

[4]Uncertainty Fusion

[5]DnR

[6]LLM Signature flow

[7]SDP

Green: efficiency / linear algebraBlue: multimodalPurple: probabilityTeal: hallucination / agentic reasoning

Accepted Publications

[4]Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection

SungHeon Jeong, Jihong Park, Mohsen Imani

ECCV 2026

arXiv Code

A video anomaly detection framework that synthesizes event representations from RGB videos and fuses them with image features through an uncertainty-aware process.

[5]Draft and Refine with Visual Experts

SungHeon Jeong, Ryozo Masukawa, Jihong Park, Sanggeon Yun, Wenjun Huang, Hanning Chen, Mahdi Imani, Mohsen Imani

CVPR 2026

CVF CVPR Virtual arXiv Code

An agent framework that improves multimodal reasoning by measuring visual reliance and refining responses with feedback from visual experts.

[3]Understanding the Visual Projection Space of Multimodal LLMs

SungHeon Jeong, Yoojeong Song, Hyungjoon Kim

WACV 2026

CVF

A geometric probing study of the projected visual token in multimodal LLMs, analyzing latent-token alignment, intrinsic dimensionality, and perturbation sensitivity.

[2]Cross-Modal Event Encoder: Bridging Image-Text Knowledge to Event Streams

SungHeon Jeong, Hanning Chen, Sanggeon Yun, Suhyeon Cho, Wenjun Huang, Xiangjian Liu, Mohsen Imani

WACV 2026

CVF arXiv Code

A cross-modal event encoder that adapts CLIP's image-text representation space to event streams while preserving zero-shot learning and text alignment.

[1]Exploiting Boosting in Hyperdimensional Computing for Enhanced Reliability in Healthcare

SungHeon Jeong, Hamza Errahmouni Barkam, Sanggeon Yun, Yeseong Kim, Shaahin Angizi, Mohsen Imani

DATE 2025

IEEE

A hyperdimensional computing framework that applies boosting to improve reliability and robustness in healthcare-oriented learning tasks.

arXiv Preprints

[7]State-Centric Decision Process

SungHeon Jeong, Ryozo Masukawa, Sanggeon Yun, Mahdi Imani, Mohsen Imani

arXiv 2026

arXiv

A state-centric framework for agent decision-making that represents reasoning trajectories through certified state transitions and supports analysis such as credit assignment, failure localization, and modular operator replacement.

[6]Internal Flow Signatures for Self-Checking and Refinement in LLMs

SungHeon Jeong, Sanggeon Yun, Ryozo Masukawa, Wenjun Huang, Hanning Chen, Mohsen Imani

arXiv 2026

arXiv Code

A self-checking and refinement framework that audits internal decision dynamics of LLMs and enables targeted correction without modifying the base model.