👋 Hello there, I’m Hairui (海睿)
👨🔬 Hi! I’m a second year graduate student at UMD College Park, advised by Dr. Abdirisak Mohamed. I’m also an intern advised by Prof. Chuang Gan, mentored by Jiaben Chen at UMass Amherst. Before graduate study, I received my bachelor’s degree in Computer Science and Technology at SIST of ShanghaiTech University. During my junior and senior years, I had the privilege of working with Prof. Haipeng Zhang.
🔬 My research interests are bridging vision, language and the real world.
📖 Recently, I am working on 3D video scene generation, robot simulation, aerial manipulation, and image generation control using counterfactual methods.
🎯 My short-term goal (next 12 months) is to grow into a reliable and qualified researcher and obtain a PhD position for Fall 2026
🎯 My medium-term goal (2 ~ 3 years) is to publish some interesting, high-impact works and collaborate with many outstanding researchers
🎯 My long-term goal is to enable machines to understand and reason the causal relation in the physical world through vision, prior knowledge and physics-aware intelligence.
🎼 Here’s some demos of my selected interesting works:
Education
University of Maryland, College Park | |
University of Wisconsin, Madison | |
ShanghaiTech University |
Research Experience
- Egocentric: First-person AR scene world model (08/2025 – Present) Intern · Advised by Prof. Chuang Gan, Mentored by Jiaben Chen, UMass Amherst
- Extending EgoGen with hand–object interaction for first-person AR video generation.
- Designed and implementing an egocentric Blender, Unreal data pipeline with automated Synthesis.
- Physics-based Motion Video Generation (07/2025 – Present)
Intern · Advised by Prof. Chuang Gan, Mentored by Jiaben Chen, UMass Amherst- Extending physics-based humanoid control frameworks (InterMimic, PHC) in human-object interaction for motion video generation.
- Constructed novel dataset for video-motion integration with Unreal5 using Bedlam framework.
- Structural Causal Model based Diffusion (03/2025 – Present)
Advised by Prof. Abdirisak Mohamed, University of Maryland- Removed restrictive assumptions in counterfactual backtracking methods.
- Developing causal-consistent image editing in diffusion models.
- Tool-Oriented Prompt Injection Attacks on LLM Agents (05/2025 – 06/2025)
Advised by Udari Madhushani Sehwag- Extended multi-agent frameworks (AutoGen, AgentDojo) to support tool injection attack scenarios.
- Multimodal Extraction of Genealogy Images (09/2023 – 05/2024)
Research Assistant · Advised by Prof. Haipeng Zhang, Shanghaitech University- Built a large-scale genealogy multimodal dataset (2.8TB) for sociological analysis.
- Applied OCR, Vision Models, and LLMs to analyze historical demographic patterns.
Professional Experience
- Assistant Data Engineer · Glodon, Shanghai (01/2024 – 07/2024)
- Enhanced CV-based safety monitoring with YOLO and Faster R-CNN.
- Developed scalable synthetic data pipeline with Blender, 3D cloud models, and OpenCV.
- Security Engineer Intern · NSFOCUS, Shanghai (06/2022 – 08/2022)
- Realized security data verification and communication inspection.
Other things about me (🚫 Research)
I love teaching, and I am currently working as a TA for course ‘Decision Making for information science’ at UMD.
I love sports🏸🏓🎾. Though not playing 🏀 right now, I was a basketball athlete in my teenage years. At that time, I also won first prize in the 1500-meter long-distance race in Changning district, Shanghai.
🎤 in choirs in primary school, high school and undergrate university.