CV
Maowei Jiang / 蒋茂苇
Email: jmw24@mails.tsinghua.edu.cn
GitHub: Zero-coder
Google Scholar: Profile
Research Interests
Large Language Models, Multimodal Learning, Vision-Language-Action Models, Robot Learning, Embodied World Models, Reinforcement Learning, Policy Optimization, Long-Horizon Decision Making, LLM/VLM Agents.
Education
- Tsinghua University, School of Future Human Habitats, 2024.09 - Present
Research focus: Large models, agents, embodied intelligence. - University of Chinese Academy of Sciences / Shenyang Institute of Automation, Chinese Academy of Sciences, 2021.08 - 2024.06
Research focus: Deep learning and intelligent perception. - Mianyang Teachers’ College, School of Mechanical and Electrical Engineering, 2017.07 - 2021.06
Research focus: Electrical engineering and embedded automation.
Selected Publications and Submissions
TAPO: Dynamic Teacher and Perturbed Answer Injection for Policy Optimization
AAAI 2026 Oral. First author. LLM policy optimization with dynamic teacher signals and perturbed answer injection.FutureVLA: Acting on Predicted Futures with Vision-Language-Action Models
NeurIPS 2026 under review. First author. Future-conditioned VLA framework connecting visual world-model prediction and robot action generation.ReCon: Reference-Conditioned Online Refinement for Vision-Language-Action Policies
NeurIPS 2026 under review. First author. Online residual correction for frozen VLA policies on real-robot contact-rich tasks.Prompt2Act: Mapping Prompts into Sequence of Robotic Actions with Large Foundation Models
Information Fusion, IF 15.5, Q1 Top, CCF-B. First author. Natural-language prompts to executable robot action sequences.FDVLA: A Flow-Diffusion Vision-Language-Action Framework with Dual Reasoning Modulation
Information Fusion under review. First author. Flow-diffusion VLA modeling and reasoning modulation for robot manipulation.RL2VLA: Reinforcement Learning Fine-tuning for Vision-Language-Action Models
ACM MM 2026 under review. First author. Reinforcement learning fine-tuning for VLA models.DAAC: Discrepancy-Aware Adaptive Contrastive Learning
NeurIPS 2025. Co-first / project lead. Adaptive contrastive learning under distribution discrepancy.CARD / MRED-14 / GreenPlanner
NeurIPS 2024 Workshop on Open-World Agents; ACM MM; CVPR. Contributions to cross-modal agents, generative residential design, and constraint-aware layout generation.
Research and Engineering Experience
- Embodied AI and VLA: Future-conditioned VLA, online policy refinement, prompt-to-action generation, robot action representation, contact-rich real-robot tasks, long-horizon decision making.
- LLM / Agent / RL: SFT, LoRA, GRPO, PPO, SAC, reward design, reward hacking analysis, policy optimization, agent evaluation, credit assignment.
- Multimodal Learning: LLaMA / LLaVA-style fine-tuning, modality alignment, multimodal representation learning, visual grounding, future-state prediction.
- Engineering: Python, PyTorch, Transformers, DeepSpeed, TRL / VERL, Docker, Git, Linux, multi-GPU training, reproducible experiment management.
Open Source
- Awesome-LLM-Robotics: contributor to a 4.4k+ star resource list for LLM/MLLM + Robotics/RL.
- Second-Me: contributor to a 15.5k+ star personalized AI self system.
- Prompt2Act, RL2VLA, FDVLA: research repositories for VLA and robot action generation.
- Zero-coder: 98 public repositories covering LLMs, VLA, multimodal learning, and computer vision.
Honors and Competitions
- Kaggle CMI Child Mind Institute: Silver Medal, global rank 75 / 1878.
- Huawei Ascend AI Innovation Competition: Excellent Solution Award.
- BMW Hackathon: Finalist / second place.
- Alibaba Tianchi Few-shot Trademark Detection: Global rank 239 / 2135.
- Asia-Pacific Ophthalmology Big Data Competition: Global rank 142 / 10006.
