Yifei Xu




E-mail : yifei.xu@weride.ai

Cell Phone : +1 (424) 535-6710

Personal Website : https://yfxu.top

Research Interests
Autonomous Driving, Generative Models, Reinforcement Learning
Education
2017.09 - 2022.03
  • University of California, Los Angeles, United State
  • Ph.D. in Statistics (Youngest Ph.D. awardee in UCLA class 2022)
  • Working at the Center for Vision, Cognition, Learning, and Autonomy (VCLA)
  • Advisor: Prof. Ying Nian Wu
  • GPA: 3.99 / 4.0
2013.09 - 2017.06
  • Shanghai Jiao Tong University, China
  • B. S. Eng. in Computer Science (Top 1% degree thesis in class 2017)
  • ACM Honor Class, Zhiyuan College (a pilot CS class in China)
  • Zhiyuan College, Shanghai Jiao Tong University, China
  • GPA: 3.83 / 4.0 (Major GPA)
2016.07 - 2016.09
  • University of California, Los Angeles, United State
  • Cross-disciplinary Scholars in Science and Technology Program
  • Department of Statistics
  • GPA: 4.0 / 4.0
2017.09 - 2022.03

The Center for Vision, Cognition, Learning, and Autonomy (VCLA)

The Center for Vision, Cognition, Learning, and Autonomy (VCLA) is affiliated with the Departments of Statistics and Computer Science at UCLA. We start from Computer Vision and expand to other disciplines. Our objective is to pursue a unified framework for representation, learning, inference and reasoning, and to build intelligent computer systems for real world applications.

Professor Ying Nian Wu

Professor Wu is a professor of Statistics in University of California, Los Angeles. He is interested in statistical modeling, computing and learning. In particular, he is interested in generative models and unsupervised learning.

  • GPA: 3.99 / 4.0 (First 4 Years)
  • Major Course grade A+ / A :
    • Applied Probability
    • Statistical Programming
    • Computer Vision and Pattern Recognition
    • Modeling and Learning
    • Matrix Algebra and Optimization
    • Modeling and Learning in Vision
    • Advance Modeling and Inference
    • Non-parametric Model
    • Probabilistic Programming
    • Statistical Computing and Inference
    • Machine Learning in Natural Language Processing
    2013.09 - 2017.06

    Shanghai Jiao Tong University

    B. S. Eng. in Computer Science, Zhiyuan College

    SJTU Excellent Bachelor's Degree Thesis (Top 1% in 3600 Undergraduates)

    ACM Honored Class

    ACM Honored Class is a pilot computer science class in China.

    Over the past 10 years, ACM students have received hundreds of honors and awards. ACM studnets won the ACM International Student Programming Contest World Championship for three times in 2002, 2005 and 2010.

    ACM students has published more than 40 academic papers as the first author in the NIPS, WWW, SIGIR, SIGMOD, SIGKDD, ICML, AAAI and other important international conferences and journals.

    Zhiyuan College

    Zhiyuan College, within Shanghai Jiao Tong University, is an institude that provides an Elite-education for our students. It aims to train them to become future leaders in science and in technology.

    In order to be admitted to Zhiyuan College, a student must be on the top fo more than 17,000 undergraduate students within SJTU. Currently, 461 students are enrolled in Zhiyuan College.

    By September 2016, 359 students have graduated from Zhiyuan College, 327 (91%) to pursue further studies, 273 (76%) admitted by world top 100 University listed in QS World University Ranking 2016, and 250 (70%) to pursue Ph.D. degrees.

    Shanghai Jiao Tong University

    Shanghai Jiao Tong University (SJTU), as one of the higher education institutions which enjoy a long history and a world-renowned reputation in China, is a key university directly under the administration of the Ministry of Education (MOE) of the People's Republic of China and co-constructed by MOE and Shanghai Municipal Government. SJTU has become a comprehensive, research-oriented, and internationalized top university in China.

    GPA: 3.83 / 4.0 (Major GPA) (A+ = 4.3)

    Major Course grade A+ / A :
    • Programming
    • Linear Algebra
    • Mathematical Analysis
    • University Physics
    • Science and Technology Innovation
    • Computer Architecture
    • Computer System
    • Course Design on Computer System
    • Cmputing Complexity
    • Machine Learning (Include Statistics)
    • Natural Language Processing
    • Database Systems
    • Lab Practice
    2016.07 - 2016.09
    • University of California, Los Angeles, United State
    • Department of Statistics

    Cross-disciplinary Scholars in Science and Technology Program

    The CSST office administers the CSST Summer Program which brings outstanding third year undergraduate students, interested in PhD studies, nominated by top-tier universities in the People?s Republic of China (PRC) and Japan, to conduct 10 week intensive research training with UCLA faculty mentors. This 10-week program offers emerging scholars premier research training in a cutting edge scientific environment that fosters cross-disciplinary collaborations.

    GPA: 4.0 / 4.0

    Course grade A+ / A :
    • CSST Project
    • Directed Research
    Research Experience

    Center for Vision, Cognition, Learning and Autonomy

    University of California, LA

    2016.07 - 2022.03
    Advisor: Ying Nian Wu

    • Inverse Reinforcement Learning by Energy-based Model
    • Cooperate both model-based and model-free approach
    • Apply on various task in reinforcement learning and optimal controls
    • Learning Generative ConvNet with Continuous Latent Factors
    • Model: a non-linear generalization of factor analysis where the mapping is parametrized by CNN
    • Optimized image synthesis training on large-scale images by batch normalization
    • Used new Back-Propagation inferenced by gradient descent / Langevin dynamics
    • Generative Hierarchical Structure Learning of Sparse FRAME Models
    • Model: Sparse FRAME, a multi-layer probability distribution model captured the part deformation
    • Designed experiments for Sparse FRAME model on detection and clustering
    • Compared Sparse FRAME model with DPM, And-or Graph on point, part, object level detection

    Cognitive Computing Lab --- Research Intern (part-time)

    Baidu Research USA

    2021.09 - 2022.03
    Advisor: Ping Li

    • Designed the united framework for Energy-based Model on Paddlepaddle and Pytorch
    • Support multiple data pipelines: texture, image, voxel, point cloud, etc.
    • Support multiple EBM training method: short-run, ABP, ABP through OT, etc.

    Creative Vision team --- Research Intern

    Snap Research

    2021.06 - 2021.09
    Advisor: Sergey Tulyakov

    • Model: Energy-based Implicit Function for 3D Shape Representation
    • Use energy-based model to represent objects in 3D space
    • Improved generating capability by incorporating VAE and EBM
    • Better versatility and easier preprocess compared to DeepSDF

    Decision Intelligence Lab --- Research Intern

    Alibaba DAMO Academy USA

    2020.07 - 2020.09
    Advisor: Jingqiao Zhang

    • SAS: Self-argmented Strategy for Self-supervised Learning
    • Model: A self-generating strategy for contextualized data argmentation without the separated generator
    • Researched on unsupervised pretrain model for Transformers in NLP
    • Outperform the SOTA result ELECTRA with 30% less computing cost

    Research --- Research Intern

    Hikvision Research USA

    2019.06 - 2019.09
    Advisor: Jianwen Xie

    • Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets
    • Model : An energy-based model applied on 3D pointcloud generation via Langevin Dynamic
    • Compared on PointFlow, PointGlow, Autoencoder, etc.

    Planning Group --- Research Intern

    Isee Inc.

    2018.07 - 2018.09
    Advisor: Chris Baker

    • Continuous Inverse Optimal Control via Langevin Sampling to learn trajectory prediction
    • Model: A sample-based inverse reinforcement learning model driven by Energy-based Model and Langevin Dynamic
    • Energy function is formed by neural network enhanced human-crafted cost function
    • Langevin Dynamic is used to generate trajectories by sampling in energy-based distribution

    Computer and Machine Intelligence Lab

    Shanghai Jiao Tong University

    2015.07 - 2017.06
    Advisor: Liqing Zhang

    • Large-scale Image Retrieval Competition
    • Model: A model with saliency detection, image classification and image retrieval
    • Implemented Saliency Detection combining Dense and Sparse Reconstruction by Bayesian Integration
    • Classified large-scale images by SVM and Convolution Neural Network
    • Interactive Image Search for Clothing Recommendation
    • Model: Hybrid Topics Model, An LDA based model integrates both visual and text information
    • Used multi-trained Fast-RCNN to localize regions
    • Extracted 3 types of visual descriptors: HOG, LBP, Color
    • Implemented Hybrid Topics Model and introduced a demand-adaptive retrieval strategy

    Visual Computing Lab --- Research Intern

    Microsoft Research in Asia

    2016.09 - 2017.02
    Advisor: Fang Wen

    • Joint Face Detection and Alignment via Cascaded Compositional Learning
    • Model: Sparse FRAME, a multi-layer probability distribution model captured the part deformation
    • Jointed cascade face detection and alignment by advanced boosting algorithm
    • Considered multi-domain to overcome unconstrained face data
    • Trained multi domain on same random forest with both detection and alignment in parallel

    Center for Vision, Cognition, Learning and Autonomy

    University of California, LA

    2016.07 - 2016.09
    Advisor: Ying Nian Wu

    Research Intern

    Learning Generative ConvNet with Continuous Latent Factors

    • Model: a non-linear generalization of factor analysis where the mapping is parametrized by CNN
    • Optimized image synthesis training on large-scale images by batch normalization
    • Used new Back-Propagation inferenced by gradient descent / Langevin dynamics

    Paper Abstract

    This paper proposes an alternating back-propagation algorithm for learning the generator network model. The model is a non-linear generalization of factor analysis. In this model, the mapping from the latent factors to the observed vector is parametrized by a convolutional neural network. The alternating back-propagation algorithm iterates between the following two steps: (1) Inferential back-propagation, which infers the latent factors by Langevin dynamics or gradient descent. (2) Learning back-propagation, which updates the parameters given the inferred latent factors by gradient descent.

    • The project page : Link
    • The paper online : Link *My name is listed in the Acknowledgement
    • The poster : Link
    • The presentation : Link

    Generative Hierarchical Structure Learning of Sparse FRAME Models

    • Model: Sparse FRAME, a multi-layer probability distribution model captured the part deformation
    • Designed experiments for Sparse FRAME model on detection and clustering
    • Compared Sparse FRAME model with DPM, And-or Graph on point, part, object level detection

    Paper Abstract

    This paper proposes a framework for generative learning of hierarchical structure of visual objects, based on training hierarchical random field models. The resulting model, which we call structured sparse FRAME model, is a straightforward variation on decomposing the original sparse FRAME model into multiple parts that are allowed to shift their locations, orientations and scales, so that the resulting model becomes a reconfigurable template.

    • The project page : Link
    • The paper online : Link
    • The poster : Link

    Cognitive Computing Lab --- Research Intern (part-time)

    Baidu Research USA

    2021.09 - 2021.03
    Advisor: Ping Li

    • Designed the united framework for Energy-based Model on Paddlepaddle and Pytorch
    • Support multiple data pipelines: texture, image, voxel, point cloud, etc.
    • Support multiple EBM training method: short-run, ABP, ABP through OT, etc.

    Creative Vision team --- Research Intern

    Snap Research

    2021.06 - 2021.09
    Advisor: Sergey Tulyakov

    • Energy-based Implicit Function for 3D Shape Representation
    • Model: Use energy-based model to represent objects in 3D space
    • Improved generating capability by incorporating VAE and EBM
    • Better versatility and easier preprocess compared to DeepSDF

    Decision Intelligence Lab

    Alibaba DAMO Academy USA

    2020.07 - 2020.09
    Advisor: Jingqiao Zhang

    Research Intern

    SAS: Self-Augmented Strategy for Language Model Pre-training

      • A novel generating strategy for contextualized data argmentation
      • Researched on unsupervised pretrain model for Transformers in NLP
      • Outperform the SOTA result ELECTRA with 30% less computing cost

    Paper Abstract

    The core of a self-supervised learning method for pre-training language models includes the design of appropriate data augmentation and corresponding pre-training task(s). Most data augmentations in language model pre-training are context-independent. The seminal contextualized augmentation recently proposed by the ELECTRA requires a separate generator, which leads to extra computation cost as well as the challenge in adjusting the capability of its generator relative to that of the other model component(s). We propose a self-augmented strategy (SAS) that uses a single forward pass through the model to augment the input data for model training in the next epoch. Essentially our strategy eliminates a separate generator network and uses only one network to generate the data augmentation and undertake two pre-training tasks (the MLM task and the RTD task) jointly, which naturally avoids the challenge in adjusting the generator's capability as well as reduces the computation cost. Additionally, our SAS is a general strategy such that it can seamlessly incorporate many new techniques emerging recently or in the future, such as the disentangled attention mechanism recently proposed by the DeBERTa model. Our experiments show that our SAS is able to outperform the ELECTRA and other state-of-the-art models in the GLUE tasks with the same or less computation cost.

    • The paper online : Arxiv

    Research Group

    Hikvision Research USA

    2019.06 - 2019.09
    Advisor: Jianwen Xie

    Research Intern

    Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets

    • Model : An energy-based model applied on 3D pointcloud generation
    • Compared on PointFlow, PointGlow, Autoencoder, etc.

    Paper Abstract

    We propose a generative model of point clouds in the forms of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network. The energy function learns a coordinate encoding of each point and then aggregates all individual point features into an energy for the whole point cloud. We show that our model can be derived from the discriminative PointNet. The model is trained by MCMC-based maximum likelihood learning (as well as its variants), without the help of any assisting networks like those in GANs and VAEs. Our model does not rely on hand-crafting distance metric for point clouds in generation. It synthesizes point clouds that match to the observed examples. The learned point cloud representation can be useful for point cloud classification. Experiments demonstrate the advantages of the proposed model. Furthermore, we can learn a short-run MCMC toward the energy-based model as a flow-like generator for point cloud reconstruction and interpretation.

    • The paper online : Arxiv

    Planning Group

    Isee Inc.

    2018.07 - 2018.09
    Advisor: Chris Baker

    Research Intern

    Continuous Inverse Optimal Control by Energy-based Model

    • Model: A sample-based inverse reinforcement learning model driven by Energy-based Model and Langevin Dynamic
    • Energy function is formed by neural network enhanced human-crafted cost function
    • Langevin Dynamic is used to generate trajectories by sampling in energy-based distribution

    Paper Abstract

    Autonomous driving is a challenging multiagent domain which requires optimizing complex, mixed cooperative-competitive interactions. Learning to predict contingent distributions over other vehicles' trajectories simplifies the problem, allowing approximate solutions by trajectory optimization with dynamic constraints. We take a model-based approach to prediction, in order to make use of structured prior knowledge of vehicle kinematics, and the assumption that other drivers plan trajectories to minimize an unknown cost function. We introduce a novel inverse optimal control (IOC) algorithm to learn other vehicles' cost functions in an energy-based generative model. Langevin Sampling, a Monte Carlo based sampling algorithm, is used to directly sample the control sequence. Our algorithm provides greater flexibility than standard IOC methods, and can learn higher-level, non-Markovian cost functions defined over entire trajectories. We extend weighted feature-based cost functions with neural networks to obtain NN-augmented cost functions, which combine the advantages of both model-based and model-free learning. Results show that model-based IOC can achieve state-of-the-art vehicle trajectory prediction accuracy, and naturally take scene information into account.

    • The paper online : Link

    Computer and Machine Intelligence Lab

    Shanghai Jiao Tong University

    2015.07 - 2017.06
    Advisor: Liqing Zhang

    Research Assistant

    *1

    Large-scale Image Retrieval Competition

    • Model: A model with saliency detection, image classification and image retrieval
    • Implemented Saliency Detection combining Dense and Sparse Reconstruction by Bayesian Integration
    • Classified large-scale images by SVM and Convolution Neural Network

    This is a competition hold by Alibaba. The goal is to output the picture with the most similarity by the given picture. The database is a million web pictures. There are three part for our model. They are saliency detection, CNN classification and text matching. I am in charge of saliency detection and classification. Our team ranked in the top 16 in the competition(Over 2000 teams).

    Interactive Image Search for Clothing Recommendation

    • Model: Hybrid Topics Model, An LDA based model integrates both visual and text information
    • Used multi-trained Fast-RCNN to localize regions
    • Extracted 3 types of visual descriptors: HOG, LBP, Color
    • Implemented Hybrid Topics Model and introduced a demand-adaptive retrieval strategy

    Paper Abstract

    This paper proposes a novel approach to meet users' multi-dimensional requirements in clothing image retrieval.We propose the Hybrid Topic (HT) model to learn the intricate semantic representation of the descriptors above. The model provides an effective multi-dimensional representation of clothes and is able to perform automatic image annotation by probabilistic reasoning from image search. Furthermore, we develop a demand-adaptive retrieval strategy which refines users' specific requirements and removes users' unwanted features. Our experiments show that the HT method significantly outperforms the deep neural network methods.

    Visual Computing Lab

    Microsoft Research in Asia

    2016.09 - 2017.02
    Advisor: Fang Wen

    Research Intern

    *2

    Joint Face Detection and Alignment via Cascaded Compositional Learning

    • Model: Sparse FRAME, a multi-layer probability distribution model captured the part deformation
    • Jointed cascade face detection and alignment by advanced boosting algorithm
    • Considered multi-domain to overcome unconstrained face data
    • Trained multi domain on same random forest with both detection and alignment in parallel

    This work is based on "Joint cascade face detection and alignment" and "Unconstrained Face Alignment via Cascaded Compositional Learning". We aim to provide domain partition on the Joint cascade face detection and alignment method.

    Publication
    2022.03

    Yifei Xu, Zeng Huang, Ying Nian Wu, Sergey Tulyakov" Energy-based Implicit Function for 3D Shape Representation" In Review

    2023.02

    Jianwen Xie, Yaxuan Zhu, Yifei Xu, Dingcheng Li, Ping Li " Generative Learning with Latent Space Flow-based Prior Model" In Proc. 37th AAAI Conference on Artificial Intelligence (AAAI) 2023

    2022.02

    Yifei Xu, Jingqiao Zhang, Ru He, Liangzhu Ge, Chao Yang, Cheng Yang, Ying Nian Wu " SAS: Self-Augmented Strategy for Language Model Pre-training" In Proc. 36th AAAI Conference on Artificial Intelligence (AAAI) 2022

    2021.06

    Yifei Xu, Jianwen Xie, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu " Generative PointNet : Deep Energy-Based Learning on Point Sets for 3D Generation and Reconstruction" IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.

    2020.02

    Yifei Xu, Jianwen Xie, Tianyang Zhao, Chris Baker, Yibiao Zhao, Ying Nian Wu " Energe-based Continous Inverse Optimal Control" IEEE Transactions on Neural Networks and Learning Systems (TNNLS) 2022; NeurIPS workshop on Machine Learning for Autonomous Driving, 2020

    2018.11

    Tianyang Zhao, Yifei Xu, Mathew Monfort, Wongun Choi, Chris Baker, Yibiao Zhao, Yizhou Wang, Ying Nian Wu " Convolutional Spatial Fusion for Multi-Agent Trajectory Prediction" IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.

    2017.03

    Jianwen Xie, Yifei Xu, Erik Nijkamp, Ying Nian Wu, Song-Chun Zhu "Generative Hierarchical Structure Learning of Sparse FRAME Models" IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017.

    2016.10

    Zhengzhong Zhou, Yifei Xu, Jingjin Zhou and Liqing Zhang "Interactive Image Search for Clothing Recommendation. " The 24th ACM international conference on Multimedia. ACM, 2016.

    Honors and Awards
    2014-2016

    Academic Excellence Scholarship at SJTU Prize B, C, B (Top 10%, 20%, 10% in University)

    2016.04

    Interdisciplinary Contest In Modeling 2016 Meritorious

    2016.07

    UCLA CSST Scholarship and CSST Award (2 in CSST Program CS Major)

    2016.10

    'ele' Scholarship for outstanding CS students (6 in university each year)

    2016.12

    'YuanKang' Scholarship for outstanding research (5 in university each year)

    2017.06

    SJTU Excellent Bachelor's Degree Thesis (Top 1% in 3600 Undergraduates)

    Project Experience
    AI
    • "FishTank" Game AI
    • "Texas Hold'em" Game AI
    System
    • Compiler for simplified C
    • Simulated Pipeline CPU
    • C++ STL Container
    • Virus for Linux
    • SQL System
    Web Dev.
    • Bookex System
    • ACM New Website
    • POI System
    Machine Learning
    • Trajectory Compression
    • Implicit Discourse Parsing
    • Multi-label Text Classification
    • ML Toolkit (Regression, Clustering, Boosting...)
    • Clustering / Monte Carlo Algorithm
    • Generative Model (VAE, GAN, DCGAN, ...)
    • Descriptive Model (EBM, ABP, ...)
    • EM Algorithm
    2013.12
    AI "FishTank" Game AI
    C++

    Project of "Programming"

    2014.08
    AI "Texas Hold'em" Game AI
    C++

    Project of "Programming Practice"

    2014.06
    System C++ STL Container
    C++

    Project of "Data Struct" which include AVL tree, Hashmap, Linklist, etc.

    2015.04
    System Compiler for simplified C
    Java

    A compiler which transform C code into MIPS code.

    2015.06
    System Simulated Pipeline CPU
    Verilog

    Simulate the MIPS code’s running on Verilog simulator.

    2015.10
    System Virus for Linux
    Linux C

    A virus runs on Linux in order to have the super authority.

    2016.05
    System SQL System
    C++

    A SQL System.

    2014.03
    Web Dev. Bookex System (Part)
    Html PHP javascript

    A recommended system for a secondhand book market

    2014.08
    Web Dev. ACM New Website
    Html PHP javascript

    A new, Responsive website for ACM Class.

    2016.05
    Web Dev. POI System
    Html JSP Javascript

    A yelp-like website.

    2015.08
    ML Trajectory Compression
    C++

    Compress a trajectory with lossless and lossy method.

    2016.06
    ML Implicit Discourse Parsing
    Python Matlab

    The implementation of "Recognizing Implicit Discourse Relations in the Penn Discourse Treebank".

    2018.12
    ML Machine learning Toolkit
    Python R

    Multiple machine learning algorithms including Regression, Clustering, Boostering, MCMC, VAE, GAN, EBM, EM ...

    2019.06
    ML Multi-label Text Classification
    Python

    Multi-label text classification via ELMo and label attention

    2017-2019
    ML Advanced ML network implementation
    Python

    Including clustering algorithm, monte carlo algorithms, VAE, GAN, DCGAN, EBM, ABP, EM algorithm, etc.

    Extracurricular Experience
    Extracurricular
    • 2013-14: Member of Zhiyuan Pandeng (leadership) Project
    • 2015-16: Minisiter of the college publicity center
    • 2014-17: Publicity commissary and Vice monitor for ACM 2013 Class
    Teach Assistant
    • 2015-16: Data Structure
    • 2018, 2019 Fall: Statistical Programming (STAT 202A)
    • 2019, 2020 Winter: Methods of Machine Learning (STAT 231B)
    • 2019, 2020 Spring: Introduction to Probability (STAT 100A)
    • 2020, 2021 Spring: Machine Learning (STAT 413)
    • 2020 Fall: Pattern Recognition and Machine Learning (STAT M231A)
    • 2021 Winter: Matrix Algebra and Optimization (STAT 202B)
    • 2021 Winter: Introduction to Computational Statistics with R (STAT 102A)
    Reviewer
    • 2019, 2020, 2021, 2022: Conference on Computer Vision and Pattern Recognition (CVPR)
    • 2020, 2021: Conference on Neural Information Processing Systems (NeurIPS)
    • 2021, 2022: AAAI Conference on Artificial Intelligence (AAAI)
    • 2021: The 24th International Conference on Artificial Intelligence and Statistics (AISTATS)
    • 2022: International Conference on Learning Representations (ICLR)