videoJuly 20, 2023

NLU Workshop Talk: Towards Practical Use of Large Pre-Trained Language Models: Addressing Errors and Inconsistencies

AuthorsChris Manning (Stanford University)

Related readings and updates.

BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

May 11, 2026research area Computer Vision, research area Methods and Algorithms

Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinforcement learning (RL). However, existing captioning-RL methods and evaluation metrics often emphasize a narrow notion of caption quality, inducing…

Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures

May 8, 2026research area Computer Vision

We propose HeadsUp, a scalable feed-forward method for reconstructing high-quality 3D Gaussian heads from large-scale multi-camera setups. Our method employs an efficient encoder-decoder architecture that compresses input views into a compact latent representation. This latent representation is then decoded into a set of UV-parameterized 3D Gaussians anchored to a neutral head template. This UV representation decouples the number of 3D Gaussians…

NLU Workshop Talk: Towards Practical Use of Large Pre-Trained Language Models: Addressing Errors and Inconsistencies

Related readings and updates.

BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures

Discover opportunities in Machine Learning.