Microsoft Research: Graphics and Multimedia Advancements in AI

Graphics and Multimedia Research at Microsoft

This page showcases a collection of research publications, projects, and groups within Microsoft Research focused on Graphics and Multimedia. The content is organized by research areas, including Intelligence, Systems, and Theory, with a specific emphasis on Artificial Intelligence, Computer Vision, Audio & Acoustics, and more.

Key Research Areas:

Intelligence:
- Artificial Intelligence
- Audio & Acoustics
- Computer Vision
- Graphics & Multimedia
- Human-Computer Interaction
- Human Language Technologies
- Search & Information Retrieval
Systems:
- Data Platforms and Analytics
- Hardware & Devices
- Programming Languages & Software Engineering
- Quantum Computing
- Security, Privacy & Cryptography
- Systems & Networking
Theory:
- Algorithms
- Mathematics
Other Sciences:
- Ecology & Environment
- Economics
- Medical, Health & Genomics
- Social Sciences
- Technology for Emerging Markets

Featured Publications and Projects:

Exploring Invariance in Images through One-way Wave Equations: This publication from ICML 2025 (October 2024) by Yinpeng Chen et al. delves into invariance in images using wave equations.
BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI: Presented at User Interface Software and Technology in October 2024, this work by Shwetha Rajaram et al. explores generative AI for customizing video-conferencing backgrounds.
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing: A publication from ICLR 2025 (October 2024) by Kaizhi Zheng et al., focusing on 3D room editing using LLMs and graph diffusion.
Interactive Multimodal AI Systems (IMAIS): This group focuses on creating interactive systems that blend human and real-world complexity with advanced technology, leveraging multimodal generative AI.
ASL STEM Wiki: A project providing a dataset and benchmark for interpreting STEM articles using American Sign Language.
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling: Presented at AVGenL (ECCV 2024 workshop) in September 2024, this research by Ruihan Yang et al. explores video-audio conditional modeling.
Scribble: Auto-Generated 2D Avatars with Diverse and Inclusive Art-Direction: A SIGGRAPH publication from July 2024 by Lohit Petikam et al., focusing on generating 2D avatars.
GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions: This work from Computer Vision and Pattern Recognition 2024 Workshops (June 2024) by Salvatore Esposito et al. discusses geometry-aware generative modeling.
Kahani: A research prototype for creating stories with visually striking and culturally nuanced images through language descriptions, with a downloadable tool and video available.
What’s Your Story: Ivan Tashev: A podcast episode featuring Ivan Tashev discussing his work in audio signal processing for Microsoft products.

Filtering and Sorting:

The page allows users to refine results by content type (Publications, Videos, Projects, Blog posts, Tools, Events, Groups, Career Opportunities), author, lab (Redmond, Asia, Cambridge, India, etc.), and published date.

Social Media and Engagement:

Links are provided to follow Microsoft Research on X, Facebook, LinkedIn, YouTube, and Instagram, as well as subscribe to their RSS feed. Sharing options for X, Facebook, and LinkedIn are also available.

Image:

A prominent image at the top of the content section depicts "Graphics and multimedia" with abstract shapes and a green gradient background.

Navigation and Footer:

The page includes extensive navigation links to various Microsoft Research sections, programs, events, and about pages. The footer contains links to Microsoft products, services, and legal information, including privacy choices and terms of use.