Microsoft Research Highlights: AI, Weather Prediction, Adaptive Training, and Influencer Industry Insights

Research Focus: Week of October 23, 2023
This post highlights several key research advancements from Microsoft Research, covering areas from multimodal AI and weather prediction to adaptive training systems and the intricacies of the influencer industry.
NEW RESEARCH: Kosmos-2.5: A Multimodal Literate Model
Challenge: Current large language models (LLMs) primarily process textual information and struggle with visual understanding. Multimodal large language models (MLLMs) aim to bridge this gap by integrating visual and textual data within a single model.
Advancement: Microsoft researchers introduce Kosmos-2.5, an MLLM specifically designed for machine reading of text-intensive images. Unlike previous MLLMs that focused on natural images, Kosmos-2.5 is pre-trained on a large-scale dataset of text-intensive images.
Key Capabilities:
- Spatially-Aware Text Blocks: Assigns spatial coordinates to text blocks within an image.
- Structured Text Output: Captures styles and structures, outputting them in markdown format.
Impact: Kosmos-2.5 can be adapted for various text-intensive image understanding tasks through fine-tuning, paving the way for future scaling of MLLMs.
Spotlight: AI-POWERED EXPERIENCE
Microsoft offers an AI-powered experience to explore research, exemplified by the Microsoft research copilot experience.
NEW RESEARCH: Evaluation of Dependency Structure for Multivariate Weather Predictors using Copulas
Context: Climate change is increasing the frequency and severity of extreme weather events, particularly impacting the Global South. Accurate weather forecasting is crucial but challenging due to complex variable interactions.
Approach: Researchers explore the use of vine copulas to model the complex relationships between different weather variables. Copulas separate marginal distributions from dependency structures, enabling more flexible modeling for improved risk assessments.
Methodology: Vine copulas, built from various bivariate copulas (Gaussian, Student's t, Clayton, Gumbel, Frank), are effective for high-dimensional problems. The research applies this framework to subseasonal forecasting models to enhance predictions.
Goal: To improve the prediction of different weather events or variables by understanding the dependency structure of multivariate weather predictors.
NEW RESEARCH: Adaptive Training System
Concept: Adaptive training adjusts tasks or stimuli based on trainee performance, leading to faster and more effective learning compared to fixed training methods.
Application: Virtual reality (VR) provides new opportunities for adaptive training. By using computational models of the training process, optimal scenario difficulty can be recommended.
Microsoft's Contribution: Researchers propose an adaptive training algorithm that accelerates learning by making trial-by-trial recommendations for scenario difficulty. This system is applied to training pilots in a VR flight simulator, ranging from easy scenarios to challenging conditions like fog and side winds.
Outcome: The system aims to maximize improvements in a trainee's absolute skill level.
NEW RESEARCH: CodePlan: Repository-level Coding using LLMs and Planning
Problem: Software engineering tasks like package migration or bug fixing require editing entire code repositories. While LLM-powered assistants excel at localized coding problems, repository-level tasks are more complex due to interdependencies and repository size.
Solution: Researchers introduce CodePlan, a task-agnostic framework that frames LLM-driven repository-level coding as a planning problem. It synthesizes a multi-step chain of edits, with each step involving an LLM call informed by the entire repository context.
Evaluation: CodePlan is evaluated on package migration (C#) and temporal code edits (Python), demonstrating stronger alignment with ground truth compared to baseline methods.
NEW ARTICLE: The intimacy triple bind: Structural inequalities and relational labor in the influencer industry
Focus: This article examines the concept of "relational labor" within the influencer industry, where content creators commodify their personalities to build audience intimacy and authentic self-brands.
Key Findings: Drawing on ethnographic research, the article highlights how structural inequalities disproportionately affect marginalized creators. Managing audience relationships is more challenging for them, often leading to increased risk of trolling and harassment.
Tactics for Creators:
- Focusing on content creation rather than personal branding.
- Using silence to manage interactions with "anti-fans."
- Retreating to private community spaces.
- Disabling public comments.
Read the article Read the paper
Related Publications:
- Kosmos-2.5: A Multimodal Literate Model
- Evaluation of Dependency Structure for Multivariate Weather Predictors using Copulas
- CodePlan: Repository-level Coding using LLMs and Planning
- Adaptive Training System
Further Reading:
- Research Focus: Week of October 28, 2024
- Abstracts: October 23, 2023 (Microsoft Research Podcast)
- Research Focus: Week of October 9, 2023
- Research Focus: Week of September 25, 2023
Research Areas Highlighted:
- Artificial intelligence
- Computer vision
- Ecology and environment
- Human language technologies
- Human-computer interaction
- Mathematics
- Programming languages and software engineering
Related Research Groups:
- General Artificial Intelligence
- Audio and Acoustics Research Group
Related Projects:
- AI-Driven Software Engineering
- Document AI (Intelligent Document Processing)
- Brain-Computer Interfaces
Follow Microsoft Research:
- Follow on X
- Like on Facebook
- Share on LinkedIn
- Subscribe on YouTube
- Follow on Instagram
- Subscribe to RSS feed
Original article available at: https://www.microsoft.com/en-us/research/blog/research-focus-week-of-october-23-2023/?lang=fr_ca&locale=fr-ca