Call for Papers

Submit your research on Agentic AI for Visual Media and contribute to the advancement of autonomous visual content creation.

Submission Guidelines

We invite submissions to the Workshop on Agentic AI for Visual Media, which will be held in conjunction with CVPR 202X. This workshop focuses on the emerging intersection of computer vision, multimodal large language models, and agentic AI systems.

The goal is to explore how autonomous, tool-using AI agents can perform complex image and video processing, editing, generation, and evaluation tasks in real-world workflows.

Paper Tracks

Long Papers

Length: Up to 8 pages (excluding references)
Content: No appendix or supplementary material
Proceedings: Included in official CVPR proceedings
Scope: Full research contributions

Extended Abstracts

Length: Up to 4 pages (excluding references)
Content: No appendix or supplementary material
Proceedings: Not included in official proceedings
Scope: Work-in-progress and preliminary results

Topics of Interest

We invite contributions on (but not limited to):

Image/Video Processing and Generation — Restoration, enhancement, editing, synthesis, and neural rendering for visual media, including video production, storyboarding, and storytelling.
Generative AI for Creative Visual Media — Foundation models (GANs, diffusion, multi-modal LLMs) and specialized generative models (fine-tuned models, fine-tuning methods) for content creation, customization, and large-scale deployment.
Image/Video Understanding and Representation — Captioning, summarization, quality and aesthetic assessment, semantic understanding, alignment with human perception, and the use of LLMs as judges for evaluation and decision support.
LLM/VLM Post-Training and Adaptation — Knowledge editing and injection, supervised fine-tuning, reinforcement learning based post-training, agent-oriented training strategies, construction and curation of training data, and approaches for learning from raw experience, with a focus on enabling more adaptive and trustworthy multi-modal agents.
Agentic AI Foundations — Agent architectures, orchestration frameworks, multi-modal reasoning, tool use, and adaptive workflow design.
Multi-Agent Systems — Collaboration, coordination, communication, and reinforcement learning methods for multi-agent and distributed settings.
Human–Agent Collaboration — Interactive interfaces, trust, transparency, controllability, and socially responsible design of agentic systems.
Benchmarks, Datasets, and Evaluation — Standardized datasets, evaluation metrics, and protocols for multi-step, agent-driven workflows, including quality, efficiency, and safety.
Agent Applications and Deployment — Embodied agents, OS/desktop agents, creative copilots, GUI agents, and integration in media production pipelines.
Infrastructure and Tooling — Real-time systems, scalable deployment, integration with APIs and existing software ecosystems, and resource-efficient implementations.
Trust, Safety, and Ethics — Reliability, accountability, fairness, and responsible deployment of agentic AI systems for visual media.

Submission Requirements

Important Notes:

Submissions must use the CVPR 2026 Author Kit(LaTeX/Word)
Follow all CVPR 2026 author instructions and submission policies
All submissions must be anonymized for double-blind review
Papers accepted to CVPR 2026 main conference may be submitted to Extended Abstract track
Submissions to another CVPR 2026 workshop are not permitted

Submit Paper

Submit your paper through OpenReview

Important Dates

Paper Submission Deadline

20 March 2026

Notification of Acceptance

31 March 2026

Camera-ready Deadline

[Date to be announced]

Workshop Date

03 - 07 June 2026

Contact

For questions about paper submissions, please contact:

Jinjin Gu

Organizer

jinjin.gu [at] insait.ai

Lei Sun

Organizer

lei.sun [at] insait.ai