C R E A : A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models

Kavana Venkatesh*, Connor Dunlop*, Pinar Yanardag
Virginia Tech
*Equal contribution

TL;DR We introduce CREA, the first creative image editing tool using a collaborative multi-agent framework. CREA simulates the human creative process to autonomously perform novel image generation and editing tasks. By coordinating specialized AI agents like a Creative Director, Prompt Architect, and Art Critic, CREA plans, critiques, and refines creative outputs in a disentangled and interpretable way. Beyond static images, CREA supports personalization, user-guided editing, and even creative video generation.

Teaser Image

We introduce CREA, an agentic framework that emulates the human creative process for creative image editing and generation. Our approach is driven by collaborative interactions between specialized agents, such as a Creative Director and an Art Critic, who communicate to refine and enhance creative output. Moreover, our approach can be extended to video domain for creative video generation. Our framework can also be integrated with personalization techniques to further enrich and expand creative workflows.

Abstract

Creativity in AI-generated imagery remains a fundamental challenge, requiring not only the generation of visually compelling content but also the capacity to add novel, expressive, and artistically rich transformations to images. Unlike conventional editing tasks that rely on direct prompt-based modifications, creative image editing demands an autonomous, iterative approach that balances originality, coherence, and artistic intent. To address this, we introduce CREA, a novel multi-agent collaborative framework that mimics the human creative process. Our framework leverages a team of specialized AI agents who dynamically collaborate to conceptualize, generate, critique, and enhance images. Through extensive qualitative and quantitative evaluations, we demonstrate that CREA significantly outperforms state-of-the-art methods in diversity, semantic alignment, and creative transformation. By structuring creativity as a dynamic, agentic process, CREA redefines the intersection of AI and art, paving the way for autonomous AI-driven artistic exploration, generative design, and human-AI co-creation. To the best of our knowledge, CREA is the first work to introduce a disentangled approach and the first to employ an agentic framework for editing tasks.

Explore CREA Images

Explore CREA Videos

To demonstrate the versatility of our agentic framework, we extend CREA to creative video generation using CogVideoX. CREA can generate diverse and highly creative videos.

Explore CREA Edits

Before Couch After Couch
Before Guitar After Guitar
Before Helmet After Helmet
Before Table After Table

Method

Paper Method Diagram

CREA Framework. We propose CREA, a collaborative multi-agent framework designed to emulate the human creative process for image editing and generation. CREA operates through four key stages: 1.a Pre-Generation Planning, where agents such as the Creative Director and Prompt Architect define a creativity blueprint and synthesize a contrastive, high-creativity prompt grounded in six creativity principles (e.g., originality, expressivness, aesthetic appeal, technical execution, unexpected associations, interpretability & depth). 1.b Creative Image Generation/Editing, where the Generative Executor uses text-to-image diffusion models or ControlNet to either generate new images or perform disentangled edits. 2. Post-Generation Evaluation, where an Art Critic agent, powered by a multimodal LLM, scores the output on the creativity dimensions to compute a Creativity Index (CI). 3. Self-Enhancement, where the Refinement Strategist suggests targeted improvements based on feedback, prompting iterative refinement via collaborative agent feedback loops, optionally incorporating user guidance. This structured and collaborative workflow enables CREA to generate and refine outputs that are novel, expressive, and artistically coherent, maximizing creativity and diversity.

Qualitative Results

Image Generation

Qualitative results for creative image editing and generation tasks illustrate CREA’s disentangled creative edits and generation across diverse objects and domains. For more results, please check out our paper’s Supplementary Material.

Qualitative Comparison

Qualitative Comparison

Qualitative Comparison of Creative Image Editing Task: We compare CREA with state-of-the-art editing methods. As shown, CREA successfully reimagines objects into creative variants in a disentangled manner, whereas other approaches either fail to produce distinctly creative edits or introduce unintended alterations.

Qualitative Comparison

Qualitative Comparison of Creative Image Generation Task: We compare CREA with ConceptLab, SDXL and Flux. CREA consistently produces diverse and creative generations across multiple domains.

Quantitative Results

Traditional T2I Metrics

Our method surpasses state-of-the-art methods across multiple metrics for both editing and generation tasks. Note that DINO scores cannot be computed for image generation, as they rely on image-image similarity, and there is no reference image available for this task. * indicates that scores are interpreted in opposition to their conventional usage as creative generation task benefits from greater perceptual distance between the original and edited images. Q1 measures generation usability, editing consistency while Q2 measures novelty/uniqueness.

Traditional Metrics Table

MLLM‑As‑a‑Judge Metrics

Conventional metrics miss nuanced creative qualities. We therefore apply an LLM‑based judge (QWEN2.5-VL) to score key aspects of creativity, including Originality, Expressiveness, Aesthetic Appeal, Technical Execution, Unexpected Associations, Interpretability and Depth, and Overall Creativity, simulating human-like subjective assessments. CREA achieves the highest scores across all creativity dimensions, validating its superior creative performance.

LLM Judge Table

BibTeX

@misc{venkatesh2025creacollaborativemultiagentframework,
  title={CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models}, 
  author={Kavana Venkatesh and Connor Dunlop and Pinar Yanardag},
  year={2025},
  eprint={2504.05306},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2504.05306},
}