PosterGen:
Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
Zhilin Zhang1,2 ★ Xiang Zhang3 ★ Jiaqi Wei4 Yiwei Xu5 Chenyu You1
1Stony Brook University 2New York
University 3University of British Columbia
4Zhejiang University 5University of
California, Los Angeles ★Equal Contribution
Abstract
Multi-agent systems built on large language models (LLMs) and multimodal large language models (MLLMs) have recently shown strong capabilities on complex compositional tasks. We apply this paradigm to paper-to-poster generation, a practical but time-consuming step in research communication. Most of existing methods ignore core design and aesthetic principles, and often produce posters that still require substantial manual refinement. To address these limitations, we propose PosterGen, a multi-agent framework that embeds fundamental design principles into a specialized agent workflow that mirrors professional designers. Our system incorporates collaborative agents to distill a paper's narrative (CuratorAgent), map content to a balanced spatial layout (LayoutAgent), apply a cohesive visual system of color and typography (StylistAgents), and render the final poster in PPTX format (Renderer). This design-centric approach produces posters that are both semantically grounded and visually compelling. To systematically evaluate visual quality, we introduce a comprehensive VLM-based rubric measuring information fidelity and aesthetic design, which we validate against human preferences in a user study. Experimental results show that PosterGen achieves both content fidelity and design aesthetics comparable to human-designed posters while significantly and consistently outperforming state-of-the-art methods.
Method Overview
1. ParserAgent processes the input paper, extracting all text and visual assets and organizing them into a structured format focusing on an ABT narrative.
2. Aesthetic-Aware LLMs then transform this content into a styled layout: CuratorAgent creates a narrative-based storyboard, LayoutAgent calculates the precise spatial arrangement and balances the columns, and StylistAgents apply a harmonious color palette and a hierarchical typographic system.
3. Renderer module takes the styled metadata and produces the output poster via python-pptx.
Aesthetic-Aware LLMs
Our method embeds the design principles as core logic within each specialized agent to establish a cascade of structured design constraints.
Json-formatted story board
Three-column and left-aligned layout structure
CSS-like box model within single section
Different highlighting styles
Harmonious color palette
Evaluation Metrics
We evaluate the generated posters by using a Vision-Language Model (VLM) to serve as an expert judge.
Poster Content
This part verifies that the poster is an accurate, concise, and coherent narrative of the source paper, free from factual errors or overly dense text.
Poster Design
This part evaluates the poster's visual execution, i.e., the application of design principles (Alignment, Proximity, Repetition, Contrast), effective spatial organization, a clear information hierarchy, and the use of a limited and consistent color palette and a disciplined typographic system.
Quantitive Results
VLM-as-Judge results on content metrics across different poster generation methods, with scores averaged over 30 posters rated on a 1-5 scale.
VLM-as-Judge results on aesthetic metrics across different poster generation methods, with scores averaged over 30 posters rated on a 1-5 scale.
Ablation Study
We conduct an ablation study to isolate the contributions of the three core agents in our PosterGen pipeline: the Curator Agent, the Layout Agent, and the Stylist Agents, on the same paper (Cao et al. 2025).
(a) Curator Agent
(b) Curator Agent + Layout Agent
(c) Curator Agent + Layout Agent + Stylist Agents
BibTeX
@article{zhang2025postergen,
title={Postergen: Aesthetic-aware paper-to-poster generation via multi-agent llms},
author={Zhang, Zhilin and Zhang, Xiang and Wei, Jiaqi and Xu, Yiwei and You, Chenyu},
journal={arXiv preprint arXiv:2508.17188},
year={2025}
}