Diagram
closed
Introduction
This task targets a common need in academic writing, where researchers need to design figures to illustrate complex methods in research papers. The evaluated model receives a technical description of a specific method or model architecture and is asked to synthesize a self-contained figure with caption.
Example
I'm creating a figure for my paper to illustrate how the OmniDocBench dataset was constructed.The figure should show two main processes:
- Data Acquisition: 200k PDFs are sourced from the web and internal repositories. From this, 6k visually diverse pages are sampled using feature clustering, and ~1k pages are selected with attribute labels via manual balancing.
- Data Annotation: A 3-stage annotation pipeline is used. In stage 1, state-of-the-art vision models automatically annotate selected pages. In stage 2, human annotators verify and correct the annotations. In stage 3, PhD-level experts inspect and refine the results. Annotations include layout detection (bbox, attributes, reading order, affiliations) and content recognition (text, formulas, tables). Please generate: 1. A visual diagram showing how these components interact. 2. A separate caption summarizing the key idea of the figure. Please provide the image and caption separately.