While AI excels in structured creative tasks, this study shows that true creativity requires human collaboration to unlock its full potential, especially in originality and complex thinking.
Research: Can AI Enhance its Creativity to Beat Humans? Image Credit: DALL·E 3
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
In an article recently submitted to the arXiv preprint* server, researchers explored how artificial intelligence (AI) compared to humans in creative performance. They examined two prompting strategies—naive and expert AI—across three tasks (text, draw, alternative uses) and evaluated the creative outputs using both human evaluators and objective metrics. The authors found that AI outperformed humans in some aspects of creativity, particularly in structured tasks and certain objective criteria, though human feedback remained essential to optimize AI's creative potential, especially in certain tasks and criteria that required novelty and cross-disciplinary thinking.
Background
AI has integrated into many aspects of life, both professionally and personally, significantly advancing fields like natural language processing and image generation. A major breakthrough in AI development is the use of transformer architectures, specifically large language models (LLMs), such as generative pre-trained transformers (GPT), which leverage attention mechanisms to improve performance.
These models have become crucial tools for generating creative outputs like art, music, and writing. However, AI's role in creativity, traditionally considered a uniquely human trait, raises questions about whether AI can compete with or complement human creativity. While AI models can efficiently generate creative outputs based on vast amounts of data, the degree to which they can rival human originality remains a subject of debate.
Previous studies have explored human-AI collaboration and substitution in creative tasks, but gaps remain, particularly in assessing creative performance across varied criteria. The present study also addressed the role of prompting strategies in shaping AI performance, noting that AI's ability to generate creative outputs is highly dependent on the quality and specificity of the human-provided prompt. AI's reliance on human prompts and its limitations in complex tasks without human input has not been fully examined. This paper filled these gaps by evaluating AI's creative performance across multiple tasks, emphasizing the impact of prompt engineering on AI's success, and advocating for human-machine collaboration to enhance creative outputs.
The Experiment of Creativity
The experiment was divided into two parts: collecting and evaluating creative outputs from humans and AI. Human creativity was measured through three tasks: a text task, a drawing task, and an alternative uses task (AUT). These tasks captured both convergent and divergent thinking processes.
AI-generated outputs were created using GPT-4 (text and AUT) and DALL-E 2 (draw) with two prompting strategies—naive and expert. Unaware of AI-generated content, human evaluators assessed the outputs based on creativity criteria like validity, originality, and elaboration. A total of 199 evaluators participated, with each task evaluated by at least two people.
Task Type and Prompting Strategies
The authors explored AI's creative capabilities compared to humans, focusing on how task type and prompting strategies influence performance. Based on existing literature and observations, they formulated three hypotheses.
First, AI was expected to outperform humans in close-ended tasks, where identifying optimal solutions based on learned patterns was key, while humans were believed to excel in open-ended tasks that demanded novelty and divergent thinking. Second, expert prompting strategies were predicted to enhance AI performance across all tasks, as tailored prompts allowed AI to align its outputs with desired outcomes. However, the study found that while expert prompting improved AI’s creative performance in some cases, it also led to over-fixation on task constraints in others, reducing creativity in more open-ended tasks.
The methodology included creating two AI agents, a naive AI and an expert AI, both generated through GPT-4, with creative output calibrated for coherence. Performance was assessed through a mix of human-based evaluations, theme-based metrics (variety, balance, and diversity), and embedding-based metrics, using models like Camem bidirectional encoder representations from transformers (CamemBERT) for text and contrastive language–image pre-training (CLIP) for images.
The researchers also applied analysis of variance (ANOVA) and non-parametric tests to compare human and AI creativity across tasks. Although AI efficiently processed vast knowledge and generated varied outputs, it faced challenges in producing original and unexpected ideas without human intervention. Results indicated that while AI could process vast knowledge efficiently, human creativity tended to surpass AI in tasks requiring originality and cross-disciplinary thinking.
Creative Task Performance Analysis
The results were divided into three creative tasks: text, alternative uses, and drawing, each evaluated through objective and subjective measures. For the text task, naive AI outperformed expert AI and humans regarding variety, showcasing more themes and unusual combinations. However, expert AI, which was prompted to optimize for creativity, sometimes became too constrained by its instructions, reducing its performance. Humans provided more semantic diversity (cosine distance). The subjective evaluation revealed that humans and naive AI scored higher in validity and originality than expert AI.
In the alternative uses task, AI agents (both naive and expert) outperformed humans in variety and diversity, while humans excelled in group-level semantic distance. Expert AI performed better in originality, though there was no significant difference between expert and naive AI for feasibility and validity. The study highlighted that in this task, AI’s success was heavily influenced by the quantity of output, which led human evaluators to perceive it as more creative, even when originality was less evident.
Performance and Implications
The authors examined AI's creative performance compared to humans across three tasks and two prompting strategies. They revealed that AI generally outperformed humans, though results varied depending on task criteria. While humans excelled in originality and content-related measures, AI performed better in formal aspects like structure.
The researchers highlighted AI's potential in creative endeavors but noted the importance of human intervention in improving AI outcomes, especially to avoid repetitive or constrained outputs. Human involvement was deemed crucial not only to refine prompts but also to introduce a level of unpredictability and cross-disciplinary connections that AI struggled to achieve on its own. Organizations might favor AI for its consistency and lower risk, though human involvement remains critical for maximizing AI's creative potential and overcoming its limitations.
Conclusion
In conclusion, the researchers comprehensively analyzed AI's creative performance compared to humans across diverse tasks. While AI often surpassed human creativity in structured outputs, human involvement remained crucial for enhancing originality and addressing complex challenges. The findings underscore that AI’s performance is heavily task-dependent and that it excels when creativity is defined in terms of variety and efficiency but struggles in tasks requiring profound originality. The findings underscored the importance of human-AI collaboration, suggesting that AI can complement human creativity rather than replace it.
Future research should also explore how human-AI collaboration can address AI’s limitations in more open-ended creative tasks and investigate AI’s role in different professional domains, particularly those requiring high degrees of originality. By understanding the interplay between human and machine creativity, organizations can better harness AI's potential while preserving the unique strengths of human creativity.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Journal reference:
- Preliminary scientific report.
Maltese, A., Pelletier, P., & Guichardaz, R. (2024). Can AI Enhance its Creativity to Beat Humans? ArXiv. DOI: abs/2409.18776, https://arxiv.org/abs/2409.18776