How to Fix the AI Idea Machine | Harvard Business School AI Institute

New research pushes language models beyond predictable outputs.

Listen to this article:

Lately, we’ve all developed a sixth sense for the “AI writing voice” with its polished, repetitive cadence and the obligatory “it’s not just X—it’s Y.” This signals a deeper challenge for using AI to brainstorm, design, research, write, and make decisions. We want creative, novel, and even surprising ideas from AI that stand out from the crowd and our competitors, and when we prompt AI to give us 10 different ideas on a topic, we want a variety of original options, not to be steered towards the same predictable middle. The new paper, “Inducing Sustained Creativity and Diversity in Large Language Models,” co-written by HBS AI Institute Associate Gary King, takes up the challenge of how AI is currently misaligned with our hunt for breakthroughs. The good news: the researchers have a fix.

Key Insight: The Wrong Tool for the Right Question

“[C]urrent LLMs are designed to converge to the single ‘correct’ or conventional answer.” [1]

The authors focus on what they call “search quests,” open-ended tasks where users aren’t trying to find a known fact, but to discover something unique, captivating, or meaningful. Unlike in fact-retrieval (“What’s the capital of France?”), the journey itself is essential to search quests. Consider the authors’ example of looking for a wedding dress: someone might not know the “right” one until they have explored enough ideas to recognize and develop explicit, concrete preferences. The authors argue that the results of a search quest need to be relevant, diverse, creative, and sustained. The “sustained” part matters: it’s not enough for a model to produce five different ideas one time, it needs to keep producing conceptually distinct possibilities long enough for the user to learn the landscape and come to a decision. The authors argue that LLMs struggle here because they often end up repeating the same high-probability ideas. Post-training alignment through human feedback tends to compound the problem as models are rewarded for matching majority preferences, making them systematically better at returning the expected answer and worse at surfacing unexpected ones.

Key Insight: Nudging the Model Away from the Obvious Path

“RD directs generation toward less traversed but still meaningful regions of the model’s knowledge space without [sic] in a way that can be easily adapted with any LLM to elicit diverse knowledge.” [2]

To solve this, the authors propose Recoding-Decoding (RD) to introduce controlled randomness during generation so the model moves away from its most predictable route while staying within the user’s search space. First, it adds a random priming phrase at the beginning of the prompt, such as “Related to FOOD,” when the search quest is related to a book topic on world history. [3] Second, it inserts a random three-letter diverting token at the start of each new sentence. This approach exploits the “positional bias” of LLMs, where the model pays the most attention to tokens at the very beginning or end of a sequence. Crucially, this method does not require access to the model’s internal space or retraining. In plain terms, RD acts like a set of small nudges that keep the model from settling too quickly on an obvious answer, instead expanding the option set without collapsing into repetition.

Key Insight: RD Produces More Ideas, and Keeps Producing Them

“Put differently, two independent users are far less likely to ‘show up to the same party with the same dress,’ so to speak, under RD.” [4]

Across 50 brainstorming topics, RD achieved diversity scores of 0.94-0.98 compared to 0.47-0.69 for standard methods, and the novel outputs weren’t noise, they were on-topic and substantive. Applied to the bridal dress design, the authors found that RD surfaced jumpsuit gowns, Mongolian-inspired brocade, and music-themed motifs alongside conventional options. In a collective diversity test simulating two independent users running the same prompt, a standard method produced 35 unique conceptual clusters; RD produced 244. Critically, the researchers found that RD’s diversity is sustained: unlike other diversity-boosting approaches that degrade or repeat themselves over many iterations, RD continues generating conceptually distant ideas.

Why This Matters

For business professionals and executives, these findings show us that relying on standard AI prompts for strategy and execution is operating within a groupthink bubble shared by all your competitors. As AI becomes a staple in corporate decision-making, the risk is that organizations will inadvertently converge on identical, average strategies, eroding the diversity that drives market innovation and advantage. Implementing approaches like Recoding-Decoding allows leaders and their teams to break free from the AI echo chamber, ensuring that they explore the full spectrum of possibilities rather than just the most probable ones. In an era of the monotonous AI voice, the ability to intentionally induce sustained creativity is a necessity for any firm looking for a unique competitive edge.

Bonus

As this research shows, to make AI output more useful, engineered workflows can be a great benefit. For another look at how AI-assisted exploration can help people see more options, interrogate evidence, and develop better judgment, check out AI Tools That Rewrite How We “See” Stories.

References

[1] Luo, Queenie, Gary King, Michael Puett, and Michael D. Smith, “Inducing Sustained Creativity and Diversity in Large Language Models,” arXiv preprint arXiv:2603.19519 (2026), 1.

[2] Luo et al., “Inducing Sustained Creativity and Diversity in Large Language Models,” 2.

[3] Luo et al., “Inducing Sustained Creativity and Diversity in Large Language Models,” 4.

[4] Luo et al., “Inducing Sustained Creativity and Diversity in Large Language Models,” 7.

Meet the Authors

Queenie Luo is a PhD candidate in religion and philosophy at Harvard University.

Gary King is the Albert J. Weatherhead III University Professor at Harvard University, and an HBS AI Institute Associate.

Michael Puett is the Walter C. Klein Professor of Chinese History and Anthropology at Harvard University.

Michael D. Smith is the John H. Finley, Jr. Professor of Engineering and Applied Sciences at the Harvard John A. Paulson School of Engineering and Applied Sciences.

Watch a video version of the Insight Article here.