Gain Consumer Insight With Generative AI

Stuart Kinlough/Ikon Images

Marketing leaders often face a dilemma: Deriving the insights they need in order to make confident decisions can cost tens of thousands of dollars and involve several months of data gathering and analysis, by which time market conditions may have shifted. Can generative AI fundamentally reshape this calculus?

Drawing on recent research, including our own study published in the Journal of Marketing, as well as interviews with marketing leaders from major organizations, we have identified five ways that large language models (LLMs) are beginning to transform the marketing function and reshape the $153 billion insights industry.1 LLMs can viably compress marketing research timelines from months to days by introducing new approaches for rapid concept testing, such as the use of synthetic consumer “digital twins,” and enabling qualitative research at scale. These techniques allow companies to better harness unstructured data and smaller research teams to conduct much larger studies than they could previously.

Organizations conduct marketing research to uncover consumer insights that guide strategic and tactical business decisions. Historically, insight generation has been a multistage, time-consuming, and labor-intensive process.

A typical marketing research pipeline includes problem definition, research design, study design, sample selection, data collection, data analysis, and insights delivery. Some aspects of marketing research are qualitative (such as interviews and focus groups), and others (surveys, for example) are quantitative in nature. These studies may be conducted by in-house marketing research teams or outsourced to agencies with specialized expertise. A research project can take a few weeks to several months, depending on its scope, and can cost anywhere from tens to hundreds of thousands of dollars.

Generative AI is making the consumer insight generation process substantially more efficient while also presenting novel ways to make the research more effective. In short, it is making the marketing research process faster and cheaper.

Just as AI-driven drug discovery has shortened the timeline from candidate screening to clinical-trial readiness, generative AI is shortening timelines from exploration to insights.2 AI is being integrated into the market research process with humans in the loop, as illustrated in the figure “How AI Is Integrated Into the Marketing Research Process.” In the early stages of research, problem definition and design are primarily guided by the decision maker. This is because critical factors — such as client experience, market intuition, and practical constraints like budget and timing — are human-led and challenging for AI to infer. Although the AI can help refine problem statements or brainstorm design options, its role during these early stages is typically minimal. In contrast, AI serves as an excellent collaborator in the remaining stages of marketing research.

In the study design phase of qualitative research, LLMs can be used to generate initial drafts of discussion guides for exploratory work. During sample selection, they can help identify respondent characteristics that align with the research goals. In the analysis phase, LLMs summarize long interviews, extract themes, and organize unstructured text into interpretable insights. As Paul Metz, CEO of C+R Research, said, “AI tools process and synthesize large volumes of transcript data within hours, detecting patterns and themes that previously took days to uncover.”

Such efficiencies allow teams to handle large volumes of qualitative data and work more productively. The speed and cost savings allow companies to shift from large, infrequent studies that take months to complete to smaller, more frequent studies aligned with decision cycles. This also empowers managers to test more ideas, iterate quickly, and adopt an experimentation-oriented mindset.

For quantitative research, LLMs can be used to quickly generate the first draft of a survey, report summary statistics, visualize the data, and debug analysis code as needed. These GenAI use cases allow the research team to delegate many of the rote tasks to the AI, use that time to focus on answering the business questions more effectively, and deliver insights faster.

The Research

• In their Journal of Marketing paper, the authors tested how well the large language model GPT-4 could replicate qualitative and quantitative marketing research projects conducted in 2019 by a Fortune 500 food manufacturing company and its market research partner.

• To replicate the qualitative study, the LLM was used to generate synthetic respondents that matched the profiles of human respondents in the original study. These synthetic respondents were asked a subset of the questions from the original study, and their responses were evaluated and compared against the original human responses by crowd workers on attributes such as depth, clarity, and insightfulness.

• The LLM and experienced human analysts from the partner company then conducted separate thematic concept analyses on the original human response transcripts, and their findings were compared in a blind evaluation by senior qualitative researchers.

• To replicate the quantitative study, which asked respondents to rate pet food product concepts, the LLM was used to generate synthetic responses to the same questions based on the demographic and screening data from the original study’s participants. The synthetic data was then compared with the original study’s results.

• Additionally, the authors conducted semistructured interviews with five industry leaders affiliated with the Marketing Leadership Institute at the Wisconsin School of Business to contextualize their findings: Chauncey Holder (senior expert, McKinsey), Chuck Hwang (vice president of analytics and insights, Procter & Gamble), Lisa Gudding (president, Ipsos), Paul Metz (CEO, C+R Research), and Kajoli Tankha (senior director of consumer, brand, and AI insights, Microsoft).

Generate Consumer Insights With Synthetic Digital Twins

An important way in which LLMs enable data generation for consumer insights is using digital twins. A digital twin is a synthetic, data-driven representation of an object or process that enables simulation and what-if experimentation at low cost. A range of fields, such as drug discovery, climate science, and supply chain management, were using digital twins well before the rise of LLMs.

In marketing, LLMs are enabling the use of consumer digital twins — personas that can simulate decision-making, preference shifts, and responses to marketing stimuli — as testbeds for premarket experimentation.3 Instead of waiting for new-data collection, analysts can simulate concept tests, assortment decisions, pricing moves, or campaign reactions in silico before making a significant financial commitment.

AI market research companies like Evidenza and academic initiatives such as Columbia University’s digital twin data set highlight the growing ecosystem around AI-driven consumer emulation.4 Evidenza partnered with a German information and communications technology company to study whether B2B buyers would trust the company to handle cybersecurity and cloud infrastructure for sensitive data. The research team used synthetic samples of decision makers to simulate a study and quickly test hypotheses around spending trajectories, the products most likely to drive vendor switching, and other questions. Validation against an existing human survey revealed strong correlations (0.75-0.88) across metrics, confirming that the synthetic samples provided directionally accurate insights. The synthetic approach enabled the B2B company to obtain valuable input at a fraction of the time and cost of traditional marketing research.

Consumer digital twins can be generated from a variety of demographic, psychographic, and behavioral data from various internal and external sources that companies may have access to. To generate digital twins in our study, we obtained detailed profiles of respondents in our research partner’s original study, including their demographics and product use. We then prompted the LLM by providing it with the research context and the persona we wanted it to assume based on a human respondent’s profile. Finally, we asked it to perform a task, such as giving a detailed answer to an open-ended question or picking from multiple response options for a survey question. We generated hundreds of synthetic respondents in that manner using the API for an LLM.

Our study found that LLMs can generate high-quality, information-rich qualitative data. Both LLM- and human-generated data look and feel remarkably similar, although LLM responses are superior in terms of depth and insightfulness, since they are unconstrained by time or a willingness to elaborate. They can also help reach niche or hard-to-reach segments, thus complementing human respondents in meaningful ways. For quantitative survey research, we found that an LLM does a good job of replicating the direction and magnitude of the human answers well.

Additionally, our findings revealed that digital twins add significant value to develop the research process. An LLM can be used to generate synthetic response data to a survey before it is administered to human respondents. By turning the typical research flow on its head, this “backward” marketing research approach allows researchers to test their survey design before fielding a survey.5 They can look at the synthetic survey results to answer fundamental questions, such as the quality of insights the survey is likely to reveal and survey questions that could be removed or added. In some circumstances, synthetic data may even obviate the need to conduct the survey; this could occur, for example, when one concept clearly dominates all of the concepts tested, or when the main insight from the survey is not new.

The gains from digital twin data are likely to be higher for hard-to-reach respondents, such as doctors or senior managers. Decision makers would much rather work with data from digital twins than have no data at all for these hard-to-reach groups. An attractive aspect of digital twins is that they do not get tired or have time constraints and can provide lengthy answers for many questions.

In addition to generating useful data, LLMs can be helpful in collecting and analyzing unstructured data from human or synthetic participants.

Unlock Qualitative Research at Scale

The traditional model for conducting marketing research is to begin with unstructured qualitative research (such as ethnographies, in-depth interviews, or focus groups) involving a small number of respondents and use it as the foundation for a large sample survey. Because unstructured, qualitative data involves a small sample size, is labor intensive, and is therefore expensive to collect and analyze, companies have historically relied more heavily on survey data. However, LLMs are proving to be useful in making qualitative data much easier to collect and analyze.

AI as the data collection engine. An impressive use case for generative AI in data collection is as an interviewer of human respondents, where it is used to perform three key tasks:

Interviewer: The LLM follows a discussion guide to ask specific questions.
Scorer: The LLM then evaluates the human answer against metrics such as clarity and depth, and provides a score on a scale of 1-100.
Prober: If the evaluation score is below a preestablished threshold, the LLM asks the respondent to elaborate further.

This three-step approach is not limited to conducting interviews with humans; it can also be applied to generating synthetic data. In testing this idea, we determined that synthetic data from AI-moderated interviews preserves the meaning and essence of human-generated data. Importantly, independent evaluation by human raters scored the AI-generated data significantly higher on measures of depth and insight.

AI-moderated interviews are powerful additions to a marketing researcher’s toolkit and permit data collection for qualitative research at scale. Unlike a human moderator, an AI moderator can collect detailed unstructured data (video, audio, or text) from many respondents across the globe, and at a fraction of the cost of a traditional in-person in-depth interview. Although an experienced human moderator may be better at reading respondents’ tone, body language, and visual cues, the advantage of AI moderators is the ability to quickly conduct interviews at scale, across geographical boundaries. AI moderators may offer an additional advantage in situations where humans feel uncomfortable talking about a product because of social desirability biases or fear of judgment.

Suppliers such as Outset and Nexxt Intelligence have commercially available products with AI-moderated functionality for conducting interviews. In one case study, Outset claimed to have completed 100 interviews in just a few days — a task that normally would have taken weeks. The resulting qualitative data revealed problems its client had not known existed and helped shape messaging for its brand campaigns. The AI moderator approach also gave the client the ability to conduct research continuously rather than just once or twice a year.

AI as the analysis engine. The traditional approach to qualitative data analysis is largely manual and performed by expert analysts, who sort through large volumes of unstructured text and audiovisual data. The analysis task for text data, for example, involves thematic concept analysis, which includes reading the text, excluding fillers, highlighting key phrases or sentences, clustering them into related concepts or themes, iterating to remove repetitive ideas, and consolidating the themes into a concise summary. Our research finds that LLMs have made many of these analysis tasks easier to perform without sacrificing quality.

At the process level, we find that humans tend to highlight more sentences than LLMs when analyzing data and that there is significant overlap in the sentences that humans and the LLM highlight as important. LLMs uncover most of the same themes that humans do and identify new themes that humans do not. Overall, LLMs are comparable to humans in identifying key ideas, grouping them into themes, and summarizing them. In practice, suppliers such as Voxpopme offer excellent tools to analyze multimodal (video, audio, and text) qualitative data. In one case study, Voxpopme claimed a 30% to 50% reduction in the cost of qualitative research projects, a 50% increase in the use of existing research insights, and an impressive 60-times-faster research analysis.

AI-enabled marketing research makes it possible to conduct both qualitative and quantitative research at scale. This was previously infeasible with traditional qualitative research (small samples, deep insights) and quantitative research (large samples, broad insights). Given LLMs’ effectiveness, low cost, and ease of use, we expect that they will play an increasingly critical role during the data collection and analysis stages for unstructured data. Companies, in turn, are quickly discovering how much more they can do with unstructured data than was previously possible.

In addition to traditional qualitative research data (from in-depth interviews and focus groups, for example), there is also rich information in unstructured data such as online reviews, call center transcripts, and social media posts. Chauncey Holder, a senior expert at McKinsey, noted that “AI agents can interrogate multimodal data — like social media, category features, and behavioral signals — to uncover unmet needs and emerging trends, identifying white-space opportunities more efficiently than traditional methods.” The inability to mine this information-rich data quickly and inexpensively was a constraint for marketing researchers because past natural language processing models relied heavily on expensive, labor-intensive human labeling.6 Pretrained LLMs have changed this by enabling low-cost semantic summarization, topic extraction, sentiment classification, and narrative insight generation from massive multimodal data far more easily than previously available tools could. This change marks a massive shift in how the field of marketing research can unlock the value of unstructured data to inform business decisions.

Connect Siloed Data Using Retrieval-Augmented Generation

Although today’s LLMs have an impressive set of capabilities, their performance on complex tasks that require domain knowledge (in-house marketing research by a brand, for example) can be limited. For situations in which the LLM lacks the requisite information, retrieval-augmented generation (RAG) is a cost-effective method that can improve its output quality. RAG incorporates information from an external knowledge source, such as a company’s existing qualitative data, as input in addition to the user prompt.

In our own research, we had mixed results when generating synthetic survey data using an LLM alone (without RAG). Although the LLM correctly captured the direction and magnitude of consumer attitudes, it exhibited two key weaknesses evident in many basic AI applications. First, the responses lacked heterogeneity; there was less variation in the AI’s answers compared with the human data. Second, the LLM answers lacked the internal consistency found in human answers; for example, the LLM’s answers did not rate attributes such as “healthy ingredients” and “safest food” similarly, as humans would. Both of those shortcomings were partially overcome when we used RAG to draw on existing qualitative data.

More broadly, RAG can be particularly useful for marketing research, where managers rely on multiple external information sources for decision-making. Effectively integrating siloed insight streams is a challenging task for marketing organizations: Survey trackers, customer relationship management (CRM) systems, social listening, and third-party intelligence rarely “speak” to one another in a cohesive way. LLMs using RAG offer “connective tissue” across disparate sources and enable cross-source synthesis. RAG can also be used to integrate multiple sources of information — such as in-house CRM, survey, and demographic data — and create an AI-enabled chatbot, or persona bot, that brand managers can use to gain a deeper understanding of their customers.

Lisa Gudding, president of strategic growth at consulting firm Ipsos, echoed the argument above, adding that “companies are now blending their own behavioral data with syndicated studies and trend signals that we supply to build richer, more dynamic insight ecosystems. This shift has given rise to data as a service [DaaS], where AI is enabling a new kind of consultative intelligence.” Market Logic and Stravito are two examples of DaaS-based knowledge management companies that integrate multiple sources of information to deliver insights to market researchers.

Although RAG is useful for integrating siloed, multimodal marketing data, it is not without limitations. First, it faces scalability challenges where retrieval accuracy and processing speed degrade as the knowledge base gets very large. Second, the inherent complexity and inconsistency of integrating real-time, multiformat marketing data require extensive preprocessing, which can restrict the volume and fidelity of information the LLM can effectively use. Finally, if the retrieval mechanism identifies information that is incomplete, is irrelevant, or lacks proper context, the quality of insights will be compromised, regardless of how good the LLM’s generative capabilities are.

On this issue, Chuck Hwang, vice president of analytics and insights at Procter & Gamble, observed that “some of the knowledge created, especially in marketing and research, is not fully preserved [and] is often embedded in slide decks or shared verbally, making it difficult for AI to fully capture the institutional context.” Therefore, the effectiveness of a RAG system depends on the underlying information retrieval architecture and data completeness. When these infrastructural and data quality challenges are successfully addressed, this knowledge integration aspect of generative AI can prove to be a source of significant value creation.

Human Oversight Is Essential

While we see immense value in using AI for both qualitative and quantitative research, we find it essential to underscore that humans are still the drivers of the insight-generation process.

At the data collection phase of qualitative research, companies can design human-AI teams to generate insights efficiently and effectively. LLMs are excellent assistants that can take the first pass at analyzing vast amounts of text and audiovisual data. This gives the experts time for higher-order tasks, such as ensuring that the insights answer the research questions. In our research, we found that more unique insights emerged from AI-human hybrids than from the human-only or LLM-only approaches. Experienced qualitative researchers and LLMs complement each other well.

Much along the same lines, in quantitative survey research, an LLM can rapidly generate a strong first draft of a survey that can serve as an efficient starting point in the design process. A human expert can begin with this draft survey and perform tasks like adding skip logic and programming instructions, and assessing respondent experience, before signing off on the final version. In this reimagined research pipeline, the LLM focuses on the laborious, repetitive, and uninteresting tasks while the human expert uses the time saved to think more creatively about the business questions to be answered and the quality of the insights the research should deliver.

As Microsoft senior director of consumer, brand, and AI insights Kajoli Tankha noted, “In our own work, GenAI has become a powerful collaborator — accelerating synthesis, enabling scale, and broadening what teams can take on. At the same time, human expertise remains essential for framing the right questions and translating outputs into insight.”

As with any disruptive innovation, we encourage companies to be thoughtful and strategic when adopting LLMs for marketing research. To calibrate and uncover the true value of an LLM for their business, companies should run multiple validation checks before fully embracing LLM-generated outcomes. Such a test-and-learn approach may reveal areas in which an LLM shines and those in which it is inappropriate.

Researchers must develop AI literacy so that they know how to prompt, evaluate, and govern models, and their companies must implement quality guardrails, bias checks, and strict protocols for working with AI. The adoption of generative AI increases the value of human judgment by elevating the researcher to the role of curator of truth rather than just a producer of tables, graphs, and slide decks.

GenAI and Marketing Research: Implementation Risks and Considerations

Like any technology, generative AI comes with significant negative externalities. Many are structural (such as intellectual property violation, impact on climate, and job displacements) and outside the scope of this article, but others are squarely related to marketing research and deserve full consideration within the insights function.

First, LLMs are prone to gender, race, and cultural biases because of the data on which they are trained. Modern-day marketing researchers should be trained to spot these limitations when incorporating LLMs into the research pipeline. This issue further reinforces the need for critical human oversight in marketing research.

Second, LLMs make it much easier not only to produce good marketing research but also credible-looking marketing research of low quality. Most of the experts with whom we spoke expressed concern about the marketing industry’s growing appetite for speed at the expense of truly meaningful insights.

Third, there is some early evidence of entry-level job losses in marketing because of AI.7 The tasks that can most easily be automated by LLMs have historically served as training opportunities for junior talent. Most of the experts with whom we spoke echoed concerns about AI’s impact on the talent pipeline. Without hands-on experience in tasks that AI can automate, they noted, emerging talent may struggle to develop the deep analytical thinking and contextual judgment required to interpret data meaningfully and challenge assumptions.

Finally, although digital twins have a tremendous upside, they could be misused to generate fraudulent data that is hard to detect. For example, human respondents to online surveys could use LLMs to generate realistic answers in order to earn compensation.

Although the risks outlined above are real, they can be mitigated through the rigorous oversight and AI literacy we advocated for earlier. GenAI is a powerful ally of marketers, and the next generation of marketing research will be defined by a symbiotic partnership led by humans and fully supported by AI.