Matt Harrison Clough/Ikon Images
How can recent advances in generative AI tools be applied to transform pricing decisions? By lowering technical and financial barriers, such tools democratize access to sophisticated pricing capabilities, empowering even small businesses to benefit from artificial intelligence without the need for costly, bespoke solutions. There are fundamental differences between GenAI-driven approaches and traditional algorithmic pricing that should inform its use. Effective pricing recommendations depend heavily on how prompts are crafted (for now, at least). In this article, drawn from my forthcoming book on pricing in the age of AI, I’ll be discussing the recent, promising trend of using large language models to support pricing decisions.
Here, pricing refers to simple, static recommendations of price points for products or services. Prompting an LLM to recommend a price requires no technical skill and can be done for any type of product or service at very low cost. More complex pricing strategies that are highly dynamic and adaptive still require algorithmic development and are not discussed in this article, although AI agents and multi-AI-agent systems can be very useful for such tasks as well.
How GenAI Democratizes Pricing
For years, companies have used algorithmic and AI pricing models to optimize prices with precision and speed. But a new player has entered the scene: generative AI, powered by LLMs like ChatGPT.
How do LLM-based tools compare with traditional algorithmic tools when used for pricing?
Traditional algorithmic pricing tools have been used extensively in sectors like e-commerce, travel, hospitality, and ride-hailing. To optimize prices, these models rely on high-quality historical data, carefully built rules, and reasonable assumptions about the market. These are custom-built solutions that are powerful and data-driven, but building, maintaining, and deploying them is often expensive and requires technical expertise. These tools typically offer (1) reliable and repeatable results, (2) precision in complex, multivariable pricing environments (such as offering dynamic discounts and personalized promotions), and (3) the ability to scale rapidly and learn over time. But their drawbacks include high setup and maintenance costs, limited flexibility (if they’re built around predefined logic and data frameworks, for example), and concerns around bias and explainability (which is true for LLMs as well).
LLMs offer an entirely different approach. Instead of hard-coding logic and data, managers simply feed in a prompt (i.e., instruction or question) in natural language, describing the market, product, cost and margin information, and strategic goals, and the model generates pricing recommendations on the spot. It does so without requiring proprietary data or custom code, and delivers insights based on its vast training knowledge. Of course, any available proprietary data can be used by LLMs (either directly in the prompt or by fine-tuning the model) to obtain more specialized, contextually refined answers. Using LLMs for pricing has several advantages:
Ease of use. No technical expertise is needed.
Low-cost and fast deployment. There is no need to purchase an expensive license — just write a prompt.
Flexible thinking. An LLM isn’t constrained by preprogrammed logic, which means that it can propose creative pricing strategies and think outside the box.
Democratization of pricing tools. Businesses of any size can now access insights that were once reserved for large, data-rich organizations.
But while generative AI unlocks new opportunities, it also introduces new concerns that need to be carefully considered and mitigated:
Lack of explainability. It’s difficult to trace how GenAI models arrive at pricing suggestions (unlike several traditional algorithmic approaches that follow specific rules). This can be risky in regulated markets, such as finance and health care.
Lack of consistency. GenAI responses vary depending on how a question is phrased, or even with the same prompt phrasing. This lack of consistency reduces confidence in the results.
Potential presence of biases. GenAI models are trained on internet-scale data, which may include skewed perspectives or regional/cultural biases.
Prompt quality equals outcome quality. A weak prompt leads to weak pricing advice. At the same time, recent advances in LLMs are decreasing the reliance on prompt quality.
LLMs aren’t replacing pricing teams (at least not yet), but they are augmenting them. Generative AI offers a quick, affordable way to brainstorm or validate pricing strategies before investing in deeper analytics. In industries with limited data, fast-moving markets, or small budgets, LLMs offer game-changing access to pricing intelligence. However, for mission-critical pricing decisions, especially those requiring legal or financial transparency, traditional algorithmic tools still provide greater control and reliability. Future pricing strategies may involve a hybrid approach: using GenAI for ideation and flexibility, and algorithms for precision and execution. For such an approach to be effective, users must master how to prompt properly.
How to Prompt for Pricing
Given that GenAI-based pricing is primarily context- and language-driven, the prompt will determine the kind of answer you get in return. The way you frame the question, how much detail you include, what context you provide, and how clearly you guide the AI can all dramatically affect the quality, accuracy, and pertinence of the pricing recommendation you receive.
I once advised a retailer who wanted to use ChatGPT to help price a new product line. Their initial prompt returned a lowball price that, based on their intuition and experience, risked undercutting the brand. But after they refined the prompt to include details about target margins, competitive benchmarks, and product positioning, the LLM suggested a much more strategic price that aligned with both business goals and market expectations.
Similarly, a digital subscription business I worked with used an LLM for testing different bundle configurations. When the manager asked, “What price should I charge for a bundle of services?” the AI provided vague suggestions. But when the question was reframed to include past churn data, willingness-to-pay insights, and tiered usage patterns, the model suggested highly strategic price points and even proposed a new “light” tier that hadn’t been considered.
These cases illustrate a broader truth: You should treat GenAI not as a magic answer machine but as a collaborative partner that needs proper direction. Prompting should be viewed as an iterative dialogue rather than a one-shot request. Crafting thoughtful, structured, and data-informed prompts will make the difference between helpful assistance and a misguided guess.
A key principle in effective prompting is to structure the prompt in distinct parts, each serving a specific purpose. A commonly used approach is the RISEN five-step framework.
- Role: Defines the AI’s role and expertise in relation to the task.
- Input (or Instructions): Specifies the data or information the AI should consider.
- Steps: Outlines the process or sequence of actions the AI should follow.
- Expectations: Clearly states the desired output, outcome, or goal.
- Narrowing (or Novelty): Adds constraints or areas of focus to guide a more precise, targeted response.
While the effectiveness of the RISEN framework highly varies depending on the context, it’s generally regarded as a good practice for crafting high-quality prompts. Still, it’s important to recognize that LLMs are inherently stochastic, meaning that their output can vary even when the same prompt is used with the same LLM. This randomness leads the model to sometimes produce inconsistent answers. LLMs don’t retrieve fixed answers from a database or apply clear, well-defined rules; instead, they generate text by predicting the next word in a sequence based on probabilities. On repeated runs, LLMs may yield varying pricing suggestions or different justifications for similar price points.
To mitigate this variability, pricing teams can adopt the following best practices:
- Run multiple iterations of the same prompt, and analyze the range of outputs to help identify a pricing consensus and filter out outliers (called a stability test).
- Use structured, step-by-step prompting (also known as chain-of-thought prompting), where you break down the decision-making process and guide the AI through it in a clear, logical order, or you explicitly ask it to explain its reasoning step by step. For example, you could first instruct the AI to evaluate the competitive landscape, then to assess the customer value perception, and finally to propose a price.
- Consider saving the top-performing prompt versions that consistently yield clear, well-reasoned outputs, and document the assumptions that were made, for traceability and internal validation.
Ultimately, you should not expect generative AI to offer one “perfect” price every time. Instead, treat it as a smart brainstorming partner that can validate intuitions. Its variability is a feature, not a bug, and can spark deeper strategic thinking and uncover pricing angles that might otherwise go unexplored.
Recent advances in GenAI are making prompt engineering easier and potentially less critical to success. Newer models increasingly engage users in interactive dialogues, asking follow-up questions to clarify intent and refine the prompt, effectively enabling chain-of-thought prompting by default.
How GenAI Pricing Works in Practice
Here, let’s consider two specific examples of using generative AI for pricing in everyday contexts: pricing a used car and setting the hourly rate for contracting services.
Used-car pricing. I collected 100,000 car listings from a popular U.S. online platform and asked four different LLMs to recommend appropriate selling prices. To investigate the impact of prompting, I tested four levels of sophistication — from a vague prompt providing only the vehicle model to a comprehensive prompt incorporating vehicle specifications, market intelligence, macroeconomic factors, and strategic negotiation considerations. The results were striking: When the most sophisticated prompt was used, all of the tested LLMs recommended prices that were, on average, within 3% of the human-set (seller’s) price.
The improvement across prompt levels was dramatic. GPT-4o’s recommendations improved from a 14.75% deviation (relative to the human-set prices) when prompted with only the vehicle model to just 2.95% when provided with a more comprehensive prompt that included more information about the vehicle, macroeconomic factors, and a reminder to leave room for negotiation, while Claude 3.7 Sonnet improved from 11.26% to 2.61%. This demonstrates that prompting properly is crucial. I also conducted a stability test, running the same prompt 1,000 times for each of 1,000 randomly selected sales listings. While I observed some variation — with maximum deviations averaging 11.12% — after I removed extreme outliers, the prices recommended for 93.7% of the cars were within 5% of the average human-set value.
The practical takeaway is clear: GenAI can provide remarkably accurate price recommendations for used cars but only with well-crafted prompts. Users should run multiple iterations to test the model’s stability, challenge the AI’s recommendations through follow-up questions, and be mindful that occasional hallucinations can occur. (Extreme outliers were observed in rare cases.)
Hourly rates for contractors. In a 2025 study that I conducted with colleagues at McGill University, we analyzed 60,000 freelancer listings across six job categories (accounting, full stack development, virtual assistance, data analytics, graphic design, and social media marketing) to test how generative AI recommends hourly rates freelancers should charge. Unlike in the previous example, we found that LLMs’ rates, based on the information in the freelancers’ profiles and LLMs’ universal training data, were systematically priced higher compared with rates set by freelancers themselves: Human rates averaged $23.60, while AI recommendations ranged from $30.72 to $36.18 across different models. This consistent upward bias held across all job categories and experience levels.
We also conducted three important bias and discrimination tests. First, we tested for gender-based discrimination by creating three versions of each listing: one with a common male name, one with a common female name, and one with no name specified. Reassuringly, we found no gender-based price discrimination across any of the eight tested LLMs. Second, we tested for geography-based disparities by duplicating listings while varying the location of the freelancer. Our test revealed significant location-based pricing disparities. When we considered identical freelancer profiles and changed only the location to either the U.S. or the Philippines, the AI recommended prices that were 53% to 106% higher for U.S.-based freelancers, even though the skills and experience were identical. Third, we tested for age-based discrimination by setting the age of the freelancers as either 22, 37, or 60 years old. Once again, we observed clear price disparities: Sixty-year-old freelancers were priced 46% higher than 22-year-olds and 8.1% higher than 37-year-olds with the same profile.
The implications here are nuanced. On the one hand, generative AI can provide useful hourly rate recommendations and appears to be free from gender bias. On the other hand, the systematic overpricing and location- and age-based disparities raise important questions about fairness in the global gig economy.
For anyone using generative AI to price products or services, the lesson is to treat AI recommendations as one input among many, and to carefully test for potential biases. The technology can help users validate their intuitions and explore market positioning, but they must remain vigilant about unintended discrimination patterns, particularly those tied to geography, demographics, and economic development levels.
While traditional algorithmic pricing approaches do come with concerns about collusion, overall, several results based on market simulations suggest that GenAI-driven pricing is more likely to enhance competition rather than harm it, provided that businesses use LLMs wisely and mindfully.
LLMs offer an exciting, highly accessible new tool for pricing whose impact on market prices largely depends on human input (prompting strategies) and can serve as a useful complement to traditional algorithmic approaches to pricing.