Step-by-step guide with examples

Advanced Prompting for Reasoning Models: Few-Shot, Scratchpad, and Self-Consistency

This guide breaks down advanced prompting techniques for large language models focused on reasoning tasks. It covers few-shot prompting, scratchpad methods, and self-consistency, illustrating each with detailed examples for enterprise AI practitioners.

In this guide · 5 steps

01Few-Shot Prompting: Contextual Learning Through Examples
02Scratchpad Prompting: Improving Reasoning via Stepwise Explanation
03Self-Consistency: Leveraging Multiple Reasoning Paths
04Putting It All Together: Composite Prompting Strategies
05Checklist: Implementing Advanced Prompting for Reasoning Models

Advanced prompting techniques enhance the ability of large language models (LLMs) to perform reasoning tasks. These methods help models produce more accurate, interpretable, and consistent outputs, which are critical for enterprise-grade decision support systems.

1. Few-Shot Prompting: Contextual Learning Through Examples

Few-shot prompting provides the model with a handful of carefully engineered examples within the prompt. By demonstrating the task in context, the model learns to generalize the pattern without explicit fine-tuning. This approach, introduced by Brown et al. in the GPT-3 paper (2020), is widely used for complex reasoning problems.

Example: To classify the sentiment of a sentence, a prompt might include two examples — one positive, one negative — followed by the query sentence. The model infers the classification pattern from these examples and applies it to the new input.

Example prompt snippet for sentiment classification: """ Review: 'The movie was fantastic and thrilling.' Sentiment: Positive Review: 'I found the plot dull and predictable.' Sentiment: Negative Review: 'The acting was superb and immersive.' Sentiment:"""

Best practice: Limit the number of examples to fit within the model’s token limit while maintaining task diversity. Research from OpenAI suggests 8-12 examples often strike a balance for models like GPT-3 (175B parameters).

2. Scratchpad Prompting: Improving Reasoning via Stepwise Explanation

Scratchpad prompting asks the model to generate step-by-step intermediate computations or reasoning traces before producing the final output. This technique guides the model to 'think aloud,' which often improves accuracy on multi-step reasoning problems.

A seminal example came from Nye et al. (2021), who showed that instructing language models to output their reasoning steps improved performance on arithmetic and logic tasks by over 15 percentage points compared to direct answer prompting.

Example: Given a math problem, the prompt instructs the model: """ Calculate 23 multiplied by 17 step-by-step. Step 1: Multiply 20 by 17 = 340 Step 2: Multiply 3 by 17 = 51 Step 3: Sum results = 340 + 51 = 391 Answer: 391"""

This explicit breakdown helps the model avoid shortcut reasoning and provides a transparent audit trail for downstream verification or human review.

3. Self-Consistency: Leveraging Multiple Reasoning Paths

Self-consistency extends scratchpad prompting by sampling multiple reasoning chains and aggregating their final answers. This method, introduced by Wang et al. (2022), improves robustness and reduces model hallucinations by selecting the most consistent output across diverse reasoning paths.

Implementation involves generating several output completions with reasoning steps using temperature-controlled sampling. The final answer is chosen by majority vote or most frequent conclusion, improving the probability of correctness.

For example, on a multi-step logic puzzle, a model might produce 10 distinct reasoning solutions. If 7 solutions conclude answer "C" and 3 solutions choose "A," the self-consistent answer returned is "C."

4. Putting It All Together: Composite Prompting Strategies

Enterprises can combine these techniques to optimize reasoning with LLMs. For example, start with few-shot prompting using diverse examples, add scratchpad instructions to encourage intermediate reasoning, and apply self-consistency sampling at inference to validate answers.

Sample composite prompt for a reasoning task might look like this: """ Problem: If 3 workers complete a task in 6 hours, how long for 5 workers? Answer step-by-step: Step 1: Total work units = 3 workers * 6 hours = 18 Step 2: Time for 5 workers = 18 / 5 = 3.6 hours Answer: 3.6 hours"""

Run this prompt multiple times with stochastic sampling, then select the answer most consistently produced. This approach balances example-driven contextual learning, stepwise reasoning clarity, and output reliability.

5. Checklist: Implementing Advanced Prompting for Reasoning Models

Key steps for enterprise teams implementing advanced reasoning prompting

Curate diverse, high-quality few-shot examples specific to the reasoning domain.
Instruct the model explicitly to produce step-by-step reasoning (scratchpad) before the final answer.
Use model temperature controls to enable diverse output sampling for self-consistency.
Aggregate multiple sampled reasoning paths and select the majority or consensus answer.
Validate generated intermediate steps for transparency and error detection.
Measure accuracy improvements on benchmark tasks aligned with enterprise needs.
Optimize example count to remain within model token limits while maximizing task coverage.

Advanced prompting techniques—few-shot, scratchpad, and self-consistency—provide practical levers to improve reasoning with LLMs, yielding more reliable and interpretable outputs for enterprise AI decision-making.