🏗️ Scaffoldings
Scaffolding in machine learning, and particularly in large language models, is the process of providing a model more context how to generate it's output during the inference time.
Scaffolding is originally from traditional education, where it refers to the gradually reduced support given to students to become independent learners.
This is a powerful technique that can be used to steer the model in a particular direction without training or fine-tuning a model.
Scaffolding Techniques
All of these techniques can be categorized as "prompt engineering".
I personally use the term "prompt engineering" as the phase where you are iterating on the prompts as a developer, while "scaffolding" is the automated process of providing the model with those prompts you've constructed, including other automated processes how you query your model(s).
Role Assignment
By assigning the model a role, we can guide it to maintain consistent approach.
You are an assistant that helps the user respond to their mails.
Act as a math teacher.
You are an expert in linguistics and grammar.
Step-by-Step Instructions
Providing the model with step-by-step instructions affects both output format and logic. Best on complex tasks that you want to dumb down for the model.
Expose Chain of Thought
Encourage the model to generate intermediate steps or thoughts that lead to the final output. This helps both in debugging and steering the model logic.
This is similar to step-by-step, but focused on why rather than how.
Leading Examples
Providing examples to the model, so that it can generate output similar to the examples. This can be used to guide the model towards a certain style or format.
Retrieval-Augmented Generation
Based on the user query, retrieve relevant information from a knowledge base and provide it to the model as context.
Format Assertion
Telling the model the expected format of the output, like that the output should be in the form of a poem or Markdown.
Template Filling
Providing a template to the model, and asking it to fill in the blanks. This can be used to generate output in a certain format.
Providing Constraints
By providing constraints, like "the output should be less than 100 words" or listing topics that shouldn't be mentioned, we can guide the model.
You must follow the instructions below:
- The output should be less than 100 words.
- Do not mention any names.
- The output should be in the form of a poem.
Self-assessment
By asking the model about its own previous output, we can achieve more consistency and coherence.
After generating its output, the model can be asked to summarize, rate, fact check, or assess clarity of its own output.
You shouldn't rely on self-assessment too much though, but it can work wonders as a secondary criteria in some use cases.
This can also become very complex, similar to reinforcement learning, where the assessing model is actually a separate model and learns to assess the output over time. Can get very complex and impossible to reason about though...
Split Prompting
More consistent guardrails can be achieved by breaking complex prompts into smaller, more manageable parts. Each part can then be prompted separately, even against separate, more restrained models.
Chain-of-Thought Prompting
You essentially guide the model through each step you've laid before it, asking questions about the answers and ask it to reflect on those answers further.
This is similar how ChatGPT o1 and other "reasoning" models work. The chain-of-thought prompting is embedded into the model itself.