Evaluator-Optimizer
Evaluator-Optimizer
The Evaluator-Optimizer pattern is a design where an evaluator assesses the quality of outputs and an optimizer improves them based on feedback.
Basic Structure
states:
- id: start
type: initial
next: generate_initial
- id: generate_initial
type: llm
model: generator
prompt: |
Generate content for:
{topic}
input:
topic: "${input.topic}"
next: evaluate
- id: evaluate
type: llm
model: evaluator
prompt: |
Evaluate this content:
{content}
Provide scores for:
- Clarity
- Accuracy
- Completeness
input:
content: "${previous.output}"
next: optimize
- id: optimize
type: llm
model: optimizer
prompt: |
Improve this content based on evaluation:
Content: {content}
Evaluation: {evaluation}
Focus on improving low-scoring aspects.
input:
content: "${states.generate_initial.output}"
evaluation: "${previous.output}"
next: end
- id: end
type: output
value: "${previous.output}"Common Use Cases
1. Content Generation
states:
- id: start
type: initial
next: generate_content
- id: generate_content
type: llm
model: generator
prompt: |
Write an article about:
{topic}
Include:
- Introduction
- Main points
- Conclusion
input:
topic: "${input.topic}"
next: evaluate_content
- id: evaluate_content
type: llm
model: evaluator
prompt: |
Evaluate this article:
{content}
Score (1-10):
- Structure
- Clarity
- Engagement
- Grammar
input:
content: "${previous.output}"
next: optimize_content
- id: optimize_content
type: llm
model: optimizer
prompt: |
Improve this article based on evaluation:
Article: {content}
Scores: {scores}
Focus on improving areas with scores below 8.
input:
content: "${states.generate_content.output}"
scores: "${previous.output}"
next: end2. Code Generation
states:
- id: start
type: initial
next: generate_code
- id: generate_code
type: llm
model: generator
prompt: |
Generate code for:
{function_description}
Use {language} and follow best practices.
input:
function_description: "${input.description}"
language: "${input.language}"
next: evaluate_code
- id: evaluate_code
type: llm
model: evaluator
prompt: |
Evaluate this code:
{code}
Check for:
- Correctness
- Performance
- Security
- Style
input:
code: "${previous.output}"
next: optimize_code
- id: optimize_code
type: llm
model: optimizer
prompt: |
Improve this code based on evaluation:
Code: {code}
Issues: {issues}
Fix all identified issues.
input:
code: "${states.generate_code.output}"
issues: "${previous.output}"
next: end3. Data Analysis
states:
- id: start
type: initial
next: analyze_data
- id: analyze_data
type: llm
model: analyzer
prompt: |
Analyze this dataset:
{data}
Provide:
- Statistical analysis
- Key insights
- Visualizations
input:
data: "${input.dataset}"
next: evaluate_analysis
- id: evaluate_analysis
type: llm
model: evaluator
prompt: |
Evaluate this analysis:
{analysis}
Check for:
- Statistical validity
- Insight quality
- Visualization effectiveness
input:
analysis: "${previous.output}"
next: optimize_analysis
- id: optimize_analysis
type: llm
model: optimizer
prompt: |
Improve this analysis based on evaluation:
Analysis: {analysis}
Feedback: {feedback}
Enhance weak areas.
input:
analysis: "${states.analyze_data.output}"
feedback: "${previous.output}"
next: endAdvanced Patterns
1. Iterative Optimization
states:
- id: start
type: initial
next: generate_initial
- id: generate_initial
type: llm
model: generator
prompt: "Generate: {task}"
input:
task: "${input.task}"
next: evaluate
- id: evaluate
type: llm
model: evaluator
prompt: "Evaluate: {content}"
input:
content: "${previous.output}"
transitions:
- when: "${output.score >= 8}"
next: end
- next: optimize
- id: optimize
type: llm
model: optimizer
prompt: |
Improve based on evaluation:
Content: {content}
Feedback: {feedback}
input:
content: "${states.generate_initial.output}"
feedback: "${states.evaluate.output}"
next: evaluate2. Multi-criteria Evaluation
states:
- id: start
type: initial
next: generate
- id: generate
type: llm
model: generator
prompt: "Generate: {task}"
input:
task: "${input.task}"
next: parallel_evaluation
- id: parallel_evaluation
type: parallel
branches:
- id: quality_eval
states:
- id: start
type: initial
next: evaluate_quality
- id: evaluate_quality
type: llm
model: quality_evaluator
prompt: "Evaluate quality: {content}"
- id: technical_eval
states:
- id: start
type: initial
next: evaluate_technical
- id: evaluate_technical
type: llm
model: technical_evaluator
prompt: "Evaluate technical aspects: {content}"
next: optimize
- id: optimize
type: llm
model: optimizer
prompt: |
Improve based on evaluations:
Content: {content}
Quality Feedback: {quality}
Technical Feedback: {technical}
input:
content: "${states.generate.output}"
quality: "${branches.quality_eval.output}"
technical: "${branches.technical_eval.output}"
next: endBest Practices
- Clear Criteria: Define clear evaluation criteria
- Balanced Feedback: Provide balanced positive and negative feedback
- Iterative Improvement: Allow multiple optimization cycles
- Specific Guidance: Give specific improvement suggestions
- Quality Thresholds: Set clear quality thresholds
- Documentation: Document evaluation and optimization processes
- Testing: Test the evaluator-optimizer pipeline
Common Pitfalls
- Vague Criteria: Unclear evaluation criteria
- Over-optimization: Too many optimization cycles
- Biased Evaluation: Unbalanced feedback
- Missing Context: Insufficient context for optimization
- Poor Documentation: Inadequate process documentation
Next Steps
- Learn about Autonomous Agent
- Explore Best Practices
- Read about Testing
Last updated on