# Working with Advanced Pipeline Examples This guide demonstrates how to load, modify, and run an existing advanced pipeline example, focusing on the two-step justified confidence model for tossup questions. ## Loading the Two-Step Justified Confidence Example 1. Navigate to the "Tossup Agents" tab at the top of the interface. 2. Click the "Select Pipeline to Import..." dropdown and choose "two-step-justified-confidence.yaml". 3. Click "Import Pipeline" to load the example into the interface. ## Understanding the Two-Step Pipeline Structure The loaded pipeline has two distinct steps: 1. **Step A: Answer Generator** - Uses OpenAI/gpt-4o-mini - Takes question text as input - Generates an answer candidate - Uses a focused system prompt for answer generation only 2. **Step B: Confidence Evaluator** - Uses Cohere/command-r-plus - Takes the question text AND the generated answer from Step A - Evaluates confidence and provides justification - Uses a specialized system prompt for confidence evaluation This separation of concerns allows each model to focus on a specific task: - The first model concentrates solely on generating the most accurate answer - The second model evaluates how confident we should be in that answer ## Modifying the Pipeline for Better Performance Here are some ways to enhance the pipeline: 1. **Upgrade the Answer Generator**: - Click on Step A in the interface - Change the model from gpt-4o-mini to a more powerful model like gpt-4o - Modify the system prompt to include more specific instructions about quizbowl answer formatting 2. **Improve the Confidence Evaluator**: - Click on Step B - Add specific domain knowledge to the system prompt - For example, add: "Consider question length when evaluating confidence. Shorter, incomplete questions with less information revealed typically result in lower confidence scores." - Change the order of input variables so that model produces justification before confidence score, and hence conditions its confidence score on the justification. ## Running and Testing Your Modified Pipeline 1. After making your modifications, scroll down to adjust the buzzer settings: - Consider changing the confidence threshold based on the performance of your enhanced model - You might want to lower it slightly if you've improved the confidence evaluator 2. Test your modified pipeline: - Select a Question ID or use the provided sample question - Click "Run on Tossup Question" - Observe the answer, confidence score, and justification 3. Check the "Buzz Confidence" chart to see how confidence evolved during question processing ## Advantages of Multi-Step Pipelines Multi-step pipelines offer several benefits: 1. **Specialized Models**: Use different models for different tasks (e.g., GPT for general knowledge, Claude for reasoning) 2. **Focused Prompting**: Each step can have a targeted system prompt optimized for its specific task 3. **Chain of Thought**: Build sophisticated reasoning by connecting steps in a logical sequence 4. **Better Confidence Calibration**: Dedicated confidence evaluation typically results in more reliable buzzing 5. **Transparency**: The justification output helps you understand why the model made certain decisions