Agent Test Bench System¤
Overview¤
The Agent Test Bench System provides comprehensive testing capabilities for all AI agents and system features. It includes both API endpoints for programmatic testing and a developer-friendly web interface.
Backend API Endpoints¤
All test bench endpoints are prefixed with /api/test-bench
:
Agent Testing¤
Test Individual Agent¤
- Endpoint:
POST /api/test-bench/agent/{agentType}/test
- Purpose: Test a specific agent with a custom message
- Request Body:
Bulk Test All Agents¤
- Endpoint:
POST /api/test-bench/bulk-test
- Purpose: Test all agents with the same message
- Request Body:
Feature Testing¤
Message Classification¤
- Endpoint:
POST /api/test-bench/classifier/test
- Purpose: Test message classification system
- Request Body:
RAG Service¤
- Endpoint:
POST /api/test-bench/rag/test
- Purpose: Test RAG (Retrieval-Augmented Generation) service
- Request Body:
Response Validation¤
- Endpoint:
POST /api/test-bench/validator/test
- Purpose: Test response validation system
- Request Body:
Joke Learning System¤
- Endpoint:
POST /api/test-bench/joke-learning/test
- Purpose: Test adaptive joke learning system
- Request Body:
Goal-Seeking System¤
- Endpoint:
POST /api/test-bench/goal-seeking/test
- Purpose: Test proactive goal-seeking behavior
- Request Body:
Conversation Manager¤
- Endpoint:
POST /api/test-bench/conversation-manager/test
- Purpose: Test conversation flow and agent handoffs
- Request Body:
Comprehensive System Test¤
- Endpoint:
POST /api/test-bench/comprehensive/test
- Purpose: Test full system integration (goal-seeking + conversation management)
- Request Body:
System Information¤
Get Available Agents¤
- Endpoint:
GET /api/test-bench/agents/list
- Purpose: Retrieve list of all available agents
- Response: Returns agent metadata including names and descriptions
System Health Check¤
- Endpoint:
GET /api/test-bench/health
- Purpose: Check health status of all system components
- Response: Returns operational status of each service
Agent Types¤
The system supports the following 14 agent types:
- general - General assistant for casual conversation and everyday tasks
- joke - Adaptive joke master with learning capabilities
- trivia - Trivia master for fascinating facts and knowledge
- gif - GIF master for entertaining visual content
- account_support - Account-related issues and authentication
- billing_support - Billing, payments, and financial matters
- website_support - Website functionality and technical issues
- operator_support - General customer service coordination
- hold_agent - Hold experience management with entertainment
- story_teller - Creative storytelling and narratives
- riddle_master - Riddles, puzzles, and brain teasers
- quote_master - Inspirational quotes and wisdom
- game_host - Interactive games and challenges
- music_guru - Music recommendations and discussions
Frontend Test Bench Interface¤
The Developer Test Bench provides a comprehensive web interface for testing all system components:
Features¤
- Tabbed Interface: Organized testing panels for different system components
- Real-time Results: Live feedback with success/failure indicators
- Execution Timing: Performance metrics for each test
- Result Export: Download test results as JSON for analysis
- System Health Dashboard: Monitor component operational status
- Agent Directory: View all available agents and their capabilities
Usage¤
- Access: Navigate to the test bench interface in your development environment
- Select Test Type: Choose from agent testing, classifier, system health, etc.
- Configure Test: Set parameters like agent type, message, user ID
- Execute: Run individual tests or bulk tests across all agents
- Review Results: View detailed response data and execution metrics
- Export Data: Download results for further analysis
Test Categories¤
Agent Testing Tab¤
- Test individual agents with custom messages
- Bulk test all agents simultaneously
- Configure user ID and conversation history
- View agent-specific responses and confidence levels
Classifier Tab¤
- Test message classification accuracy
- See which agent type is selected for different messages
- View classification confidence and reasoning
System Health Tab¤
- Monitor operational status of all services
- View OpenAI API key configuration status
- Browse available agents and their descriptions
- Check system component health indicators
Testing Best Practices¤
1. Agent Response Testing¤
2. Classification Testing¤
3. System Health Check¤
4. Bulk Agent Testing¤
Integration with CI/CD¤
The test bench can be integrated into continuous integration pipelines:
Example Test Script¤
Error Handling¤
All test endpoints return consistent error responses:
Performance Monitoring¤
The test bench tracks:
- Response times for each endpoint
- Success/failure rates
- Agent performance comparisons
- System resource utilization
- Error frequency and types
Security Considerations¤
- Test endpoints are intended for development environments
- Production deployments should disable or restrict access
- Test data should not contain sensitive information
- User IDs in tests should be clearly marked as test accounts
Development Workflow¤
- Feature Development: Use individual agent tests during feature development
- Integration Testing: Use comprehensive tests for full system validation
- Performance Testing: Use bulk tests to identify performance bottlenecks
- Regression Testing: Use automated test suites for continuous validation
- Debugging: Use detailed test results to diagnose issues
This test bench system provides comprehensive coverage for all AI agents and system features, enabling developers to validate functionality, performance, and integration across the entire platform.