RAG (Retrieval-Augmented Generation) System Documentation¤
Overview¤
The RAG system provides curated, high-quality content for entertainment agents, ensuring consistent and reliable responses even in demo mode without OpenAI API keys. This system combines intelligent content retrieval with quality-rated entertainment content.
Architecture¤
graph TB
subgraph "RAG System Architecture"
UserQuery[User Query] --> AgentService[Agent Service]
subgraph "Content Retrieval Layer"
AgentService --> RAGService[RAG Service]
RAGService --> ContentDB[(Content Database<br/>30 curated items)]
RAGService --> SearchEngine[Search Engine]
SearchEngine --> TagMatcher[Tag Matcher<br/>0.3pts per match]
SearchEngine --> PhraseMatcher[Phrase Matcher<br/>0.8pts exact match]
SearchEngine --> CategoryMatcher[Category Matcher<br/>0.2pts per match]
SearchEngine --> QualityBooster[Quality Booster<br/>+0.1pts for rating]
end
subgraph "Content Types"
ContentDB --> JokeContent[Jokes Database<br/>🎭 10 premium jokes]
ContentDB --> TriviaContent[Trivia Database<br/>🧠 10 fascinating facts]
ContentDB --> GIFContent[GIF Database<br/>🎬 10 curated GIFs]
end
subgraph "Search Processing"
TagMatcher --> RelevanceScorer[Relevance Scorer]
PhraseMatcher --> RelevanceScorer
CategoryMatcher --> RelevanceScorer
QualityBooster --> RelevanceScorer
RelevanceScorer --> ResultFilter[Result Filter<br/>Min 0.1 threshold]
ResultFilter --> ResultRanker[Result Ranker<br/>Score-based sorting]
end
subgraph "Content Delivery"
ResultRanker --> ContentResponse[Content Response]
ContentResponse --> FallbackHandler[Fallback Handler<br/>Random if no matches]
FallbackHandler --> QualityValidation[Quality Validation<br/>4-5 star ratings only]
end
subgraph "Agent Integration"
QualityValidation --> JokeAgent[Joke Agent<br/>😄 Humor delivery]
QualityValidation --> TriviaAgent[Trivia Agent<br/>🧠 Fact sharing]
QualityValidation --> GIFAgent[GIF Agent<br/>🎬 Visual entertainment]
end
end
classDef service fill:#e1f5fe,stroke:#01579b,color:#000
classDef data fill:#e8f5e8,stroke:#2e7d32,color:#000
classDef external fill:#fff3e0,stroke:#ef6c00,color:#000
classDef content fill:#f3e5f5,stroke:#7b1fa2,color:#000
classDef agent fill:#e8eaf6,stroke:#3f51b5,color:#000
class RAGService,SearchEngine,TagMatcher,PhraseMatcher,CategoryMatcher,QualityBooster,RelevanceScorer,ResultFilter,ResultRanker,FallbackHandler,QualityValidation service
class ContentDB,JokeContent,TriviaContent,GIFContent data
class UserQuery,ContentResponse external
class JokeAgent,TriviaAgent,GIFAgent agent
Core Components¤
- RAGService - Main service class managing content database and search
- ContentItem - Individual content pieces (jokes, trivia, GIFs)
- SearchQuery - Query interface for content retrieval
- SearchResult - Scored search results with relevance metrics
Content Database¤
Current Content Statistics¤
- 10 Premium Jokes (Dad jokes, tech humor, story jokes)
- 10 Fascinating Trivia Facts (Science, animals, space, history)
- 10 Curated GIFs (Reactions, emotions, celebrations)
- Quality Ratings: All content rated 4-5 stars
- 15+ Categories for organized content discovery
- 100+ Search Tags for intelligent content matching
Content Structure¤
Content Categories¤
Jokes¤
dad_joke
- Classic dad humor and punstech_joke
- Programming and technology humorstory_joke
- Narrative-style jokes
Trivia¤
animals
- Animal facts and biologyspace
- Astronomy and space explorationscience
- Scientific discoveries and phenomenahistory
- Historical facts and eventsfood
- Food science and culinary factshuman_body
- Human biology and healthmathematics
- Mathematical concepts and paradoxes
GIFs¤
funny
- General humor and comedycute
- Adorable and heartwarming contentexcited
- Celebration and joy reactionssurprised
- Shock and amazement reactionsapplause
- Approval and congratulationsparty
- Celebration and festive contentthumbs_up
- Positive approvalfacepalm
- Disappointment reactionsshrug
- Confusion and uncertaintymind_blown
- Astonishment reactions
API Reference¤
RAGService Methods¤
search(query: SearchQuery): SearchResult[]
¤
Searches for content based on query parameters.
Parameters:
Returns: Array of SearchResult objects with relevance scores.
searchForAgent(agentType: AgentType, query: string, fallbackToRandom?: boolean): ContentItem | null
¤
Simplified search for specific agent types.
Parameters:
agentType
- Agent requesting content ('joke', 'trivia', 'gif')query
- User's message for contextfallbackToRandom
- Whether to return random content if no matches (default: true)
getRandomContent(type: 'joke' | 'trivia' | 'gif', category?: string): ContentItem | null
¤
Retrieves random content of specified type.
addContent(item: ContentItem): void
¤
Dynamically adds new content to the database.
getStats(): { [type: string]: number }
¤
Returns content statistics by type.
getTopRated(type?: string, limit?: number): ContentItem[]
¤
Gets highest-rated content, optionally filtered by type.
Content Retrieval Flow¤
sequenceDiagram
participant Agent as Entertainment Agent
participant RAG as RAG Service
participant Search as Search Engine
participant DB as Content Database
participant Quality as Quality Filter
Agent->>+RAG: searchForAgent(type, query, fallback)
Note over RAG: Query Processing
RAG->>RAG: Parse agent type and context
RAG->>RAG: Extract search keywords
RAG->>RAG: Determine content filters
Note over RAG,Search: Search Execution
RAG->>+Search: search(query, type, filters)
Search->>+DB: Get all content by type
DB-->>-Search: Content items array
Note over Search: Relevance Scoring
Search->>Search: Calculate phrase matches (0.8pts)
Search->>Search: Calculate tag matches (0.3pts each)
Search->>Search: Calculate category matches (0.2pts)
Search->>Search: Apply quality boost (0.1pts)
Search->>Search: Filter by threshold (≥0.1)
Search->>Search: Sort by relevance score
Search-->>-RAG: Ranked results array
alt Results Found
Note over RAG,Quality: Quality Assurance
RAG->>+Quality: Validate top result
Quality->>Quality: Check rating (4-5 stars)
Quality->>Quality: Verify content appropriateness
Quality-->>-RAG: Validated content
RAG-->>Agent: High-quality content item
else No Results & Fallback Enabled
Note over RAG,DB: Fallback Strategy
RAG->>+DB: getRandomContent(type)
DB-->>-RAG: Random quality content
RAG-->>Agent: Fallback content item
else No Results & No Fallback
RAG-->>-Agent: null (no content found)
end
Note over Agent,Quality: Content Delivered with Context
Search Algorithm Architecture¤
graph TB
subgraph "Search Algorithm Processing"
Query[Search Query] --> Preprocessor[Query Preprocessor]
subgraph "Text Processing"
Preprocessor --> Tokenizer[Text Tokenizer<br/>Split into keywords]
Preprocessor --> Normalizer[Text Normalizer<br/>Lowercase, trim spaces]
Preprocessor --> StopWords[Stop Word Filter<br/>Remove common words]
end
subgraph "Matching Strategies"
Tokenizer --> ExactPhrase[Exact Phrase Matching<br/>0.8 points maximum]
Tokenizer --> TagMatching[Tag Matching<br/>0.3 points per tag]
Normalizer --> CategoryMatch[Category Matching<br/>0.2 points per match]
StopWords --> KeywordMatch[Keyword Matching<br/>0.1 points per word]
end
subgraph "Content Analysis"
ExactPhrase --> ContentScanner[Content Scanner]
TagMatching --> TagDatabase[(Tag Index<br/>100+ searchable tags)]
CategoryMatch --> CategoryIndex[(Category Index<br/>15+ content categories)]
KeywordMatch --> ContentIndex[(Full-text Index<br/>All content searchable)]
end
subgraph "Scoring Pipeline"
ContentScanner --> BaseScore[Base Relevance Score<br/>Sum of all matches]
TagDatabase --> BaseScore
CategoryIndex --> BaseScore
ContentIndex --> BaseScore
BaseScore --> QualityMultiplier[Quality Multiplier<br/>Rating-based boost]
QualityMultiplier --> FinalScore[Final Relevance Score<br/>0.0 - 1.0+ range]
end
subgraph "Result Processing"
FinalScore --> ThresholdFilter[Threshold Filter<br/>Minimum 0.1 score]
ThresholdFilter --> ScoreSorter[Score-based Sorting<br/>Highest relevance first]
ScoreSorter --> LimitApplier[Result Limiter<br/>Top N results]
end
subgraph "Output"
LimitApplier --> RankedResults[Ranked Results<br/>Scored content items]
RankedResults --> TopResult[Top Result<br/>Best match for agent]
end
end
classDef service fill:#e1f5fe,stroke:#01579b,color:#000
classDef data fill:#e8f5e8,stroke:#2e7d32,color:#000
classDef external fill:#fff3e0,stroke:#ef6c00,color:#000
classDef processing fill:#f3e5f5,stroke:#7b1fa2,color:#000
classDef scoring fill:#e8eaf6,stroke:#3f51b5,color:#000
class Preprocessor,Tokenizer,Normalizer,StopWords,ContentScanner,QualityMultiplier,ThresholdFilter,ScoreSorter,LimitApplier service
class TagDatabase,CategoryIndex,ContentIndex data
class Query,RankedResults,TopResult external
class ExactPhrase,TagMatching,CategoryMatch,KeywordMatch processing
class BaseScore,FinalScore scoring
Search Algorithm Scoring¤
The RAG system uses intelligent relevance scoring:
- Exact Phrase Match (0.8 points) - Direct content matches
- Tag Matching (0.3 points per tag) - Contextual tag alignment
- Category Matching (0.2 points) - Category relevance
- Content Keywords (0.1 points per word) - General content relevance
- Quality Boost (up to 0.1 points) - Based on content rating
Minimum Threshold: 0.1 relevance score required for results.
Agent Integration Patterns¤
graph TB
subgraph "Multi-Agent RAG Integration"
AgentRequest[Agent Content Request] --> AgentRouter[Agent Router]
subgraph "Agent-Specific Processing"
AgentRouter --> JokePath[Joke Agent Path<br/>😄 Humor context]
AgentRouter --> TriviaPath[Trivia Agent Path<br/>🧠 Educational context]
AgentRouter --> GIFPath[GIF Agent Path<br/>🎬 Visual context]
end
subgraph "Content Customization"
JokePath --> JokeRAG[Joke RAG Service<br/>Dad jokes, tech humor, stories]
TriviaPath --> TriviaRAG[Trivia RAG Service<br/>Science, animals, space, history]
GIFPath --> GIFRAG[GIF RAG Service<br/>Reactions, emotions, celebrations]
end
subgraph "Fallback Strategies"
JokeRAG --> JokeFallback[Joke Fallback<br/>Random high-quality joke]
TriviaRAG --> TriviaFallback[Trivia Fallback<br/>Random fascinating fact]
GIFRAG --> GIFFallback[GIF Fallback<br/>Random appropriate GIF]
end
subgraph "Response Enhancement"
JokeFallback --> JokeEnhancer[Joke Response Enhancer<br/>Add reaction prompts & emojis]
TriviaFallback --> TriviaEnhancer[Trivia Response Enhancer<br/>Add follow-up questions]
GIFFallback --> GIFEnhancer[GIF Response Enhancer<br/>Add context & alt text]
end
subgraph "Quality Assurance"
JokeEnhancer --> QualityGate[Quality Gate<br/>4-5 star content only]
TriviaEnhancer --> QualityGate
GIFEnhancer --> QualityGate
QualityGate --> FinalResponse[Enhanced Agent Response<br/>✅ Curated & contextual]
end
subgraph "Demo Mode Integration"
FinalResponse --> DemoCheck{Demo Mode?}
DemoCheck -->|Yes| DemoResponse[Demo Mode Response<br/>RAG content + demo notice]
DemoCheck -->|No| ProductionResponse[Production Response<br/>RAG fallback if API fails]
end
end
classDef service fill:#e1f5fe,stroke:#01579b,color:#000
classDef data fill:#e8f5e8,stroke:#2e7d32,color:#000
classDef external fill:#fff3e0,stroke:#ef6c00,color:#000
classDef agent fill:#f3e5f5,stroke:#7b1fa2,color:#000
classDef enhancement fill:#e8eaf6,stroke:#3f51b5,color:#000
class AgentRouter,JokeRAG,TriviaRAG,GIFRAG,QualityGate,DemoCheck service
class JokeFallback,TriviaFallback,GIFFallback data
class AgentRequest,FinalResponse,DemoResponse,ProductionResponse external
class JokePath,TriviaPath,GIFPath agent
class JokeEnhancer,TriviaEnhancer,GIFEnhancer enhancement
Content Management Workflow¤
graph TB
subgraph "Content Management System"
ContentCreation[Content Creation] --> ContentValidation[Content Validation]
subgraph "Quality Assurance"
ContentValidation --> RatingCheck[Rating Check<br/>Require 4-5 stars]
ContentValidation --> FamilyFriendly[Family-Friendly Check<br/>Appropriate for all ages]
ContentValidation --> AccuracyCheck[Accuracy Verification<br/>Fact-checking for trivia]
ContentValidation --> AccessibilityCheck[Accessibility Check<br/>Alt text for GIFs]
end
subgraph "Content Processing"
RatingCheck --> TagGeneration[Tag Generation<br/>Extract searchable keywords]
FamilyFriendly --> CategoryAssignment[Category Assignment<br/>Assign to content buckets]
AccuracyCheck --> MetadataCreation[Metadata Creation<br/>Add descriptions & context]
AccessibilityCheck --> ContentFormatting[Content Formatting<br/>Standardize structure]
end
subgraph "Database Integration"
TagGeneration --> TagIndex[Tag Index Update<br/>100+ searchable tags]
CategoryAssignment --> CategoryIndex[Category Index Update<br/>15+ content categories]
MetadataCreation --> ContentDatabase[(Content Database<br/>Persistent storage)]
ContentFormatting --> SearchIndex[Search Index Update<br/>Full-text indexing]
end
subgraph "Validation & Testing"
TagIndex --> SearchTesting[Search Testing<br/>Verify findability]
CategoryIndex --> RelevanceTesting[Relevance Testing<br/>Check scoring accuracy]
ContentDatabase --> QualityTesting[Quality Testing<br/>User satisfaction validation]
SearchIndex --> PerformanceTesting[Performance Testing<br/>Search response times]
end
subgraph "Deployment"
SearchTesting --> ProductionDeployment[Production Deployment<br/>Live content activation]
RelevanceTesting --> ProductionDeployment
QualityTesting --> ProductionDeployment
PerformanceTesting --> ProductionDeployment
ProductionDeployment --> MonitoringSetup[Monitoring Setup<br/>Usage analytics & feedback]
end
subgraph "Continuous Improvement"
MonitoringSetup --> UsageAnalytics[Usage Analytics<br/>Track content performance]
UsageAnalytics --> ContentOptimization[Content Optimization<br/>Update based on feedback]
ContentOptimization --> ContentCreation
end
end
classDef service fill:#e1f5fe,stroke:#01579b,color:#000
classDef data fill:#e8f5e8,stroke:#2e7d32,color:#000
classDef external fill:#fff3e0,stroke:#ef6c00,color:#000
classDef quality fill:#f3e5f5,stroke:#7b1fa2,color:#000
classDef testing fill:#e8eaf6,stroke:#3f51b5,color:#000
class ContentValidation,TagGeneration,CategoryAssignment,MetadataCreation,ContentFormatting,ProductionDeployment,MonitoringSetup service
class TagIndex,CategoryIndex,ContentDatabase,SearchIndex data
class ContentCreation,UsageAnalytics,ContentOptimization external
class RatingCheck,FamilyFriendly,AccuracyCheck,AccessibilityCheck quality
class SearchTesting,RelevanceTesting,QualityTesting,PerformanceTesting testing
Integration with Agents¤
Entertainment Agent Integration¤
Each entertainment agent uses RAG for consistent content delivery:
Demo Mode Enhancement¤
RAG provides reliable content when OpenAI API is unavailable:
- Consistent Quality - Pre-curated, rated content
- No API Dependencies - Works offline
- Contextual Relevance - Smart search matches user intent
- Professional Experience - Enterprise-grade content delivery
Content Management¤
Adding New Content¤
Content Guidelines¤
Quality Standards¤
- Rating 4-5: Only high-quality, tested content
- Family Friendly: All content appropriate for general audiences
- Engaging: Content should entertain and delight users
- Accurate: Trivia facts must be factually correct
- Accessible: GIFs should include alt text and descriptions
Tag Strategy¤
- Descriptive Tags: Clear, searchable keywords
- Multiple Contexts: Include various relevant tags
- Consistent Naming: Use standardized tag formats
- Semantic Grouping: Related concepts should share tags
Performance Monitoring¤
Available Metrics¤
Usage Analytics¤
The system automatically logs:
- Search queries and result counts
- Content relevance scores
- Agent usage patterns
- Content performance metrics
Best Practices¤
For Developers¤
- Always Provide Fallback: Use
fallbackToRandom: true
for agent searches - Cache Frequently Used Content: Store popular items for quick access
- Monitor Search Performance: Review logs for optimization opportunities
- Test Content Quality: Validate new content before adding
- Update Regularly: Keep content fresh and relevant
For Content Creators¤
- Write Clear Content: Ensure jokes/facts are easy to understand
- Use Descriptive Tags: Include all relevant keywords
- Rate Honestly: Use 1-5 scale based on entertainment value
- Test with Users: Validate content with target audience
- Include Metadata: Provide context and accessibility information
Troubleshooting¤
Common Issues¤
No Search Results:
- Check minimum relevance threshold (0.1)
- Verify tag spelling and categories
- Use broader search terms
Poor Content Relevance:
- Review and improve tag assignments
- Add more contextual tags
- Consider content categorization
Performance Issues:
- Monitor database size (current: 30 items)
- Consider search result limits
- Review complex query patterns
Debug Mode¤
Enable detailed logging:
Future Enhancements¤
Planned Features¤
- Vector Embeddings - Semantic search capabilities
- Machine Learning - Personalized content recommendations
- Content Analytics - Detailed usage and performance metrics
- Dynamic Learning - Content rating based on user reactions
- External Integration - API connections to content services
- Advanced Filtering - Complex query capabilities
- Content Validation - Automated quality checking
- A/B Testing - Content performance comparison
Scalability Considerations¤
- Database Growth: Current in-memory storage suitable up to ~1000 items
- Search Performance: O(n) search acceptable for current scale
- Content Expansion: Easy addition of new content types
- Multi-language Support: Framework ready for localization
Contributing¤
Adding Content Types¤
- Update
ContentItem
type definition - Add new categories and tags
- Update search algorithms if needed
- Add validation rules
- Update documentation
Content Submission Process¤
- Create content following quality guidelines
- Add appropriate tags and ratings
- Test with multiple search queries
- Submit for review and validation
- Add to production database
This RAG system provides a robust foundation for reliable, high-quality entertainment content that enhances the customer service experience while maintaining professional standards.