A Coding Guide to Build an Intelligent Conversational AI Agent with Agent Memory Using Cognee and Free Hugging Face Models

In this tutorial, we delve into building an advanced AI agent with agent memory using Cognee and Hugging Face models, utilizing entirely free, open-source tools that work seamlessly in Google Colab and other notebook. We configure Cognee for memory storage and retrieval, integrate a lightweight conversational model for generating responses, and bring it all together into an intelligent agent that learns, reasons, and interacts naturally. Whether it’s processing documents across domains or engaging in dialogue with contextual understanding, we walk through each step to create a capable agent without relying on paid APIs. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

!pip install cognee transformers torch sentence-transformers accelerate


import asyncio
import os
import json
from typing import List, Dict, Any
from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch


import cognee

We begin by installing all the essential libraries, including Cognee, Transformers, Torch, and Sentence-Transformers, to power our AI agent. We then import the required modules to handle tokenization, model loading, asynchronous tasks, and memory integration. This setup ensures we have everything ready to build, train, and interact with our intelligent agent. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

async def setup_cognee():
   """Setup Cognee with proper configuration"""
   try:
       await cognee.config.set("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
       await cognee.config.set("EMBEDDING_PROVIDER", "sentence_transformers")
       print("✅ Cognee configured successfully")
       return True
   except Exception as e:
       print(f"⚠️ Cognee config error: {e}")
       try:
           os.environ["EMBEDDING_MODEL"] = "sentence-transformers/all-MiniLM-L6-v2"
           os.environ["EMBEDDING_PROVIDER"] = "sentence_transformers"
           print("✅ Cognee configured via environment")
           return True
       except Exception as e2:
           print(f"⚠️ Alternative config failed: {e2}")
           return False

We set up Cognee by configuring the embedding model and provider to use all-MiniLM-L6-v2, a lightweight and efficient sentence-transformer. If the primary method fails, we fall back to manually setting environment variables, ensuring Cognee is always ready to process and store embeddings. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

class HuggingFaceLLM:
   def __init__(self, model_name="microsoft/DialoGPT-medium"):
       print(f"🤖 Loading Hugging Face model: {model_name}")
       self.device = "cuda" if torch.cuda.is_available() else "cpu"
       print(f"📱 Using device: {self.device}")
      
       if "DialoGPT" in model_name:
           self.tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left")
           self.model = AutoModelForCausalLM.from_pretrained(model_name)
           if self.tokenizer.pad_token is None:
               self.tokenizer.pad_token = self.tokenizer.eos_token
       else:
           self.generator = pipeline(
               "text-generation",
               model="distilgpt2",
               device=0 if self.device == "cuda" else -1,
               max_length=150,
               do_sample=True,
               temperature=0.7
           )
           self.tokenizer = None
           self.model = None
      
       print("✅ Model loaded successfully!")
  
   def generate_response(self, prompt: str, max_length: int = 100) -> str:
       try:
           if self.model is not None:
               inputs = self.tokenizer.encode(prompt + self.tokenizer.eos_token, return_tensors="pt")
              
               with torch.no_grad():
                   outputs = self.model.generate(
                       inputs,
                       max_length=inputs.shape[1] + max_length,
                       num_return_sequences=1,
                       temperature=0.7,
                       do_sample=True,
                       pad_token_id=self.tokenizer.eos_token_id
                   )
              
               response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
               response = response[len(prompt):].strip()
               return response if response else "I understand."
          
           else:
               result = self.generator(prompt, max_length=max_length, truncation=True)
               return result[0]['generated_text'][len(prompt):].strip()
              
       except Exception as e:
           print(f"⚠️ Generation error: {e}")
           return "I'm processing that information."


hf_llm = None

We define the HuggingFaceLLM class to handle text generation using lightweight Hugging Face models, such as DialoGPT or DistilGPT2. We detect whether a GPU is available and load the appropriate tokenizer and model accordingly. This setup enables our agent to generate intelligent and context-aware responses to user queries. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

class AdvancedAIAgent:
   """
   Advanced AI Agent with persistent memory, learning capabilities,
   and multi-domain knowledge processing using Cognee
   """
  
   def __init__(self, agent_name: str = "CogneeAgent"):
       self.name = agent_name
       self.memory_initialized = False
       self.knowledge_domains = []
       self.conversation_history = []
       self.manual_memory = [] 
      
   async def initialize_memory(self):
       """Initialize the agent's memory system and HF model"""
       global hf_llm
       if hf_llm is None:
           hf_llm = HuggingFaceLLM("microsoft/DialoGPT-medium")
      
       setup_success = await setup_cognee()
      
       try:
           await cognee.prune() 
           print(f"✅ {self.name} memory system initialized")
           self.memory_initialized = True
       except Exception as e:
           print(f"⚠️ Memory initialization warning: {e}")
           self.memory_initialized = True
  
   async def learn_from_text(self, text: str, domain: str = "general"):
       """Add knowledge to the agent's memory with domain tagging"""
       if not self.memory_initialized:
           await self.initialize_memory()
      
       enhanced_text = f"[DOMAIN: {domain}] [TIMESTAMP: {datetime.now().isoformat()}]\n{text}"
      
       try:
           await cognee.add(enhanced_text)
           await cognee.cognify() 
           if domain not in self.knowledge_domains:
               self.knowledge_domains.append(domain)
           print(f"📚 Learned new knowledge in domain: {domain}")
           return True
       except Exception as e:
           print(f"❌ Learning error: {e}")
           try:
               await cognee.add(text)
               await cognee.cognify()
               if domain not in self.knowledge_domains:
                   self.knowledge_domains.append(domain)
               print(f"📚 Learned (simplified): {domain}")
               return True
           except Exception as e2:
               print(f"❌ Simplified learning failed: {e2}")
               if not hasattr(self, 'manual_memory'):
                   self.manual_memory = []
               self.manual_memory.append({"text": text, "domain": domain})
               if domain not in self.knowledge_domains:
                   self.knowledge_domains.append(domain)
               print(f"📚 Stored in manual memory: {domain}")
               return True
  
   async def learn_from_documents(self, documents: List[Dict[str, str]]):
       """Batch learning from multiple documents"""
       print(f"📖 Processing {len(documents)} documents...")
      
       for i, doc in enumerate(documents):
           text = doc.get("content", "")
           domain = doc.get("domain", "general")
           title = doc.get("title", f"Document_{i+1}")
          
           enhanced_content = f"Title: {title}\n{text}"
           await self.learn_from_text(enhanced_content, domain)
          
           if i % 3 == 0:
               print(f"  Processed {i+1}/{len(documents)} documents")
  
   async def query_knowledge(self, question: str, domain_filter: str = None) -> List[str]:
       """Query the agent's knowledge base with optional domain filtering"""
       try:
           if domain_filter:
               enhanced_query = f"[DOMAIN: {domain_filter}] {question}"
           else:
               enhanced_query = question
              
           search_results = await cognee.search("SIMILARITY", enhanced_query)
          
           results = []
           for result in search_results:
               if hasattr(result, 'text'):
                   results.append(result.text)
               elif hasattr(result, 'content'):
                   results.append(result.content)
               elif hasattr(result, 'value'):
                   results.append(str(result.value))
               elif isinstance(result, dict):
                   content = result.get('text') or result.get('content') or result.get('data') or result.get('value')
                   if content:
                       results.append(str(content))
                   else:
                       results.append(str(result))
               elif isinstance(result, str):
                   results.append(result)
               else:
                   result_str = str(result)
                   if len(result_str) > 10: 
                       results.append(result_str)
          
           if not results and hasattr(self, 'manual_memory'):
               for item in self.manual_memory:
                   if domain_filter and item['domain'] != domain_filter:
                       continue
                   if any(word.lower() in item['text'].lower() for word in question.split()):
                       results.append(item['text'])
          
           return results[:5] 
          
       except Exception as e:
           print(f"🔍 Search error: {e}")
           results = []
           if hasattr(self, 'manual_memory'):
               for item in self.manual_memory:
                   if domain_filter and item['domain'] != domain_filter:
                       continue
                   if any(word.lower() in item['text'].lower() for word in question.split()):
                       results.append(item['text'])
           return results[:5]
  
   async def reasoning_chain(self, question: str) -> Dict[str, Any]:
       """Advanced reasoning using retrieved knowledge"""
       print(f"🤔 Processing question: {question}")
      
       relevant_info = await self.query_knowledge(question)
      
       analysis = {
           "question": question,
           "relevant_knowledge": relevant_info,
           "domains_searched": self.knowledge_domains,
           "confidence": min(len(relevant_info) / 3.0, 1.0), 
           "timestamp": datetime.now().isoformat()
       }
      
       if relevant_info and len(relevant_info) > 0:
           reasoning = self._synthesize_answer(question, relevant_info)
           analysis["reasoning"] = reasoning
           analysis["answer"] = self._extract_key_points(relevant_info)
       else:
           analysis["reasoning"] = "No relevant knowledge found in memory"
           analysis["answer"] = "I don't have information about this topic in my current knowledge base."
      
       return analysis




   def _synthesize_answer(self, question: str, knowledge_pieces: List[str]) -> str:
       """AI-powered answer synthesis using Hugging Face model"""
       global hf_llm
      
       if not knowledge_pieces:
           return "No relevant information found in my knowledge base."
      
       context = " ".join(knowledge_pieces[:2]) 
       context = context[:300] 
      
       prompt = f"Based on this information: {context}\n\nQuestion: {question}\nAnswer:"
      
       try:
           if hf_llm:
               synthesized = hf_llm.generate_response(prompt, max_length=80)
               return synthesized if synthesized else f"Based on my knowledge: {context[:100]}..."
           else:
               return f"From my analysis: {context[:150]}..."
       except Exception as e:
           print(f"⚠️ Synthesis error: {e}")
           return f"Based on my knowledge: {context[:100]}..."
  
   def _extract_key_points(self, knowledge_pieces: List[str]) -> List[str]:
       """Extract key points from retrieved knowledge"""
       key_points = []
       for piece in knowledge_pieces:
           clean_piece = piece.replace("[DOMAIN:", "").replace("[TIMESTAMP:", "")
           sentences = clean_piece.split('.')
           if len(sentences) > 0 and len(sentences[0].strip()) > 10:
               key_points.append(sentences[0].strip() + ".")
      
       return key_points[:3] 


   async def conversational_agent(self, user_input: str) -> str:
       """Main conversational interface with HF model integration"""
       global hf_llm
       self.conversation_history.append({"role": "user", "content": user_input})
      
       if any(word in user_input.lower() for word in ["learn", "remember", "add", "teach"]):
           content_to_learn = user_input.replace("learn this:", "").replace("remember:", "").strip()
           await self.learn_from_text(content_to_learn, "conversation")
           response = "I've stored that information in my memory! What else would you like to teach me?"
          
       elif user_input.lower().startswith(("what", "how", "why", "when", "where", "who", "tell me")):
           analysis = await self.reasoning_chain(user_input)
          
           if analysis["relevant_knowledge"] and hf_llm:
               context = " ".join(analysis["relevant_knowledge"][:2])[:200]
               prompt = f"Question: {user_input}\nKnowledge: {context}\nFriendly response:"
               ai_response = hf_llm.generate_response(prompt, max_length=60)
               response = ai_response if ai_response else "Here's what I found in my knowledge base."
           else:
               response = "I don't have specific information about that topic in my current knowledge base."
              
       else:
           relevant_context = await self.query_knowledge(user_input)
          
           if hf_llm:
               context_info = ""
               if relevant_context:
                   context_info = f" I know that: {relevant_context[0][:100]}..."
              
               conversation_prompt = f"User says: {user_input}{context_info}\nI respond:"
               response = hf_llm.generate_response(conversation_prompt, max_length=50)
              
               if not response or len(response.strip())

We now define the core of our system, the AdvancedAIAgent class, which brings together Cognee’s memory, domain-aware learning, knowledge retrieval, and Hugging Face-powered reasoning. We empower our agent to learn from both text and documents, retrieve contextually relevant knowledge, and respond to queries with synthesized, intelligent answers. Whether it’s remembering facts, answering questions, or engaging in conversation, this agent adapts, remembers, and responds with human-like fluency. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

async def main():
   print("🚀 Advanced AI Agent with Cognee Tutorial")
   print("=" * 50)
  
   agent = AdvancedAIAgent("TutorialAgent")
   await agent.initialize_memory()
  
   print("\n📚 DEMO 1: Multi-domain Learning")
   sample_documents = [
       {
           "title": "Python Basics",
           "content": "Python is a high-level programming language known for its simplicity and readability.",
           "domain": "programming"
       },
       {
           "title": "Climate Science",
           "content": "Climate change",
           "domain": "science"
       },
       {
           "title": "AI Ethics",
           "content": "AI ethics involves ensuring artificial intelligence systems are developed and deployed responsibly, considering fairness, transparency, accountability, and potential societal impacts.",
           "domain": "technology"
       },
       {
           "title": "Sustainable Energy",
           "content": "Renewable energy sources are crucial for reducing carbon emissions",
           "domain": "environment"
       }
   ]
  
   await agent.learn_from_documents(sample_documents)
  
   print("\n🔍 DEMO 2: Knowledge Retrieval & Reasoning")
   test_questions = [
       "What do you know about Python programming?",
       "How does climate change relate to energy?",
       "What are the ethical considerations in AI?"
   ]
  
   for question in test_questions:
       print(f"\n❓ Question: {question}")
       analysis = await agent.reasoning_chain(question)
       print(f"💡 Answer: {analysis.get('answer', 'No answer generated')}")
       print(f"🎯 Confidence: {analysis.get('confidence', 0):.2f}")
  
   print("\n💬 DEMO 3: Conversational Agent")
   conversation_inputs = [
       "Learn this: Machine learning is a subset of AI",
       "What is machine learning?",
       "How does it relate to Python?",
       "Remember that neural networks are inspired by biological neurons"
   ]
  
   for user_input in conversation_inputs:
       print(f"\n🗣️ User: {user_input}")
       response = await agent.conversational_agent(user_input)
       print(f"🤖 Agent: {response}")
  
   print(f"\n📊 DEMO 4: Agent Knowledge Summary")
   print(f"Knowledge domains: {agent.knowledge_domains}")
   print(f"Conversation history: {len(agent.conversation_history)} exchanges")
  
   print(f"\n🎯 Domain-specific search:")
   programming_results = await agent.query_knowledge("programming concepts", "programming")
   print(f"Programming knowledge: {len(programming_results)} results found")


if __name__ == "__main__":
   print("Starting Advanced AI Agent Tutorial with Hugging Face Models...")
   print("🤗 Using free models from Hugging Face Hub")
   print("📱 GPU acceleration available!" if torch.cuda.is_available() else "💻 Running on CPU")
  
   try:
       await main()
   except RuntimeError:
       import nest_asyncio
       nest_asyncio.apply()
       asyncio.run(main())
  
   print("\n✅ Tutorial completed! You've learned:")
   print("• How to set up Cognee with Hugging Face models")
   print("• AI-powered response generation")
   print("• Multi-domain knowledge management")
   print("• Advanced reasoning and retrieval")
   print("• Conversational agent with memory")
   print("• Free GPU-accelerated inference")

We conclude the tutorial by running a comprehensive demonstration of our AI agent in action. We first teach it from multi-domain documents, then test its ability to retrieve knowledge and reason intelligently. Next, we engage it in a natural conversation, watching it learn and recall information taught by users. Finally, we view a summary of its memory, showcasing how it organizes and filters knowledge by domain, all with real-time inference using free Hugging Face models.

In conclusion, we’ve built a fully functional AI agent that can learn from structured data, recall and reason with stored knowledge, and converse intelligently using Hugging Face models. We configure Cognee for persistent memory, demonstrate domain-specific queries, and even simulate real conversations with the agent.

Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.