In the last post. I wrote about how to create a GenAI chatbot from scratch, as instructed in a DataMites workshop. But in it, there was some code that I encountered for the first time mysefl, so writing this post for exploring what that code does, with the help of Claude- (this code review is best read with the previous post)- https://avtnshm.github.io/2025/02/07/GenAI-Chatbot.html¶

In [ ]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter 
from langchain_community.embeddings import OllamaEmbeddings 
from langchain_chroma import Chroma
loader = TextLoader("text.txt") text_docs = loader.load() #loads the text file
splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=30) final_docs = splitter.split_documents(text_docs) 
#splits the text file contents using the RCTS splitter
myembeddings = OllamaEmbeddings(model="gemma2:2b") #creates embedding using the locally loaded gemma 2 model with 2B parameters
vectordb = Chroma.from_documents( documents=final_docs, embedding=myembeddings, persist_directory="./chroma_db" )
#creates the vector database and saves it permanently for future use with persist directory parameter.

Now, we have to review and understand the above code, which is used in loading, splitting. embedding and finally storing it in the vector database for later retrieval¶

¶

  • from langchain_community.document_loaders import TextLoader
  • from langchain_text_splitters import RecursiveCharacterTextSplitter
  • from langchain_community.embeddings import OllamaEmbeddings
  • from langchain_chroma import Chroma

as we can see, it is pretty straightforward code- we are importing various modules from the langchain library to work with our data, in this case, text file.¶

RecursiveCharacterTextSplitter is useful as it helps in maintaining and understanding the context; it has the ability to split the text on grammar based rules, after full stop, lines, paragraphs and even character counts, since we are giving it chunk size of 100 characters with chunk overlap of 30 characters, it will help us especially in keeping the relevant content in the context.¶

In [ ]:
import os #importing the OS module
from dotenv import load_dotenv  #allows to load environment variables
from langchain_community.embeddings import OllamaEmbeddings 
from langchain_chroma import Chroma 
from langchain_core.prompts import ChatPromptTemplate #template for convo with LLM
from langchain_groq import ChatGroq #used for using models available on Groq
from langchain.chains import create_retrieval_chain #used for retrieving relevant content
from langchain.chains.combine_documents import create_stuff_documents_chain 
#combining retrieved content with embeddings
import streamlit as st #web application for data

load_dotenv()

groq_api_key ="insert your key here" #insert the secret key
model = ChatGroq(model="gemma2-9b-it", groq_api_key=groq_api_key) 
embeddings = OllamaEmbeddings(model = 'gemma2:2b') 
mydb = Chroma(persist_directory='./chroma_db',embedding_function=embeddings) 
retriever = mydb.as_retriever(search_type='similarity', search_kwargs={"k":6})
#using similarity search, we retrieve 6 most similar document chunks
st.title("Welcome to A's Caffe") query = st.chat_input("Ask me anything- ") #title of the app and display message

system_prompt= ( "You are an assitant for question answering task for a restaurant called A's caffe" 
                "Use the following pieces of retrieved context to answer the question." 
                "Make sure you talk very polite with the customer and don't write anything bad about the restaurant" 
                "Your tone of reply should always be exciting and luring to the customers" "\n\n" "{context}" )
#instructions to LLM for the kind of response we want

prompt = ChatPromptTemplate.from_messages([ ('system', system_prompt),('human',"{input}") ]) #combines human and system prompt

if query: 
    question_answer_chain= create_stuff_documents_chain(model,prompt) #sends relevant information, along with the question to LLM
    rag_chain = create_retrieval_chain(retriever, question_answer_chain) #completes the chain with embeddings and question

response = rag_chain.invoke({'input':query}) #prompt to LLM model
st.write(response['answer']) #LLM's reply to query