Readme

Here’s the comprehensive documentation for your Conversational Retrieval Chatbot project, incorporating details on setup, usage, API endpoints, and more:

---

# **Conversational Retrieval Chatbot Documentation**

## **Project Overview**

The Conversational Retrieval Chatbot is a sophisticated AI-powered system designed to interact with users through a chatbot interface. It leverages document embeddings and advanced language models to provide context-aware responses. Key features include document uploading, conversational querying, text-to-speech functionality, and personality-based responses.

### **Features**

- **Upload and Process Documents**: Upload and store documents for querying.
- **Handle Conversational Queries**: Use a retrieval-based chain to respond to user queries.
- **Generate and Stream Text-to-Speech Responses**: Convert text responses to audio and stream them.
- **Personality Queries**: Respond to queries about the bot’s personality.

## **Setup and Installation**

### **Prerequisites**

- **Python 3.8 or higher**
- **`pip`** (Python package installer)
- **API Keys**: Obtain API keys for Deepgram and Groq.

### **1. Clone the Repository**

First, clone the repository to your local machine:

```bash
git clone <repository-url>
cd <repository-directory>
```

### **2. Create and Activate a Virtual Environment**

Create a virtual environment to manage dependencies:

```bash
python -m venv venv
```

Activate the virtual environment:

- **On Unix/macOS**:

  ```bash
  source venv/bin/activate
  ```

- **On Windows**:

  ```bash
  venv\Scripts\activate
  ```

### **3. Install Dependencies**

Install the required packages using `pip`:

```bash
pip install -r requirements.txt
```

### **4. Set Up Environment Variables**

Create a `.env` file in the root directory of the project with the following content:

```
GROQ_API_KEY=your_groq_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
```

Replace `your_groq_api_key` and `your_deepgram_api_key` with your actual API keys.

### **5. Initialize Vector Store**

Run the server once to initialize the vector store:

```bash
python app.py
```

The vector store will be created or loaded based on existing files.

## **API Endpoints**

### **1. Upload Documents**

**Endpoint:** `/upload`  
**Method:** `POST`  
**Description:** Upload documents to be processed and stored for querying.

**Request:**

- `multipart/form-data`
  - `files[]`: List of documents to upload. Supported formats: PDF, DOCX, DOC, TXT.

**Response:**

- **Success (200 OK):**

  ```json
  {
    "message": "Files processed successfully"
  }
  ```

- **Error (400 Bad Request):**

  ```json
  {
    "error": "No file part"
  }
  ```

### **2. Chat with the Bot**

**Endpoint:** `/chat`  
**Method:** `POST`  
**Description:** Send a message to the bot and receive a text and audio response.

**Request:**

- `application/json`
  - `message`: The user’s query.

**Response:**

- **Success (200 OK):** Streams a response containing text and audio data.

  **Text Response:**

  ```json
  {
    "text": "Bot’s response text"
  }
  ```

  **Audio Response:** Audio content streamed as `application/octet-stream`.

- **Error (400 Bad Request):**

  ```json
  {
    "error": "Please upload documents first"
  }
  ```

  ```json
  {
    "error": "No message provided"
  }
  ```

### **3. Personality Queries**

The bot responds to queries about its personality using predefined responses based on the input. Example queries include:

- "What is your name?"
- "Who created you?"
- "What can you do?"

The responses are customized based on the bot’s personality traits.

## **File Upload and Processing Workflow**

1. **Upload Files**: Use the `/upload` endpoint to upload documents. Supported formats include `.pdf`, `.docx`, `.doc`, and `.txt`.

2. **Process Files**: Uploaded files are saved temporarily. The appropriate loader (e.g., PyPDFLoader, Docx2txtLoader, TextLoader) extracts text from each file.

3. **Split Text into Chunks**: The text is divided into smaller chunks using a `CharacterTextSplitter` for more manageable processing.

4. **Generate Embeddings**: Convert text chunks into embeddings using a pre-trained Hugging Face model (`sentence-transformers/all-MiniLM-L6-v2`).

5. **Store in Vector Database**: Save the embeddings in a Chroma vector database for efficient retrieval.

6. **Create Conversational Chain**: Construct a `ConversationalRetrievalChain` using the embeddings, allowing the chatbot to fetch relevant information based on user queries.

## **Chat Workflow**

1. **Receive User Query**:

   - **Endpoint**: `/chat`
   - **Method**: POST
   - **Description**: The user sends a query to this endpoint in JSON format.

   **Example Request:**

   ```json
   {
     "message": "Tell me about World War II."
   }
   ```

2. **Retrieve Relevant Chunks**:

   - **Action**: Fetch the most relevant text chunks from the vector database based on the user query.

3. **Generate Response**:

   - **Action**: Pass the relevant chunks and conversation history to the `ChatGroq` language model to generate a text response.

   **Example Text Response:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945..."
   }
   ```

4. **Generate Audio Response**:

   - **Action**: Convert the text response to audio using Deepgram’s text-to-speech service.

   **Example Request to Deepgram:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945...",
     "voice": "en_us_male"
   }
   ```

5. **Stream Audio Response**:

   - **Action**: Stream the generated audio back to the client. The response includes both text and audio.

   **Example Response:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945...",
     "audio_url": "<url_to_audio_stream>"
   }
   ```

   The client can use this URL to stream the audio response.

6. **Handle Errors**:
   - **Action**: Return appropriate error messages for issues such as invalid input or external service failures.

## **Code Overview**

### **Main Components**

- **Flask App**: Initializes and runs the web server.
- **`generateAudio(text)`**: Generates speech from text using `deepgram` and stream.
- **`create_conversational_chain(vector_store)`**: Creates a conversational retrieval chain using the `ChatGroq` model and a vector store.
- **`load_or_create_vector_store()`**: Loads or creates a vector store for document embeddings.
- **`upload_files()`**: Endpoint to upload and process documents.
- **`chat()`**: Endpoint to handle user queries and generate responses.
- **`remove_emojis(text)`**: Removes emojis from text to ensure cleaner responses.
- **`handle_personality_query(user_input)`**: Provides predefined responses based on user queries about the bot’s personality.
- **`add_personality_to_response(response, personality)`**: Adjusts the response based on the bot’s personality.

### **Debug Statements**

Throughout the code, debug statements are included to trace the flow and verify operations.

## **Running the Server**

Start the Flask server in debug mode:

```bash
python app.py
```

The server will be accessible at `http://127.0.0.1:5000`.

## **Testing the API**

You can test the API using tools like `curl`, Postman, or directly via Python scripts.

**Example using `curl` to upload a file:**

```bash
curl -X POST -F "files[]=@path/to/your/file.pdf" http://127.0.0.1:5000/upload
```

**Example using `curl` to chat:**

```bash
curl -X POST -H "Content-Type: application/json" -d '{"message": "Hello, bot!"}' http://127.0.0.1:5000/chat
```

## **Troubleshooting**

- **Application Not Starting**:

  - Ensure all environment variables are correctly set.
  - Check for missing dependencies in `requirements.txt`.

- **File Upload Issues**:

  - Verify the file formats and sizes.
  - Check server logs for errors related to file processing.

- **API Errors**:

  - Validate API requests and responses.
  - Ensure external services (Groq, Deepgram) are accessible and operational.

- **Audio Streaming Problems**:
  - Check the URL returned by Deepgram and ensure it is correct.
  - Verify audio content is properly formatted and streamed.

## **License**

This project is licensed under the MIT License. See the `LICENSE` file for more details.

---

Feel free to modify this documentation as needed to fit your project's specifics and requirements.