Skip to content

Craftech360-projects/AI-Bot-Server

Repository files navigation

Here’s the comprehensive documentation for your Conversational Retrieval Chatbot project, incorporating details on setup, usage, API endpoints, and more:

---

# **Conversational Retrieval Chatbot Documentation**

## **Project Overview**

The Conversational Retrieval Chatbot is a sophisticated AI-powered system designed to interact with users through a chatbot interface. It leverages document embeddings and advanced language models to provide context-aware responses. Key features include document uploading, conversational querying, text-to-speech functionality, and personality-based responses.

### **Features**

- **Upload and Process Documents**: Upload and store documents for querying.
- **Handle Conversational Queries**: Use a retrieval-based chain to respond to user queries.
- **Generate and Stream Text-to-Speech Responses**: Convert text responses to audio and stream them.
- **Personality Queries**: Respond to queries about the bot’s personality.

## **Setup and Installation**

### **Prerequisites**

- **Python 3.8 or higher**
- **`pip`** (Python package installer)
- **API Keys**: Obtain API keys for Deepgram and Groq.

### **1. Clone the Repository**

First, clone the repository to your local machine:

```bash
git clone <repository-url>
cd <repository-directory>
```

### **2. Create and Activate a Virtual Environment**

Create a virtual environment to manage dependencies:

```bash
python -m venv venv
```

Activate the virtual environment:

- **On Unix/macOS**:

  ```bash
  source venv/bin/activate
  ```

- **On Windows**:

  ```bash
  venv\Scripts\activate
  ```

### **3. Install Dependencies**

Install the required packages using `pip`:

```bash
pip install -r requirements.txt
```

### **4. Set Up Environment Variables**

Create a `.env` file in the root directory of the project with the following content:

```
GROQ_API_KEY=your_groq_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
```

Replace `your_groq_api_key` and `your_deepgram_api_key` with your actual API keys.

### **5. Initialize Vector Store**

Run the server once to initialize the vector store:

```bash
python app.py
```

The vector store will be created or loaded based on existing files.

## **API Endpoints**

### **1. Upload Documents**

**Endpoint:** `/upload`  
**Method:** `POST`  
**Description:** Upload documents to be processed and stored for querying.

**Request:**

- `multipart/form-data`
  - `files[]`: List of documents to upload. Supported formats: PDF, DOCX, DOC, TXT.

**Response:**

- **Success (200 OK):**

  ```json
  {
    "message": "Files processed successfully"
  }
  ```

- **Error (400 Bad Request):**

  ```json
  {
    "error": "No file part"
  }
  ```

### **2. Chat with the Bot**

**Endpoint:** `/chat`  
**Method:** `POST`  
**Description:** Send a message to the bot and receive a text and audio response.

**Request:**

- `application/json`
  - `message`: The user’s query.

**Response:**

- **Success (200 OK):** Streams a response containing text and audio data.

  **Text Response:**

  ```json
  {
    "text": "Bot’s response text"
  }
  ```

  **Audio Response:** Audio content streamed as `application/octet-stream`.

- **Error (400 Bad Request):**

  ```json
  {
    "error": "Please upload documents first"
  }
  ```

  ```json
  {
    "error": "No message provided"
  }
  ```

### **3. Personality Queries**

The bot responds to queries about its personality using predefined responses based on the input. Example queries include:

- "What is your name?"
- "Who created you?"
- "What can you do?"

The responses are customized based on the bot’s personality traits.

## **File Upload and Processing Workflow**

1. **Upload Files**: Use the `/upload` endpoint to upload documents. Supported formats include `.pdf`, `.docx`, `.doc`, and `.txt`.

2. **Process Files**: Uploaded files are saved temporarily. The appropriate loader (e.g., PyPDFLoader, Docx2txtLoader, TextLoader) extracts text from each file.

3. **Split Text into Chunks**: The text is divided into smaller chunks using a `CharacterTextSplitter` for more manageable processing.

4. **Generate Embeddings**: Convert text chunks into embeddings using a pre-trained Hugging Face model (`sentence-transformers/all-MiniLM-L6-v2`).

5. **Store in Vector Database**: Save the embeddings in a Chroma vector database for efficient retrieval.

6. **Create Conversational Chain**: Construct a `ConversationalRetrievalChain` using the embeddings, allowing the chatbot to fetch relevant information based on user queries.

## **Chat Workflow**

1. **Receive User Query**:

   - **Endpoint**: `/chat`
   - **Method**: POST
   - **Description**: The user sends a query to this endpoint in JSON format.

   **Example Request:**

   ```json
   {
     "message": "Tell me about World War II."
   }
   ```

2. **Retrieve Relevant Chunks**:

   - **Action**: Fetch the most relevant text chunks from the vector database based on the user query.

3. **Generate Response**:

   - **Action**: Pass the relevant chunks and conversation history to the `ChatGroq` language model to generate a text response.

   **Example Text Response:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945..."
   }
   ```

4. **Generate Audio Response**:

   - **Action**: Convert the text response to audio using Deepgram’s text-to-speech service.

   **Example Request to Deepgram:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945...",
     "voice": "en_us_male"
   }
   ```

5. **Stream Audio Response**:

   - **Action**: Stream the generated audio back to the client. The response includes both text and audio.

   **Example Response:**

   ```json
   {
     "text": "World War II was a global conflict that lasted from 1939 to 1945...",
     "audio_url": "<url_to_audio_stream>"
   }
   ```

   The client can use this URL to stream the audio response.

6. **Handle Errors**:
   - **Action**: Return appropriate error messages for issues such as invalid input or external service failures.

## **Code Overview**

### **Main Components**

- **Flask App**: Initializes and runs the web server.
- **`generateAudio(text)`**: Generates speech from text using `deepgram` and stream.
- **`create_conversational_chain(vector_store)`**: Creates a conversational retrieval chain using the `ChatGroq` model and a vector store.
- **`load_or_create_vector_store()`**: Loads or creates a vector store for document embeddings.
- **`upload_files()`**: Endpoint to upload and process documents.
- **`chat()`**: Endpoint to handle user queries and generate responses.
- **`remove_emojis(text)`**: Removes emojis from text to ensure cleaner responses.
- **`handle_personality_query(user_input)`**: Provides predefined responses based on user queries about the bot’s personality.
- **`add_personality_to_response(response, personality)`**: Adjusts the response based on the bot’s personality.

### **Debug Statements**

Throughout the code, debug statements are included to trace the flow and verify operations.

## **Running the Server**

Start the Flask server in debug mode:

```bash
python app.py
```

The server will be accessible at `http://127.0.0.1:5000`.

## **Testing the API**

You can test the API using tools like `curl`, Postman, or directly via Python scripts.

**Example using `curl` to upload a file:**

```bash
curl -X POST -F "files[]=@path/to/your/file.pdf" http://127.0.0.1:5000/upload
```

**Example using `curl` to chat:**

```bash
curl -X POST -H "Content-Type: application/json" -d '{"message": "Hello, bot!"}' http://127.0.0.1:5000/chat
```

## **Troubleshooting**

- **Application Not Starting**:

  - Ensure all environment variables are correctly set.
  - Check for missing dependencies in `requirements.txt`.

- **File Upload Issues**:

  - Verify the file formats and sizes.
  - Check server logs for errors related to file processing.

- **API Errors**:

  - Validate API requests and responses.
  - Ensure external services (Groq, Deepgram) are accessible and operational.

- **Audio Streaming Problems**:
  - Check the URL returned by Deepgram and ensure it is correct.
  - Verify audio content is properly formatted and streamed.

## **License**

This project is licensed under the MIT License. See the `LICENSE` file for more details.

---

Feel free to modify this documentation as needed to fit your project's specifics and requirements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published