TextToSQL is a powerful tool designed to convert natural human language into SQL queries. Leveraging LangChain and Ollama, it can generate, explain, and execute SQL queries against a specified database. This project streamlines database interactions, making it easier for users without SQL expertise to retrieve and manipulate data.
- Natural Language to SQL Conversion: Converts user input in plain English into a valid SQL query.
- Query Explanation: Provides a detailed explanation of the generated SQL query.
- Query Execution: Executes the generated SQL query against the connected database.
- User Interaction: Allows users to choose whether to execute the query, get an explanation, or abort the operation.
- Dynamic few-shot prompting: Dynamically chooses examples that best matches with the user question by using similarity search. Uses these examples to generate a few-shot prompt.
- Data augmentation: Creates a set of natural language questions and their corresponding SQL queries for a specified database, used as examples for dynamic few-shot prompting in SQL query generation.
-
Query Generation:
- The user inputs a question in natural language.
query_chain
validates the question and generates an SQL query usingwrite_query_chain
andllm_approval
.
-
User Interaction:
- The system asks the user to choose an action:
- Execute the query.
- Get an explanation of the query.
- Abort the operation.
- The system asks the user to choose an action:
-
Action Execution:
- If the user chooses to execute the query,
execute_query_chain
runs the SQL query against the database and returns the results. - If the user asks for an explanation,
explain_chain
provides a detailed breakdown of the SQL query. - The loop continues until the user decides to execute the query or abort the operation.
- If the user chooses to execute the query,
-
Input:
What is the total revenue generated from selling Ford Mustang?
-
Generated SQL Query:
SELECT SUM(price * quantity) as total_revenue FROM sales JOIN cars ON sales.carid = cars.carid WHERE cars.make = 'Ford' AND cars.model = 'Mustang';
-
User Actions:
- Execute the query to get results.
[(Decimal('112051.12'),)]
- Get an explanation:
This query calculates the total revenue for Ford Mustand car sales by summing the product of price and qunatity sold from the 'sales' table, joining it with the 'cars' table where the car ID matches, and filtering results based on make ('Ford') and model ('Mustang').
This project is licensed under the MIT License.
Feel free to customize this README to better fit your project's specifics and structure.