The frontend of DailyBriefs is built using HTML, CSS, and JavaScript. It provides a user-friendly interface for displaying summarized news articles in various categories such as sports, technology, and stock market news.
-
HTML Structure: The main structure of the frontend is defined in the
index.html
file. It includes a header, a main section with a dropdown for selecting news categories, and a section for displaying the articles. -
CSS Styling: The styling is handled by the
style.css
file, which ensures a clean and responsive design. Key elements like the header, dropdown, and article summaries are styled for better user experience. -
JavaScript Functionality: The core functionality is implemented in the
script.js
file. It includes event listeners and functions to fetch and display articles based on the selected category.
When a user selects a category (e.g., sports or technology) from the dropdown, the frontend fetches the latest news summaries from the Supabase database for the current date. This is done through an API endpoint that the backend provides.
The backend has a route /get_articles/<category>
that handles fetching articles from Supabase. The function get_articles_from_supabase
is used to query the database for articles with the current date.
When the user selects "Sports" from the dropdown, the frontend fetches the latest sports news summaries from the backend and displays them. Below is an example of how the sports news is presented:
This project is a backend service that automatically fetches, summarizes, and stores news articles from various categories using AI-powered summarization.
The backend follows a pipeline to process news articles:
- Fetching News URLs: The system searches for news articles using Google News.
- Web Scraping: It then scrapes the content from the news websites.
- AI Summarization: The scraped content is summarized using AI models.
- Database Storage: Finally, the summaries are stored in a Supabase database.
- The
getNewsData
function inai_agent.py
searches Google News for articles based on a query and number of results. - For stock news,
get_stock_news_data
is used to specifically fetch Yahoo Finance articles.
- The
get_news_data_from_url
function inai_agent.py
scrapes the content from each news URL. - It uses BeautifulSoup to parse the HTML and extract relevant information like title, text, date, and source.
- Three summarization functions are used for different news categories:
sm_summary
for stock market newssports_summary
for sports newstech_summary
for technology news
- These functions use OpenAI's GPT model via the LangChain library to generate summaries.
- Custom prompts are used to guide the AI in creating relevant summaries for each category.
- The
insert_summary
function inapp.py
handles storing summaries in Supabase. - It creates a data object with the current date, summary, news title, and URL.
- The data is then inserted into the appropriate table in Supabase.
- The
/upload_all/
endpoint triggers the summarization and upload process for all news categories. - It uses FastAPI's
BackgroundTasks
to run the processes asynchronously.
This project utilizes OpenAI's GPT-3.5-turbo model for summarizing news articles. We employ the map-reduce method to efficiently process longer texts while staying within the model's token limit.
The map-reduce approach is implemented using LangChain's load_summarize_chain
function. This method involves:
-
Splitting: The input text is split into smaller chunks.
-
Mapping: Each chunk is summarized independently.
-
Reducing: The individual summaries are combined into a final summary.
We use carefully crafted prompts for each news category (stock market, sports, and technology). These prompts are designed to:
-
Guide the model: Instruct the model to act as a specific type of analyst (e.g., "As a stock market analyst...").
-
Ensure accuracy: Include instructions like "YOU DO NOT MAKE UP CONTENTS!" to prevent fabrication.
-
Retain important information: Request that "all important numbers, statistics, and quotes are retained."
-
Filter irrelevant content: Use the instruction "If the content is not related to [category], please respond with NOT RELEVANT."
-
Focus on key points: Provide a list of specific items to include in the summary if mentioned in the original text.
Example of a map prompt (for stock market news):
map_prompt = """
As a stock market analyst, please summarize the following news content and extract key market insights and data:
"{text}"
YOU DO NOT MAKE UP CONTENTS!
Please include the following in your summary ONLY if mentioned:
1. Changes in major market indices (if mentioned)
2. Performance and reasons for important stocks
3. Key factors affecting the market (e.g., economic data, company earnings, policy changes, etc.)
4. Important statistics and percentage changes
5. Important quotes from analysts or experts
6. Predictions or opinions on future market trends
Please ensure all important numbers, statistics, and quotes are retained.
If the content is not related to the stock market, please respond with *NOT RELEVANT*.
Summary:
"""
Below is an example of the summarization result for stock market news:
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables in a
.env
file:SUPABASE_URL
SUPABASE_KEY
OPENAI_API_KEY
-
Run the FastAPI server:
python app.py