Blog Editor

A CLI tool to easily transcribe, generate thumbnails and generate blog assets (title, description, blog, linkedin) for a given .m4a audio file.

Demo:

Python.CLI.Python.project.github.mp4

Problem Context

This is the current workflow for creating a blog post (for my newsletter, Ambitious x Driven):

As you can see, the major bottleneck for c ontent production is preparing the blog assets from the recording.

As this is a side-project (that I'm doing next to my full-time CS Masters @ ETH Zurich), this would be unsustainable on the long-run.

Thus, I decided to build a Python CLI tool leveraging the latest LLMs & prompt-engineering techniques that would automate this workflow.

Data processing pipeline

Here is a diagram that shows the processing pipeline for the blog writer.

To generate the blog assets, you need to input an audio file, a PDF resume, and a photo of the guest. It then:

Extracts the guest details from the resume
Transcribes the audio file (using AssemblyAI w/ speaker diarization)
Generates landscape + square thumbnails in the Ambitious x Driven style by using an 🤗 Hugging Face Image Segmentation model & the resume details
Generates a blog post, title, description and LinkedIn post using the guest details(to reduce hallucinations + fix transcription errors) and transcript.

[Design] Below is the system design for the blog writer.

I created a separate File Helper class + FileHandlers that directly interact with the file system (as it's more convenient for a personal project + viewing the outputs).
I encapsulate each Blog 'component' into a separate schema and helper class ('Files', 'Metadata', 'Thumbnails', 'Blog').

Here is how a typical file change happens (e.g. generating a title)

User requests a file from the CLI (we are requesting files as there are multiple Zoom recordings in the folder)
The BlogEditor class requests the file from the FileHelper
The FileHelper parses together all the files using the relevant FileHandlers, then returns it to the BlogEditor (then CLI for viewing)
Then, through the CLI, the user can apply relevant functions (e.g. generate all, transcribe, edit, ...). The BlogEditor gets the latest file, applies the requested function, saves the file, then returns it to the CLI for viewing.

Usage

On MacOS:

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Required changes before using:

Update the Zoom folder path in blog_editor.py
You'll have to write your own prompts.py file (I can provide a skeleton if it's needed - reach out here: http://linkedin.com/in/anirudhhramesh/ or anirudhh.ramesh[AT]gmail.com)
You'll have to include a .env file following the env.example file

Required files

Provide .m4a/.mp3 2-person interview (with one guest and one interviewer)
Provide resume.pdf (of the guest)
Provide photo.png (of the guest) - needs to end with photo.png

Then:

python cli.py

This will run & open the CLI interface. I recommend going into full-screen terminal mode before running this.

Features

In the CLI interface:

list (see all the blogs found in your Zoom folder)
get <blog_name> (get the blog with the given name, this will be your 'working blog')
generate all (generate all the attributes for the blog)
edit (edit the attribute with the given value)
publish (publish the blog to notion)
reset all (reset all the attributes for the blog) - (NOT YET IMPLEMENTED)
reset (reset the attribute with the given value) - (NOT YET IMPLEMENTED)

Prompt-engineering details for generating a human-sounding 'true-to-transcript' blog

I probably will write a paper on this, so I'll share more details once that's prepared :)
In the meantime, I'm looking for labs/researchers to advise the paper-writing, if you're interested email me please: anirudhh.ramesh[AT]gmail.com

Citations

I use the BiRefNet model for image segmentation model from Hugging Face. This one seemed to work best out of the couple I tried.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
file_system		file_system
helpers		helpers
schemas		schemas
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
blog_editor.py		blog_editor.py
cli.py		cli.py
errors.py		errors.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blog Editor

Demo:

Problem Context

Data processing pipeline

Usage

Required changes before using:

Required files

Features

Prompt-engineering details for generating a human-sounding 'true-to-transcript' blog

Citations

About

Releases

Packages

Languages

AnirudhhRamesh/BlogEditor

Folders and files

Latest commit

History

Repository files navigation

Blog Editor

Demo:

Problem Context

Data processing pipeline

Usage

Required changes before using:

Required files

Features

Prompt-engineering details for generating a human-sounding 'true-to-transcript' blog

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages