I have created a Semantic Segmentation model to segment potholes from an input video.
- I have trained a U-Net model from scratch using Tensorflow and Keras.
- I have used a custom Data Generator to feed the images and the masks as input to the U-Net model.
- The depth of the U-Net model can be customised. (recommended depth - 3 to 5 for (256,256,3) images).
- The maximum allowed depth of the U-Net model is determined by the dimensions of the input frames.
Following were the versions of the libraries I used
- Tensorflow - 2.10.1
- OpenCV (cv2) - 4.6.0
- Numpy - 1.23.5
- Moviepy - 1.0.3
Clone the project
git clone https://github.com/lilNewbie/SemanticSegmentation.git
Go to the project directory
cd SemanticSegmentation
Important! Set the paths to the input and output video files
Run segfile.py
python segfile.py
- The U-Net model was trained in Google Colab.
- You will need to use your Kaggle public API to access the dataset via Colab
Input Video
input_video.mp4
Output Video
output_video.mp4
The difference in input and output clarity is due to resizing the input frames for prediction and later resizing it back to its actual dimensions