I have created a Semantic Segmentation model to segment potholes from an input video.
- I have trained a U-Net model from scratch using Tensorflow and Keras.
- I have used a custom Data Generator to feed the images and the masks as input to the U-Net model.
- The depth of the U-Net model can be customised. (recommended depth - 3 to 5 for (256,256,3) images).
- The maximum allowed depth of the U-Net model is determined by the dimensions of the input frames.
Following were the versions of the libraries I used
- Tensorflow - 2.10.1
- OpenCV (cv2) - 4.6.0
- Numpy - 1.23.5
- Moviepy - 1.0.3
Clone the project
git clone https://github.com/lilNewbie/SemanticSegmentation.git
Go to the project directory
cd SemanticSegmentation
Important! Set the paths to the input and output video files
Run segfile.py
python segfile.py
- The U-Net model was trained in Google Colab.
- You will need to use your Kaggle public API to access the dataset via Colab
Input Video
Output Video
The difference in input and output clarity is due to resizing the input frames for prediction and later resizing it back to its actual dimensions