Mask R-CNN using Tensorflow and OpenCV to increase inference performances on NVidia GPU

Mask RCNN is a deep neural network for instance segmentation. In other words, it can separate different objects in a image or a video. You give it a image, it gives you the object bounding boxes, classes and masks.

You can read full paper explaining Mask RCNN here :

Basically a Mask RCNN neural netwotk model is an extension of faster R-CNN SSD object detection network neaural network. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It’s based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

You can find an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It’s based on Feature Pyramid Network (FPN) and a ResNet101 backbone here :

With this implementation you can easily train or transfer learning Mask R-CNN with your own dataset on Tensorflow / Keras , I will talk about Mask R-CNN training with Python3 anf TF/Keras in another article.

For this example we are going to use default Mask R-CNN weights trained with COCO Dataset wich is included in OpenCV 4.2.0.

First of all you have to install sources and compile OpenCV 4.2.0.

My workstation is based on Unbuntu 18.04 with Nvidia Geforce RTX 2080 nvidia dirvers 440.59 cuda 10.2 and cudnn 7.5.0 which is a minimum requirement to build OpenCV 4.2.0

Create a directory for example mkdir OpenCV-4.2.0 in your home and get inside

Download both opencv and opencv_contrib :

wget -O

wget -O

Unzip both archives and rename directories to opencv and opencv_contrib :

mv opencv-4.2.0 opencv

mv opencv_contrib-4.2.0 opencv_contrib

cd opencv

mkdir build

cd build

Now you are ready to cmake Opencv-4.2.0 inside build directory, you need a cmake version 3.10 or newer or compilation will fail, update cmake in your system before proceed.

These are the flags I used on my system for cmake command to build with Opencv-4.2.0 with Cuda and Cudnn. I have excluded python builds because not interested in python environment on this machine, but you can turn INSTALL_PYTHON_EXAMPLES=ON to build them :

-D CMAKE_INSTALL_PREFIX=/usr/local/opencv-4.2.0 \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules \

If cmake succefully recognized cuda and cudnn you should see something like this :

– NVIDIA GPU arch: 70
– NVIDIA PTX archs:
– cuDNN: YES (ver 7.5.0)

now you can run make command (suggest to check your proc capabilities on your machine using nproc command to speed up Opecv build on my machine I can go up to make -j24)

Once compilation is end you can install OpenCv 4.2.0 in your machine using command :

sudo make install
sudo ldconfig

Now that you have OpeCV installed on your system you can download marsk-rcnn.cpp OpenCV source demo here :

Git clone repository or download archive and follow instruction on page to dowload Tensorflow Mask-RCNN model and weights trained on COCO Dataset.

You have to modify modify mask_rcnn.cpp to enable GPU and CUDNN network inference here on Network loading :


You can download my mask_rcnn.cpp modified and with a bug in BuondingBox draw corrected here :

To build you can use Eclipse or any other IDE you are used or just compile with command line command

Remember to include in your include paths : – I /usr/local/include/opencv4
Link following libraries -l :


Using library search path -L : /usr/local/opencv-4.2.0/lib

then run test program :

./mask_rcnn.out –video=<path to your video file>

Here’s my video sample running test program on Nvidia RTX 2080 GPU with 20-25 fps performance using cuda and cudnn acceleration enjoy :

You may also like