On my post "OpenCV DNN speed compare in Python, C#, C++", Blaise Thunderbytes asked me to implement pjreddie's YOLO with OpenCvSharp, so that was why this post came out :P
Since OpenCV 3.3.1, DNN moudle supported parse YOLO models, so we can easily using YOLO pre-trained model now. OpenCv Doc have a tutorial of YOLO object detection writed in C++, if you using C++ can check it, I will using C# with OpenCVSharp.
Better Speed
Compare with my previous test, YOLOv2 544x544 was almost 2x faster than SSD 512x512 (1000ms vs 1900ms using CPU and OpenCvSharp), and faster than SSD with python almost half(1000ms vs 1500ms).
Let's look the code, because we using DNN module to load darknet model, so the code template was similar.
var cfg = "yolo-voc.cfg";
var model = "yolo-voc.weights"; //YOLOv2 544x544
var threshold = 0.3;
We using YOLO2 voc 544x544 model.
var blob = CvDnn.BlobFromImage(org, 1 / 255.0, new Size(544, 544), new Scalar(), true, false);
var net = CvDnn.ReadNetFromDarknet(cfg, model);
net.SetInput(blob, "data");
Setting blob, remember the parameter value are important, it make the result different.
const int prefix = 5; //skip 0~4
for (int i = 0; i < prob.Rows; i++)
{
var confidence = prob.At(i, 4);
if (confidence > threshold)
{
//get classes probability
Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max);
var classes = max.X;
var probability = prob.At(i, classes + prefix);
if (probability > threshold) //more accuracy
{
//get center and width/height
var centerX = prob.At(i, 0) * w;
var centerY = prob.At(i, 1) * h;
var width = prob.At(i, 2) * w;
var height = prob.At(i, 3) * h;
//label formating
var label = $"{Labels[classes]} {probability * 100:0.00}%";
Console.WriteLine($"confidence {confidence * 100:0.00}% {label}");
var x1 = (centerX - width / 2) < 0 ? 0 : centerX - width / 2; //avoid left side over edge
//draw result
org.Rectangle(new Point(x1, centerY - height / 2), new Point(centerX + width / 2, centerY + height / 2), Colors[classes], 2);
var textSize = Cv2.GetTextSize(label, HersheyFonts.HersheyTriplex, 0.5, 1, out var baseline);
Cv2.Rectangle(org, new Rect(new Point(x1, centerY - height / 2 - textSize.Height - baseline),
new Size(textSize.Width, textSize.Height + baseline)), Colors[classes], Cv2.FILLED);
Cv2.PutText(org, label, new Point(x1, centerY - height / 2-baseline), HersheyFonts.HersheyTriplex, 0.5, Scalar.Black);
}
}
}
YOLO's output format was like this :0,1 : Center of x, yIn this case, VOC has 20 classes, so 5~24 are class probability.
2,3 : Width, Height
4 : Confidence
rest : Individual class probability
After take few time to figure out it, the other part just draw the result like before.
BTW I add a IF CASE of (probability > threshold) to make result look better, if you don't do it, the result will look like this.
And the final result was here.
and other pictures.
The full code was here, or you can get it from github.
using System;
using System.Diagnostics;
using System.Linq;
using OpenCvSharp;
using OpenCvSharp.Dnn;
namespace OpenCvDnnYolo
{
class Program
{
private static readonly string[] Labels = { "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor" };
private static readonly Scalar[] Colors = Enumerable.Repeat(false, 20).Select(x => Scalar.RandomColor()).ToArray();
static void Main()
{
var file = "bali.jpg";
// https://pjreddie.com/darknet/yolo/
var cfg = "yolo-voc.cfg";
var model = "yolo-voc.weights"; //YOLOv2 544x544
var threshold = 0.3;
var org = Cv2.ImRead(file);
var w = org.Width;
var h = org.Height;
//setting blob, parameter are important
var blob = CvDnn.BlobFromImage(org, 1 / 255.0, new Size(544, 544), new Scalar(), true, false);
var net = CvDnn.ReadNetFromDarknet(cfg, model);
net.SetInput(blob, "data");
Stopwatch sw = new Stopwatch();
sw.Start();
//forward model
var prob = net.Forward();
sw.Stop();
Console.WriteLine($"Runtime:{sw.ElapsedMilliseconds} ms");
/* YOLO2 VOC output
0 1 : center 2 3 : w/h
4 : confidence 5 ~24 : class probability */
const int prefix = 5; //skip 0~4
for (int i = 0; i < prob.Rows; i++)
{
var confidence = prob.At(i, 4);
if (confidence > threshold)
{
//get classes probability
Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max);
var classes = max.X;
var probability = prob.At(i, classes + prefix);
if (probability > threshold) //more accuracy
{
//get center and width/height
var centerX = prob.At(i, 0) * w;
var centerY = prob.At(i, 1) * h;
var width = prob.At(i, 2) * w;
var height = prob.At(i, 3) * h;
//label formating
var label = $"{Labels[classes]} {probability * 100:0.00}%";
Console.WriteLine($"confidence {confidence * 100:0.00}% {label}");
var x1 = (centerX - width / 2) < 0 ? 0 : centerX - width / 2; //avoid left side over edge
//draw result
org.Rectangle(new Point(x1, centerY - height / 2), new Point(centerX + width / 2, centerY + height / 2), Colors[classes], 2);
var textSize = Cv2.GetTextSize(label, HersheyFonts.HersheyTriplex, 0.5, 1, out var baseline);
Cv2.Rectangle(org, new Rect(new Point(x1, centerY - height / 2 - textSize.Height - baseline),
new Size(textSize.Width, textSize.Height + baseline)), Colors[classes], Cv2.FILLED);
Cv2.PutText(org, label, new Point(x1, centerY - height / 2-baseline), HersheyFonts.HersheyTriplex, 0.5, Scalar.Black);
}
}
}
using (new Window("died.tw", org))
{
Cv2.WaitKey();
}
}
}
}
Hope you enjoy it.
Seems lots ppl can't get right yolo weight file after it upgrade to version 3, so I uploaded my solution include weight file.
Download Here
It should be click and run, I hope :)
2019/1/10 Update: I have a new post of YOLO v3, you can try it if need.
I got this exception on line 26 whit your code, do you know how could i solve it?
ReplyDeleteOpenCvSharp.OpenCVException
HResult=0x80131500
Message=ifile.is_open()
Source=OpenCvSharp
StackTrace:
at OpenCvSharp.NativeMethods.<>c.<.cctor>b__1579_0(ErrorCode status, String funcName, String errMsg, String fileName, Int32 line, IntPtr userdata)
at OpenCvSharp.NativeMethods.dnn_readNetFromDarknet(String cfgFile, String darknetModel)
at OpenCvSharp.Dnn.Net.ReadNetFromDarknet(String cfgFile, String darknetModel)
at OpenCvSharp.Dnn.CvDnn.ReadNetFromDarknet(String cfgFile, String darknetModel)
at OpenCvDnnYolo.Program.Main() in F:\Icarian\Escritorio\OpenCvSharpDnnYolo-master\OpenCvDnnYolo\Program.cs:line 26
Nvm. Weights file was missing.
DeleteGlad you found the reason.
DeleteIs there a way to only detect 1 kind of object, to get better FPS processing?
ReplyDeleteas I know, retrain model with one class can get better speed, if using YOLO, you can try tiny YOLO, a lot faster but got lower mAP.
DeleteHi again,
ReplyDeleteTrying to get this working but getting this error "OpenCvSharp.OpenCVException: 'separator_index < line.size()'" at this line var net = CvDnn.ReadNetFromDarknet(cfg, model);
I have tried a few yolov2 and yolov3 config and weight file with the same exception. Could you upload the exact files you used?
I download yolov2 from https://pjreddie.com/darknet/yolo/ , you can try it.
DeleteHi Derek,
DeleteI uploaded my solution, you can download it from https://mega.nz/#!knhiwT4Z!aVqbGvDjl__wAPIcJVsq1CM8OhjhFPKHZv6aiaNOKUc
Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max); i get error that point is type but used like variable can you help me?
ReplyDeleteMaybe your .net/c# version too low ?
DeleteHey, just had this issue as well but this was fixed by declaring this variable outside of the line, like so:
DeletePoint min;
Point max;
Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out min, out max);
var classes = max.X;
I'm actually having some trouble myself where everything is working mostly fine, however I'm getting an issue with
org.Rectangle(new Point(x1, centerY - height / 2), new Point(centerX + width / 2, centerY + height / 2), Colors[classes], 2); with the error ArgumentException: right > left
Any ideas? I'm using Unity for this implementation and it's working up until this line, going so far as correctly identifying the images in a Debug.Log, but failing to draw the rectangle
First point should be centerX-width/2 instead of x1
DeleteTried that but sadly to no avail, been messing around fruitlessly but I can't seem to figure it out.
DeleteHere's a snippet of my results, https://imgur.com/a/hcowSNY , it's my first time working with neural nets and I can't tell if these are off or correct (huge numbers for an image only around 700x400). If anyone could compare to their own results it would be much appreciated, using the picture horses.jpg for this one and it would be the first resultset.
I've fixed the issue, in my version of C# I was required to make casts to several data types which I naively assumed were integers. I downloaded your project and adjusted my types to be the same as yours and she works perfectly!
DeleteAnyone have any tips on using this for YOLO3? I did some research and I think it's necessary to rebuild the OpenCV dll's in OpenCVSharp
ReplyDeleteHi Died , could u share how to use opencvsharp GPU Accelerated Computing ?
ReplyDeleteI tried , but catch a error message " cannot use cuda model "
Sorry for late replay.
DeleteOpenCvSharp3 can enable GPU but you have to build it myself, it's hard unless I never success lol.
then at OpenCvSharp4, it only support CPU.
thank you!!!
ReplyDelete