├── .gitattributes ├── .gitignore ├── LICENSE ├── ObjectDetection.sln ├── ObjectDetection ├── DataStructures │ ├── ImageNetData.cs │ └── ImageNetPrediction.cs ├── Form1.Designer.cs ├── Form1.cs ├── Form1.resx ├── ObjectDetection.csproj ├── OnnxModelScorer.cs ├── Program.cs ├── YoloParser │ ├── DimensionsBase.cs │ ├── YoloBoundingBox.cs │ └── YoloOutputParser.cs └── assets │ └── Model │ └── TinyYolo2_model.onnx ├── Properties ├── Resources.Designer.cs └── Resources.resx ├── README.md └── docs └── Netron └── netron.PNG /.gitattributes: -------------------------------------------------------------------------------- 1 | *.onnx filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.rsuser 8 | *.suo 9 | *.user 10 | *.userosscache 11 | *.sln.docstates 12 | 13 | # User-specific files (MonoDevelop/Xamarin Studio) 14 | *.userprefs 15 | 16 | # Mono auto generated files 17 | mono_crash.* 18 | 19 | # Build results 20 | [Dd]ebug/ 21 | [Dd]ebugPublic/ 22 | [Rr]elease/ 23 | [Rr]eleases/ 24 | x64/ 25 | x86/ 26 | [Aa][Rr][Mm]/ 27 | [Aa][Rr][Mm]64/ 28 | bld/ 29 | [Bb]in/ 30 | [Oo]bj/ 31 | [Ll]og/ 32 | [Ll]ogs/ 33 | 34 | # Visual Studio 2015/2017 cache/options directory 35 | .vs/ 36 | # Uncomment if you have tasks that create the project's static files in wwwroot 37 | #wwwroot/ 38 | 39 | # Visual Studio 2017 auto generated files 40 | Generated\ Files/ 41 | 42 | # MSTest test Results 43 | [Tt]est[Rr]esult*/ 44 | [Bb]uild[Ll]og.* 45 | 46 | # NUnit 47 | *.VisualState.xml 48 | TestResult.xml 49 | nunit-*.xml 50 | 51 | # Build Results of an ATL Project 52 | [Dd]ebugPS/ 53 | [Rr]eleasePS/ 54 | dlldata.c 55 | 56 | # Benchmark Results 57 | BenchmarkDotNet.Artifacts/ 58 | 59 | # .NET Core 60 | project.lock.json 61 | project.fragment.lock.json 62 | artifacts/ 63 | 64 | # StyleCop 65 | StyleCopReport.xml 66 | 67 | # Files built by Visual Studio 68 | *_i.c 69 | *_p.c 70 | *_h.h 71 | *.ilk 72 | *.meta 73 | *.obj 74 | *.iobj 75 | *.pch 76 | *.pdb 77 | *.ipdb 78 | *.pgc 79 | *.pgd 80 | *.rsp 81 | *.sbr 82 | *.tlb 83 | *.tli 84 | *.tlh 85 | *.tmp 86 | *.tmp_proj 87 | *_wpftmp.csproj 88 | *.log 89 | *.vspscc 90 | *.vssscc 91 | .builds 92 | *.pidb 93 | *.svclog 94 | *.scc 95 | 96 | # Chutzpah Test files 97 | _Chutzpah* 98 | 99 | # Visual C++ cache files 100 | ipch/ 101 | *.aps 102 | *.ncb 103 | *.opendb 104 | *.opensdf 105 | *.sdf 106 | *.cachefile 107 | *.VC.db 108 | *.VC.VC.opendb 109 | 110 | # Visual Studio profiler 111 | *.psess 112 | *.vsp 113 | *.vspx 114 | *.sap 115 | 116 | # Visual Studio Trace Files 117 | *.e2e 118 | 119 | # TFS 2012 Local Workspace 120 | $tf/ 121 | 122 | # Guidance Automation Toolkit 123 | *.gpState 124 | 125 | # ReSharper is a .NET coding add-in 126 | _ReSharper*/ 127 | *.[Rr]e[Ss]harper 128 | *.DotSettings.user 129 | 130 | # TeamCity is a build add-in 131 | _TeamCity* 132 | 133 | # DotCover is a Code Coverage Tool 134 | *.dotCover 135 | 136 | # AxoCover is a Code Coverage Tool 137 | .axoCover/* 138 | !.axoCover/settings.json 139 | 140 | # Visual Studio code coverage results 141 | *.coverage 142 | *.coveragexml 143 | 144 | # NCrunch 145 | _NCrunch_* 146 | .*crunch*.local.xml 147 | nCrunchTemp_* 148 | 149 | # MightyMoose 150 | *.mm.* 151 | AutoTest.Net/ 152 | 153 | # Web workbench (sass) 154 | .sass-cache/ 155 | 156 | # Installshield output folder 157 | [Ee]xpress/ 158 | 159 | # DocProject is a documentation generator add-in 160 | DocProject/buildhelp/ 161 | DocProject/Help/*.HxT 162 | DocProject/Help/*.HxC 163 | DocProject/Help/*.hhc 164 | DocProject/Help/*.hhk 165 | DocProject/Help/*.hhp 166 | DocProject/Help/Html2 167 | DocProject/Help/html 168 | 169 | # Click-Once directory 170 | publish/ 171 | 172 | # Publish Web Output 173 | *.[Pp]ublish.xml 174 | *.azurePubxml 175 | # Note: Comment the next line if you want to checkin your web deploy settings, 176 | # but database connection strings (with potential passwords) will be unencrypted 177 | *.pubxml 178 | *.publishproj 179 | 180 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 181 | # checkin your Azure Web App publish settings, but sensitive information contained 182 | # in these scripts will be unencrypted 183 | PublishScripts/ 184 | 185 | # NuGet Packages 186 | *.nupkg 187 | # NuGet Symbol Packages 188 | *.snupkg 189 | # The packages folder can be ignored because of Package Restore 190 | **/[Pp]ackages/* 191 | # except build/, which is used as an MSBuild target. 192 | !**/[Pp]ackages/build/ 193 | # Uncomment if necessary however generally it will be regenerated when needed 194 | #!**/[Pp]ackages/repositories.config 195 | # NuGet v3's project.json files produces more ignorable files 196 | *.nuget.props 197 | *.nuget.targets 198 | 199 | # Microsoft Azure Build Output 200 | csx/ 201 | *.build.csdef 202 | 203 | # Microsoft Azure Emulator 204 | ecf/ 205 | rcf/ 206 | 207 | # Windows Store app package directories and files 208 | AppPackages/ 209 | BundleArtifacts/ 210 | Package.StoreAssociation.xml 211 | _pkginfo.txt 212 | *.appx 213 | *.appxbundle 214 | *.appxupload 215 | 216 | # Visual Studio cache files 217 | # files ending in .cache can be ignored 218 | *.[Cc]ache 219 | # but keep track of directories ending in .cache 220 | !?*.[Cc]ache/ 221 | 222 | # Others 223 | ClientBin/ 224 | ~$* 225 | *~ 226 | *.dbmdl 227 | *.dbproj.schemaview 228 | *.jfm 229 | *.pfx 230 | *.publishsettings 231 | orleans.codegen.cs 232 | 233 | # Including strong name files can present a security risk 234 | # (https://github.com/github/gitignore/pull/2483#issue-259490424) 235 | #*.snk 236 | 237 | # Since there are multiple workflows, uncomment next line to ignore bower_components 238 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 239 | #bower_components/ 240 | 241 | # RIA/Silverlight projects 242 | Generated_Code/ 243 | 244 | # Backup & report files from converting an old project file 245 | # to a newer Visual Studio version. Backup files are not needed, 246 | # because we have git ;-) 247 | _UpgradeReport_Files/ 248 | Backup*/ 249 | UpgradeLog*.XML 250 | UpgradeLog*.htm 251 | ServiceFabricBackup/ 252 | *.rptproj.bak 253 | 254 | # SQL Server files 255 | *.mdf 256 | *.ldf 257 | *.ndf 258 | 259 | # Business Intelligence projects 260 | *.rdl.data 261 | *.bim.layout 262 | *.bim_*.settings 263 | *.rptproj.rsuser 264 | *- [Bb]ackup.rdl 265 | *- [Bb]ackup ([0-9]).rdl 266 | *- [Bb]ackup ([0-9][0-9]).rdl 267 | 268 | # Microsoft Fakes 269 | FakesAssemblies/ 270 | 271 | # GhostDoc plugin setting file 272 | *.GhostDoc.xml 273 | 274 | # Node.js Tools for Visual Studio 275 | .ntvs_analysis.dat 276 | node_modules/ 277 | 278 | # Visual Studio 6 build log 279 | *.plg 280 | 281 | # Visual Studio 6 workspace options file 282 | *.opt 283 | 284 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 285 | *.vbw 286 | 287 | # Visual Studio LightSwitch build output 288 | **/*.HTMLClient/GeneratedArtifacts 289 | **/*.DesktopClient/GeneratedArtifacts 290 | **/*.DesktopClient/ModelManifest.xml 291 | **/*.Server/GeneratedArtifacts 292 | **/*.Server/ModelManifest.xml 293 | _Pvt_Extensions 294 | 295 | # Paket dependency manager 296 | .paket/paket.exe 297 | paket-files/ 298 | 299 | # FAKE - F# Make 300 | .fake/ 301 | 302 | # CodeRush personal settings 303 | .cr/personal 304 | 305 | # Python Tools for Visual Studio (PTVS) 306 | __pycache__/ 307 | *.pyc 308 | 309 | # Cake - Uncomment if you are using it 310 | # tools/** 311 | # !tools/packages.config 312 | 313 | # Tabs Studio 314 | *.tss 315 | 316 | # Telerik's JustMock configuration file 317 | *.jmconfig 318 | 319 | # BizTalk build output 320 | *.btp.cs 321 | *.btm.cs 322 | *.odx.cs 323 | *.xsd.cs 324 | 325 | # OpenCover UI analysis results 326 | OpenCover/ 327 | 328 | # Azure Stream Analytics local run output 329 | ASALocalRun/ 330 | 331 | # MSBuild Binary and Structured Log 332 | *.binlog 333 | 334 | # NVidia Nsight GPU debugger configuration file 335 | *.nvuser 336 | 337 | # MFractors (Xamarin productivity tool) working folder 338 | .mfractor/ 339 | 340 | # Local History for Visual Studio 341 | .localhistory/ 342 | 343 | # BeatPulse healthcheck temp database 344 | healthchecksdb 345 | 346 | # Backup folder for Package Reference Convert tool in Visual Studio 2017 347 | MigrationBackup/ 348 | 349 | # Ionide (cross platform F# VS Code tools) working folder 350 | .ionide/ 351 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Vadim Frolov 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /ObjectDetection.sln: -------------------------------------------------------------------------------- 1 |  2 | Microsoft Visual Studio Solution File, Format Version 12.00 3 | # Visual Studio Version 16 4 | VisualStudioVersion = 16.0.29521.150 5 | MinimumVisualStudioVersion = 10.0.40219.1 6 | Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ObjectDetection", "ObjectDetection\ObjectDetection.csproj", "{72FC4E1F-F147-4938-8656-A0B679BFA7EC}" 7 | EndProject 8 | Global 9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 10 | Debug|Any CPU = Debug|Any CPU 11 | Release|Any CPU = Release|Any CPU 12 | EndGlobalSection 13 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 14 | {72FC4E1F-F147-4938-8656-A0B679BFA7EC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU 15 | {72FC4E1F-F147-4938-8656-A0B679BFA7EC}.Debug|Any CPU.Build.0 = Debug|Any CPU 16 | {72FC4E1F-F147-4938-8656-A0B679BFA7EC}.Release|Any CPU.ActiveCfg = Release|Any CPU 17 | {72FC4E1F-F147-4938-8656-A0B679BFA7EC}.Release|Any CPU.Build.0 = Release|Any CPU 18 | EndGlobalSection 19 | GlobalSection(SolutionProperties) = preSolution 20 | HideSolutionNode = FALSE 21 | EndGlobalSection 22 | GlobalSection(ExtensibilityGlobals) = postSolution 23 | SolutionGuid = {ECE6F180-85FD-45A8-869F-5FF05E00CB23} 24 | EndGlobalSection 25 | EndGlobal 26 | -------------------------------------------------------------------------------- /ObjectDetection/DataStructures/ImageNetData.cs: -------------------------------------------------------------------------------- 1 | using System.Drawing; 2 | using Microsoft.ML.Transforms.Image; 3 | 4 | namespace ObjectDetection.DataStructures 5 | { 6 | public class ImageNetData 7 | { 8 | // Dimensions provided here seem not to play an important role 9 | [ImageType(480, 640)] 10 | public Bitmap InputImage { get; set; } 11 | 12 | public string Label { get; set; } 13 | 14 | public ImageNetData() 15 | { 16 | InputImage = null; 17 | Label = ""; 18 | } 19 | } 20 | } 21 | -------------------------------------------------------------------------------- /ObjectDetection/DataStructures/ImageNetPrediction.cs: -------------------------------------------------------------------------------- 1 | using Microsoft.ML.Data; 2 | 3 | namespace ObjectDetection.DataStructures 4 | { 5 | public class ImageNetPrediction 6 | { 7 | [ColumnName("grid")] 8 | public float[] PredictedLabels; 9 | } 10 | } 11 | -------------------------------------------------------------------------------- /ObjectDetection/Form1.Designer.cs: -------------------------------------------------------------------------------- 1 | namespace ObjectDetection 2 | { 3 | partial class Form1 4 | { 5 | /// 6 | /// Required designer variable. 7 | /// 8 | private System.ComponentModel.IContainer components = null; 9 | 10 | /// 11 | /// Clean up any resources being used. 12 | /// 13 | /// true if managed resources should be disposed; otherwise, false. 14 | protected override void Dispose(bool disposing) 15 | { 16 | if (disposing && (components != null)) 17 | { 18 | components.Dispose(); 19 | } 20 | base.Dispose(disposing); 21 | } 22 | 23 | #region Windows Form Designer generated code 24 | 25 | /// 26 | /// Required method for Designer support - do not modify 27 | /// the contents of this method with the code editor. 28 | /// 29 | private void InitializeComponent() 30 | { 31 | this.btnStart = new System.Windows.Forms.Button(); 32 | this.pictureBox1 = new System.Windows.Forms.PictureBox(); 33 | this.btnStop = new System.Windows.Forms.Button(); 34 | ((System.ComponentModel.ISupportInitialize)(this.pictureBox1)).BeginInit(); 35 | this.SuspendLayout(); 36 | // 37 | // btnStart 38 | // 39 | this.btnStart.BackColor = System.Drawing.Color.Green; 40 | this.btnStart.Location = new System.Drawing.Point(51, 52); 41 | this.btnStart.Margin = new System.Windows.Forms.Padding(6, 7, 6, 7); 42 | this.btnStart.Name = "btnStart"; 43 | this.btnStart.Size = new System.Drawing.Size(197, 81); 44 | this.btnStart.TabIndex = 0; 45 | this.btnStart.Text = "Start"; 46 | this.btnStart.UseVisualStyleBackColor = false; 47 | this.btnStart.Click += new System.EventHandler(this.btnStart_Click); 48 | // 49 | // pictureBox1 50 | // 51 | this.pictureBox1.Location = new System.Drawing.Point(332, 52); 52 | this.pictureBox1.Name = "pictureBox1"; 53 | this.pictureBox1.Size = new System.Drawing.Size(1026, 820); 54 | this.pictureBox1.TabIndex = 1; 55 | this.pictureBox1.TabStop = false; 56 | // 57 | // btnStop 58 | // 59 | this.btnStop.BackColor = System.Drawing.Color.Red; 60 | this.btnStop.Location = new System.Drawing.Point(51, 157); 61 | this.btnStop.Name = "btnStop"; 62 | this.btnStop.Size = new System.Drawing.Size(201, 81); 63 | this.btnStop.TabIndex = 2; 64 | this.btnStop.Text = "Stop"; 65 | this.btnStop.UseVisualStyleBackColor = false; 66 | this.btnStop.Click += new System.EventHandler(this.btnStop_Click); 67 | // 68 | // Form1 69 | // 70 | this.AutoScaleDimensions = new System.Drawing.SizeF(15F, 37F); 71 | this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font; 72 | this.ClientSize = new System.Drawing.Size(1431, 928); 73 | this.Controls.Add(this.pictureBox1); 74 | this.Controls.Add(this.btnStop); 75 | this.Controls.Add(this.btnStart); 76 | this.Margin = new System.Windows.Forms.Padding(6, 7, 6, 7); 77 | this.Name = "Form1"; 78 | this.Text = "Realtime object detection"; 79 | this.FormClosing += new System.Windows.Forms.FormClosingEventHandler(this.Form1_FormClosing); 80 | ((System.ComponentModel.ISupportInitialize)(this.pictureBox1)).EndInit(); 81 | this.ResumeLayout(false); 82 | 83 | } 84 | 85 | #endregion 86 | 87 | private System.Windows.Forms.Button btnStart; 88 | private System.Windows.Forms.PictureBox pictureBox1; 89 | private System.Windows.Forms.Button btnStop; 90 | } 91 | } 92 | 93 | -------------------------------------------------------------------------------- /ObjectDetection/Form1.cs: -------------------------------------------------------------------------------- 1 | using System; 2 | using System.IO; 3 | using System.Collections.Generic; 4 | using System.Data; 5 | using System.Drawing; 6 | using System.Drawing.Drawing2D; 7 | using System.Linq; 8 | using System.Windows.Forms; 9 | using OpenCvSharp; 10 | using Microsoft.ML; 11 | using ObjectDetection.YoloParser; 12 | using ObjectDetection.DataStructures; 13 | 14 | namespace ObjectDetection 15 | { 16 | public partial class Form1 : Form 17 | { 18 | private VideoCapture _capture; 19 | private bool _isRunning; 20 | private Image _mySharpImage; 21 | 22 | private string assetsRelativePath = @"assets"; 23 | private string assetsPath; 24 | private string modelFilePath; 25 | 26 | public Form1() 27 | { 28 | InitializeComponent(); 29 | 30 | _capture = new VideoCapture(0); 31 | _isRunning = false; 32 | 33 | assetsPath = GetAbsolutePath(assetsRelativePath); 34 | modelFilePath = Path.Combine(assetsPath, "Model", "TinyYolo2_model.onnx"); 35 | 36 | //Mat image = new Mat(); 37 | //_capture.Read(image); 38 | //Console.WriteLine($"image size (height; width) = ({image.Height}; {image.Width})"); 39 | } 40 | 41 | private void btnStop_Click(object sender, EventArgs e) 42 | { 43 | _isRunning = false; 44 | // Uncomment if you want to clear output upon stop 45 | //pictureBox1.Image = null; 46 | } 47 | 48 | private void btnStart_Click(object sender, EventArgs e) 49 | { 50 | // Frame image buffer 51 | Mat image = new Mat(); 52 | _isRunning = true; 53 | btnStart.Enabled = false; 54 | 55 | var mlContext = new MLContext(); 56 | 57 | // Create instance of model scorer 58 | var modelScorer = new OnnxModelScorer(modelFilePath, mlContext); 59 | // Load model only once 60 | var model = modelScorer.LoadModel(); 61 | 62 | while (_isRunning) 63 | { 64 | _capture.Read(image); // read frame from webcam 65 | 66 | if (image.Empty()) 67 | break; 68 | 69 | // Store frame as in-memory source for ML.NET 70 | ImageNetData[] inMemoryCollection = new ImageNetData[] 71 | { 72 | new ImageNetData 73 | { 74 | InputImage = OpenCvSharp.Extensions.BitmapConverter.ToBitmap(image), 75 | Label = "", 76 | } 77 | }; 78 | var imageDataView = mlContext.Data.LoadFromEnumerable(inMemoryCollection); 79 | 80 | // Make another copy of the frame. We will use it to draw bounding boxes on it 81 | _mySharpImage = (Image)OpenCvSharp.Extensions.BitmapConverter.ToBitmap(image); 82 | 83 | // Use model to score data 84 | IEnumerable probabilities = modelScorer.Score(model, imageDataView); 85 | 86 | // Post-process model output 87 | YoloOutputParser parser = new YoloOutputParser(); 88 | 89 | var boundingBoxes = 90 | probabilities 91 | .Select(probability => parser.ParseOutputs(probability)) 92 | .Select(boxes => parser.FilterBoundingBoxes(boxes, 5, .5F)); 93 | // Since we only have a single frame, it is OK to have i = 0. Otherwise we would need 94 | // to iterate through images. 95 | var i = 0; 96 | IList detectedObjects = boundingBoxes.ElementAt(i); 97 | DrawBoundingBox(ref _mySharpImage, detectedObjects); 98 | 99 | pictureBox1.SizeMode = PictureBoxSizeMode.StretchImage; 100 | pictureBox1.Image = _mySharpImage; 101 | Cv2.WaitKey(1); 102 | 103 | _mySharpImage.Dispose(); 104 | inMemoryCollection[0].InputImage.Dispose(); 105 | } 106 | btnStart.Enabled = true; 107 | } 108 | public static string GetAbsolutePath(string relativePath) 109 | { 110 | FileInfo _dataRoot = new FileInfo(typeof(Program).Assembly.Location); 111 | string assemblyFolderPath = _dataRoot.Directory.FullName; 112 | 113 | string fullPath = Path.Combine(assemblyFolderPath, relativePath); 114 | 115 | return fullPath; 116 | } 117 | private static void LogDetectedObjects(IList boundingBoxes) 118 | { 119 | if (boundingBoxes.Count == 0) 120 | { 121 | return; 122 | } 123 | 124 | Console.WriteLine($".....The objects in the image are detected as below...."); 125 | 126 | foreach (var box in boundingBoxes) 127 | { 128 | Console.WriteLine($"{box.Label} and its Confidence score: {box.Confidence}"); 129 | } 130 | 131 | Console.WriteLine(""); 132 | } 133 | 134 | private static void DrawBoundingBox(ref Image image, IList filteredBoundingBoxes) 135 | { 136 | var originalImageHeight = image.Height; 137 | var originalImageWidth = image.Width; 138 | 139 | foreach (var box in filteredBoundingBoxes) 140 | { 141 | // Get Bounding Box Dimensions 142 | var x = (uint)Math.Max(box.Dimensions.X, 0); 143 | var y = (uint)Math.Max(box.Dimensions.Y, 0); 144 | var width = (uint)Math.Min(originalImageWidth - x, box.Dimensions.Width); 145 | var height = (uint)Math.Min(originalImageHeight - y, box.Dimensions.Height); 146 | 147 | // Resize To Image 148 | x = (uint)originalImageWidth * x / OnnxModelScorer.ImageNetSettings.imageWidth; 149 | y = (uint)originalImageHeight * y / OnnxModelScorer.ImageNetSettings.imageHeight; 150 | width = (uint)originalImageWidth * width / OnnxModelScorer.ImageNetSettings.imageWidth; 151 | height = (uint)originalImageHeight * height / OnnxModelScorer.ImageNetSettings.imageHeight; 152 | 153 | // Bounding Box Text 154 | string text = $"{box.Label} ({(box.Confidence * 100).ToString("0")}%)"; 155 | 156 | using (Graphics thumbnailGraphic = Graphics.FromImage(image)) 157 | { 158 | thumbnailGraphic.CompositingQuality = CompositingQuality.HighQuality; 159 | thumbnailGraphic.SmoothingMode = SmoothingMode.HighQuality; 160 | thumbnailGraphic.InterpolationMode = InterpolationMode.HighQualityBicubic; 161 | 162 | // Define Text Options 163 | Font drawFont = new Font("Arial", 12, FontStyle.Bold); 164 | SizeF size = thumbnailGraphic.MeasureString(text, drawFont); 165 | SolidBrush fontBrush = new SolidBrush(Color.Black); 166 | System.Drawing.Point atPoint = new System.Drawing.Point((int)x, (int)y - (int)size.Height - 1); 167 | 168 | // Define BoundingBox options 169 | Pen pen = new Pen(box.BoxColor, 3.2f); 170 | SolidBrush colorBrush = new SolidBrush(box.BoxColor); 171 | 172 | // Draw text on image 173 | thumbnailGraphic.FillRectangle(colorBrush, (int)x, (int)(y - size.Height - 1), (int)size.Width, (int)size.Height); 174 | thumbnailGraphic.DrawString(text, drawFont, fontBrush, atPoint); 175 | 176 | // Draw bounding box on image 177 | thumbnailGraphic.DrawRectangle(pen, x, y, width, height); 178 | } 179 | } 180 | } 181 | 182 | private void Form1_FormClosing(object sender, FormClosingEventArgs e) 183 | { 184 | _capture.Release(); 185 | } 186 | } 187 | } 188 | -------------------------------------------------------------------------------- /ObjectDetection/Form1.resx: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | text/microsoft-resx 51 | 52 | 53 | 2.0 54 | 55 | 56 | System.Resources.ResXResourceReader, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 57 | 58 | 59 | System.Resources.ResXResourceWriter, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 60 | 61 | -------------------------------------------------------------------------------- /ObjectDetection/ObjectDetection.csproj: -------------------------------------------------------------------------------- 1 |  2 | 3 | 4 | WinExe 5 | netcoreapp3.1 6 | true 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | Always 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | True 31 | True 32 | Resources.resx 33 | 34 | 35 | 36 | 37 | 38 | ResXFileCodeGenerator 39 | Resources.Designer.cs 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /ObjectDetection/OnnxModelScorer.cs: -------------------------------------------------------------------------------- 1 | using System; 2 | using System.Collections.Generic; 3 | using System.Drawing; 4 | using System.Linq; 5 | using Microsoft.ML; 6 | using Microsoft.ML.Data; 7 | using ObjectDetection.DataStructures; 8 | using ObjectDetection.YoloParser; 9 | 10 | namespace ObjectDetection 11 | { 12 | class OnnxModelScorer 13 | { 14 | private readonly string modelLocation; 15 | private readonly MLContext mlContext; 16 | 17 | private IList _boundingBoxes = new List(); 18 | 19 | public OnnxModelScorer(string modelLocation, MLContext mlContext) 20 | { 21 | this.modelLocation = modelLocation; 22 | this.mlContext = mlContext; 23 | } 24 | 25 | public struct ImageNetSettings 26 | { 27 | public const int imageHeight = 416; 28 | public const int imageWidth = 416; 29 | } 30 | 31 | public struct TinyYoloModelSettings 32 | { 33 | // for checking Tiny yolo2 Model input and output parameter names, 34 | //you can use tools like Netron, 35 | // which is installed by Visual Studio AI Tools 36 | 37 | // input tensor name 38 | public const string ModelInput = "image"; 39 | 40 | // output tensor name 41 | public const string ModelOutput = "grid"; 42 | } 43 | 44 | public ITransformer LoadModel() 45 | { 46 | // Create IDataView from empty list to obtain input data schema 47 | ImageNetData[] inMemoryCollection = new ImageNetData[] 48 | { 49 | new ImageNetData 50 | { 51 | InputImage = null, 52 | Label = "" 53 | } 54 | }; 55 | var data = mlContext.Data.LoadFromEnumerable(inMemoryCollection); 56 | 57 | // Define scoring pipeline 58 | var pipeline = mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "InputImage") 59 | .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image")) 60 | .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput })); 61 | 62 | // Fit scoring pipeline 63 | var model = pipeline.Fit(data); 64 | 65 | return model; 66 | } 67 | 68 | private IEnumerable PredictDataUsingModel(IDataView testData, ITransformer model) 69 | { 70 | IDataView scoredData = model.Transform(testData); 71 | 72 | IEnumerable probabilities = scoredData.GetColumn(TinyYoloModelSettings.ModelOutput); 73 | 74 | return probabilities; 75 | } 76 | 77 | public IEnumerable Score(ITransformer model, IDataView data) 78 | { 79 | return PredictDataUsingModel(data, model); 80 | } 81 | } 82 | } 83 | 84 | -------------------------------------------------------------------------------- /ObjectDetection/Program.cs: -------------------------------------------------------------------------------- 1 | using System; 2 | using System.Collections.Generic; 3 | using System.Linq; 4 | using System.Threading.Tasks; 5 | using System.Windows.Forms; 6 | 7 | namespace ObjectDetection 8 | { 9 | static class Program 10 | { 11 | /// 12 | /// The main entry point for the application. 13 | /// 14 | [STAThread] 15 | static void Main() 16 | { 17 | Application.SetHighDpiMode(HighDpiMode.SystemAware); 18 | Application.EnableVisualStyles(); 19 | Application.SetCompatibleTextRenderingDefault(false); 20 | Application.Run(new Form1()); 21 | } 22 | } 23 | } 24 | -------------------------------------------------------------------------------- /ObjectDetection/YoloParser/DimensionsBase.cs: -------------------------------------------------------------------------------- 1 | namespace ObjectDetection.YoloParser 2 | { 3 | public class DimensionsBase 4 | { 5 | public float X { get; set; } 6 | public float Y { get; set; } 7 | public float Height { get; set; } 8 | public float Width { get; set; } 9 | } 10 | } -------------------------------------------------------------------------------- /ObjectDetection/YoloParser/YoloBoundingBox.cs: -------------------------------------------------------------------------------- 1 | using System.Drawing; 2 | 3 | namespace ObjectDetection.YoloParser 4 | { 5 | public class BoundingBoxDimensions : DimensionsBase { } 6 | 7 | public class YoloBoundingBox 8 | { 9 | public BoundingBoxDimensions Dimensions { get; set; } 10 | 11 | public string Label { get; set; } 12 | 13 | public float Confidence { get; set; } 14 | 15 | public RectangleF Rect 16 | { 17 | get { return new RectangleF(Dimensions.X, Dimensions.Y, Dimensions.Width, Dimensions.Height); } 18 | } 19 | 20 | public Color BoxColor { get; set; } 21 | } 22 | 23 | } -------------------------------------------------------------------------------- /ObjectDetection/YoloParser/YoloOutputParser.cs: -------------------------------------------------------------------------------- 1 | using System; 2 | using System.Collections.Generic; 3 | using System.Drawing; 4 | using System.Linq; 5 | 6 | namespace ObjectDetection.YoloParser 7 | { 8 | class YoloOutputParser 9 | { 10 | class CellDimensions : DimensionsBase { } 11 | 12 | public const int ROW_COUNT = 13; 13 | public const int COL_COUNT = 13; 14 | public const int CHANNEL_COUNT = 125; 15 | public const int BOXES_PER_CELL = 5; 16 | public const int BOX_INFO_FEATURE_COUNT = 5; 17 | public const int CLASS_COUNT = 20; 18 | public const float CELL_WIDTH = 32; 19 | public const float CELL_HEIGHT = 32; 20 | 21 | private int channelStride = ROW_COUNT * COL_COUNT; 22 | 23 | private float[] anchors = new float[] 24 | { 25 | 1.08F, 1.19F, 3.42F, 4.41F, 6.63F, 11.38F, 9.42F, 5.11F, 16.62F, 10.52F 26 | }; 27 | 28 | private string[] labels = new string[] 29 | { 30 | "aeroplane", "bicycle", "bird", "boat", "bottle", 31 | "bus", "car", "cat", "chair", "cow", 32 | "diningtable", "dog", "horse", "motorbike", "person", 33 | "pottedplant", "sheep", "sofa", "train", "tvmonitor" 34 | }; 35 | 36 | private static Color[] classColors = new Color[] 37 | { 38 | Color.Khaki, 39 | Color.Fuchsia, 40 | Color.Silver, 41 | Color.RoyalBlue, 42 | Color.Green, 43 | Color.DarkOrange, 44 | Color.Purple, 45 | Color.Gold, 46 | Color.Red, 47 | Color.Aquamarine, 48 | Color.Lime, 49 | Color.AliceBlue, 50 | Color.Sienna, 51 | Color.Orchid, 52 | Color.Tan, 53 | Color.LightPink, 54 | Color.Yellow, 55 | Color.HotPink, 56 | Color.OliveDrab, 57 | Color.SandyBrown, 58 | Color.DarkTurquoise 59 | }; 60 | 61 | private float Sigmoid(float value) 62 | { 63 | var k = (float)Math.Exp(value); 64 | return k / (1.0f + k); 65 | } 66 | 67 | private float[] Softmax(float[] values) 68 | { 69 | var maxVal = values.Max(); 70 | var exp = values.Select(v => Math.Exp(v - maxVal)); 71 | var sumExp = exp.Sum(); 72 | 73 | return exp.Select(v => (float)(v / sumExp)).ToArray(); 74 | } 75 | 76 | private int GetOffset(int x, int y, int channel) 77 | { 78 | // YOLO outputs a tensor that has a shape of 125x13x13, which 79 | // WinML flattens into a 1D array. To access a specific channel 80 | // for a given (x,y) cell position, we need to calculate an offset 81 | // into the array 82 | return (channel * this.channelStride) + (y * COL_COUNT) + x; 83 | } 84 | 85 | private BoundingBoxDimensions ExtractBoundingBoxDimensions(float[] modelOutput, int x, int y, int channel) 86 | { 87 | return new BoundingBoxDimensions 88 | { 89 | X = modelOutput[GetOffset(x, y, channel)], 90 | Y = modelOutput[GetOffset(x, y, channel + 1)], 91 | Width = modelOutput[GetOffset(x, y, channel + 2)], 92 | Height = modelOutput[GetOffset(x, y, channel + 3)] 93 | }; 94 | } 95 | 96 | private float GetConfidence(float[] modelOutput, int x, int y, int channel) 97 | { 98 | return Sigmoid(modelOutput[GetOffset(x, y, channel + 4)]); 99 | } 100 | 101 | private CellDimensions MapBoundingBoxToCell(int x, int y, int box, BoundingBoxDimensions boxDimensions) 102 | { 103 | return new CellDimensions 104 | { 105 | X = ((float)x + Sigmoid(boxDimensions.X)) * CELL_WIDTH, 106 | Y = ((float)y + Sigmoid(boxDimensions.Y)) * CELL_HEIGHT, 107 | Width = (float)Math.Exp(boxDimensions.Width) * CELL_WIDTH * anchors[box * 2], 108 | Height = (float)Math.Exp(boxDimensions.Height) * CELL_HEIGHT * anchors[box * 2 + 1], 109 | }; 110 | } 111 | 112 | public float[] ExtractClasses(float[] modelOutput, int x, int y, int channel) 113 | { 114 | float[] predictedClasses = new float[CLASS_COUNT]; 115 | int predictedClassOffset = channel + BOX_INFO_FEATURE_COUNT; 116 | for (int predictedClass = 0; predictedClass < CLASS_COUNT; predictedClass++) 117 | { 118 | predictedClasses[predictedClass] = modelOutput[GetOffset(x, y, predictedClass + predictedClassOffset)]; 119 | } 120 | return Softmax(predictedClasses); 121 | } 122 | 123 | private ValueTuple GetTopResult(float[] predictedClasses) 124 | { 125 | return predictedClasses 126 | .Select((predictedClass, index) => (Index: index, Value: predictedClass)) 127 | .OrderByDescending(result => result.Value) 128 | .First(); 129 | } 130 | 131 | private float IntersectionOverUnion(RectangleF boundingBoxA, RectangleF boundingBoxB) 132 | { 133 | var areaA = boundingBoxA.Width * boundingBoxA.Height; 134 | 135 | if (areaA <= 0) 136 | return 0; 137 | 138 | var areaB = boundingBoxB.Width * boundingBoxB.Height; 139 | 140 | if (areaB <= 0) 141 | return 0; 142 | 143 | var minX = Math.Max(boundingBoxA.Left, boundingBoxB.Left); 144 | var minY = Math.Max(boundingBoxA.Top, boundingBoxB.Top); 145 | var maxX = Math.Min(boundingBoxA.Right, boundingBoxB.Right); 146 | var maxY = Math.Min(boundingBoxA.Bottom, boundingBoxB.Bottom); 147 | 148 | var intersectionArea = Math.Max(maxY - minY, 0) * Math.Max(maxX - minX, 0); 149 | 150 | return intersectionArea / (areaA + areaB - intersectionArea); 151 | } 152 | 153 | public IList ParseOutputs(float[] yoloModelOutputs, float threshold = .3F) 154 | { 155 | var boxes = new List(); 156 | 157 | for (int row = 0; row < ROW_COUNT; row++) 158 | { 159 | for (int column = 0; column < COL_COUNT; column++) 160 | { 161 | for (int box = 0; box < BOXES_PER_CELL; box++) 162 | { 163 | var channel = (box * (CLASS_COUNT + BOX_INFO_FEATURE_COUNT)); 164 | 165 | BoundingBoxDimensions boundingBoxDimensions = ExtractBoundingBoxDimensions(yoloModelOutputs, row, column, channel); 166 | 167 | float confidence = GetConfidence(yoloModelOutputs, row, column, channel); 168 | 169 | CellDimensions mappedBoundingBox = MapBoundingBoxToCell(row, column, box, boundingBoxDimensions); 170 | 171 | if (confidence < threshold) 172 | continue; 173 | 174 | float[] predictedClasses = ExtractClasses(yoloModelOutputs, row, column, channel); 175 | 176 | var (topResultIndex, topResultScore) = GetTopResult(predictedClasses); 177 | var topScore = topResultScore * confidence; 178 | 179 | if (topScore < threshold) 180 | continue; 181 | 182 | boxes.Add(new YoloBoundingBox() 183 | { 184 | Dimensions = new BoundingBoxDimensions 185 | { 186 | X = (mappedBoundingBox.X - mappedBoundingBox.Width / 2), 187 | Y = (mappedBoundingBox.Y - mappedBoundingBox.Height / 2), 188 | Width = mappedBoundingBox.Width, 189 | Height = mappedBoundingBox.Height, 190 | }, 191 | Confidence = topScore, 192 | Label = labels[topResultIndex], 193 | BoxColor = classColors[topResultIndex] 194 | }); 195 | } 196 | } 197 | } 198 | return boxes; 199 | } 200 | 201 | public IList FilterBoundingBoxes(IList boxes, int limit, float threshold) 202 | { 203 | var activeCount = boxes.Count; 204 | var isActiveBoxes = new bool[boxes.Count]; 205 | 206 | for (int i = 0; i < isActiveBoxes.Length; i++) 207 | isActiveBoxes[i] = true; 208 | 209 | var sortedBoxes = boxes.Select((b, i) => new { Box = b, Index = i }) 210 | .OrderByDescending(b => b.Box.Confidence) 211 | .ToList(); 212 | 213 | var results = new List(); 214 | 215 | for (int i = 0; i < boxes.Count; i++) 216 | { 217 | if (isActiveBoxes[i]) 218 | { 219 | var boxA = sortedBoxes[i].Box; 220 | results.Add(boxA); 221 | 222 | if (results.Count >= limit) 223 | break; 224 | 225 | for (var j = i + 1; j < boxes.Count; j++) 226 | { 227 | if (isActiveBoxes[j]) 228 | { 229 | var boxB = sortedBoxes[j].Box; 230 | 231 | if (IntersectionOverUnion(boxA.Rect, boxB.Rect) > threshold) 232 | { 233 | isActiveBoxes[j] = false; 234 | activeCount--; 235 | 236 | if (activeCount <= 0) 237 | break; 238 | } 239 | } 240 | } 241 | 242 | if (activeCount <= 0) 243 | break; 244 | } 245 | } 246 | return results; 247 | } 248 | 249 | } 250 | } -------------------------------------------------------------------------------- /ObjectDetection/assets/Model/TinyYolo2_model.onnx: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:a29cc82c91355e0198e98a839c7a072a8b07bea5976abf4406d32e8a3dc3b179 3 | size 63481471 4 | -------------------------------------------------------------------------------- /Properties/Resources.Designer.cs: -------------------------------------------------------------------------------- 1 | //------------------------------------------------------------------------------ 2 | // 3 | // This code was generated by a tool. 4 | // Runtime Version:4.0.30319.42000 5 | // 6 | // Changes to this file may cause incorrect behavior and will be lost if 7 | // the code is regenerated. 8 | // 9 | //------------------------------------------------------------------------------ 10 | 11 | namespace ObjectDetection.Properties { 12 | using System; 13 | 14 | 15 | /// 16 | /// A strongly-typed resource class, for looking up localized strings, etc. 17 | /// 18 | // This class was auto-generated by the StronglyTypedResourceBuilder 19 | // class via a tool like ResGen or Visual Studio. 20 | // To add or remove a member, edit your .ResX file then rerun ResGen 21 | // with the /str option, or rebuild your VS project. 22 | [global::System.CodeDom.Compiler.GeneratedCodeAttribute("System.Resources.Tools.StronglyTypedResourceBuilder", "16.0.0.0")] 23 | [global::System.Diagnostics.DebuggerNonUserCodeAttribute()] 24 | [global::System.Runtime.CompilerServices.CompilerGeneratedAttribute()] 25 | internal class Resources { 26 | 27 | private static global::System.Resources.ResourceManager resourceMan; 28 | 29 | private static global::System.Globalization.CultureInfo resourceCulture; 30 | 31 | [global::System.Diagnostics.CodeAnalysis.SuppressMessageAttribute("Microsoft.Performance", "CA1811:AvoidUncalledPrivateCode")] 32 | internal Resources() { 33 | } 34 | 35 | /// 36 | /// Returns the cached ResourceManager instance used by this class. 37 | /// 38 | [global::System.ComponentModel.EditorBrowsableAttribute(global::System.ComponentModel.EditorBrowsableState.Advanced)] 39 | internal static global::System.Resources.ResourceManager ResourceManager { 40 | get { 41 | if (object.ReferenceEquals(resourceMan, null)) { 42 | global::System.Resources.ResourceManager temp = new global::System.Resources.ResourceManager("ObjectDetection.Properties.Resources", typeof(Resources).Assembly); 43 | resourceMan = temp; 44 | } 45 | return resourceMan; 46 | } 47 | } 48 | 49 | /// 50 | /// Overrides the current thread's CurrentUICulture property for all 51 | /// resource lookups using this strongly typed resource class. 52 | /// 53 | [global::System.ComponentModel.EditorBrowsableAttribute(global::System.ComponentModel.EditorBrowsableState.Advanced)] 54 | internal static global::System.Globalization.CultureInfo Culture { 55 | get { 56 | return resourceCulture; 57 | } 58 | set { 59 | resourceCulture = value; 60 | } 61 | } 62 | } 63 | } 64 | -------------------------------------------------------------------------------- /Properties/Resources.resx: -------------------------------------------------------------------------------- 1 |  2 | 3 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | text/microsoft-resx 110 | 111 | 112 | 2.0 113 | 114 | 115 | System.Resources.ResXResourceReader, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 116 | 117 | 118 | System.Resources.ResXResourceWriter, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 119 | 120 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Realtime Object Detection from a webcam 2 | 3 | | ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms | 4 | |----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------| 5 | | v1.4 | Dynamic API | Up-to-date | GUI app | in-memory | Object Detection | Deep Learning | Tiny Yolo2 ONNX model | 6 | 7 | 8 | This is an extension of [Object Detection](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) sample from Microsoft. 9 | Instead of loading prediction data from disc, this application uses images from a webcam. Other than that it is the same application. 10 | 11 | For a detailed explanation of how to build original Object Detection application, see the accompanying [tutorial](https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/object-detection-onnx) on the Microsoft Docs site. 12 | 13 | ## System Requirements 14 | 15 | Make sure that [Git LSF](https://git-lfs.github.com/) is installed on your system **before you clone** this repository. 16 | 17 | Apart from obvious .NET Core >= 3.1 and ML.NET library this application depends on [OpenCvSharp4](https://github.com/shimat/opencvsharp) library. 18 | 19 | OpenCvSharp4 will be installed automatically as NuGet package. 20 | 21 | ## Problem 22 | Object detection is one of the classical problems in computer vision: Recognize what the objects are inside a given image and also where they are in the image. For these cases, you can either use pre-trained models or train your own model to classify images specific to your custom domain. 23 | 24 | ## DataSet 25 | There is no dataset involved. Live images will be used. 26 | 27 | ## Pre-trained model 28 | There are multiple models which are pre-trained for identifying multiple objects in the images. Here we are using the pretrained model, **Tiny Yolo2** in **ONNX** format. This model is a real-time neural network for object detection that detects 20 different classes. It is made up of 9 convolutional layers and 6 max-pooling layers and is a smaller version of the more complex full [YOLOv2](https://pjreddie.com/darknet/yolov2/) network. 29 | 30 | The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners. 31 | 32 | The model is downloaded from the [ONNX Model Zoo](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov2) which is a is a collection of pre-trained, state-of-the-art models in the ONNX format. 33 | 34 | The Tiny YOLO2 model was trained on the [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. Below are the model's prerequisites. 35 | 36 | **Model input and output** 37 | 38 | **Input** 39 | 40 | Input image of the shape (3x416x416) 41 | 42 | **Output** 43 | 44 | Output is a (1x125x13x13) array 45 | 46 | **Pre-processing steps** 47 | 48 | Resize the input image to a (3x416x416) array of type float32. 49 | 50 | **Post-processing steps** 51 | 52 | The output is a (125x13x13) tensor where 13x13 is the number of grid cells that the image gets divided into. Each grid cell corresponds to 125 channels, made up of the 5 bounding boxes predicted by the grid cell and the 25 data elements that describe each bounding box (5x25=125). For more information on how to derive the final bounding boxes and their corresponding confidence scores, refer to this [post](http://machinethink.net/blog/object-detection-with-yolo/). 53 | 54 | ## Solution 55 | The GUI application project `ObjectDetection` can be used to to identify objects in live images based on the **Tiny Yolo2 ONNX** model. 56 | 57 | Again, note that this sample only uses/consumes a pre-trained ONNX model with ML.NET API. Therefore, it does **not** train any ML.NET model. Currently, ML.NET supports only for scoring/detecting with existing ONNX trained models. 58 | 59 | ## Code Walkthrough 60 | There is a single project in the solution named `ObjectDetection`, which is responsible for loading the model in Tiny Yolo2 ONNX format and then detects objects in the stream. 61 | 62 | ### ML.NET: Model Scoring 63 | 64 | Define the schema of data in a class type and refer that type while loading data using TextLoader. Here the class type is **ImageNetData**. 65 | 66 | ```csharp 67 | public class ImageNetData 68 | { 69 | // Dimensions provided here seem not to play an important role 70 | [ImageType(480, 640)] 71 | public Bitmap InputImage { get; set; } 72 | 73 | public string Label { get; set; } 74 | 75 | public ImageNetData() 76 | { 77 | InputImage = null; 78 | Label = ""; 79 | } 80 | } 81 | ``` 82 | 83 | ### ML.NET: Configure the model 84 | 85 | Code for working with the model is found in `OnnxModelScorer.cs` file, `LoadModel` method. 86 | 87 | The first step is to create an empty dataview as we just need schema of data while configuring up model. 88 | 89 | ```csharp 90 | ImageNetData[] inMemoryCollection = new ImageNetData[] 91 | { 92 | new ImageNetData 93 | { 94 | InputImage = null, 95 | Label = "" 96 | } 97 | }; 98 | var data = mlContext.Data.LoadFromEnumerable(inMemoryCollection); 99 | ``` 100 | 101 | It is important to highlight that the `Label` in the `ImageNetData` class is not really used when scoring with the Tiny Yolo2 Onnx model. 102 | 103 | The second step is to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. This is the reason images are resized and then transformed (mainly, pixel values are normalized across all R,G,B channels). 104 | 105 | ```csharp 106 | var pipeline = mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "InputImage") 107 | .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image")) 108 | .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput })); 109 | ``` 110 | 111 | You also need to check the neural network, and check the names of the input / output nodes. In order to inspect the model, you can use tools like [Netron](https://github.com/lutzroeder/netron), which is automatically installed with [Visual Studio Tools for AI](https://visualstudio.microsoft.com/downloads/ai-tools-vs/). 112 | These names are used later in the definition of the estimation pipe: in the case of the inception network, the input tensor is named 'image' and the output is named 'grid' 113 | 114 | Define the **input** and **output** parameters of the Tiny Yolo2 Onnx Model. 115 | 116 | ```csharp 117 | public struct TinyYoloModelSettings 118 | { 119 | // for checking TIny yolo2 Model input and output parameter names, 120 | //you can use tools like Netron, 121 | // which is installed by Visual Studio AI Tools 122 | 123 | // input tensor name 124 | public const string ModelInput = "image"; 125 | 126 | // output tensor name 127 | public const string ModelOutput = "grid"; 128 | } 129 | ``` 130 | 131 | ![inspecting neural network with netron](./docs/Netron/netron.PNG) 132 | 133 | Finally, we return the trained model after *fitting* the estimator pipeline. 134 | 135 | ```csharp 136 | var model = pipeline.Fit(data); 137 | return model; 138 | ``` 139 | When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image. On console we display the label value of each bounding box. 140 | 141 | # Detect objects in the image: 142 | 143 | After the model is configured, we need to pass the image to the model to detect objects. When obtaining the prediction, we get an array of floats in the property `PredictedLabels`. The array is a float array of size **21125**. This is the output of model i,e 125x13x13 as discussed earlier. This output is interpreted by `YoloOutputParser` class and returns a number of bounding boxes for each image. Again these boxes are filtered so that we retrieve only 5 bounding boxes which have better confidence(how much certain that a box contains the obejct) for each object of the image. 144 | 145 | ```csharp 146 | IEnumerable probabilities = modelScorer.Score(imageDataView); 147 | 148 | YoloOutputParser parser = new YoloOutputParser(); 149 | 150 | var boundingBoxes = 151 | probabilities 152 | .Select(probability => parser.ParseOutputs(probability)) 153 | .Select(boxes => parser.FilterBoundingBoxes(boxes, 5, .5F)); 154 | ``` 155 | 156 | **Note** The Tiny Yolo2 model is not having much accuracy compare to full YOLO2 model. As this is a sample program we are using Tiny version of Yolo model i.e Tiny_Yolo2 157 | 158 | 159 | -------------------------------------------------------------------------------- /docs/Netron/netron.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fralik/ObjectDetection-MLNet/77e48ab57a52ea5ec75d9928ba9500810bc93add/docs/Netron/netron.PNG --------------------------------------------------------------------------------