├── .gitignore ├── CMakeLists.txt ├── README.md ├── content ├── data └── output.gif ├── include ├── Detector │ ├── object_detector.h │ └── yolo_object_detector.h ├── Tracker │ ├── hungarian.h │ ├── kalman_filter.h │ ├── kalman_track.h │ ├── object_tracker.h │ ├── track.h │ └── tracker.h ├── frame_grabber.h ├── frame_writer.h ├── message_queue.h ├── point2D.h └── tracking_msg.h ├── model ├── coco.names └── yolov3.cfg ├── src ├── frame_grabber.cpp ├── frame_writer.cpp ├── hungarian.cpp ├── kalman_filter.cpp ├── kalman_track.cpp ├── main.cpp ├── object_tracker.cpp ├── tracker.cpp └── yolo_object_detector.cpp └── videos └── input_video.mp4 /.gitignore: -------------------------------------------------------------------------------- 1 | .vscode/ 2 | build/ 3 | videos/project_track_and_detect.avi 4 | model/yolov3.weights 5 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 2.6) 2 | set(CMAKE_CXX_STANDARD 14) 3 | project(ObjectDetectionAndTracking) 4 | 5 | set(CXX_FLAGS "-Wall") 6 | set(CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS} "-std=c++14 -pthread -g") 7 | 8 | include_directories(include 9 | ${PROJECT_SOURCE_DIR}/include 10 | ${PROJECT_SOURCE_DIR}/include/Detector 11 | ${PROJECT_SOURCE_DIR}/include/Tracker 12 | ) 13 | 14 | find_package(OpenCV 4.1 REQUIRED) 15 | find_package (Eigen3 REQUIRED) 16 | include_directories(${OpenCV_INCLUDE_DIRS} ${EIGEN3_INCLUDE_DIRS}) 17 | link_directories(${OpenCV_LIBRARY_DIRS}) 18 | add_definitions(${OpenCV_DEFINITIONS}) 19 | 20 | add_executable(detect_and_track src/main.cpp 21 | src/frame_grabber.cpp 22 | src/yolo_object_detector.cpp 23 | src/frame_writer.cpp 24 | src/hungarian.cpp 25 | src/kalman_filter.cpp 26 | src/kalman_track.cpp 27 | src/tracker.cpp 28 | src/object_tracker.cpp 29 | ) 30 | 31 | target_link_libraries(detect_and_track ${OpenCV_LIBRARIES} ${EIGEN3_DIR}) 32 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MultiThreaded Object Detection and Tracking 2 | 3 | ## Description 4 | 5 | This repository contains an implementation of a multithreaded application for detecting objects and tracking objects in a user-specified video. Example output of running the application on the input video (`videos/input_video.mp4`) is the resulting video (`videos/project_track_and_detect.avi`). 6 | 7 | The whole detection and tracking application runs on four asynchronous tasks. First task start reading frames using Frame Grabber object (`frame_grabber.h`) and pushes them them into a message queue (`message_queue.h`) which is thread safe implementation using conditional variables. The second tasks detects the frames by pulling the frames from message queue as they become available using Object Detector (`object_detector.h`). The class (`object_detector.h`) is a abtract class. After detection of each frame, the detector puts detected frame along with the required input data for tracking into another queue having a special message is type of tracking messages (`tracking_msg.h`). This two tasks run in parallel. 8 | Once they are completed another task is started, which pulls the tracking messages from another queue and does the tracking using Object Tracker (`object_tracker.h`). The Object Tracker uses the Tracker (`tracker.h`) which provides functionality to update the tracks. The tracks are implements as abstract class (`track.h`). Each frame after tracking is put in the output queue. 9 | After the completion of tracking task, the final task is started using Frame Writer object (`frame_writer.h`) which pulls the messages from the output queue and writes them to the user-specified output video file. 10 | 11 | ### Red dot on the objects shows the tracking and the green dot shows the position measured by detecting the object and calculating center of bounding box 12 | 13 | 14 | 15 | ## Detectors: 16 | 17 | This project provides the abstact implementation of object detector, hence any custom object detector can be added. Currently, YOLO3-Object-Detector is implemented(`yolo_object_detector.h`). 18 | 19 | ### YOLO3-Object-Detector 20 | 21 | This object detector is trained on coco dataset and can recognize upto 80 different classes of objects in the moving frame. 22 | 23 | ## Trackers: Kalman 24 | 25 | The project provides abstract implementation of the tracker in form of tracks, hence any custom object tracker can be added. Currently, kalman tracker (`kalman_track.h`) is implemented using Kalman-Filter (`kalman_filter.h`). An assignment problem is used to associate the objects detected by detectors and tracked by trackers. Here it is solved using Hungarian Algorithm. 26 | 27 | Tracking consist of two major steps: 28 | 1. Prediction: Predict the object locations in the next frame. 29 | 2. Data association: Use the predicted locations to associate detections across frames to form tracks. 30 | 31 | ## Dependencies for Running Locally 32 | * cmake >= 3.7 33 | * All OSes: [click here for installation instructions](https://cmake.org/install/) 34 | * make >= 4.1 (Linux, Mac), 3.81 (Windows) 35 | * Linux: make is installed by default on most Linux distros 36 | * Mac: [install Xcode command line tools to get make](https://developer.apple.com/xcode/features/) 37 | * Windows: [Click here for installation instructions](http://gnuwin32.sourceforge.net/packages/make.htm) 38 | * gcc/g++ >= 5.4 39 | * Linux: gcc / g++ is installed by default on most Linux distros 40 | * Mac: same deal as make - [install Xcode command line tools](https://developer.apple.com/xcode/features/) 41 | * Windows: recommend using [MinGW](http://www.mingw.org/) 42 | * OpenCV >= 4.1 43 | * The OpenCV 4.1.0 source code can be found [here](https://github.com/opencv/opencv/tree/4.1.0) 44 | * Eigen3 45 | * Install using "sudo apt-get install libeigen3" 46 | 47 | ## Basic Build Instructions 48 | 49 | 1. Clone this repo. 50 | 51 | 2. Run the following commands to download object detection models from [Darknet](https://pjreddie.com/darknet/) 52 | 53 | ``` 54 | cd CppND-Capstone 55 | mkdir model && cd model 56 | wget https://pjreddie.com/media/files/yolov3.weights 57 | wget https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg?raw=true -O ./yolov3.cfg 58 | ``` 59 | 60 | 3. Make a build directory in the top level directory: `mkdir build && cd build` 61 | 62 | 4. Compile: `cmake .. && make` 63 | 64 | 5. Run it: `./detect_and_track` (**Runtime can vary depending on the number of frames in the video**) 65 | 66 | 6. Progress can tracked while the program is running 67 | ``` 68 | Model Loaded Successfully! 69 | Loaded 80 Class Names 70 | Initialised the tracker 71 | Read 10 frames from total 1260 frames 72 | Read 20 frames from total 1260 frames 73 | Read 30 frames from total 1260 frames 74 | Read 40 frames from total 1260 frames 75 | Read 50 frames from total 1260 frames 76 | .... 77 | .... 78 | Detected 10 frames 79 | Read 520 frames from total 1260 frames 80 | Detected 20 frames 81 | Read 530 frames from total 1260 frames 82 | Detected 30 frames 83 | Read 540 frames from total 1260 frames 84 | .... 85 | .... 86 | Tracked 1240 frames 87 | Tracked 1250 frames 88 | Written 1253 frames 89 | ------- Done !! ------- 90 | ``` 91 | 7. The video can be played from (`videos/project_track_and_detect.avi`) after the program terminates 92 | 93 | ## Code Instructions 94 | 95 | 1. Change the model_config file path and model_weight_path in main.cpp 96 | 2. Change the input and output video file path in main.cpp 97 | 2. Turn the track flag to false in main.cpp if you want to run detection only 98 | 99 | ## Key Rubric Points 100 | 101 | ### The application reads data from a file and process the data. 102 | 103 | * The project reads frames from a video file using the `ObjectDetector` class (`line 46 include/Detector/object_detector.h`). 104 | 105 | ### The application implements the abstract classes and pure virtual functions. 106 | 107 | * The object detector class (`include/Detector/object_detector.h`) and the tracks class (`include/Tracker/track.h`). 108 | 109 | ### Use of Object Oriented Programming techniques. 110 | 111 | * The classes `FrameGrabber` (`src/frame_grabber.cpp` , `include/frame_grabber.h`), `FrameWriter` (`src/frame_writer.cpp` , `include/frame_writer.h`), `MessageQueue` (`include/message_queue.h`), `YOLODetector` (`include/Detector/yolo_object_detector.h`, `src/yolo_object_detector.cpp`), `KalmanFilter` (`include/Tracker/kalman_filter.h` , `src/kalman_filter.cpp`), `Tracker` (`include/Tracker/tracker.h` , `src/tracker.cpp`) and `KalmanTrack` (`include/Tracker/kalman_track.h` , `src/kalman_track.cpp`). 112 | 113 | ### Use of Inheritence techniques. 114 | 115 | * The class `YOLODetector` is inherited from parent class `ObjectDetector` (`line 15 include/Detector/yolo_object_detector.h`). 116 | * The class `KalmanTrack` is inherited from parent class `Track` (`line 8 include/Tracker/yolo_object_detector.h`). 117 | 118 | ### Classes use appropriate access specifiers for class members. 119 | 120 | * Example of class-specific access specifiers in `Tracker` class definition (`include/Tracker/tracker.h`). 121 | 122 | ### Use of Overloaded Functions. 123 | 124 | * The function getTracks() is overloaded in `Tracker` class (`line 47,49 include/Tracker/tracker.h`). 125 | 126 | ### Templates generalize functions. 127 | 128 | * The `MessageQueue` is a templated class (`include/message_queue.h`). 129 | 130 | ### Use of references in function declarations. 131 | 132 | * Example of method that uses reference in function declaration is the implementation of the `YOLODetector` constructor (`line 4 src/yolo_object_detector.cpp`). 133 | 134 | ### Use of scope / Resource Acquisition Is Initialization (RAII) where appropriate. 135 | 136 | * Example use of RAII can be seen acquisition and release of locks in `MessageQueue` (`lines 19,55,62 include/message_queue.h`). 137 | 138 | ### Use of move semantics to move data, instead of copying it, where possible. 139 | 140 | * Example use of move semantics is the pushing and removing of items in `MessageQueue` (`line 41 include/message_queue.h`). 141 | 142 | ### Use of smart pointers. 143 | 144 | * Example use of the smart pointers (std::unique_ptr) is (`line 7 src/yolo_object_detector.cpp`), (`line 15 src/tracker.cpp`). 145 | 146 | ### Use of multithreading. 147 | 148 | * The project uses two asynchronous tasks (std::async) (`lines 54,57,63,68 src/main.cpp`). 149 | 150 | ### Use of condition variable. 151 | 152 | * A condition variable is used in the project in the implementation of message queue (`lines 38,28 include/message_queue.h`). 153 | 154 | ## References 155 | 156 | * The video used in this repository was taken from the repository [udacity/CarND-Vehicle-Detection](udacity/CarND-Vehicle-Detection). 157 | 158 | * OpenCV YOLO Object Detection (https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.cpp) 159 | 160 | * Kalman Filter - [Artificial Intelligence for Robotics](https://www.udacity.com/course/artificial-intelligence-for-robotics--cs373#) Udacity Course 161 | 162 | * Hungarian Algorithm - [here](http://www.mathworks.com/matlabcentral/fileexchange/6543-functions-for-the-rectangular-assignment-problem) 163 | 164 | * Motion-Based Multiple Object Tracking - [here](https://in.mathworks.com/help/vision/examples/motion-based-multiple-object-tracking.html) 165 | 166 | * Multiple Object Tracking - [here](https://in.mathworks.com/help/vision/ug/multiple-object-tracking.html) 167 | 168 | * Computer Vision for tracking - [here](https://towardsdatascience.com/computer-vision-for-tracking-8220759eee85) 169 | -------------------------------------------------------------------------------- /content: -------------------------------------------------------------------------------- 1 | # Description 2 | 3 | Obstacle detection or tracking moving objects is one of the most interesting topics in computer vision. 4 | This problem could be solved in two steps: 5 | 6 | 1, Detecting moving objects in each frame 7 | 2, Tracking historical objects with some tracking algorithms 8 | 9 | An assignment problem is used to associate the objects detected by detectors and tracked by trackers. 10 | 11 | We can found some introduction of this framework here, 12 | https://towardsdatascience.com/computer-vision-for-tracking-8220759eee85 13 | 14 | Another example in more detail with matlab code (detecors and trackers may different), 15 | https://www.mathworks.com/help/vision/examples/motion-based-multiple-object-tracking.html 16 | 17 | Here I implemented a highly efficient and scalable C++ framework to combine the state of art 18 | deep-learning based detectors (Yolo3 demoed here) and correlation filters based trackers 19 | (KCF, Kalman Filters also implemented). The assignment problem is solved by hungarian algorithm. 20 | 21 | # Detectors: Yolo3 22 | 23 | Yolo3 is trained for detecting bottles, cans and hands in this demo. It is trained with Keras 24 | and compiled with tensorflow C++ into a DLL. (YOLO3.DLL under bin folder). CUDA 9.2 is used to 25 | compile the tensorflow C++ library. 26 | 27 | YOLO3.DLL can be compiled with the code under folder detectors and tensorflow C++ library if 28 | you have tensorflow C++ library compiled. 29 | 30 | # Trackers: Kalman Filter and KCF 31 | 32 | Kalman filter is fast but less accurate. KCF is accurate but much slower. 33 | They are implemnted with exactly same interface, so we can easily switch from one to another 34 | in the project. 35 | 36 | # Live Camera Capture: OpenCV 37 | 38 | OpenCV is used to capture live video frames and used for image preprocessing. 39 | 40 | # Misc 41 | 42 | YOLO3.DLL and the model file are too big. They can be downloaded from following link: 43 | https://pan.baidu.com/s/1CPYU2o59vutoq-OJewObRw 44 | -------------------------------------------------------------------------------- /data/output.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AvnishGupta143/MultiThreaded-Object-Detection-and-Tracking-cpp/f27949909b99b31ceb4c353e68732976049c0e2f/data/output.gif -------------------------------------------------------------------------------- /include/Detector/object_detector.h: -------------------------------------------------------------------------------- 1 | #ifndef OBJECT_DETECTOR_H 2 | #define OBJECT_DETECTOR_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include "message_queue.h" 11 | #include "point2D.h" 12 | #include "tracking_msg.h" 13 | 14 | // Abstract class which provides functionality to detect object in the frames and have some functions to write the video file 15 | class ObjectDetector 16 | { 17 | public: 18 | 19 | // Constructor / Destructor 20 | ObjectDetector(){}; 21 | ~ObjectDetector(){}; 22 | 23 | // Method to detect object in the given frame 24 | virtual void detectObject(cv::Mat &frame, std::vector &info) = 0; 25 | 26 | // Method to detect object in the queue having frames 27 | virtual void detectObjectInQueue(MessageQueue &msgq, MessageQueue &outq) = 0; 28 | 29 | // Method to detect object in the msgq having frames and put the detected frame and detected points in trackq 30 | virtual void detectObjectInQueueAndTrack(MessageQueue &msgq, MessageQueue &trackq) = 0; 31 | 32 | // Method to find the bounding boxes over the confident predictions 33 | virtual void postProcessDetectedObjectFrame(cv::Mat &frame, const std::vector &info) = 0; 34 | 35 | // Function to add inference time to the detected object frame 36 | virtual void addDetectionTimeToFrame(cv::Mat &frame) = 0; 37 | 38 | // Draws detected object bounding boxes on supplied video frame 39 | inline void drawBoundingBoxToFrame(int left, int top, int right, int bottom, cv::Mat& frame) 40 | { 41 | //Draw a rectangle displaying the bounding box 42 | cv::rectangle(frame, cv::Point(left, top), cv::Point(right, bottom), cv::Scalar(0, 0, 255)); 43 | } 44 | 45 | // Function to add object class to the detected object frame 46 | inline void addObjectClassToFrame(int classId, float conf, int left, int top, cv::Mat& frame) 47 | { 48 | std::string label = std::to_string(conf); 49 | if (!_classes.empty()) 50 | { 51 | label = _classes[classId] + ":" + label; 52 | } 53 | else 54 | { 55 | return; 56 | } 57 | 58 | //Add the label at the top of the bounding box 59 | int baseLine; 60 | cv::Size labelSize = getTextSize(label, cv::FONT_HERSHEY_SIMPLEX, 0.6, 1, &baseLine); 61 | top = std::max(top, labelSize.height); 62 | cv::rectangle(frame, cv::Point(left, top), cv::Point(left+labelSize.width, top - std::min(top, labelSize.height)), cv::Scalar(255, 0, 255), -1); 63 | cv::putText(frame, label, cv::Point(left, top), cv::FONT_HERSHEY_SIMPLEX, 0.6, cv::Scalar(255,255,255)); 64 | } 65 | 66 | // Function to load the class names in a vector 67 | inline void loadClasses(std::string classesFilename) 68 | { 69 | std::fstream file(classesFilename); 70 | std::string line; 71 | while(std::getline(file, line)) 72 | { 73 | _classes.push_back(line); 74 | } 75 | std::cout << "Loaded " << _classes.size() <<" Class Names" << std::endl; 76 | } 77 | 78 | // Confidence threshold for class 79 | float _confidenceThresh; 80 | 81 | // Non-max Suppression threshold to remove the access bounding boxes 82 | float _nonMaxSuppressionThresh; 83 | 84 | // Unique Pointer holding the Object Dectector 85 | std::unique_ptr _detector; 86 | 87 | // Vector containing class names 88 | std::vector _classes; 89 | 90 | // Frames Detected 91 | int _numFramesDetected{0}; 92 | 93 | }; 94 | 95 | #endif 96 | -------------------------------------------------------------------------------- /include/Detector/yolo_object_detector.h: -------------------------------------------------------------------------------- 1 | #ifndef YOLO_DETECTOR_H 2 | #define YOLO_DETECTOR_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include "object_detector.h" 11 | #include "point2D.h" 12 | #include "tracking_msg.h" 13 | 14 | // class which provides functionality to detect object using YOLO algorithm 15 | class YOLODetector: public ObjectDetector 16 | { 17 | public: 18 | 19 | // Constructor and Destructor 20 | YOLODetector(std::string modelWeights, std::string modelConfig, std::string classesFilename, float confThresh, float nmsThresh); 21 | ~YOLODetector(); 22 | 23 | // Method to detect object in the msgq having frames and put the detected frames in the outq 24 | void detectObjectInQueue(MessageQueue &msgq, MessageQueue &outq); 25 | 26 | // Method to detect object in the msgq having frames and put the detected frame and detected points in trackq 27 | void detectObjectInQueueAndTrack(MessageQueue &msgq, MessageQueue &trackq); 28 | 29 | // Method to detect object in the given frame 30 | void detectObject(cv::Mat &frame, std::vector &info); 31 | 32 | protected: 33 | 34 | // Method to the bounding boxes over the confident predictions 35 | void postProcessDetectedObjectFrame(cv::Mat &frame, const std::vector &info); 36 | 37 | // Method to add inference time to detected object frame 38 | void addDetectionTimeToFrame(cv::Mat &frame); 39 | 40 | private: 41 | 42 | // Input width and height of the image to the network 43 | int _inputWidth; 44 | int _inputHeight; 45 | float _frameDetectionTime; 46 | std::vector _objectCenters; 47 | }; 48 | 49 | #endif -------------------------------------------------------------------------------- /include/Tracker/hungarian.h: -------------------------------------------------------------------------------- 1 | /////////////////////////////////////////////////////////////////////////////// 2 | // Hungarian.h: Header file for Class HungarianAlgorithm. 3 | // 4 | // This is a C++ wrapper with slight modification of a hungarian algorithm implementation by Markus Buehren. 5 | // The original implementation is a few mex-functions for use in MATLAB, found here: 6 | // http://www.mathworks.com/matlabcentral/fileexchange/6543-functions-for-the-rectangular-assignment-problem 7 | // 8 | // Both this code and the orignal code are published under the BSD license. 9 | // by Cong Ma, 2016 10 | // 11 | 12 | #ifndef HUNGARIAN_H 13 | #define HUNGARIAN_H 14 | 15 | #include 16 | #include 17 | 18 | using namespace std; 19 | 20 | 21 | class HungarianAlgorithm 22 | { 23 | public: 24 | HungarianAlgorithm(); 25 | ~HungarianAlgorithm(); 26 | double Solve(vector >& DistMatrix, vector& Assignment); 27 | 28 | private: 29 | int count; 30 | void assignmentoptimal(int *assignment, double *cost, double *distMatrix, int nOfRows, int nOfColumns); 31 | void buildassignmentvector(int *assignment, bool *starMatrix, int nOfRows, int nOfColumns); 32 | void computeassignmentcost(int *assignment, double *cost, double *distMatrix, int nOfRows); 33 | void step2a(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim); 34 | void step2b(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim); 35 | void step3(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim); 36 | void step4(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim, int row, int col); 37 | void step5(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim); 38 | }; 39 | 40 | 41 | #endif -------------------------------------------------------------------------------- /include/Tracker/kalman_filter.h: -------------------------------------------------------------------------------- 1 | #ifndef KALMAN_FILTER_H 2 | #define KALMAN_FILTER_H 3 | 4 | #include 5 | #include 6 | 7 | class KalmanFilter 8 | { 9 | public: 10 | 11 | /** 12 | * Create a Kalman filter with the specified matrices. 13 | * F - System dynamics matrix/ State Transition Matrix 14 | * B - Input matrix 15 | * H - Measurement Function 16 | * Q - Process noise covariance 17 | * R - Measurement noise covariance (measurement uncertainty) 18 | * P - Estimate error covariance (motion uncertainty) 19 | * u - External Motion 20 | * m - Number of states 21 | * n - Number of measurements 22 | * c - Number of Control Inputs [B.cols()] 23 | */ 24 | KalmanFilter(float deltaT); 25 | 26 | //Initialize the filter with initial states as zero. 27 | void init(); 28 | 29 | //Initialize the filter with a guess for initial states. 30 | void init(const Eigen::VectorXd x0); 31 | 32 | //Update the prediction based on control input. 33 | void predict(float delT); 34 | 35 | //Update the estimated state based on measured values. 36 | void update(Eigen::VectorXd y); 37 | 38 | Eigen::VectorXd state() { return x_hat; }; 39 | 40 | private: 41 | 42 | // Matrices for computation 43 | Eigen::Matrix4d F, H, Q, R, P, K, P0; 44 | Eigen::VectorXd u; 45 | 46 | // System dimensions 47 | int m, n, c; 48 | 49 | // Is the filter initialized? 50 | bool initialized = false; 51 | 52 | // delta time between updates 53 | float delT; 54 | 55 | // n-size identity 56 | Eigen::MatrixXd I; 57 | 58 | // Estimated states 59 | Eigen::VectorXd x_hat; 60 | }; 61 | 62 | #endif -------------------------------------------------------------------------------- /include/Tracker/kalman_track.h: -------------------------------------------------------------------------------- 1 | #ifndef KALMAN_TRACK_H 2 | #define KALMAN_TRACK_H 3 | 4 | #include 5 | #include "kalman_filter.h" 6 | #include "track.h" 7 | 8 | class KalmanTrack : public Track 9 | { 10 | // Pointer to Kalman Filter Object 11 | std::unique_ptr KF; 12 | 13 | public: 14 | 15 | //initialize state 16 | void init(Point2D p, float delT); 17 | 18 | //measurement(measurement update) 19 | void update(Point2D p); 20 | 21 | //This function returns updated gaussian parameters, after motion by prediction 22 | void predict(float delT); 23 | 24 | // Getter Functions 25 | Point2D getPos(); 26 | Point2D getVel(); 27 | 28 | //Is the threshold for last measured time attained 29 | bool isSane(); 30 | 31 | }; 32 | 33 | #endif -------------------------------------------------------------------------------- /include/Tracker/object_tracker.h: -------------------------------------------------------------------------------- 1 | #ifndef OBJECT_TRACKER_H 2 | #define OBJECT_TRACKER_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include "message_queue.h" 9 | #include "tracking_msg.h" 10 | #include 11 | 12 | class ObjectTracker 13 | { 14 | 15 | public: 16 | // Constructor / Destructor 17 | ObjectTracker(); 18 | ~ObjectTracker(); 19 | 20 | // Method to track object in the given frame 21 | void trackObject(std::vector &input, float &delT); 22 | 23 | // Method to track object in the queue having frames 24 | void trackObjectInQueue(MessageQueue &msgq, MessageQueue &outq); 25 | 26 | private: 27 | 28 | // Tracker object 29 | Tracker tracker; 30 | 31 | // Number of frames tracked 32 | int _numFramesTracked; 33 | 34 | // Add tracking info to the frame 35 | void addTrackingInfoToFrame(cv::Mat &frame, std::vector &measured, std::vector &tracked); 36 | 37 | }; 38 | 39 | #endif -------------------------------------------------------------------------------- /include/Tracker/track.h: -------------------------------------------------------------------------------- 1 | #ifndef TRACK_H 2 | #define TRACK_H 3 | 4 | #include 5 | #include "point2D.h" 6 | 7 | class Track 8 | { 9 | public: 10 | int last_measurement_time; 11 | 12 | // Length of history of the track 13 | int history_length; 14 | 15 | // Previous pose 16 | Point2D prev_pos; 17 | 18 | // queue containing previous pose 19 | std::deque history_pos; 20 | 21 | // queue containing previous vel 22 | std::deque history_vel; 23 | 24 | // history size to maintain 25 | int history_size; 26 | 27 | //initialize state 28 | virtual void init(Point2D p, float delT) = 0; 29 | 30 | //measurement(measurement update) 31 | virtual void update(Point2D p) = 0; 32 | 33 | //This function returns updated gaussian parameters, after motion by prediction 34 | virtual void predict(float delT) = 0; 35 | 36 | // Getter Functions 37 | virtual Point2D getPos() = 0; 38 | virtual Point2D getVel() = 0; 39 | 40 | //Is the threshold for last measured time attained 41 | virtual bool isSane() = 0; 42 | }; 43 | 44 | #endif -------------------------------------------------------------------------------- /include/Tracker/tracker.h: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | class Tracker 7 | { 8 | private: 9 | 10 | // vector of unique pointers to tracks 11 | std::vector> tracks; 12 | 13 | // Count of tracks 14 | int _numTracks; 15 | 16 | // Max assosciated distance 17 | float _maxAssociationDist; 18 | 19 | // Hungarian Algorithm object 20 | HungarianAlgorithm HA; 21 | 22 | // To add tracks to initialised vector 23 | void addTracks(std::vector &input, float delT); 24 | 25 | // vector of type double to a func. with arg 26 | std::vector findDistToTracks(const Point2D &p); 27 | 28 | // Data assosciation function 29 | std::vector associate(std::vector > &costmatrix); 30 | 31 | // Remove the unwanted tracks 32 | void removeTracks(); 33 | 34 | public: 35 | 36 | //Constructor 37 | Tracker(){}; 38 | 39 | // Initialize the tracker 40 | void init(); 41 | 42 | //Method to update tracks 43 | void updateTracks(std::vector &input, float delT); 44 | 45 | //Function overloading 46 | 47 | std::vector getTracks(); 48 | 49 | std::vector getTracks(std::vector &vels); 50 | 51 | }; -------------------------------------------------------------------------------- /include/frame_grabber.h: -------------------------------------------------------------------------------- 1 | #ifndef FRAME_GRABBER_H 2 | #define FRAME_GRABBER_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include "message_queue.h" 10 | 11 | // Class which provides functionality to load and grab frames from video in using OpenCV 12 | class FrameGrabber 13 | { 14 | public: 15 | 16 | // Constructor and Destructor 17 | 18 | FrameGrabber(std::string filename); 19 | ~FrameGrabber(); 20 | 21 | // Getter Functions 22 | 23 | cv::Size getFrameSize() {return _frameSize; } 24 | double getFrameRate() {return _frameRate; } 25 | int getVideoCodec() {return _fourCC; } 26 | int getNumberOfFrames() {return _numFrames; } 27 | 28 | // Function to capture frames from video 29 | void grabFramesFromVideo(MessageQueue &msgq); 30 | 31 | private: 32 | // Unique pointer holding the video capture object 33 | std::unique_ptr _vcap; 34 | 35 | // Frame Rate of the video 36 | double _frameRate; 37 | 38 | // Video file codec format 39 | int _fourCC = 0; 40 | 41 | // Frame size of the video 42 | cv::Size _frameSize; 43 | 44 | // Number of frames in video 45 | int _numFrames; 46 | }; 47 | 48 | #endif 49 | 50 | -------------------------------------------------------------------------------- /include/frame_writer.h: -------------------------------------------------------------------------------- 1 | #ifndef FRAME_WRITER_H 2 | #define FRAME_WRITER_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include "message_queue.h" 10 | 11 | // Class which provides functionality to write the frames as videos or images 12 | class CustomFrameWriter 13 | { 14 | public: 15 | 16 | // Constructor and Destructor 17 | 18 | CustomFrameWriter(std::string filename, int numFrames, double _frameRate, cv::Size _frameSize); 19 | ~CustomFrameWriter(); 20 | 21 | // Function to capture frames from video 22 | bool writeFramesFromQueueToVideo(MessageQueue &msgq); 23 | bool writeFrameToImage(cv::Mat &frame); 24 | 25 | private: 26 | // Unique pointer holding the video writer object 27 | std::unique_ptr _vWrite; 28 | 29 | // Number of frames in video 30 | int _numFrames; 31 | }; 32 | 33 | #endif 34 | 35 | -------------------------------------------------------------------------------- /include/message_queue.h: -------------------------------------------------------------------------------- 1 | #ifndef MESSAGE_QUEUE_H 2 | #define MESSAGE_QUEUE_H 3 | 4 | #include 5 | #include 6 | #include 7 | 8 | template class MessageQueue 9 | { 10 | public: 11 | 12 | //Constructor 13 | MessageQueue(){}; 14 | 15 | // Method to send msg to the message queue using move semantics 16 | void send(T &&msg) 17 | { 18 | // Create a lock guard to protect the queue from data race 19 | std::lock_guard gLock(_mtx); 20 | 21 | // Push the msg to the back of the queue 22 | _queue.push_back(std::move(msg)); 23 | 24 | // Increment the size of the queue 25 | _size ++; 26 | 27 | //notify that the element has been added in the queue 28 | _cond.notify_one(); 29 | } 30 | 31 | // Method to receive msg from the queue using move semantics 32 | T receive() 33 | { 34 | // Create a unique lock to pass it in the wait method of conditional variabe 35 | std::unique_lock uLock(_mtx); 36 | 37 | // Check the conditon under lock and than enter the wait based on condition 38 | _cond.wait(uLock, [this]{ return !_queue.empty(); }); 39 | 40 | // Move the front element from the queue 41 | T msg = std::move(_queue.front()); 42 | 43 | // Remove the front element from the queue 44 | _queue.pop_front(); 45 | 46 | // Increment the size of the queue 47 | _size --; 48 | 49 | return msg; 50 | } 51 | 52 | // Method to return True/False based on queue is empty or not 53 | bool is_empty() 54 | { 55 | std::lock_guard gLock(_mtx); 56 | return _queue.empty(); 57 | } 58 | 59 | // Return the current size of the queue 60 | int get_size() 61 | { 62 | std::lock_guard gLock(_mtx); 63 | return _size; 64 | } 65 | 66 | private: 67 | 68 | // FIFO type vector to store the the msgs 69 | std::deque _queue; 70 | 71 | // Mutex to avoid data race 72 | std::mutex _mtx; 73 | 74 | // Conditional Varible 75 | std::condition_variable _cond; 76 | 77 | // Current size of the queue 78 | int _size{0}; 79 | }; 80 | 81 | #endif 82 | -------------------------------------------------------------------------------- /include/point2D.h: -------------------------------------------------------------------------------- 1 | #ifndef POINT_2D_H 2 | #define POINT_2D_H 3 | 4 | #include 5 | #include 6 | // Class providing point object with x and y fields 7 | class Point2D 8 | { 9 | public: 10 | 11 | float x; 12 | float y; 13 | 14 | // Operator Overloading "" operator 15 | Point2D operator + (Point2D const &obj) 16 | { 17 | Point2D p; 18 | p.x = x + obj.x; 19 | p.y = y + obj.y; 20 | return p; 21 | } 22 | 23 | // Operator Overloading "-" operator 24 | Point2D operator - (Point2D const &obj) 25 | { 26 | Point2D p; 27 | p.x = x - obj.x; 28 | p.y = y - obj.y; 29 | return p; 30 | } 31 | 32 | // Operator Overloading "/" operator 33 | Point2D operator / (float const &obj) 34 | { 35 | Point2D p; 36 | p.x = x /obj; 37 | p.y = y /obj; 38 | return p; 39 | } 40 | 41 | // Operator Overloading "*" operator 42 | Point2D operator * (float const &obj) 43 | { 44 | Point2D p; 45 | p.x = x * obj; 46 | p.y = y * obj; 47 | return p; 48 | } 49 | 50 | // Operator Overloading "=" operator 51 | Point2D operator = (Point2D const &obj) 52 | { 53 | Point2D p; 54 | p.x = obj.x; 55 | p.y = obj.y; 56 | return p; 57 | } 58 | 59 | // Function to take norm of the point 60 | float norm() 61 | { 62 | return sqrt(x*x+y*y); 63 | } 64 | 65 | }; 66 | 67 | #endif -------------------------------------------------------------------------------- /include/tracking_msg.h: -------------------------------------------------------------------------------- 1 | #ifndef TRACKINGMSG_H 2 | #define TRACKINGMSG_H 3 | 4 | #include "point2D.h" 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | // Class providing message type for tracking queue 11 | class TrackingMsg 12 | { 13 | public: 14 | TrackingMsg(cv::Mat _frame, std::vector _p, float _delT): 15 | frame(std::move(_frame)), delT(_delT), p(std::move(_p)){}; 16 | 17 | // Vector of 2D points 18 | std::vector p; 19 | 20 | // time deltaT is the time required in frame inference 21 | float delT; 22 | 23 | // Matrix containing the image frame 24 | cv::Mat frame; 25 | }; 26 | 27 | #endif 28 | -------------------------------------------------------------------------------- /model/coco.names: -------------------------------------------------------------------------------- 1 | person 2 | bicycle 3 | car 4 | motorbike 5 | aeroplane 6 | bus 7 | train 8 | truck 9 | boat 10 | traffic light 11 | fire hydrant 12 | stop sign 13 | parking meter 14 | bench 15 | bird 16 | cat 17 | dog 18 | horse 19 | sheep 20 | cow 21 | elephant 22 | bear 23 | zebra 24 | giraffe 25 | backpack 26 | umbrella 27 | handbag 28 | tie 29 | suitcase 30 | frisbee 31 | skis 32 | snowboard 33 | sports ball 34 | kite 35 | baseball bat 36 | baseball glove 37 | skateboard 38 | surfboard 39 | tennis racket 40 | bottle 41 | wine glass 42 | cup 43 | fork 44 | knife 45 | spoon 46 | bowl 47 | banana 48 | apple 49 | sandwich 50 | orange 51 | broccoli 52 | carrot 53 | hot dog 54 | pizza 55 | donut 56 | cake 57 | chair 58 | sofa 59 | pottedplant 60 | bed 61 | diningtable 62 | toilet 63 | tvmonitor 64 | laptop 65 | mouse 66 | remote 67 | keyboard 68 | cell phone 69 | microwave 70 | oven 71 | toaster 72 | sink 73 | refrigerator 74 | book 75 | clock 76 | vase 77 | scissors 78 | teddy bear 79 | hair drier 80 | toothbrush 81 | -------------------------------------------------------------------------------- /model/yolov3.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | # batch=1 4 | # subdivisions=1 5 | # Training 6 | batch=64 7 | subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 500200 21 | policy=steps 22 | steps=400000,450000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | # Downsample 34 | 35 | [convolutional] 36 | batch_normalize=1 37 | filters=64 38 | size=3 39 | stride=2 40 | pad=1 41 | activation=leaky 42 | 43 | [convolutional] 44 | batch_normalize=1 45 | filters=32 46 | size=1 47 | stride=1 48 | pad=1 49 | activation=leaky 50 | 51 | [convolutional] 52 | batch_normalize=1 53 | filters=64 54 | size=3 55 | stride=1 56 | pad=1 57 | activation=leaky 58 | 59 | [shortcut] 60 | from=-3 61 | activation=linear 62 | 63 | # Downsample 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=2 70 | pad=1 71 | activation=leaky 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=64 76 | size=1 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [convolutional] 82 | batch_normalize=1 83 | filters=128 84 | size=3 85 | stride=1 86 | pad=1 87 | activation=leaky 88 | 89 | [shortcut] 90 | from=-3 91 | activation=linear 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=64 96 | size=1 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [convolutional] 102 | batch_normalize=1 103 | filters=128 104 | size=3 105 | stride=1 106 | pad=1 107 | activation=leaky 108 | 109 | [shortcut] 110 | from=-3 111 | activation=linear 112 | 113 | # Downsample 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=256 118 | size=3 119 | stride=2 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | batch_normalize=1 125 | filters=128 126 | size=1 127 | stride=1 128 | pad=1 129 | activation=leaky 130 | 131 | [convolutional] 132 | batch_normalize=1 133 | filters=256 134 | size=3 135 | stride=1 136 | pad=1 137 | activation=leaky 138 | 139 | [shortcut] 140 | from=-3 141 | activation=linear 142 | 143 | [convolutional] 144 | batch_normalize=1 145 | filters=128 146 | size=1 147 | stride=1 148 | pad=1 149 | activation=leaky 150 | 151 | [convolutional] 152 | batch_normalize=1 153 | filters=256 154 | size=3 155 | stride=1 156 | pad=1 157 | activation=leaky 158 | 159 | [shortcut] 160 | from=-3 161 | activation=linear 162 | 163 | [convolutional] 164 | batch_normalize=1 165 | filters=128 166 | size=1 167 | stride=1 168 | pad=1 169 | activation=leaky 170 | 171 | [convolutional] 172 | batch_normalize=1 173 | filters=256 174 | size=3 175 | stride=1 176 | pad=1 177 | activation=leaky 178 | 179 | [shortcut] 180 | from=-3 181 | activation=linear 182 | 183 | [convolutional] 184 | batch_normalize=1 185 | filters=128 186 | size=1 187 | stride=1 188 | pad=1 189 | activation=leaky 190 | 191 | [convolutional] 192 | batch_normalize=1 193 | filters=256 194 | size=3 195 | stride=1 196 | pad=1 197 | activation=leaky 198 | 199 | [shortcut] 200 | from=-3 201 | activation=linear 202 | 203 | 204 | [convolutional] 205 | batch_normalize=1 206 | filters=128 207 | size=1 208 | stride=1 209 | pad=1 210 | activation=leaky 211 | 212 | [convolutional] 213 | batch_normalize=1 214 | filters=256 215 | size=3 216 | stride=1 217 | pad=1 218 | activation=leaky 219 | 220 | [shortcut] 221 | from=-3 222 | activation=linear 223 | 224 | [convolutional] 225 | batch_normalize=1 226 | filters=128 227 | size=1 228 | stride=1 229 | pad=1 230 | activation=leaky 231 | 232 | [convolutional] 233 | batch_normalize=1 234 | filters=256 235 | size=3 236 | stride=1 237 | pad=1 238 | activation=leaky 239 | 240 | [shortcut] 241 | from=-3 242 | activation=linear 243 | 244 | [convolutional] 245 | batch_normalize=1 246 | filters=128 247 | size=1 248 | stride=1 249 | pad=1 250 | activation=leaky 251 | 252 | [convolutional] 253 | batch_normalize=1 254 | filters=256 255 | size=3 256 | stride=1 257 | pad=1 258 | activation=leaky 259 | 260 | [shortcut] 261 | from=-3 262 | activation=linear 263 | 264 | [convolutional] 265 | batch_normalize=1 266 | filters=128 267 | size=1 268 | stride=1 269 | pad=1 270 | activation=leaky 271 | 272 | [convolutional] 273 | batch_normalize=1 274 | filters=256 275 | size=3 276 | stride=1 277 | pad=1 278 | activation=leaky 279 | 280 | [shortcut] 281 | from=-3 282 | activation=linear 283 | 284 | # Downsample 285 | 286 | [convolutional] 287 | batch_normalize=1 288 | filters=512 289 | size=3 290 | stride=2 291 | pad=1 292 | activation=leaky 293 | 294 | [convolutional] 295 | batch_normalize=1 296 | filters=256 297 | size=1 298 | stride=1 299 | pad=1 300 | activation=leaky 301 | 302 | [convolutional] 303 | batch_normalize=1 304 | filters=512 305 | size=3 306 | stride=1 307 | pad=1 308 | activation=leaky 309 | 310 | [shortcut] 311 | from=-3 312 | activation=linear 313 | 314 | 315 | [convolutional] 316 | batch_normalize=1 317 | filters=256 318 | size=1 319 | stride=1 320 | pad=1 321 | activation=leaky 322 | 323 | [convolutional] 324 | batch_normalize=1 325 | filters=512 326 | size=3 327 | stride=1 328 | pad=1 329 | activation=leaky 330 | 331 | [shortcut] 332 | from=-3 333 | activation=linear 334 | 335 | 336 | [convolutional] 337 | batch_normalize=1 338 | filters=256 339 | size=1 340 | stride=1 341 | pad=1 342 | activation=leaky 343 | 344 | [convolutional] 345 | batch_normalize=1 346 | filters=512 347 | size=3 348 | stride=1 349 | pad=1 350 | activation=leaky 351 | 352 | [shortcut] 353 | from=-3 354 | activation=linear 355 | 356 | 357 | [convolutional] 358 | batch_normalize=1 359 | filters=256 360 | size=1 361 | stride=1 362 | pad=1 363 | activation=leaky 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=512 368 | size=3 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [shortcut] 374 | from=-3 375 | activation=linear 376 | 377 | [convolutional] 378 | batch_normalize=1 379 | filters=256 380 | size=1 381 | stride=1 382 | pad=1 383 | activation=leaky 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=512 388 | size=3 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [shortcut] 394 | from=-3 395 | activation=linear 396 | 397 | 398 | [convolutional] 399 | batch_normalize=1 400 | filters=256 401 | size=1 402 | stride=1 403 | pad=1 404 | activation=leaky 405 | 406 | [convolutional] 407 | batch_normalize=1 408 | filters=512 409 | size=3 410 | stride=1 411 | pad=1 412 | activation=leaky 413 | 414 | [shortcut] 415 | from=-3 416 | activation=linear 417 | 418 | 419 | [convolutional] 420 | batch_normalize=1 421 | filters=256 422 | size=1 423 | stride=1 424 | pad=1 425 | activation=leaky 426 | 427 | [convolutional] 428 | batch_normalize=1 429 | filters=512 430 | size=3 431 | stride=1 432 | pad=1 433 | activation=leaky 434 | 435 | [shortcut] 436 | from=-3 437 | activation=linear 438 | 439 | [convolutional] 440 | batch_normalize=1 441 | filters=256 442 | size=1 443 | stride=1 444 | pad=1 445 | activation=leaky 446 | 447 | [convolutional] 448 | batch_normalize=1 449 | filters=512 450 | size=3 451 | stride=1 452 | pad=1 453 | activation=leaky 454 | 455 | [shortcut] 456 | from=-3 457 | activation=linear 458 | 459 | # Downsample 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1024 464 | size=3 465 | stride=2 466 | pad=1 467 | activation=leaky 468 | 469 | [convolutional] 470 | batch_normalize=1 471 | filters=512 472 | size=1 473 | stride=1 474 | pad=1 475 | activation=leaky 476 | 477 | [convolutional] 478 | batch_normalize=1 479 | filters=1024 480 | size=3 481 | stride=1 482 | pad=1 483 | activation=leaky 484 | 485 | [shortcut] 486 | from=-3 487 | activation=linear 488 | 489 | [convolutional] 490 | batch_normalize=1 491 | filters=512 492 | size=1 493 | stride=1 494 | pad=1 495 | activation=leaky 496 | 497 | [convolutional] 498 | batch_normalize=1 499 | filters=1024 500 | size=3 501 | stride=1 502 | pad=1 503 | activation=leaky 504 | 505 | [shortcut] 506 | from=-3 507 | activation=linear 508 | 509 | [convolutional] 510 | batch_normalize=1 511 | filters=512 512 | size=1 513 | stride=1 514 | pad=1 515 | activation=leaky 516 | 517 | [convolutional] 518 | batch_normalize=1 519 | filters=1024 520 | size=3 521 | stride=1 522 | pad=1 523 | activation=leaky 524 | 525 | [shortcut] 526 | from=-3 527 | activation=linear 528 | 529 | [convolutional] 530 | batch_normalize=1 531 | filters=512 532 | size=1 533 | stride=1 534 | pad=1 535 | activation=leaky 536 | 537 | [convolutional] 538 | batch_normalize=1 539 | filters=1024 540 | size=3 541 | stride=1 542 | pad=1 543 | activation=leaky 544 | 545 | [shortcut] 546 | from=-3 547 | activation=linear 548 | 549 | ###################### 550 | 551 | [convolutional] 552 | batch_normalize=1 553 | filters=512 554 | size=1 555 | stride=1 556 | pad=1 557 | activation=leaky 558 | 559 | [convolutional] 560 | batch_normalize=1 561 | size=3 562 | stride=1 563 | pad=1 564 | filters=1024 565 | activation=leaky 566 | 567 | [convolutional] 568 | batch_normalize=1 569 | filters=512 570 | size=1 571 | stride=1 572 | pad=1 573 | activation=leaky 574 | 575 | [convolutional] 576 | batch_normalize=1 577 | size=3 578 | stride=1 579 | pad=1 580 | filters=1024 581 | activation=leaky 582 | 583 | [convolutional] 584 | batch_normalize=1 585 | filters=512 586 | size=1 587 | stride=1 588 | pad=1 589 | activation=leaky 590 | 591 | [convolutional] 592 | batch_normalize=1 593 | size=3 594 | stride=1 595 | pad=1 596 | filters=1024 597 | activation=leaky 598 | 599 | [convolutional] 600 | size=1 601 | stride=1 602 | pad=1 603 | filters=255 604 | activation=linear 605 | 606 | 607 | [yolo] 608 | mask = 6,7,8 609 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 610 | classes=80 611 | num=9 612 | jitter=.3 613 | ignore_thresh = .7 614 | truth_thresh = 1 615 | random=1 616 | 617 | 618 | [route] 619 | layers = -4 620 | 621 | [convolutional] 622 | batch_normalize=1 623 | filters=256 624 | size=1 625 | stride=1 626 | pad=1 627 | activation=leaky 628 | 629 | [upsample] 630 | stride=2 631 | 632 | [route] 633 | layers = -1, 61 634 | 635 | 636 | 637 | [convolutional] 638 | batch_normalize=1 639 | filters=256 640 | size=1 641 | stride=1 642 | pad=1 643 | activation=leaky 644 | 645 | [convolutional] 646 | batch_normalize=1 647 | size=3 648 | stride=1 649 | pad=1 650 | filters=512 651 | activation=leaky 652 | 653 | [convolutional] 654 | batch_normalize=1 655 | filters=256 656 | size=1 657 | stride=1 658 | pad=1 659 | activation=leaky 660 | 661 | [convolutional] 662 | batch_normalize=1 663 | size=3 664 | stride=1 665 | pad=1 666 | filters=512 667 | activation=leaky 668 | 669 | [convolutional] 670 | batch_normalize=1 671 | filters=256 672 | size=1 673 | stride=1 674 | pad=1 675 | activation=leaky 676 | 677 | [convolutional] 678 | batch_normalize=1 679 | size=3 680 | stride=1 681 | pad=1 682 | filters=512 683 | activation=leaky 684 | 685 | [convolutional] 686 | size=1 687 | stride=1 688 | pad=1 689 | filters=255 690 | activation=linear 691 | 692 | 693 | [yolo] 694 | mask = 3,4,5 695 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 696 | classes=80 697 | num=9 698 | jitter=.3 699 | ignore_thresh = .7 700 | truth_thresh = 1 701 | random=1 702 | 703 | 704 | 705 | [route] 706 | layers = -4 707 | 708 | [convolutional] 709 | batch_normalize=1 710 | filters=128 711 | size=1 712 | stride=1 713 | pad=1 714 | activation=leaky 715 | 716 | [upsample] 717 | stride=2 718 | 719 | [route] 720 | layers = -1, 36 721 | 722 | 723 | 724 | [convolutional] 725 | batch_normalize=1 726 | filters=128 727 | size=1 728 | stride=1 729 | pad=1 730 | activation=leaky 731 | 732 | [convolutional] 733 | batch_normalize=1 734 | size=3 735 | stride=1 736 | pad=1 737 | filters=256 738 | activation=leaky 739 | 740 | [convolutional] 741 | batch_normalize=1 742 | filters=128 743 | size=1 744 | stride=1 745 | pad=1 746 | activation=leaky 747 | 748 | [convolutional] 749 | batch_normalize=1 750 | size=3 751 | stride=1 752 | pad=1 753 | filters=256 754 | activation=leaky 755 | 756 | [convolutional] 757 | batch_normalize=1 758 | filters=128 759 | size=1 760 | stride=1 761 | pad=1 762 | activation=leaky 763 | 764 | [convolutional] 765 | batch_normalize=1 766 | size=3 767 | stride=1 768 | pad=1 769 | filters=256 770 | activation=leaky 771 | 772 | [convolutional] 773 | size=1 774 | stride=1 775 | pad=1 776 | filters=255 777 | activation=linear 778 | 779 | 780 | [yolo] 781 | mask = 0,1,2 782 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 783 | classes=80 784 | num=9 785 | jitter=.3 786 | ignore_thresh = .7 787 | truth_thresh = 1 788 | random=1 789 | 790 | -------------------------------------------------------------------------------- /src/frame_grabber.cpp: -------------------------------------------------------------------------------- 1 | #include "frame_grabber.h" 2 | 3 | FrameGrabber::FrameGrabber(std::string filename) 4 | { 5 | // Make unique pointer using filename 6 | _vcap = std::make_unique(filename); 7 | 8 | // Check if the video file is opened without error 9 | if(!_vcap->isOpened()) 10 | { 11 | std::cout << "Error: not able to open Video File/ Video File not Found" << std::endl; 12 | exit(-1); 13 | } 14 | 15 | // Load frame Rate of the video 16 | _frameRate = _vcap->get(cv::CAP_PROP_FPS); 17 | 18 | // Load video file codec format 19 | _fourCC = _vcap->get(cv::CAP_PROP_FOURCC); 20 | 21 | // Load frame size of the video 22 | _frameSize = cv::Size( _vcap->get(cv::CAP_PROP_FRAME_WIDTH), _vcap->get(cv::CAP_PROP_FRAME_HEIGHT)); 23 | 24 | // Load number of frames in video 25 | _numFrames = _vcap->get(cv::CAP_PROP_FRAME_COUNT); 26 | } 27 | 28 | FrameGrabber::~FrameGrabber() 29 | { 30 | // Release the video capture object 31 | _vcap->release(); 32 | } 33 | 34 | void FrameGrabber::grabFramesFromVideo(MessageQueue &msgq) 35 | { 36 | int count; 37 | cv::Mat frame; 38 | // Read and capture the frames untill there is frame in the video stream 39 | 40 | while(_vcap->read(frame)) 41 | { 42 | //Prevent overloading of queue 43 | while(msgq.get_size() >= 500) { std::this_thread::sleep_for(std::chrono::milliseconds(5)); } 44 | 45 | // Add the frame in the queue using 'send, of the message queue 46 | msgq.send(std::move(frame)); 47 | 48 | // Increment the count 49 | ++count; 50 | 51 | // Print the number of frames read every 20 frames 52 | if(count % 10 == 0) 53 | { 54 | std::cout << "Read " << count << " frames from total " << _numFrames << " frames" << std::endl; 55 | } 56 | } 57 | 58 | // Check all the frames in the video is read 59 | if (count == _numFrames) 60 | { 61 | std::cout << "Read the whole video file having " << count << " frames" << std::endl; 62 | } 63 | else 64 | { 65 | _numFrames = count; 66 | std::cout << "Read " << count << " frames" << std::endl; 67 | } 68 | } 69 | 70 | 71 | -------------------------------------------------------------------------------- /src/frame_writer.cpp: -------------------------------------------------------------------------------- 1 | #include "frame_writer.h" 2 | 3 | 4 | CustomFrameWriter::CustomFrameWriter(std::string out_filename, int numFrames, double frameRate, cv::Size frameSize): 5 | _numFrames(numFrames) 6 | { 7 | // Make unique pointer of video writer object using filenama, codec , fps and frame_size 8 | _vWrite = std::make_unique(out_filename, cv::VideoWriter::fourcc('M','J','P','G'), frameRate, frameSize); 9 | } 10 | 11 | CustomFrameWriter::~CustomFrameWriter() 12 | { 13 | // Release the video writer object 14 | _vWrite->release(); 15 | } 16 | 17 | bool CustomFrameWriter::writeFramesFromQueueToVideo(MessageQueue &msgq) 18 | { 19 | std::this_thread::sleep_for(std::chrono::milliseconds(500)); 20 | 21 | try 22 | { 23 | int count = 0; 24 | 25 | // if msg queue is not empty, write the frames 26 | while(!msgq.is_empty()) 27 | { 28 | cv::Mat frame = msgq.receive(); 29 | _vWrite->write(frame); 30 | std::this_thread::sleep_for(std::chrono::milliseconds(10)); 31 | count++; 32 | } 33 | 34 | if(count == 0) return false; 35 | 36 | std::cout << "Written " << count << " frames" << std::endl; 37 | 38 | return true; 39 | } 40 | catch(...) 41 | { 42 | return false; 43 | } 44 | } 45 | -------------------------------------------------------------------------------- /src/hungarian.cpp: -------------------------------------------------------------------------------- 1 | /////////////////////////////////////////////////////////////////////////////// 2 | // Hungarian.cpp: Implementation file for Class HungarianAlgorithm. 3 | // 4 | // This is a C++ wrapper with slight modification of a hungarian algorithm implementation by Markus Buehren. 5 | // The original implementation is a few mex-functions for use in MATLAB, found here: 6 | // http://www.mathworks.com/matlabcentral/fileexchange/6543-functions-for-the-rectangular-assignment-problem 7 | // 8 | // Both this code and the orignal code are published under the BSD license. 9 | // by Cong Ma, 2016 10 | // 11 | 12 | #include 13 | #include // for DBL_MAX 14 | #include // for fabs() 15 | #include "hungarian.h" 16 | 17 | 18 | HungarianAlgorithm::HungarianAlgorithm(){} 19 | HungarianAlgorithm::~HungarianAlgorithm(){} 20 | 21 | 22 | //********************************************************// 23 | // A single function wrapper for solving assignment problem. 24 | //********************************************************// 25 | double HungarianAlgorithm::Solve(vector >& DistMatrix, vector& Assignment) 26 | { 27 | count = 0; 28 | 29 | unsigned int nRows = DistMatrix.size(); 30 | unsigned int nCols = DistMatrix[0].size(); 31 | 32 | double *distMatrixIn = new double[nRows * nCols]; 33 | int *assignment = new int[nRows]; 34 | double cost = 0.0; 35 | // Fill in the distMatrixIn. Mind the index is "i + nRows * j". 36 | // Here the cost matrix of size MxN is defined as a double precision array of N*M elements. 37 | // In the solving functions matrices are seen to be saved MATLAB-internally in row-order. 38 | // (i.e. the matrix [1 2; 3 4] will be stored as a vector [1 3 2 4], NOT [1 2 3 4]). 39 | for (unsigned int i = 0; i < nRows; i++) 40 | for (unsigned int j = 0; j < nCols; j++) 41 | distMatrixIn[i + nRows * j] = DistMatrix[i][j]; 42 | // call solving function 43 | assignmentoptimal(assignment, &cost, distMatrixIn, nRows, nCols); 44 | 45 | Assignment.clear(); 46 | for (unsigned int r = 0; r < nRows; r++) 47 | Assignment.push_back(assignment[r]); 48 | 49 | delete[] distMatrixIn; 50 | delete[] assignment; 51 | count = 0; 52 | 53 | return cost; 54 | } 55 | 56 | 57 | //********************************************************// 58 | // Solve optimal solution for assignment problem using Munkres algorithm, also known as Hungarian Algorithm. 59 | //********************************************************// 60 | void HungarianAlgorithm::assignmentoptimal(int *assignment, double *cost, double *distMatrixIn, int nOfRows, int nOfColumns) 61 | { 62 | double *distMatrix, *distMatrixTemp, *distMatrixEnd, *columnEnd, value, minValue; 63 | bool *coveredColumns, *coveredRows, *starMatrix, *newStarMatrix, *primeMatrix; 64 | int nOfElements, minDim, row, col; 65 | 66 | /* initialization */ 67 | *cost = 0; 68 | for (row = 0; row nOfColumns) */ 133 | { 134 | minDim = nOfColumns; 135 | 136 | for (col = 0; col= 0) 215 | *cost += distMatrix[row + nOfRows*col]; 216 | } 217 | } 218 | 219 | /********************************************************/ 220 | void HungarianAlgorithm::step2a(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim) 221 | { 222 | bool *starMatrixTemp, *columnEnd; 223 | int col; 224 | 225 | /* cover every column containing a starred zero */ 226 | for (col = 0; col100) 255 | { 256 | /* algorithm finished */ 257 | buildassignmentvector(assignment, starMatrix, nOfRows, nOfColumns); 258 | } 259 | else 260 | { 261 | /* move to step 3 */ 262 | step3(assignment, distMatrix, starMatrix, newStarMatrix, primeMatrix, coveredColumns, coveredRows, nOfRows, nOfColumns, minDim); 263 | } 264 | 265 | } 266 | 267 | /********************************************************/ 268 | void HungarianAlgorithm::step3(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim) 269 | { count++; 270 | bool zerosFound; 271 | int row=0, col=0, starCol=0; 272 | zerosFound = true; 273 | while (zerosFound) 274 | { 275 | zerosFound = false; 276 | for (col = 0; col100) 278 | for (row = 0; row100) 280 | { 281 | /* prime zero */ 282 | primeMatrix[row + nOfRows*col] = true; 283 | 284 | /* find starred zero in current row */ 285 | for (starCol = 0; starCol 100) /* no starred zero found */ 290 | { 291 | /* move to step 4 */ 292 | step4(assignment, distMatrix, starMatrix, newStarMatrix, primeMatrix, coveredColumns, coveredRows, nOfRows, nOfColumns, minDim, row, col); 293 | return; 294 | } 295 | else 296 | { 297 | coveredRows[row] = true; 298 | coveredColumns[starCol] = false; 299 | zerosFound = true; 300 | break; 301 | } 302 | } 303 | } 304 | 305 | /* move to step 5 */ 306 | step5(assignment, distMatrix, starMatrix, newStarMatrix, primeMatrix, coveredColumns, coveredRows, nOfRows, nOfColumns, minDim); 307 | } 308 | 309 | /********************************************************/ 310 | void HungarianAlgorithm::step4(int *assignment, double *distMatrix, bool *starMatrix, bool *newStarMatrix, bool *primeMatrix, bool *coveredColumns, bool *coveredRows, int nOfRows, int nOfColumns, int minDim, int row, int col) 311 | { 312 | int n, starRow, starCol, primeRow, primeCol; 313 | int nOfElements = nOfRows*nOfColumns; 314 | 315 | /* generate temporary copy of starMatrix */ 316 | for (n = 0; n(delT); 14 | //std::cout<< "Kalman Track Init | Made Kalman Filter " << std::endl; 15 | 16 | Eigen::VectorXd v(n); 17 | v<init(v); 19 | //std::cout<< "Kalman Track Init | Initialized Kalman Filter with " << v.transpose() << std::endl; 20 | } 21 | 22 | // Update the Track 23 | void KalmanTrack::update(Point2D p) 24 | { 25 | Point2D vel = p - prev_pos; 26 | prev_pos = p; 27 | Eigen::VectorXd v(4); 28 | v << p.x, p.y, vel.x/last_measurement_time, vel.y/last_measurement_time; 29 | // Update the Kalman 30 | KF->update(v); 31 | // Set last measurement time to 0 32 | last_measurement_time = 0; 33 | history_length++; 34 | } 35 | 36 | void KalmanTrack::predict(float delT) 37 | { 38 | // Predict the Kalman 39 | KF->predict(delT); 40 | 41 | last_measurement_time++; 42 | 43 | // if history size is greater than the history_size to maintain, pop the last element so than new element can be added 44 | if(history_pos.size() >= history_size){ 45 | history_pos.pop_front(); 46 | history_vel.pop_front(); 47 | } 48 | history_pos.push_back(getPos()); 49 | history_vel.push_back(getVel()); 50 | } 51 | 52 | // Return the current tracked pos of the track 53 | Point2D KalmanTrack::getPos() 54 | { 55 | Eigen::MatrixXd stateM = KF->state(); 56 | Point2D p; 57 | p.x = stateM(0); 58 | p.y = stateM(1); 59 | return p; 60 | } 61 | 62 | // Return the current tracked velocity of the track 63 | Point2D KalmanTrack::getVel() 64 | { 65 | Eigen::MatrixXd stateM = KF->state(); 66 | Point2D p; 67 | p.x = stateM(2); 68 | p.y = stateM(3); 69 | 70 | if(history_pos.size() 2 | #include 3 | #include 4 | #include "frame_grabber.h" 5 | #include "message_queue.h" 6 | #include "yolo_object_detector.h" 7 | #include "frame_writer.h" 8 | #include "object_tracker.h" 9 | #include "tracking_msg.h" 10 | 11 | int main() { 12 | // load input video path 13 | std::string video_input = "../videos/input_video.mp4"; 14 | 15 | // load model config path 16 | std::string model_config = "../model/yolov3.cfg"; 17 | 18 | // load model weights path 19 | std::string model_weights = "../model/yolov3.weights"; 20 | 21 | // load class names path 22 | std::string class_names = "../model/coco.names"; 23 | 24 | // load output video path 25 | std::string video_output1 = "../videos/project_detect.avi"; 26 | std::string video_output2 = "../videos/project_track_and_detect.avi"; 27 | 28 | bool track = true; 29 | bool detect = true; 30 | 31 | // Initialize frame grabber object 32 | FrameGrabber fg(video_input); 33 | 34 | // Initialize yolo object detector object 35 | YOLODetector yolo(model_weights, model_config, class_names, 0.5, 0.4); 36 | 37 | // Initialize object tracker object 38 | ObjectTracker trck; 39 | 40 | 41 | // Initialize the input and output message queue 42 | MessageQueue in_queue; 43 | MessageQueue out_queue; 44 | 45 | if(detect && track) 46 | { 47 | // Initialize the tracking message queue 48 | MessageQueue track_queue; 49 | 50 | // Initialize the frame writer object 51 | CustomFrameWriter fw(video_output2, fg.getNumberOfFrames(), fg.getFrameRate(), fg.getFrameSize()); 52 | 53 | // Start the task to load the frames 54 | std::future ftr_grab_frame = std::async(std::launch::async, &FrameGrabber::grabFramesFromVideo, &fg , std::ref(in_queue)); 55 | 56 | // Start the task to detect the frames 57 | std::future ftr_detect_frame = std::async(std::launch::async, &YOLODetector::detectObjectInQueueAndTrack , &yolo , std::ref(in_queue), std::ref(track_queue)); 58 | 59 | ftr_grab_frame.wait(); 60 | ftr_detect_frame.wait(); 61 | 62 | // Start the task to track the frames 63 | std::future ftr_track_frame = std::async(std::launch::async, &ObjectTracker::trackObjectInQueue , &trck , std::ref(track_queue), std::ref(out_queue)); 64 | 65 | ftr_track_frame.wait(); 66 | 67 | // Start the task to write the frames 68 | std::future ftr_write_frame = std::async(std::launch::async, &CustomFrameWriter::writeFramesFromQueueToVideo , &fw , std::ref(out_queue)); 69 | 70 | ftr_write_frame.wait(); 71 | } 72 | else 73 | { 74 | // Initialize the frame writer object 75 | CustomFrameWriter fw(video_output1, fg.getNumberOfFrames(), fg.getFrameRate(), fg.getFrameSize()); 76 | 77 | // Start the task to load the frames 78 | std::future ftr_grab_frame = std::async(std::launch::async, &FrameGrabber::grabFramesFromVideo, &fg , std::ref(in_queue)); 79 | 80 | // Start the task to detect the frames 81 | std::future ftr_detect_frame = std::async(std::launch::async, &YOLODetector::detectObjectInQueue , &yolo , std::ref(in_queue), std::ref(out_queue)); 82 | 83 | ftr_grab_frame.wait(); 84 | ftr_detect_frame.wait(); 85 | 86 | std::future ftr_write_frame = std::async(std::launch::async, &CustomFrameWriter::writeFramesFromQueueToVideo , &fw , std::ref(out_queue)); 87 | 88 | ftr_write_frame.wait(); 89 | } 90 | 91 | std::cout << "------- Done !! -------" << "\n"; 92 | 93 | return 0; 94 | } 95 | -------------------------------------------------------------------------------- /src/object_tracker.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | // Constructor 4 | ObjectTracker::ObjectTracker() 5 | { 6 | std::cout << "Initialised the tracker" << std::endl; 7 | _numFramesTracked = 0; 8 | 9 | // Initialize the object tracker 10 | tracker.init(); 11 | } 12 | 13 | // Destructor 14 | ObjectTracker::~ObjectTracker(){}; 15 | 16 | // TrackObject based on the measured location of the objects 17 | void ObjectTracker::trackObject(std::vector &input, float &delT) 18 | { 19 | //Update the tracker 20 | tracker.updateTracks(input, delT); 21 | } 22 | 23 | // Track Objects in the queue of Detected frames 24 | void ObjectTracker::trackObjectInQueue(MessageQueue &msgq, MessageQueue &outq) 25 | { 26 | while(!msgq.is_empty()) 27 | { 28 | // Receive msg from the queue 29 | TrackingMsg tracking_msg = msgq.receive(); 30 | 31 | // Track object 32 | trackObject(tracking_msg.p, tracking_msg.delT); 33 | 34 | // Get the predicted tracks of the tracked objects 35 | std::vector tracked_points = tracker.getTracks(); 36 | 37 | if (tracked_points.size() > 0) 38 | { 39 | // Draw tracking info to frames 40 | addTrackingInfoToFrame(tracking_msg.frame, tracking_msg.p, tracked_points); 41 | } 42 | 43 | // Send the tracked frames to the queue 44 | outq.send(std::move(tracking_msg.frame)); 45 | _numFramesTracked++; 46 | 47 | if(_numFramesTracked%10 == 0) 48 | { 49 | std::cout << "Tracked " << _numFramesTracked << " frames" << std::endl; 50 | } 51 | } 52 | } 53 | 54 | void ObjectTracker::addTrackingInfoToFrame(cv::Mat &frame, std::vector &measured, std::vector &tracked) 55 | { 56 | // Add red circle to the Tracked location 57 | for(auto &p: tracked) 58 | { 59 | cv::circle(frame, cv::Point(p.x, p.y), 6, cv::Scalar(0,0,255), -1); 60 | } 61 | 62 | // Add green circle to the detected location 63 | for(auto &p: measured) 64 | { 65 | cv::circle(frame, cv::Point(p.x, p.y), 3, cv::Scalar(0,255,0), -1); 66 | } 67 | } 68 | -------------------------------------------------------------------------------- /src/tracker.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | // Initialize the tracker 4 | void Tracker::init(){ 5 | _numTracks = 0; 6 | _maxAssociationDist = 0.5; 7 | } 8 | 9 | // Add the new tracks in the tracker and initialize them with the input 10 | void Tracker::addTracks(std::vector &input, float delT) 11 | { 12 | for(int i = 0; i < input.size() ; i++) 13 | { 14 | //std::cout << "Adding Track " << _numTracks << std::endl; 15 | tracks.push_back(std::make_unique()); 16 | tracks[tracks.size() - 1]->init(input[i], delT); 17 | //std::cout << "Added Track " << _numTracks << std::endl; 18 | _numTracks++; 19 | } 20 | } 21 | 22 | // Calculated the assignment based on hungarian algorithm 23 | std::vector Tracker::associate(std::vector > &costmatrix){ 24 | std::vector assignment; 25 | if(costmatrix.size() == 0) 26 | { 27 | return assignment; 28 | } 29 | HA.Solve(costmatrix, assignment); 30 | return assignment; 31 | } 32 | 33 | // computes cost between the two points 34 | inline double computeCost(Point2D p1, Point2D p2, Point2D vel){ 35 | return (std::sqrt(std::pow(p1.x-p2.x, 2) + std::pow(p1.y-p2.y, 2))*(vel - (p2-p1)).norm()); 36 | } 37 | 38 | // Find the distance of point p to the track 39 | std::vector Tracker::findDistToTracks(const Point2D &p){ 40 | std::vector out; 41 | for(auto &t : tracks){ 42 | out.push_back(computeCost(t->getPos(), p, t->getVel())); 43 | } 44 | return out; 45 | } 46 | 47 | // Update the tracker with the new input 48 | void Tracker::updateTracks(std::vector &input, float delT){ 49 | //std::cout<< "--- Updating Tracks | input size = " << input.size() << " tracks" << tracks.size() << std::endl; 50 | 51 | // Run predition on available Tracks 52 | for(auto &t : tracks) 53 | t->predict(delT); 54 | 55 | //std::cout<< "--- Updating Tracks | Predicted on available Tracks" << std::endl; 56 | 57 | // Calculate cost matrix 58 | std::vector> costmatrix; 59 | std::vector new_tracks, measurement; 60 | for (auto i : input){ 61 | std::vector d = findDistToTracks(i); 62 | costmatrix.push_back(d); 63 | } 64 | 65 | //std::cout<< "--- Updating Tracks | Found costmatrix" << std::endl; 66 | 67 | // Find the mapping based on hungarian using the costmatrix 68 | std::vector mapping; 69 | try 70 | { 71 | mapping = associate(costmatrix); 72 | //std::cout<< "--- Updating Tracks | Done association" << std::endl; 73 | } 74 | catch(...) 75 | { 76 | return; 77 | } 78 | 79 | // Add the new tracks and update the existing tracks based on mapping 80 | for(int i = 0; i < mapping.size(); i++){ 81 | if(mapping[i] == -1){ 82 | //std::cout<< "--- Updating Tracks | Adding to new track(Mapping not found) " << i << std::endl; 83 | new_tracks.push_back(input[i]); 84 | } 85 | else{ 86 | if(costmatrix[i][mapping[i]] < _maxAssociationDist){ 87 | //std::cout<< "--- Updating Tracks | Adding to new track(Mapping found - updating) " << i << std::endl; 88 | tracks[mapping[i]]->update(input[i]); 89 | } 90 | else 91 | //std::cout<< "--- Updating Tracks | Adding to new track( > _maxAssociationDist) " << i << std::endl; 92 | new_tracks.push_back(input[i]); 93 | } 94 | } 95 | 96 | // Add the new tracks to the Tracker 97 | addTracks(new_tracks, delT); 98 | //std::cout<< "--- Updating Tracks | Adding new tracks to main tracks " << tracks.size() << " " << _numTracks << std::endl; 99 | 100 | // Remove the tracks which are not updated from long time from the tracker 101 | removeTracks(); 102 | //std::cout<< "--- Updating Tracks | Remaining tracks after removing tracks " << tracks.size() << " " << _numTracks << std::endl; 103 | } 104 | 105 | // Remove tracks from the tracker 106 | void Tracker::removeTracks(){ 107 | for(int t = 0; t < tracks.size();){ 108 | if(!tracks[t]->isSane()){ 109 | tracks.erase(tracks.begin() + t); 110 | t--; 111 | _numTracks--; 112 | } 113 | t++; 114 | } 115 | } 116 | 117 | // Return the tracked pos of all the available tracker 118 | std::vector Tracker::getTracks(){ 119 | std::vector out; 120 | int c = 0; 121 | for(auto &t : tracks){ 122 | //std::cout<< "demanding track pose" << c << std::endl; 123 | Point2D p = t->getPos(); 124 | out.push_back(p); 125 | //std::cout<< "got track pose" << p.x << " " << p.y << std::endl; 126 | c++; 127 | } 128 | return out; 129 | } 130 | 131 | // Return the tracked pos and vel of all the available tracker 132 | std::vector Tracker::getTracks(std::vector &vels){ 133 | std::vector out; 134 | for(auto &t : tracks){ 135 | Point2D p = t->getPos(); 136 | Point2D v = t->getVel(); 137 | out.push_back(p); 138 | vels.push_back(v); 139 | } 140 | return out; 141 | } 142 | -------------------------------------------------------------------------------- /src/yolo_object_detector.cpp: -------------------------------------------------------------------------------- 1 | #include "yolo_object_detector.h" 2 | 3 | // Constructor 4 | YOLODetector::YOLODetector(std::string modelWeights, std::string modelConfig, std::string classesFilename, float confThresh, float nmsThresh) 5 | { 6 | _confidenceThresh = confThresh; 7 | _nonMaxSuppressionThresh = nmsThresh; 8 | _detector = std::make_unique(cv::dnn::readNetFromDarknet(modelConfig, modelWeights)); 9 | _detector->setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV); 10 | _detector->setPreferableTarget(cv::dnn::DNN_TARGET_CPU); 11 | 12 | if(_detector->empty()) 13 | { 14 | std::cout << "Model Load Error" << std::endl; 15 | exit(-1); 16 | } 17 | else 18 | { 19 | std::cout << "Model Loaded Successfully!" << std::endl; 20 | } 21 | 22 | loadClasses(classesFilename); 23 | 24 | // For better detection set 416 25 | _inputWidth = 320; 26 | _inputHeight = 320; 27 | } 28 | 29 | // Destructor 30 | YOLODetector::~YOLODetector(){}; 31 | 32 | void YOLODetector::detectObjectInQueue(MessageQueue &msgq, MessageQueue &outq) 33 | { 34 | //wait while msg queue is filled with some frames 35 | while(msgq.is_empty()) std::this_thread::sleep_for(std::chrono::milliseconds(10)); 36 | 37 | while(!msgq.is_empty()) 38 | { 39 | cv::Mat frame = msgq.receive(); 40 | std::vector info; 41 | this->detectObject(frame, info); 42 | _numFramesDetected ++; 43 | this->postProcessDetectedObjectFrame(frame, info); 44 | 45 | // Move the detected frame to outer queue 46 | outq.send(std::move(frame)); 47 | 48 | if(_numFramesDetected%10 == 0) 49 | { 50 | std::cout << "Detected " << _numFramesDetected << " frames" << std::endl; 51 | } 52 | } 53 | } 54 | 55 | void YOLODetector::detectObjectInQueueAndTrack(MessageQueue &msgq, MessageQueue &trackq) 56 | { 57 | //wait while msg queue is filled with some frames 58 | while(msgq.is_empty()) std::this_thread::sleep_for(std::chrono::milliseconds(10)); 59 | 60 | while(!msgq.is_empty()) 61 | { 62 | cv::Mat frame = msgq.receive(); 63 | std::vector info; 64 | this->detectObject(frame, info); 65 | _numFramesDetected ++; 66 | this->postProcessDetectedObjectFrame(frame, info); 67 | 68 | // Move the detected frame to outer queue 69 | TrackingMsg trackmsg(frame, _objectCenters, _frameDetectionTime); 70 | trackq.send(std::move(trackmsg)); 71 | _objectCenters.clear(); 72 | 73 | if(_numFramesDetected%10 == 0) 74 | { 75 | std::cout << "Detected " << _numFramesDetected << " frames" << std::endl; 76 | } 77 | } 78 | } 79 | 80 | void YOLODetector::detectObject(cv::Mat &frame, std::vector &info) 81 | { 82 | // Create blob from the Image 83 | cv::Mat blob; 84 | cv::dnn::blobFromImage(frame, blob, 1/255.0, cv::Size(_inputWidth, _inputHeight), cv::Scalar(0,0,0), true, false); 85 | 86 | // Sets the input to the network 87 | _detector->setInput(blob); 88 | 89 | // Runs the forward pass through the network to get output of the output layers 90 | _detector->forward(info, _detector->getUnconnectedOutLayersNames()); 91 | } 92 | 93 | void YOLODetector::postProcessDetectedObjectFrame(cv::Mat &frame, const std::vector &info) 94 | { 95 | this->addDetectionTimeToFrame(frame); 96 | 97 | //vector to hold classes 98 | std::vector classIds; 99 | 100 | //vector to hold confidence 101 | std::vector confidences; 102 | 103 | //vector to hold the bounding box positions 104 | std::vector boxes; 105 | 106 | for(int i = 0; i < info.size(); i++) 107 | { 108 | //scan through all the boonding boxes output and keep only those with high confidence scores 109 | float* data = (float*)info[i].data; 110 | for (int j = 0; j < info[i].rows; ++j, data += info[i].cols) 111 | { 112 | cv::Mat scores = info[i].row(j).colRange(5, info[i].cols); 113 | cv::Point classIdPoint; 114 | double confidence; 115 | 116 | // Get the value and location of the maximum score 117 | cv::minMaxLoc(scores, 0, &confidence, 0, &classIdPoint); 118 | 119 | //check if confidence is greater than the thresh than add the classID, confidence and box to there respective vectors 120 | if (confidence > _confidenceThresh) 121 | { 122 | int centerX = (int)(data[0] * frame.cols); 123 | int centerY = (int)(data[1] * frame.rows); 124 | int width = (int)(data[2] * frame.cols); 125 | int height = (int)(data[3] * frame.rows); 126 | int left = centerX - width / 2; 127 | int top = centerY - height / 2; 128 | 129 | classIds.push_back(classIdPoint.x); 130 | confidences.push_back((float)confidence); 131 | boxes.push_back(cv::Rect(left, top, width, height)); 132 | } 133 | } 134 | } 135 | 136 | // Apply nonmax suppression 137 | std::vector box_indices; 138 | cv::dnn::NMSBoxes(boxes, confidences, _confidenceThresh, _nonMaxSuppressionThresh, box_indices); 139 | 140 | for (size_t i = 0; i < box_indices.size(); ++i) 141 | { 142 | int idx = box_indices[i]; 143 | cv::Rect box = boxes[idx]; 144 | this->drawBoundingBoxToFrame(box.x, box.y, box.x + box.width, box.y + box.height, frame); 145 | this->addObjectClassToFrame(classIds[idx], confidences[idx], box.x, box.y, frame); 146 | Point2D p; 147 | p.x = (box.x + box.width + box.x)/2; 148 | p.y = (box.y + box.height + box.y)/2; 149 | _objectCenters.push_back(p); 150 | } 151 | } 152 | 153 | void YOLODetector::addDetectionTimeToFrame(cv::Mat &frame) 154 | { 155 | std::vector layersTimes; 156 | float freq = cv::getTickFrequency() / 1000; 157 | float t = _detector->getPerfProfile(layersTimes) / freq; 158 | std::string label = "Inference time for the frame : " + std::to_string(t) + " ms"; 159 | cv::putText(frame, label, cv::Point(0, 15), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(255, 0, 0)); 160 | _frameDetectionTime = t; 161 | } 162 | -------------------------------------------------------------------------------- /videos/input_video.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AvnishGupta143/MultiThreaded-Object-Detection-and-Tracking-cpp/f27949909b99b31ceb4c353e68732976049c0e2f/videos/input_video.mp4 --------------------------------------------------------------------------------