diff --git a/README.md b/README.md index 33d0f8be..869fa51c 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ * [x] Age Gender Recognition * [x] Emotion Recognition * [x] Head Pose Estimation -* [x] Object Segmentation +* [x] Object Segmentation (Semantic & Instance) * [x] Person Re-Identification * [x] Vehicle Attribute Detection * [x] Vehicle License Plate Detection @@ -54,6 +54,7 @@ # Introduction ## Design Architecture +

Architecture Design From the view of hirarchical architecture design, the package is divided into different functional components, as shown in below picture. ![OpenVINO_Architecture](./data/images/design_arch.PNG "OpenVINO RunTime Architecture") @@ -94,8 +95,10 @@ See more from [here](https://github.com/openvinotoolkit/openvino) for Intel Open - **Optimized Models** provided by Model Optimizer component of Intel® OpenVINO™ toolkit. Imports trained models from various frameworks (Caffe*, Tensorflow*, MxNet*, ONNX*, Kaldi*) and converts them to a unified intermediate representation file. It also optimizes topologies through node merging, horizontal fusion, eliminating batch normalization, and quantization. It also supports graph freeze and graph summarize along with dynamic input freezing.

+

## Logic Flow +

Logic Flow From the view of logic implementation, the package introduces the definitions of parameter manager, pipeline and pipeline manager. The following picture depicts how these entities co-work together when the corresponding program is launched. ![Logic_Flow](./data/images/impletation_logic.PNG "OpenVINO RunTime Logic Flow") @@ -119,6 +122,7 @@ The contents in **.yaml config file** should be well structured and follow the s **Pipeline manager** manages all the created pipelines according to the inference requests or external demands (say, system exception, resource limitation, or end user's operation). Because of co-working with resource management and being aware of the whole framework, it covers the ability of performance optimization by sharing system resource between pipelines and reducing the burden of data copy.

+

# Supported Features ## Multiple Input Components @@ -152,12 +156,13 @@ Currently, the corresponding relation of supported inference features, models us |Emotion Recognition| Emotion recognition based on detected face image.|[pipeline_image.yaml](./sample/param/pipeline_image.yaml)
[pipeline_image_video.yaml](./sample/param/pipeline_image_video.yaml)
[pipeline_people.yaml](./sample/param/pipeline_people.yaml)
[pipeline_people_ip.yaml](./sample/param/pipeline_people_ip.yaml)|[emotions-recognition-retail-0003](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/emotions-recognition-retail-0003)| |Age & Gender Recognition| Age and gender recognition based on detected face image.|[pipeline_image.yaml](./sample/param/pipeline_image.yaml)
[pipeline_image_video.yaml](./sample/param/pipeline_image_video.yaml)
[pipeline_people.yaml](./sample/param/pipeline_people.yaml)
[pipeline_people_ip.yaml](./sample/param/pipeline_people_ip.yaml)|[age-gender-recognition-retail-0013](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/age-gender-recognition-retail-0013)| |Head Pose Estimation| Head pose estimation based on detected face image.|[pipeline_image.yaml](./sample/param/pipeline_image.yaml)
[pipeline_image_video.yaml](./sample/param/pipeline_image_video.yaml)
[pipeline_people.yaml](./sample/param/pipeline_people.yaml)
[pipeline_people_ip.yaml](./sample/param/pipeline_people_ip.yaml)|[head-pose-estimation-adas-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/head-pose-estimation-adas-0001)| -|Object Detection| Object detection based on SSD-based trained models.|[pipeline_object.yaml](./sample/param/pipeline_object.yaml)
[pipeline_object_topic.yaml](./sample/param/pipeline_object_topic.yaml)|[mobilenet-ssd](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/mobilenet-ssd)| +|Object Detection| Object detection based on SSD-based trained models.|[pipeline_object.yaml](./sample/param/pipeline_object.yaml)
[pipeline_object_topic.yaml](./sample/param/pipeline_object_topic.yaml)|[mobilenet-ssd](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/mobilenet-ssd)
[yolov5](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/111-yolov5-quantization-migration)
[yolov7](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/226-yolov7-optimization)
[yolov8](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/230-yolov8-optimization)| |Vehicle and License Detection| Vehicle and license detection based on Intel models.|[pipeline_vehicle_detection.yaml](./sample/param/pipeline_vehicle_detection.yaml)|[vehicle-license-plate-detection-barrier-0106](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/vehicle-license-plate-detection-barrier-0106)
[vehicle-attributes-recognition-barrier-0039](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/vehicle-attributes-recognition-barrier-0039)
[license-plate-recognition-barrier-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/license-plate-recognition-barrier-0001)| -|Object Segmentation| Object segmentation.|[pipeline_segmentation.yaml](./sample/param/pipeline_segmentation.yaml)
[pipeline_segmentation_image.yaml](./sample/param/pipeline_segmentation_image.yaml)
[pipeline_video.yaml](./sample/param/pipeline_video.yaml)|[semantic-segmentation-adas-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/semantic-segmentation-adas-0001)
[deeplabv3](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/deeplabv3)| +|Object Segmentation - Semantic| semantic segmentation, assign a class label to each pixel in an image. |[pipeline_segmentation.yaml](./sample/param/pipeline_segmentation.yaml)
[pipeline_segmentation_image.yaml](./sample/param/pipeline_segmentation_image.yaml)
[pipeline_video.yaml](./sample/param/pipeline_video.yaml)|[semantic-segmentation-adas-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/semantic-segmentation-adas-0001)
[deeplabv3](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/deeplabv3)| +| Object Segmentation - Instance | Instance Segmentation, combination of semantic segmentation & object detection. | [pipeline_segmentation_instance.launch.yaml](./sample/param/pipeline_segmentation_instance.yaml) | [yolov8-seg](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/230-yolov8-optimization)
[mask_rcnn_inception_v2_coco_2018_01_28](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/mask_rcnn_inception_resnet_v2_atrous_coco)| |Person Attributes| Person attributes based on object detection.|[pipeline_person_attributes.yaml](./sample/param/pipeline_person_attributes.yaml)|[person-attributes-recognition-crossroad-0230](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/person-attributes-recognition-crossroad-0230)
[person-detection-retail-0013](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/person-detection-retail-0013)| |Person Reidentification|Person reidentification based on object detection.|[pipeline_person_reidentification.yaml](./sample/param/pipeline_reidentification.yaml)|[person-detection-retail-0013](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/person-detection-retail-0013)
[person-reidentification-retail-0277](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/intel/person-reidentification-retail-0277)| -|Object Segmentation Maskrcnn| Object segmentation and detection based on maskrcnn model.|[pipeline_segmentation_maskrcnn.yaml](./sample/param/pipeline_segmentation_maskrcnn.yaml)|[mask_rcnn_inception_v2_coco_2018_01_28](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/mask_rcnn_inception_resnet_v2_atrous_coco)| +|Object Segmentation Maskrcnn| Object segmentation and detection based on maskrcnn model.[_Deprecated, it is recommended to use `object segementation - instance` for first try._]|[pipeline_segmentation_maskrcnn.yaml](./sample/param/pipeline_segmentation_maskrcnn.yaml)|[mask_rcnn_inception_v2_coco_2018_01_28](https://github.com/openvinotoolkit/open_model_zoo/tree/releases/2022/3/models/public/mask_rcnn_inception_resnet_v2_atrous_coco)|

@@ -212,6 +217,7 @@ OpenCV based image window is natively supported by the package. To enable window, Image Window output should be added into the output choices in .yaml config file. Refer to [the config file guidance](./doc/quick_start/yaml_configuration_guide.md) for more information about checking/adding this feature in your launching. ## Demo Result Snapshots +

Demo Snapshots For the snapshot of demo results, refer to the following picture. * Face detection input from standard camera @@ -225,6 +231,7 @@ For the snapshot of demo results, refer to the following picture. * Person reidentification input from standard camera ![person_reidentification_demo_video](./data/images/person-reidentification.gif "person reidentification demo video") +

# Installation and Launching ## Deploy in Local Environment @@ -240,11 +247,28 @@ For the snapshot of demo results, refer to the following picture. * OpenVINO api 2.0: Refer to the OpenVINO document for [OpenVINO_api_2.0](https://docs.openvino.ai/latest/openvino_2_0_transition_guide.html) for latest api 2.0 transition guide. # FAQ -* [How to get the IR file for yolov5?](./doc/quick_start/tutorial_for_yolov5_converted.md) +* How to get the IR file for [yolov5](./doc/quick_start/tutorial_for_yolov5_converted.md) | [yolov7](./doc/quick_start/tutorial_for_yolov7_converted.md) | [yolov8](./doc/quick_start/tutorial_for_yolov8_converted.md) ? * [How to build OpenVINO by source?](https://github.com/openvinotoolkit/openvino/wiki#how-to-build) * [How to build RealSense by source?](https://github.com/IntelRealSense/librealsense/blob/master/doc/installation.md) * [What is the basic command of Docker CLI?](https://docs.docker.com/engine/reference/commandline/docker/) * [What is the canonical C++ API for interacting with ROS?](https://docs.ros2.org/latest/api/rclcpp/) +

How to change logging level? + This project provides to logging levels: *DEBUG* & *INFO*.
+ You may follow the steps to change logging level:
+ + - Update ./openvino_wrapper_lib/CMakeLists.txt by uncommenting (for DEBUG level) or commenting (for INFO level) this line: + ```code + #add_definitions(-DLOG_LEVEL_DEBUG) + ``` + - Rebuild project
+ Refer corresponding quick-start documents to rebuild this project. e.g.:
+ ```code + source /opt/ros//setup.bash + colcon build --symlink-install + ``` + - Launch OpenVINO Node
+ You will see the logging is changed. +

# Feedback * Report questions, issues and suggestions, using: [issue](https://github.com/intel/ros2_openvino_toolkit/issues). diff --git a/data/labels/object_segmentation/frozen_inference_graph.labels b/data/labels/object_segmentation/frozen_inference_graph.labels index b4427edc..744de27d 100644 --- a/data/labels/object_segmentation/frozen_inference_graph.labels +++ b/data/labels/object_segmentation/frozen_inference_graph.labels @@ -1,3 +1,4 @@ +_background person bicycle car @@ -87,4 +88,4 @@ vase scissors teddy_bear hair_drier -toothbrush \ No newline at end of file +toothbrush diff --git a/openvino_param_lib/src/param_manager.cpp b/openvino_param_lib/src/param_manager.cpp index 89527c95..bebf3d7f 100644 --- a/openvino_param_lib/src/param_manager.cpp +++ b/openvino_param_lib/src/param_manager.cpp @@ -191,6 +191,7 @@ void ParamManager::print() const for (auto & infer : pipeline.infers) { slog::info << "\t\tName: " << infer.name << slog::endl; slog::info << "\t\tModel: " << infer.model << slog::endl; + slog::info << "\t\tModel-Type: " << infer.model_type << slog::endl; slog::info << "\t\tEngine: " << infer.engine << slog::endl; slog::info << "\t\tLabel: " << infer.label << slog::endl; slog::info << "\t\tBatch: " << infer.batch << slog::endl; diff --git a/openvino_wrapper_lib/CMakeLists.txt b/openvino_wrapper_lib/CMakeLists.txt index f6ea6126..131607a5 100644 --- a/openvino_wrapper_lib/CMakeLists.txt +++ b/openvino_wrapper_lib/CMakeLists.txt @@ -30,7 +30,7 @@ set(CMAKE_CXX_FLAGS "-std=c++17 ${CMAKE_CXX_FLAGS}") #################################### ## to get verbose log, ## then, uncomment below line -add_definitions(-DLOG_LEVEL_DEBUG) +#add_definitions(-DLOG_LEVEL_DEBUG) #################################### # environment variable OpenVINO_DIR can be use instead of relaive path to specify location of configuration file @@ -189,6 +189,7 @@ add_library(${PROJECT_NAME} SHARED src/inferences/head_pose_detection.cpp src/inferences/object_segmentation.cpp src/inferences/object_segmentation_maskrcnn.cpp + src/inferences/object_segmentation_instance.cpp src/inferences/person_reidentification.cpp src/inferences/person_attribs_detection.cpp #src/inferences/landmarks_detection.cpp @@ -209,6 +210,8 @@ add_library(${PROJECT_NAME} SHARED src/models/head_pose_detection_model.cpp src/models/object_segmentation_model.cpp src/models/object_segmentation_maskrcnn_model.cpp + src/models/object_segmentation_instance_model.cpp + src/models/object_segmentation_instance_maskrcnn_model.cpp src/models/person_reidentification_model.cpp src/models/person_attribs_detection_model.cpp #src/models/landmarks_detection_model.cpp @@ -217,6 +220,7 @@ add_library(${PROJECT_NAME} SHARED src/models/license_plate_detection_model.cpp src/models/object_detection_ssd_model.cpp src/models/object_detection_yolov5_model.cpp + src/models/object_detection_yolov8_model.cpp src/outputs/image_window_output.cpp src/outputs/ros_topic_output.cpp src/outputs/rviz_output.cpp diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/inferences/object_segmentation_instance.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/inferences/object_segmentation_instance.hpp new file mode 100644 index 00000000..f1dbc548 --- /dev/null +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/inferences/object_segmentation_instance.hpp @@ -0,0 +1,151 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#ifndef OPENVINO_WRAPPER_LIB__INFERENCES__OBJECT_SEGMENTATION_INSTANCE_HPP_ +#define OPENVINO_WRAPPER_LIB__INFERENCES__OBJECT_SEGMENTATION_INSTANCE_HPP_ +#include +#include +#include +#include +#include +#include +#include +#include "openvino_wrapper_lib/models/object_segmentation_instance_model.hpp" +#include "openvino_wrapper_lib/engines/engine.hpp" +#include "openvino_wrapper_lib/inferences/base_inference.hpp" +#include "openvino/openvino.hpp" +#include "opencv2/opencv.hpp" +// namespace +namespace openvino_wrapper_lib +{ +/** + * @class ObjectSegmentationInstanceResult + * @brief Class for storing and processing object segmentation result. + */ +class ObjectSegmentationInstanceResult : public Result +{ +public: + friend class ObjectSegmentationInstance; + explicit ObjectSegmentationInstanceResult(const cv::Rect & location); + inline std::string getLabel() const + { + return label_; + } + inline void setLabel(const std::string& label) + { + label_ = label; + } + /** + * @brief Get the confidence that the detected area is a face. + * @return The confidence value. + */ + inline float getConfidence() const + { + return confidence_; + } + inline void setConfidence(float conf) + { + confidence_ = conf; + } + inline cv::Mat getMask() const + { + return mask_; + } + inline void setMask(const cv::Mat& mask) + { + mask_ = mask; + } + +private: + std::string label_ = ""; + float confidence_ = -1; + cv::Mat mask_; +}; +/** + * @class ObjectSegmentation + * @brief Class to load object segmentation model and perform object segmentation. + */ +class ObjectSegmentationInstance : public BaseInference +{ +public: + using Result = openvino_wrapper_lib::ObjectSegmentationInstanceResult; + explicit ObjectSegmentationInstance(double); + ~ObjectSegmentationInstance() override; + /** + * @brief Load the object segmentation model. + */ + void loadNetwork(std::shared_ptr); + /** + * @brief Enqueue a frame to this class. + * The frame will be buffered but not infered yet. + * @param[in] frame The frame to be enqueued. + * @param[in] input_frame_loc The location of the enqueued frame with respect + * to the frame generated by the input device. + * @return Whether this operation is successful. + */ + bool enqueue(const cv::Mat &, const cv::Rect &) override; + + /** + * @brief Start inference for all buffered frames. + * @return Whether this operation is successful. + */ + bool submitRequest() override; + /** + * @brief This function will fetch the results of the previous inference and + * stores the results in a result buffer array. All buffered frames will be + * cleared. + * @return Whether the Inference object fetches a result this time + */ + bool fetchResults() override; + /** + * @brief Get the length of the buffer result array. + * @return The length of the buffer result array. + */ + int getResultsLength() const override; + /** + * @brief Get the location of result with respect + * to the frame generated by the input device. + * @param[in] idx The index of the result. + */ + const openvino_wrapper_lib::Result * getLocationResult(int idx) const override; + /** + * @brief Show the observed detection result either through image window + or ROS topic. + */ + void observeOutput(const std::shared_ptr & output); + /** + * @brief Get the name of the Inference instance. + * @return The name of the Inference instance. + */ + const std::string getName() const override; + const std::vector getFilteredROIs( + const std::string filter_conditions) const override; + +private: + std::shared_ptr valid_model_; + std::vector results_; + int width_ = 0; + int height_ = 0; + double show_output_thresh_ = 0; + + std::vector colors_ = { + {128, 64, 128}, {232, 35, 244}, {70, 70, 70}, {156, 102, 102}, {153, 153, 190}, + {153, 153, 153}, {30, 170, 250}, {0, 220, 220}, {35, 142, 107}, {152, 251, 152}, + {180, 130, 70}, {60, 20, 220}, {0, 0, 255}, {142, 0, 0}, {70, 0, 0}, + {100, 60, 0}, {90, 0, 0}, {230, 0, 0}, {32, 11, 119}, {0, 74, 111}, + {81, 0, 81} + }; +}; +} // namespace openvino_wrapper_lib +#endif // OPENVINO_WRAPPER_LIB__INFERENCES__OBJECT_SEGMENTATION_HPP_ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/attributes/base_attribute.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/attributes/base_attribute.hpp index 7f36c061..abf8b3aa 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/attributes/base_attribute.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/attributes/base_attribute.hpp @@ -1,4 +1,4 @@ -// Copyright (c) 2018-2022 Intel Corporation +// Copyright (c) 2018-2023 Intel Corporation // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. @@ -39,14 +39,24 @@ class ModelAttribute { public: using Ptr = std::shared_ptr; + const char* DefaultInputName {"input0"}; + const char* DefaultOutputName = "output0"; struct ModelAttr { - int max_proposal_count = 0; - int object_size = 0; + // Input Tensor Size int input_height = 0; int input_width = 0; - std::string model_name; + + //Input/Output Tensor Info + int input_tensor_count = 1; // The number of input tensors + int output_tensor_count = 1; // The number of output tensors + bool has_confidence_output = true; //Yolov5~7 have a float for confidence, while yolov8 hasn't. + bool need_transpose = false; // If the output tensor needs transpose + int max_proposal_count = 0; // The max number of objects in inference output tensor + int object_size = 0; //The size of each object in inference output tensor std::map input_names; std::map output_names; + + std::string model_name; std::vector labels; }; @@ -62,12 +72,19 @@ class ModelAttribute } inline void printAttribute() { - slog::info << "----Attributes for Model " << attr_.model_name << "----" << slog::endl; + slog::info << "-------------------- Attributes for Model ----------------------" << slog::endl; slog::info << "| model_name: " << attr_.model_name << slog::endl; slog::info << "| max_proposal_count: " << attr_.max_proposal_count << slog::endl; slog::info << "| object_size: " << attr_.object_size << slog::endl; slog::info << "| input_height: " << attr_.input_height << slog::endl; slog::info << "| input_width: " << attr_.input_width << slog::endl; + slog::info << "| input_tensor_count: " << attr_.input_tensor_count << slog::endl; + slog::info << "| output_tensor_count: " << attr_.output_tensor_count << slog::endl; + slog::info << "| need_transpose (max_proposal_count < object_size): " << std::boolalpha + << attr_.need_transpose << slog::endl; + slog::info << "| has_confidence_output: " << std::boolalpha << attr_.has_confidence_output << + slog::endl; + slog::info << "| input_names: " << slog::endl; for (auto & item: attr_.input_names) { slog::info << "| " << item.first << "-->" << item.second << slog::endl; @@ -77,13 +94,31 @@ class ModelAttribute slog::info << "| " << item.first << "-->" << item.second << slog::endl; } + slog::info << "| lables:" << slog::endl; + for (size_t i = 0; i(attr_.input_names.size())){ + slog::info << "--------" << slog::endl; + slog::warn << "The count of input_tensor(s) is not aligned with input names!" + << slog::endl; + } + if( attr_.output_tensor_count != static_cast(attr_.output_names.size())){ + slog::info << "--------" << slog::endl; + slog::warn << "The count of output_tensor(s) is not aligned with output names!" + << slog::endl; + } + slog::info << "-------------------- Attributes for Model ----------------------" << slog::endl; } virtual bool updateLayerProperty( @@ -100,7 +135,7 @@ class ModelAttribute attr_.model_name = name; } - inline std::string getInputName(std::string name = "input") const + inline std::string getInputName(std::string name = "input0") const { auto it = attr_.input_names.find(name); if(it == attr_.input_names.end()){ @@ -111,7 +146,7 @@ class ModelAttribute return it->second; } - inline std::string getOutputName(std::string name = "output") const + inline std::string getOutputName(std::string name = "output0") const { auto it = attr_.output_names.find(name); if(it == attr_.output_names.end()){ @@ -135,10 +170,10 @@ class ModelAttribute inline void loadLabelsFromFile(const std::string file_path) { std::ifstream input_file(file_path); - std::copy(std::istream_iterator(input_file), - std::istream_iterator(), - std::back_inserter(attr_.labels)); + for(std::string name; std::getline(input_file, name);){ + attr_.labels.push_back(name); } + } inline std::vector& getLabels() { @@ -150,6 +185,11 @@ class ModelAttribute attr_.input_names[key] = value; } + inline const std::string getInputInfo(std::string key) + { + return attr_.input_names[key]; + } + inline void addOutputInfo(std::string key, std::string value) { attr_.output_names[key] = value; @@ -185,6 +225,72 @@ class ModelAttribute attr_.object_size = size; } + inline void setHasConfidenceOutput(const bool has) + { + attr_.has_confidence_output = has; + } + + inline bool hasConfidenceOutput() const + { + return attr_.has_confidence_output; + } + + inline void setCountOfInputs(const int count) + { + attr_.input_tensor_count = count; + } + + inline int getCountOfInputs() const + { + return attr_.input_tensor_count; + } + + inline void setCountOfOutputs(const int count) + { + attr_.output_tensor_count = count; + } + + inline int getCountOfOutputs() const + { + return attr_.output_tensor_count; + } + + inline void setTranspose(bool trans) + { + attr_.need_transpose = trans; + } + + inline bool needTranspose() const + { + return attr_.need_transpose; + } + + inline bool _renameMapKeyByValue(std::map& map, + const std::string& value, const std::string& new_key) + { + for (auto& item: map){ + auto n = item.second.find(value); + if (std::string::npos != n) { + //if(item.second.contains(value)){ + auto nh = map.extract(item.first); + nh.key() = new_key; + map.insert(std::move(nh)); + return true; + } + } + + return false; + } + + inline bool retagOutputByValue(const std::string& value, const std::string& new_tag) + { + return _renameMapKeyByValue(attr_.output_names, value, new_tag); + } + + inline bool retagInputByValue(const std::string& value, const std::string& new_tag) + { + return _renameMapKeyByValue(attr_.input_names, value, new_tag); + } protected: ModelAttr attr_; std::string input_tensor_name_; diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/base_model.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/base_model.hpp index ce0a0ac0..fbc84dcb 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/base_model.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/base_model.hpp @@ -107,6 +107,12 @@ namespace Models * @param[in] network_reader The reader of the network to be set. */ virtual bool updateLayerProperty(std::shared_ptr& network_reader) = 0; + + virtual bool matToBlob( + const cv::Mat &orig_image, const cv::Rect &, float scale_factor, + int batch_index, const std::shared_ptr &engine); + + cv::Mat extendFrameToInputRatio(const cv::Mat); ov::Core engine; std::shared_ptr model_; void setFrameSize(const int &w, const int &h) @@ -119,11 +125,57 @@ namespace Models return frame_size_; } + inline void setFrameResizeeRatioWidth(const float r) + { + frame_resize_ratio_width_ = r; + } + + inline void setFrameResizeeRatioHeight(const float r) + { + frame_resize_ratio_height_ = r; + } + + inline float getFrameResizeRatioWidth() const + { + return frame_resize_ratio_width_; + } + + inline float getFrameResizeRatioHeight() const + { + return frame_resize_ratio_height_; + } + + inline void setKeepInputShapeRatio(bool keep) + { + keep_input_shape_ratio_ = keep; + } + + inline bool isKeepInputRatio() const + { + return keep_input_shape_ratio_; + } + + inline void setExpectedFrameSize(cv::Size expect) + { + expected_frame_size_ = expect; + } + + inline cv::Size getExpectedFrameSize() const + { + return expected_frame_size_; + } + private: int max_batch_size_; std::string model_loc_; std::string label_loc_; + + //Information about Input Data cv::Size frame_size_; + cv::Size expected_frame_size_ {224, 224}; + float frame_resize_ratio_width_ = 1.0; + float frame_resize_ratio_height_ = 1.0; + bool keep_input_shape_ratio_ = false; }; class ObjectDetectionModel : public BaseModel @@ -135,9 +187,6 @@ namespace Models std::vector &result, const float &confidence_thresh = 0.3, const bool &enable_roi_constraint = false) = 0; - virtual bool matToBlob( - const cv::Mat &orig_image, const cv::Rect &, float scale_factor, - int batch_index, const std::shared_ptr &engine) = 0; }; } // namespace Models diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov5_model.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov5_model.hpp index 8e2f177e..e5cd757e 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov5_model.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov5_model.hpp @@ -53,21 +53,12 @@ class ObjectDetectionYolov5Model : public ObjectDetectionModel const cv::Mat & frame, const cv::Rect & input_frame_loc) override; - bool matToBlob( - const cv::Mat & orig_image, const cv::Rect &, float scale_factor, - int batch_index, const std::shared_ptr & engine) override; - /** * @brief Get the name of this detection model. * @return Name of the model. */ const std::string getModelCategory() const override; bool updateLayerProperty(std::shared_ptr&) override; - static Resize_t pre_process_ov(const cv::Mat &input_image); - - cv::Mat input_image; - Resize_t resize_img; - }; } // namespace Models #endif // OPENVINO_WRAPPER_LIB__MODELS__OBJECT_DETECTION_YOLOV5_MODEL_HPP_ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov8_model.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov8_model.hpp new file mode 100644 index 00000000..c3cdd828 --- /dev/null +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_detection_yolov8_model.hpp @@ -0,0 +1,35 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#ifndef OPENVINO_WRAPPER_LIB__MODELS__OBJECT_DETECTION_YOLOV8_MODEL_HPP_ +#define OPENVINO_WRAPPER_LIB__MODELS__OBJECT_DETECTION_YOLOV8_MODEL_HPP_ +#include +#include +#include +#include "openvino_wrapper_lib/models/object_detection_yolov5_model.hpp" + +namespace Models +{ + +class ObjectDetectionYolov8Model : public ObjectDetectionYolov5Model +{ + using Result = openvino_wrapper_lib::ObjectDetectionResult; + +public: + explicit ObjectDetectionYolov8Model(const std::string& label_loc, const std::string & model_loc, + int batch_size = 1); + +}; +} // namespace Models +#endif // OPENVINO_WRAPPER_LIB__MODELS__OBJECT_DETECTION_YOLOV8_MODEL_HPP_ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_maskrcnn_model.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_maskrcnn_model.hpp new file mode 100644 index 00000000..8b32826d --- /dev/null +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_maskrcnn_model.hpp @@ -0,0 +1,43 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +#ifndef OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MASKRCNN_MODEL_HPP_ +#define OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MASKRCNN_MODEL_HPP_ +#include +#include +#include "openvino_wrapper_lib/models/base_model.hpp" + +namespace Models +{ + +/** + * @class ObjectSegmentationInstanceMaskrcnnModel + * @brief This class generates the object segmentation model. + */ +class ObjectSegmentationInstanceMaskrcnnModel : public ObjectSegmentationInstanceModel +{ + using Result = openvino_wrapper_lib::ObjectSegmentationInstanceResult; +public: + ObjectSegmentationInstanceMaskrcnnModel(const std::string& label_loc, const std::string & model_loc, int batch_size = 1); + + bool fetchResults( + const std::shared_ptr & engine, + std::vector & results, + const float & confidence_thresh = 0.3, + const bool & enable_roi_constraint = false); + + bool updateLayerProperty(std::shared_ptr&) override; + +}; +} // namespace Models +#endif // OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MASKRCNN_MODEL_HPP_ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_model.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_model.hpp new file mode 100644 index 00000000..2a566197 --- /dev/null +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/models/object_segmentation_instance_model.hpp @@ -0,0 +1,56 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +#ifndef OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MODEL_HPP_ +#define OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MODEL_HPP_ +#include +#include +#include "openvino_wrapper_lib/models/base_model.hpp" + +namespace openvino_wrapper_lib +{ + class ObjectSegmentationInstanceResult; +} + +namespace Models +{ + +/** + * @class ObjectSegmentationInstanceModel + * @brief This class generates the object segmentation model. + */ +class ObjectSegmentationInstanceModel : public BaseModel +{ + using Result = openvino_wrapper_lib::ObjectSegmentationInstanceResult; +public: + ObjectSegmentationInstanceModel(const std::string& label_loc, const std::string & model_loc, int batch_size = 1); + + virtual bool fetchResults( + const std::shared_ptr & engine, + std::vector & results, + const float & confidence_thresh = 0.3, + const bool & enable_roi_constraint = false); + + bool enqueue(const std::shared_ptr & ,const cv::Mat &, + const cv::Rect & ) override; + + /** + * @brief Get the name of this segmentation model. + * @return Name of the model. + */ + const std::string getModelCategory() const override; + virtual bool updateLayerProperty(std::shared_ptr&) override; + +}; +} // namespace Models +#endif // OPENVINO_WRAPPER_LIB__MODELS__OBJECT_SEGMENTATION_INSTANCE_MODEL_HPP_ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/base_output.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/base_output.hpp index 67971a47..150c0a69 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/base_output.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/base_output.hpp @@ -51,6 +51,7 @@ #include "openvino_wrapper_lib/inferences/vehicle_attribs_detection.hpp" #include "openvino_wrapper_lib/inferences/license_plate_detection.hpp" #include "openvino_wrapper_lib/inferences/object_segmentation_maskrcnn.hpp" +#include "openvino_wrapper_lib/inferences/object_segmentation_instance.hpp" #include "opencv2/opencv.hpp" class Pipeline; @@ -115,6 +116,12 @@ class BaseOutput virtual void accept(const std::vector &) { } + /** + * @brief Generate output content according to the object segmentation result for instance models. + */ + virtual void accept(const std::vector &) + { + } /** * @brief Generate output content according to the object detection result. */ diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/image_window_output.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/image_window_output.hpp index d7c39ea1..2ee11789 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/image_window_output.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/image_window_output.hpp @@ -104,6 +104,12 @@ class ImageWindowOutput : public BaseOutput * @param[in] An obejct segmentation result objetc. */ void accept(const std::vector &) override; + /** + * @brief Generate image window output content according to + * the object segmentation instance result. + * @param[in] An obejct segmentation result objetc. + */ + void accept(const std::vector &) override; /** * @brief Generate image window output content according to * the face detection result. @@ -152,6 +158,7 @@ class ImageWindowOutput : public BaseOutput void mergeMask(const std::vector &); void mergeMask(const std::vector &); + void mergeMask(const std::vector &); struct OutputData { diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/ros_topic_output.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/ros_topic_output.hpp index 74285c0e..ff3dc28d 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/ros_topic_output.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/outputs/ros_topic_output.hpp @@ -120,6 +120,12 @@ class RosTopicOutput : public BaseOutput * @param[in] results a bundle of object segmentation maskrcnn results. */ void accept(const std::vector &) override; + /** + * @brief Generate ros topic infomation according to + * the object segmentation result. + * @param[in] results a bundle of object segmentation maskrcnn results. + */ + void accept(const std::vector &) override; /** * @brief Generate ros topic infomation according to * the object detection result. diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_manager.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_manager.hpp index e4a3d485..b4bef25d 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_manager.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_manager.hpp @@ -127,6 +127,8 @@ class PipelineManager std::shared_ptr createObjectSegmentationMaskrcnn(const Params::ParamManager::InferenceRawData & infer); std::shared_ptr + createObjectSegmentationInstance(const Params::ParamManager::InferenceRawData & infer); + std::shared_ptr createPersonReidentification(const Params::ParamManager::InferenceRawData & infer); std::shared_ptr createPersonAttribsDetection(const Params::ParamManager::InferenceRawData & infer); diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_params.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_params.hpp index bcb2991a..e1cf80a2 100644 --- a/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_params.hpp +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/pipeline_params.hpp @@ -54,8 +54,12 @@ const char kInferTpye_HeadPoseEstimation[] = "HeadPoseEstimation"; const char kInferTpye_ObjectDetection[] = "ObjectDetection"; const char kInferTpye_ObjectSegmentation[] = "ObjectSegmentation"; const char kInferTpye_ObjectSegmentationMaskrcnn[] = "ObjectSegmentationMaskrcnn"; +const char kInferTpye_ObjectSegmentationInstance[] = "ObjectSegmentationInstance"; +const char kInferTpye_ObjectSegmentationTypeYolo[] = "yolo"; +const char kInferTpye_ObjectSegmentationTypeMaskrcnn[] = "maskrcnn"; const char kInferTpye_ObjectDetectionTypeSSD[] = "SSD"; const char kInferTpye_ObjectDetectionTypeYolov5[] = "yolov5"; +const char kInferTpye_ObjectDetectionTypeYolov8[] = "yolov8"; const char kInferTpye_PersonReidentification[] = "PersonReidentification"; const char kInferTpye_PersonAttribsDetection[] = "PersonAttribsDetection"; const char kInferTpye_LandmarksDetection[] = "LandmarksDetection"; diff --git a/openvino_wrapper_lib/include/openvino_wrapper_lib/utils/common.hpp b/openvino_wrapper_lib/include/openvino_wrapper_lib/utils/common.hpp new file mode 100644 index 00000000..75877d44 --- /dev/null +++ b/openvino_wrapper_lib/include/openvino_wrapper_lib/utils/common.hpp @@ -0,0 +1,385 @@ +// Copyright (C) 2018-2019 Intel Corporation +// SPDX-License-Identifier: Apache-2.0 +// + +/** + * @brief a header file with common samples functionality + * @file common.hpp + */ + +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#ifndef UNUSED + #ifdef _WIN32 + #define UNUSED + #else + #define UNUSED __attribute__((unused)) + #endif +#endif + +template +constexpr std::size_t arraySize(const T (&)[N]) noexcept { + return N; +} + +// Helpers to print IE version information. +// We don't directly define operator<< for InferenceEngine::Version +// and such, because that won't get picked up by argument-dependent lookup +// due to not being in the same namespace as the Version class itself. +// We need ADL to work in order to print these objects using slog. +// So instead, we define wrapper classes and operator<< for those classes. + +class PrintableIeVersion { +public: + using ref_type = const InferenceEngine::Version &; + + PrintableIeVersion(ref_type version) : version(version) {} + + friend std::ostream &operator<<(std::ostream &os, const PrintableIeVersion &p) { + ref_type version = p.version; + + return os << "\t" << version.description << " version ......... " + << IE_VERSION_MAJOR << "." << IE_VERSION_MINOR + << "\n\tBuild ........... " << IE_VERSION_PATCH; + } + +private: + ref_type version; +}; + +inline PrintableIeVersion printable(PrintableIeVersion::ref_type version) { + return { version }; +} + +class PrintableIeVersionMap { +public: + using ref_type = const std::map &; + + PrintableIeVersionMap(ref_type versions) : versions(versions) {} + + friend std::ostream &operator<<(std::ostream &os, const PrintableIeVersionMap &p) { + ref_type versions = p.versions; + + for (const auto &version : versions) { + os << "\t" << version.first << std::endl + << printable(version.second) << std::endl; + } + + return os; + } + +private: + ref_type versions; +}; + +inline PrintableIeVersionMap printable(PrintableIeVersionMap::ref_type versions) { + return { versions }; +} + +/** + * @class Color + * @brief A Color class stores channels of a given color + */ +class Color { +private: + unsigned char _r; + unsigned char _g; + unsigned char _b; + +public: + /** + * A default constructor. + * @param r - value for red channel + * @param g - value for green channel + * @param b - value for blue channel + */ + Color(unsigned char r, + unsigned char g, + unsigned char b) : _r(r), _g(g), _b(b) {} + + inline unsigned char red() const { + return _r; + } + + inline unsigned char blue() const { + return _b; + } + + inline unsigned char green() const { + return _g; + } +}; + +// Known colors for training classes from the Cityscapes dataset +static UNUSED const Color CITYSCAPES_COLORS[] = { + { 128, 64, 128 }, + { 232, 35, 244 }, + { 70, 70, 70 }, + { 156, 102, 102 }, + { 153, 153, 190 }, + { 153, 153, 153 }, + { 30, 170, 250 }, + { 0, 220, 220 }, + { 35, 142, 107 }, + { 152, 251, 152 }, + { 180, 130, 70 }, + { 60, 20, 220 }, + { 0, 0, 255 }, + { 142, 0, 0 }, + { 70, 0, 0 }, + { 100, 60, 0 }, + { 90, 0, 0 }, + { 230, 0, 0 }, + { 32, 11, 119 }, + { 0, 74, 111 }, + { 81, 0, 81 } +}; + +static std::vector> +perfCountersSorted(std::map perfMap) { + using perfItem = std::pair; + std::vector sorted; + for (auto &kvp : perfMap) sorted.push_back(kvp); + + std::stable_sort(sorted.begin(), sorted.end(), + [](const perfItem& l, const perfItem& r) { + return l.second.execution_index < r.second.execution_index; + }); + + return sorted; +} + +static UNUSED void printPerformanceCounts(const std::map& performanceMap, + std::ostream &stream, const std::string &deviceName, + bool bshowHeader = true) { + long long totalTime = 0; + // Print performance counts + if (bshowHeader) { + stream << std::endl << "performance counts:" << std::endl << std::endl; + } + + auto performanceMapSorted = perfCountersSorted(performanceMap); + + for (const auto & it : performanceMapSorted) { + std::string toPrint(it.first); + const int maxLayerName = 30; + + if (it.first.length() >= maxLayerName) { + toPrint = it.first.substr(0, maxLayerName - 4); + toPrint += "..."; + } + + + stream << std::setw(maxLayerName) << std::left << toPrint; + switch (it.second.status) { + case InferenceEngine::InferenceEngineProfileInfo::EXECUTED: + stream << std::setw(15) << std::left << "EXECUTED"; + break; + case InferenceEngine::InferenceEngineProfileInfo::NOT_RUN: + stream << std::setw(15) << std::left << "NOT_RUN"; + break; + case InferenceEngine::InferenceEngineProfileInfo::OPTIMIZED_OUT: + stream << std::setw(15) << std::left << "OPTIMIZED_OUT"; + break; + } + stream << std::setw(30) << std::left << "layerType: " + std::string(it.second.layer_type) + " "; + stream << std::setw(20) << std::left << "realTime: " + std::to_string(it.second.realTime_uSec); + stream << std::setw(20) << std::left << "cpu: " + std::to_string(it.second.cpu_uSec); + stream << " execType: " << it.second.exec_type << std::endl; + if (it.second.realTime_uSec > 0) { + totalTime += it.second.realTime_uSec; + } + } + stream << std::setw(20) << std::left << "Total time: " + std::to_string(totalTime) << " microseconds" << std::endl; + std::cout << std::endl; + std::cout << "Full device name: " << deviceName << std::endl; + std::cout << std::endl; +} + +static UNUSED void printPerformanceCounts(InferenceEngine::InferRequest request, std::ostream &stream, std::string deviceName, bool bshowHeader = true) { + auto performanceMap = request.GetPerformanceCounts(); + printPerformanceCounts(performanceMap, stream, deviceName, bshowHeader); +} + +inline std::map getMapFullDevicesNames(InferenceEngine::Core& ie, std::vector devices) { + std::map devicesMap; + InferenceEngine::Parameter p; + for (std::string& deviceName : devices) { + if (deviceName != "") { + try { + p = ie.GetMetric(deviceName, METRIC_KEY(FULL_DEVICE_NAME)); + devicesMap.insert(std::pair(deviceName, p.as())); + } + catch (InferenceEngine::Exception &) { + } + } + } + return devicesMap; +} + +inline std::string getFullDeviceName(std::map& devicesMap, std::string device) { + std::map::iterator it = devicesMap.find(device); + if (it != devicesMap.end()) { + return it->second; + } else { + return ""; + } +} + +inline std::string getFullDeviceName(InferenceEngine::Core& ie, std::string device) { + InferenceEngine::Parameter p; + try { + p = ie.GetMetric(device, METRIC_KEY(FULL_DEVICE_NAME)); + return p.as(); + } + catch (InferenceEngine::Exception &) { + return ""; + } +} + +inline std::size_t getTensorWidth(const InferenceEngine::TensorDesc& desc) { + const auto& layout = desc.getLayout(); + const auto& dims = desc.getDims(); + const auto& size = dims.size(); + if ((size >= 2) && + (layout == InferenceEngine::Layout::NCHW || + layout == InferenceEngine::Layout::NHWC || + layout == InferenceEngine::Layout::NCDHW || + layout == InferenceEngine::Layout::NDHWC || + layout == InferenceEngine::Layout::OIHW || + layout == InferenceEngine::Layout::CHW || + layout == InferenceEngine::Layout::HW)) { + // Regardless of layout, dimensions are stored in fixed order + return dims.back(); + } else { + throw std::runtime_error("Tensor does not have width dimension"); + } + return 0; +} + +inline std::size_t getTensorHeight(const InferenceEngine::TensorDesc& desc) { + const auto& layout = desc.getLayout(); + const auto& dims = desc.getDims(); + const auto& size = dims.size(); + if ((size >= 2) && + (layout == InferenceEngine::Layout::NCHW || + layout == InferenceEngine::Layout::NHWC || + layout == InferenceEngine::Layout::NCDHW || + layout == InferenceEngine::Layout::NDHWC || + layout == InferenceEngine::Layout::OIHW || + layout == InferenceEngine::Layout::CHW || + layout == InferenceEngine::Layout::HW)) { + // Regardless of layout, dimensions are stored in fixed order + return dims.at(size - 2); + } else { + throw std::runtime_error("Tensor does not have height dimension"); + } + return 0; +} + +inline std::size_t getTensorChannels(const InferenceEngine::TensorDesc& desc) { + const auto& layout = desc.getLayout(); + if (layout == InferenceEngine::Layout::NCHW || + layout == InferenceEngine::Layout::NHWC || + layout == InferenceEngine::Layout::NCDHW || + layout == InferenceEngine::Layout::NDHWC || + layout == InferenceEngine::Layout::C || + layout == InferenceEngine::Layout::CHW || + layout == InferenceEngine::Layout::NC || + layout == InferenceEngine::Layout::CN) { + // Regardless of layout, dimensions are stored in fixed order + const auto& dims = desc.getDims(); + switch (desc.getLayoutByDims(dims)) { + case InferenceEngine::Layout::C: return dims.at(0); + case InferenceEngine::Layout::NC: return dims.at(1); + case InferenceEngine::Layout::CHW: return dims.at(0); + case InferenceEngine::Layout::NCHW: return dims.at(1); + case InferenceEngine::Layout::NCDHW: return dims.at(1); + case InferenceEngine::Layout::SCALAR: // [[fallthrough]] + case InferenceEngine::Layout::BLOCKED: // [[fallthrough]] + default: + throw std::runtime_error("Tensor does not have channels dimension"); + } + } else { + throw std::runtime_error("Tensor does not have channels dimension"); + } + return 0; +} + +inline std::size_t getTensorBatch(const InferenceEngine::TensorDesc& desc) { + const auto& layout = desc.getLayout(); + if (layout == InferenceEngine::Layout::NCHW || + layout == InferenceEngine::Layout::NHWC || + layout == InferenceEngine::Layout::NCDHW || + layout == InferenceEngine::Layout::NDHWC || + layout == InferenceEngine::Layout::NC || + layout == InferenceEngine::Layout::CN) { + // Regardless of layout, dimensions are stored in fixed order + const auto& dims = desc.getDims(); + switch (desc.getLayoutByDims(dims)) { + case InferenceEngine::Layout::NC: return dims.at(0); + case InferenceEngine::Layout::NCHW: return dims.at(0); + case InferenceEngine::Layout::NCDHW: return dims.at(0); + case InferenceEngine::Layout::CHW: // [[fallthrough]] + case InferenceEngine::Layout::C: // [[fallthrough]] + case InferenceEngine::Layout::SCALAR: // [[fallthrough]] + case InferenceEngine::Layout::BLOCKED: // [[fallthrough]] + default: + throw std::runtime_error("Tensor does not have channels dimension"); + } + } else { + throw std::runtime_error("Tensor does not have channels dimension"); + } + return 0; +} + +inline void showAvailableDevices() { + InferenceEngine::Core ie; + std::vector devices = ie.GetAvailableDevices(); + + std::cout << std::endl; + std::cout << "Available target devices:"; + for (const auto& device : devices) { + std::cout << " " << device; + } + std::cout << std::endl; +} + +inline std::string fileNameNoExt(const std::string &filepath) { + auto pos = filepath.rfind('.'); + if (pos == std::string::npos) return filepath; + return filepath.substr(0, pos); +} + +static inline ov::Layout getLayoutFromShape(const ov::Shape& shape) { + if (shape.size() == 2) { + return "NC"; + } + else if (shape.size() == 3) { + return (shape[0] >= 1 && shape[0] <= 4) ? "CHW" : + "HWC"; + } + else if (shape.size() == 4) { + return (shape[1] >= 1 && shape[1] <= 4) ? "NCHW" : + "NHWC"; + } + else { + throw std::runtime_error("Usupported " + std::to_string(shape.size()) + "D shape"); + } +} diff --git a/openvino_wrapper_lib/src/inferences/object_segmentation_instance.cpp b/openvino_wrapper_lib/src/inferences/object_segmentation_instance.cpp new file mode 100644 index 00000000..2355519f --- /dev/null +++ b/openvino_wrapper_lib/src/inferences/object_segmentation_instance.cpp @@ -0,0 +1,135 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#include +#include +#include +#include +#include +#include +#include "openvino_wrapper_lib/inferences/object_segmentation_instance.hpp" +#include "openvino_wrapper_lib/outputs/base_output.hpp" +#include "openvino_wrapper_lib/slog.hpp" + +openvino_wrapper_lib::ObjectSegmentationInstanceResult::ObjectSegmentationInstanceResult(const cv::Rect &location) + : Result(location) +{ +} + +openvino_wrapper_lib::ObjectSegmentationInstance::ObjectSegmentationInstance(double show_output_thresh) + : show_output_thresh_(show_output_thresh), openvino_wrapper_lib::BaseInference() +{ +} + +openvino_wrapper_lib::ObjectSegmentationInstance::~ObjectSegmentationInstance() = default; + +void openvino_wrapper_lib::ObjectSegmentationInstance::loadNetwork( + const std::shared_ptr network) +{ + slog::info << "Loading Network: " << network->getModelCategory() << slog::endl; + valid_model_ = network; + setMaxBatchSize(network->getMaxBatchSize()); +} + +bool openvino_wrapper_lib::ObjectSegmentationInstance::enqueue( + const cv::Mat &frame, + const cv::Rect &input_frame_loc) +{ + if (width_ == 0 && height_ == 0) + { + width_ = frame.cols; + height_ = frame.rows; + } + + if (valid_model_ == nullptr || getEngine() == nullptr) + { + throw std::logic_error("Model or Engine is not set correctly!"); + return false; + } + + if (enqueued_frames_ >= valid_model_->getMaxBatchSize()) + { + slog::warn << "Number of " << getName() << "input more than maximum(" << + max_batch_size_ << ") processed by inference" << slog::endl; + return false; + } + + if (!valid_model_->enqueue(getEngine(), frame, input_frame_loc)) + { + return false; + } + + enqueued_frames_ += 1; + return true; +} + +bool openvino_wrapper_lib::ObjectSegmentationInstance::submitRequest() +{ + return openvino_wrapper_lib::BaseInference::submitRequest(); +} + +bool openvino_wrapper_lib::ObjectSegmentationInstance::fetchResults() +{ + + bool can_fetch = openvino_wrapper_lib::BaseInference::fetchResults(); + if (!can_fetch) + { + return false; + } + results_.clear(); + + return (valid_model_ != nullptr) && valid_model_->fetchResults( + getEngine(), results_, show_output_thresh_); +} + +int openvino_wrapper_lib::ObjectSegmentationInstance::getResultsLength() const +{ + return static_cast(results_.size()); +} + +const openvino_wrapper_lib::Result * +openvino_wrapper_lib::ObjectSegmentationInstance::getLocationResult(int idx) const +{ + return &(results_[idx]); +} + +const std::string openvino_wrapper_lib::ObjectSegmentationInstance::getName() const +{ + return valid_model_->getModelCategory(); +} + +void openvino_wrapper_lib::ObjectSegmentationInstance::observeOutput( + const std::shared_ptr &output) +{ + if (output != nullptr) + { + output->accept(results_); + } +} + +const std::vector openvino_wrapper_lib::ObjectSegmentationInstance::getFilteredROIs( + const std::string filter_conditions) const +{ + if (!filter_conditions.empty()) + { + slog::err << "Object segmentation does not support filtering now! " + << "Filter conditions: " << filter_conditions << slog::endl; + } + std::vector filtered_rois; + for (auto res : results_) + { + filtered_rois.push_back(res.getLocation()); + } + return filtered_rois; +} diff --git a/openvino_wrapper_lib/src/models/age_gender_detection_model.cpp b/openvino_wrapper_lib/src/models/age_gender_detection_model.cpp index ac3eea86..c15eda7f 100644 --- a/openvino_wrapper_lib/src/models/age_gender_detection_model.cpp +++ b/openvino_wrapper_lib/src/models/age_gender_detection_model.cpp @@ -48,7 +48,7 @@ bool Models::AgeGenderDetectionModel::updateLayerProperty( return false; } - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); const ov::Layout tensor_layout{"NCHW"}; input_info.tensor(). set_element_type(ov::element::f32). diff --git a/openvino_wrapper_lib/src/models/attributes/ssd_model_attr.cpp b/openvino_wrapper_lib/src/models/attributes/ssd_model_attr.cpp index 1340d490..f79b5d07 100644 --- a/openvino_wrapper_lib/src/models/attributes/ssd_model_attr.cpp +++ b/openvino_wrapper_lib/src/models/attributes/ssd_model_attr.cpp @@ -44,7 +44,7 @@ bool Models::SSDModelAttr::updateLayerProperty( input_tensor_name_ = model->input().get_any_name(); ov::preprocess::InputInfo& input_info = ppp.input(input_tensor_name_); input_info.tensor().set_element_type(ov::element::u8); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); ov::Shape input_dims = input_info_map[0].get_shape(); setInputHeight(input_dims[2]); @@ -60,7 +60,7 @@ bool Models::SSDModelAttr::updateLayerProperty( } ov::preprocess::OutputInfo& output_info = ppp.output(); - addOutputInfo("output", model->output().get_any_name()); + addOutputInfo(ModelAttribute::DefaultOutputName, model->output().get_any_name()); slog::info << "Checking Object Detection output ... Name=" << model->output().get_any_name() << slog::endl; diff --git a/openvino_wrapper_lib/src/models/base_model.cpp b/openvino_wrapper_lib/src/models/base_model.cpp index 70647b5a..e7746504 100644 --- a/openvino_wrapper_lib/src/models/base_model.cpp +++ b/openvino_wrapper_lib/src/models/base_model.cpp @@ -25,6 +25,9 @@ #include #include "openvino_wrapper_lib/models/base_model.hpp" #include "openvino_wrapper_lib/slog.hpp" +#include "openvino_wrapper_lib/utils/common.hpp" +#include "openvino_wrapper_lib/engines/engine.hpp" +#include "openvino_wrapper_lib/models/attributes/base_attribute.hpp" #include "openvino_wrapper_lib/models/attributes/base_attribute.hpp" // Validated Base Network @@ -63,29 +66,117 @@ void Models::BaseModel::modelInit() updateLayerProperty(model_); } -#if 0 +Models::ObjectDetectionModel::ObjectDetectionModel( + const std::string& label_loc, + const std::string& model_loc, + int max_batch_size) +: BaseModel(label_loc, model_loc, max_batch_size) {} + +bool Models::BaseModel::matToBlob( + const cv::Mat & orig_image, const cv::Rect &, float scale_factor, + int batch_index, const std::shared_ptr & engine) +{ + if (engine == nullptr) + { + slog::err << "A frame is trying to be enqueued in a NULL Engine." << slog::endl; + return false; + } + + ov::InferRequest infer_request = engine->getRequest(); + ov::Tensor input_tensor = infer_request.get_tensor(getInputName("input0")); + ov::Shape input_shape = input_tensor.get_shape(); + + OPENVINO_ASSERT(input_shape.size() == 4); + const auto layout = getLayoutFromShape(input_shape); + const size_t width = input_shape[ov::layout::width_idx(layout)]; //input_shape[2]; + const size_t height = input_shape[ov::layout::height_idx(layout)]; //input_shape[1]; + const size_t channels = input_shape[ov::layout::channels_idx(layout)]; //input_shape[3]; + + slog::debug <<"width is:"<< width << slog::endl; + slog::debug <<"height is:"<< height << slog::endl; + slog::debug <<"channels is:"<< channels << slog::endl; + slog::debug <<"origin channels is:"<< orig_image.channels() << slog::endl; + slog::debug <<"input shape is:"<< input_shape << slog::endl; + + unsigned char* data = input_tensor.data(); + cv::Size size = {(int)width, (int)height}; + cv::Mat resized_image(size, CV_8UC3, data); + + if ( isKeepInputRatio()){ + slog::debug << "keep Input Shape Ratio is ENABLED!" << slog::endl; + cv::Mat extend_image = extendFrameToInputRatio(orig_image); + cv::resize(extend_image, resized_image, size); + frame_resize_ratio_width_ = static_cast(extend_image.cols) / width; + frame_resize_ratio_height_ = static_cast(extend_image.rows) / height; + } else { + cv::resize(orig_image, resized_image, size); + frame_resize_ratio_width_ = static_cast(orig_image.cols) / width; + frame_resize_ratio_height_ = static_cast(orig_image.rows) / height; + } + + return true; +} + +cv::Mat Models::BaseModel::extendFrameToInputRatio(const cv::Mat orig) +{ + auto orig_width = orig.cols; + auto orig_height = orig.rows; + const auto target_width = getInputWidth(); + const auto target_height = getInputHeight(); + const float orig_ratio = static_cast(orig_width) / orig_height; + const float target_ratio = static_cast(target_width) / target_height; + + slog::debug << "extend Ratio: orit_ratio:"<< orig_ratio << ", target_ratio:" << target_ratio << + ", orig_width:" << orig_width << ", orig_height:" << orig_height << slog::endl; + if (orig_ratio < target_ratio){ + orig_width = (int)(orig_height * target_ratio); + }else{ + orig_height = (int)(orig_width * target_ratio); + } + + slog::debug << "extend Image to: " << orig_width << "x" << orig_height << slog::endl; + cv::Mat result = cv::Mat::zeros(orig_height, orig_width, CV_8UC3); + orig.copyTo(result(cv::Rect(0, 0, orig.cols, orig.rows))); + + return result; +} + bool Models::BaseModel::updateLayerProperty( - InferenceEngine::CNNNetReader::Ptr model) + std::shared_ptr& model) { -#if 0 - if (!updateLayerProperty(model)){ - slog::warn << "The model(name: " << getModelName() << ") failed to update Layer Property!" + slog::info<< "Checking INPUTS & OUTPUTS for Model " <inputs(); + slog::debug <<"input size="<outputs(); + slog::debug <<"output size="<outputs(); diff --git a/openvino_wrapper_lib/src/models/object_detection_ssd_model.cpp b/openvino_wrapper_lib/src/models/object_detection_ssd_model.cpp index 2c938ca8..f9cb5448 100644 --- a/openvino_wrapper_lib/src/models/object_detection_ssd_model.cpp +++ b/openvino_wrapper_lib/src/models/object_detection_ssd_model.cpp @@ -162,7 +162,7 @@ bool Models::ObjectDetectionSSDModel::updateLayerProperty( ov::preprocess::InputInfo& input_info = ppp.input(input_tensor_name_); input_info.tensor().set_element_type(ov::element::u8); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); ov::Shape input_dims = input_info_map[0].get_shape(); @@ -193,7 +193,7 @@ bool Models::ObjectDetectionSSDModel::updateLayerProperty( return false; } ov::preprocess::OutputInfo& output_info = ppp.output(); - addOutputInfo("output", model->output().get_any_name()); + addOutputInfo(ModelAttribute::DefaultOutputName, model->output().get_any_name()); slog::info << "Checking Object Detection output ... Name=" << model->output().get_any_name() << slog::endl; diff --git a/openvino_wrapper_lib/src/models/object_detection_yolov5_model.cpp b/openvino_wrapper_lib/src/models/object_detection_yolov5_model.cpp index 8dd7176d..b89fda66 100644 --- a/openvino_wrapper_lib/src/models/object_detection_yolov5_model.cpp +++ b/openvino_wrapper_lib/src/models/object_detection_yolov5_model.cpp @@ -1,4 +1,4 @@ -// Copyright (c) 2022 Intel Corporation +// Copyright (c) 2022-2023 Intel Corporation // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. @@ -13,77 +13,81 @@ // limitations under the License. /** - * @brief a header file with declaration of ObjectDetectionModel class + * @brief a header file with declaration of ObjectDetectionYolov5Model class * @file object_detection_yolov5_model.cpp */ #include #include #include #include +#include #include "openvino_wrapper_lib/slog.hpp" +#include "openvino_wrapper_lib/utils/common.hpp" #include "openvino_wrapper_lib/engines/engine.hpp" #include "openvino_wrapper_lib/inferences/object_detection.hpp" #include "openvino_wrapper_lib/models/object_detection_yolov5_model.hpp" +using namespace cv; +using namespace dnn; // Validated Object Detection Network Models::ObjectDetectionYolov5Model::ObjectDetectionYolov5Model( const std::string & label_loc, const std::string & model_loc, int max_batch_size) : ObjectDetectionModel(label_loc, model_loc, max_batch_size) { + setKeepInputShapeRatio(true); } bool Models::ObjectDetectionYolov5Model::updateLayerProperty( std::shared_ptr& model) { - slog::info << "Checking INPUTs for model " << getModelName() << slog::endl; - auto input_info_map = model->inputs(); - if (input_info_map.size() != 1) { - slog::warn << "This model seems not Yolo-like, which has only one input, but we got " - << std::to_string(input_info_map.size()) << "inputs" << slog::endl; - return false; - } - // set input property - ov::Shape input_dims = input_info_map[0].get_shape(); + Models::BaseModel::updateLayerProperty(model); + ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model); - input_tensor_name_ = model->input().get_any_name(); - ov::preprocess::InputInfo& input_info = ppp.input(input_tensor_name_); - const ov::Layout input_tensor_layout{"NHWC"}; - setInputHeight(input_dims[2]); - setInputWidth(input_dims[3]); + + // preprocess image inputs + ov::preprocess::InputInfo& input_info = ppp.input(getInputInfo("input0")); + ov::Layout tensor_layout = ov::Layout("NHWC"); + + if( model->input(0).get_partial_shape().is_dynamic()){ + auto expected_size = getExpectedFrameSize(); + slog::info << "Model's input has dynamic shape, set to expected size: " + << expected_size << slog::endl; + input_info.tensor().set_shape({1, expected_size.height, expected_size.width, 3}); + } + input_info.tensor(). - set_element_type(ov::element::u8). - set_layout(input_tensor_layout). - set_color_format(ov::preprocess::ColorFormat::BGR); + set_element_type(ov::element::u8). + set_layout(tensor_layout). + set_color_format(ov::preprocess::ColorFormat::BGR); + input_info.preprocess(). convert_element_type(ov::element::f32). convert_color(ov::preprocess::ColorFormat::RGB).scale({255., 255., 255.}); ppp.input().model().set_layout("NCHW"); - addInputInfo("input", input_tensor_name_); - // set output property - auto output_info_map = model->outputs(); - if (output_info_map.size() != 1) { - slog::warn << "This model seems not Yolo-like! We got " - << std::to_string(output_info_map.size()) << "outputs, but Yolov5 has only one." - << slog::endl; - return false; - } - output_tensor_name_ = model->output().get_any_name(); - ov::preprocess::OutputInfo& output_info = ppp.output(); - addOutputInfo("output", output_tensor_name_); - output_info.tensor().set_element_type(ov::element::f32); - slog::info << "Checking Object Detection output ... Name=" << output_tensor_name_ - << slog::endl; + ppp.output().tensor().set_element_type(ov::element::f32); model = ppp.build(); - ov::Shape output_dims = output_info_map[0].get_shape(); - setMaxProposalCount(static_cast(output_dims[1])); - - auto object_size = static_cast(output_dims[2]); - setObjectSize(object_size); + ov::Shape input_shape = model->input(getInputInfo("input0")).get_shape(); + slog::debug<<"image_tensor shape is:"<< input_shape.size() <outputs(); + ov::Shape output_dims = output_info_map[0].get_shape(); + if (output_dims[1] < output_dims[2]){ + slog::info << "Object-Size bigger than Proposal-Count, Outputs need Transform!" << slog::endl; + setTranspose(true); + setMaxProposalCount(static_cast(output_dims[2])); + setObjectSize(static_cast(output_dims[1])); + } else { + setTranspose(false); + setMaxProposalCount(static_cast(output_dims[1])); + setObjectSize(static_cast(output_dims[2])); + } printAttribute(); slog::info << "This model is Yolo-like, Layer Property updated!" << slog::endl; return true; @@ -107,26 +111,6 @@ bool Models::ObjectDetectionYolov5Model::enqueue( return true; } - -bool Models::ObjectDetectionYolov5Model::matToBlob( - const cv::Mat & orig_image, const cv::Rect &, float scale_factor, - int batch_index, const std::shared_ptr & engine) -{ - resize_img = pre_process_ov(orig_image); - input_image = orig_image; - - size_t height = resize_img.resized_image.rows; - size_t width = resize_img.resized_image.cols; - size_t channels = orig_image.channels(); - auto *input_data = (float *) resize_img.resized_image.data; - - ov::Tensor input_tensor; - input_tensor = ov::Tensor(ov::element::u8, {1, height, width, channels}, input_data); - engine->getRequest().set_input_tensor(input_tensor); - - return true; -} - bool Models::ObjectDetectionYolov5Model::fetchResults( const std::shared_ptr & engine, std::vector & results, @@ -140,19 +124,35 @@ bool Models::ObjectDetectionYolov5Model::fetchResults( const ov::Tensor &output_tensor = request.get_output_tensor(); ov::Shape output_shape = output_tensor.get_shape(); auto *detections = output_tensor.data(); + int rows = output_shape.at(1); + int dimentions = output_shape.at(2); + Mat output_buffer(output_shape[1], output_shape[2], CV_32F, detections); + //Check if transpose is needed + if (output_shape.at(2) > output_shape.at(1) && + output_shape.at(2) > 300){ // 300 is just a random number(bigger than the number of classes) + transpose(output_buffer, output_buffer); //[8400,84] for yolov8 + detections = (float*)output_buffer.data; + rows = output_shape.at(2); + dimentions = output_shape.at(1); + } + //slog::debug << "AFTER calibration: rows->" << rows << ", dimentions->" << dimentions << slog::endl; + std::vector boxes; std::vector class_ids; std::vector confidences; std::vector & labels = getLabels(); - for (size_t i = 0; i < output_shape.at(1); i++) { - float *detection = &detections[i * output_shape.at(2)]; - float confidence = detection[4]; - if (confidence < confidence_thresh) - continue; + for (size_t i = 0; i < rows; i++) { + float *detection = &detections[i * dimentions]; + if (hasConfidenceOutput()) { + float confidence = detection[4]; + if (confidence < confidence_thresh) + continue; + } - float *classes_scores = &detection[5]; - int col = static_cast(output_shape.at(2) - 5); + const int classes_scores_start_pos = hasConfidenceOutput()? 5 : 4; + float *classes_scores = &detection[classes_scores_start_pos]; + int col = static_cast(dimentions - classes_scores_start_pos); cv::Mat scores(1, col, CV_32FC1, classes_scores); cv::Point class_id; @@ -160,7 +160,7 @@ bool Models::ObjectDetectionYolov5Model::fetchResults( cv::minMaxLoc(scores, nullptr, &max_class_score, nullptr, &class_id); if (max_class_score > confidence_thresh) { - confidences.emplace_back(confidence); + confidences.emplace_back(max_class_score); class_ids.emplace_back(class_id.x); float x = detection[0]; @@ -177,12 +177,12 @@ bool Models::ObjectDetectionYolov5Model::fetchResults( std::vector nms_result; cv::dnn::NMSBoxes(boxes, confidences, confidence_thresh, NMS_THRESHOLD, nms_result); for (int idx: nms_result) { - double rx = (double) input_image.cols / (resize_img.resized_image.cols - resize_img.dw); - double ry = (double) input_image.rows / (resize_img.resized_image.rows - resize_img.dh); - double vx = rx * boxes[idx].x; - double vy = ry * boxes[idx].y; - double vw = rx * boxes[idx].width; - double vh = ry * boxes[idx].height; + double rx = getFrameResizeRatioWidth(); + double ry = getFrameResizeRatioHeight(); + int vx = int(rx * boxes[idx].x); + double vy = int(ry * boxes[idx].y); + double vw = int(rx * boxes[idx].width); + double vh = int(ry * boxes[idx].height); cv::Rect rec(vx, vy, vw, vh); Result result(rec); result.setConfidence(confidences[idx]); @@ -194,31 +194,3 @@ bool Models::ObjectDetectionYolov5Model::fetchResults( return true; } - -Models::Resize_t Models::ObjectDetectionYolov5Model::pre_process_ov(const cv::Mat &input_image) { - const float INPUT_WIDTH = 640.f; - const float INPUT_HEIGHT = 640.f; - auto width = (float) input_image.cols; - auto height = (float) input_image.rows; - auto r = float(INPUT_WIDTH / std::max(width, height)); - int new_unpadW = int(round(width * r)); - int new_unpadH = int(round(height * r)); - Resize_t resize_img{}; - - cv::resize(input_image, resize_img.resized_image, {new_unpadW, new_unpadH}, - 0, 0, cv::INTER_AREA); - - resize_img.dw = (int) INPUT_WIDTH - new_unpadW; - resize_img.dh = (int) INPUT_HEIGHT - new_unpadH; - cv::Scalar color = cv::Scalar(100, 100, 100); - cv::copyMakeBorder(resize_img.resized_image, - resize_img.resized_image, - 0, - resize_img.dh, - 0, - resize_img.dw, - cv::BORDER_CONSTANT, - color); - - return resize_img; -} diff --git a/openvino_wrapper_lib/src/models/object_detection_yolov8_model.cpp b/openvino_wrapper_lib/src/models/object_detection_yolov8_model.cpp new file mode 100644 index 00000000..c37c6777 --- /dev/null +++ b/openvino_wrapper_lib/src/models/object_detection_yolov8_model.cpp @@ -0,0 +1,25 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#include "openvino_wrapper_lib/models/object_detection_yolov8_model.hpp" + +Models::ObjectDetectionYolov8Model::ObjectDetectionYolov8Model( + const std::string & label_loc, const std::string & model_loc, int max_batch_size) +: ObjectDetectionYolov5Model(label_loc, model_loc, max_batch_size) +{ + //setKeepInputShapeRatio(true); + setHasConfidenceOutput(false); + setExpectedFrameSize({640, 640}); +} + diff --git a/openvino_wrapper_lib/src/models/object_segmentation_instance_maskrcnn_model.cpp b/openvino_wrapper_lib/src/models/object_segmentation_instance_maskrcnn_model.cpp new file mode 100644 index 00000000..531f5971 --- /dev/null +++ b/openvino_wrapper_lib/src/models/object_segmentation_instance_maskrcnn_model.cpp @@ -0,0 +1,221 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +/** + * @brief a source code file with declaration of ObjectSegmentationInstanceMaskrcnnModel class + * It is a child of class ObjectSegmentationInstanceModel. + */ +#include +#include +#include +#include "openvino_wrapper_lib/inferences/object_segmentation_instance.hpp" +#include "openvino_wrapper_lib/utils/common.hpp" +#include "openvino_wrapper_lib/models/object_segmentation_instance_maskrcnn_model.hpp" +#include "openvino_wrapper_lib/slog.hpp" +#include "openvino_wrapper_lib/engines/engine.hpp" + +// Validated Object Segmentation Network +Models::ObjectSegmentationInstanceMaskrcnnModel::ObjectSegmentationInstanceMaskrcnnModel( + const std::string & label_loc, + const std::string & model_loc, + int max_batch_size) + : ObjectSegmentationInstanceModel(label_loc, model_loc, max_batch_size) +{ + setHasConfidenceOutput(true); + setKeepInputShapeRatio(true); + setCountOfInputs(2); + setCountOfOutputs(2); + setExpectedFrameSize({640, 360}); +} + +bool Models::ObjectSegmentationInstanceMaskrcnnModel::updateLayerProperty( + std::shared_ptr& model) +{ + Models::BaseModel::updateLayerProperty(model); + + slog::debug << "in Models' PrePostProcessor:" << slog::endl; + ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model); + slog::debug << "Model's input size=" << model->inputs().size() << slog::endl; + //1. preprocess image inputs + for(int i=0; i" << getInputInfo(name) << slog::endl; + auto& input_info = ppp.input(getInputInfo(name)); + ov::Layout tensor_layout = ov::Layout("NHWC"); + auto input_shape = model->input(getInputInfo(name)).get_partial_shape(); + + if (input_shape.size() == 4) { // first input contains images + slog::debug << "handling Input[image_tensor]..." << slog::endl; + input_info.tensor(). + set_element_type(ov::element::u8). + set_layout(tensor_layout); + //addInputInfo(ModelAttribute::DefaultInputName, name); + //retagInputByValue(getInputInfo(name), "image_tensor"); + + if (input_shape.is_dynamic()){ + auto expected_size = getExpectedFrameSize(); + slog::info << "Model's input has dynamic shape, fix it to " << expected_size << slog::endl; + input_info.tensor().set_shape({1, expected_size.height, expected_size.width, 3}); + } + } else if (input_shape.size() == 2) { // second input contains image info + slog::debug << "handling Input[image_info]..." << slog::endl; + input_info.tensor().set_element_type(ov::element::f32); + //addInputInfo("input2", info_name_); + //retagInputByValue(getInputInfo(name), "image_info"); + } else { + throw std::logic_error("Unsupported input shape with size = " + std::to_string(input_shape.size())); + } + } + + //ppp.input(0).model().set_layout("NCHW"); + + model = ppp.build(); + ppp = ov::preprocess::PrePostProcessor(model); + + ov::Shape input0_shape = model->input(getInputName("input0")).get_shape(); + slog::debug <<"image_tensor shape is:"<< input0_shape.size() <output(output); + auto shape = output_info.get_partial_shape(); + slog::info << "Output shape for [" << output << "] is: " << shape << slog::endl; + if (shape.size() == 4){ + slog::info << "find output tensor - [masks]" << slog::endl; + retagOutputByValue(output, "masks"); + } else if (shape.size() == 2){ + slog::info << "find output tensor - [detection]" << slog::endl; + retagOutputByValue(output, "detection"); + } else { + throw std::logic_error("The shape ofoutput tensers are wrong, must be 4 or 3!"); + } + }; + check_output_and_rename(getOutputName("output0")); + check_output_and_rename(getOutputName("output1")); + + ov::preprocess::OutputInfo& output_info = ppp.output(getOutputName("masks")); + output_info.tensor().set_element_type(ov::element::f32); + + model = ppp.build(); + + if(model->is_dynamic()) { + slog::warn << "Model is still dynamic !!!!" << slog::endl; + } else { + auto output_info_map = model->outputs(); + ov::Shape output_dims = output_info_map[0].get_shape(); + if (output_dims[1] < output_dims[2]){ + slog::info << "Object-Size bigger than Proposal-Count, Outputs need Transform!" << slog::endl; + setTranspose(true); + setMaxProposalCount(static_cast(output_dims[2])); + setObjectSize(static_cast(output_dims[1])); + } else { + setTranspose(false); + setMaxProposalCount(static_cast(output_dims[1])); + setObjectSize(static_cast(output_dims[2])); + } + } + + printAttribute(); + slog::info << "Layer Property updated!" << slog::endl; + return true; + +} + +bool Models::ObjectSegmentationInstanceMaskrcnnModel::fetchResults( + const std::shared_ptr & engine, + std::vector & results, + const float & confidence_thresh, + const bool & enable_roi_constraint) +{ + ov::InferRequest infer_request = engine->getRequest(); + slog::debug << "Analyzing Detection results..." << slog::endl; + std::string detection_output = getOutputName("detection"); + std::string mask_output = getOutputName("masks"); + slog::debug << "Detection_output=" << detection_output << ", Mask_output=" << mask_output << slog::endl; + + //get detection data + ov::Tensor do_tensor = infer_request.get_tensor(detection_output); + const auto do_data = do_tensor.data(); + ov::Shape do_shape = do_tensor.get_shape(); + slog::debug << "Detection Blob getDims = " <(); + ov::Shape mask_shape = mask_tensor.get_shape(); + + // determine models + size_t box_description_size = do_shape.back(); + OPENVINO_ASSERT(mask_shape.size() == 4); + size_t box_num = mask_shape[0]; + size_t C = mask_shape[1]; + size_t H = mask_shape[2]; + size_t W = mask_shape[3]; + size_t box_stride = W * H * C; + slog::debug << "box_description is:" << box_description_size << slog::endl; + slog::debug << "box_num is:" << box_num<< slog::endl; + slog::debug << "C is:" << C << slog::endl; + slog::debug << "H is:" << H << slog::endl; + slog::debug << "W is:" << W << slog::endl; + + for (size_t box = 0; box < box_num; ++box) { + // box description: batch, label, prob, x1, y1, x2, y2 + float * box_info = do_data + box * box_description_size; + auto batch = static_cast(box_info[0]); + slog::debug << "batch =" << batch << slog::endl; + if (batch < 0) { + slog::warn << "Batch size should be greater than 0. [batch=" << batch <<"]." << slog::endl; + break; + } + float prob = box_info[2]; + const double rx = getFrameResizeRatioWidth(); + const double ry = getFrameResizeRatioHeight(); + //slog::debug << "FrameResizeRatio W:" < confidence_thresh) { + float x1 = std::min(std::max(0.0f, box_info[3] * iW), static_cast(iW)) * rx; + float y1 = std::min(std::max(0.0f, box_info[4] * iH), static_cast(iH)) * ry; + float x2 = std::min(std::max(0.0f, box_info[5] * iW), static_cast(iW)) * rx; + float y2 = std::min(std::max(0.0f, box_info[6] * iH), static_cast(iH)) * ry; + int box_width = static_cast(x2 - x1); + int box_height = static_cast(y2 - y1); + slog::debug << "Box[" << box_width << "x" << box_height << "]" << slog::endl; + if (box_width <= 0 || box_height <=0) break; + int class_id = static_cast(box_info[1] + 1e-6f); + float * mask_arr = mask_data + box_stride * box + H * W * (class_id - 1); + slog::info << "Detected class " << class_id << " with probability " << prob << " from batch " << batch + << ": [" << x1 << ", " << y1 << "], [" << x2 << ", " << y2 << "]" << slog::endl; + cv::Mat mask_mat(H, W, CV_32FC1, mask_arr); + cv::Rect roi = cv::Rect(static_cast(x1), static_cast(y1), box_width, box_height); + cv::Mat resized_mask_mat(box_height, box_width, CV_32FC1); + cv::resize(mask_mat, resized_mask_mat, cv::Size(box_width, box_height)); + Result result(roi); + result.setConfidence(prob); + std::vector & labels = getLabels(); + std::string label = class_id < labels.size() ? labels[class_id] : + std::string("label #") + std::to_string(class_id); + result.setLabel(label); + result.setMask(resized_mask_mat); + slog::debug << "adding one segmentation Box ..." << slog::endl; + results.emplace_back(result); + } + } + + return true; +} diff --git a/openvino_wrapper_lib/src/models/object_segmentation_instance_model.cpp b/openvino_wrapper_lib/src/models/object_segmentation_instance_model.cpp new file mode 100644 index 00000000..47293bdf --- /dev/null +++ b/openvino_wrapper_lib/src/models/object_segmentation_instance_model.cpp @@ -0,0 +1,300 @@ +// Copyright (c) 2023 Intel Corporation +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +/** + * @brief a header file with declaration of ObjectSegmentationInstanceModel class + * By default yolov8 segementation models are supported. + */ +#include +#include +#include +#include "openvino_wrapper_lib/inferences/object_segmentation_instance.hpp" +#include "openvino_wrapper_lib/utils/common.hpp" +#include "openvino_wrapper_lib/models/object_segmentation_instance_model.hpp" +#include "openvino_wrapper_lib/slog.hpp" +#include "openvino_wrapper_lib/engines/engine.hpp" + +// Validated Object Segmentation Network +Models::ObjectSegmentationInstanceModel::ObjectSegmentationInstanceModel( + const std::string & label_loc, + const std::string & model_loc, + int max_batch_size) + : BaseModel(label_loc, model_loc, max_batch_size) +{ + setHasConfidenceOutput(false); + setKeepInputShapeRatio(true); + setCountOfInputs(1); + setCountOfOutputs(2); + setExpectedFrameSize({640, 640}); +} + +bool Models::ObjectSegmentationInstanceModel::enqueue( + const std::shared_ptr &engine, + const cv::Mat &frame, + const cv::Rect &input_frame_loc) +{ + if (engine == nullptr) + { + slog::err << "A frame is trying to be enqueued in a NULL Engine." << slog::endl; + return false; + } + + setFrameSize(frame.cols, frame.rows); + + for (const auto &inputInfoItem : inputs_info_) + { + auto dims = inputInfoItem.get_partial_shape(); + slog::debug << "input tensor shape is:"<< dims.size() <getRequest().get_tensor(inputInfoItem); + auto data = in_tensor.data(); + data[0] = static_cast(frame.rows); // height + data[1] = static_cast(frame.cols); // width + data[2] = 1; + } + } + + return true; +} + +const std::string Models::ObjectSegmentationInstanceModel::getModelCategory() const +{ + return "Object Segmentation - Yolo-Like"; +} + +bool Models::ObjectSegmentationInstanceModel::updateLayerProperty( + std::shared_ptr& model) +{ + Models::BaseModel::updateLayerProperty(model); + + ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model); + + //1. preprocess image inputs + ov::preprocess::InputInfo& input_info = ppp.input(getInputInfo("input0")); + ov::Layout tensor_layout = ov::Layout("NHWC"); + + if( model->input(0).get_partial_shape().is_dynamic()){ + auto expected_size = getExpectedFrameSize(); + slog::info << "Model's input has dynamic shape, fix it to " << expected_size << slog::endl; + input_info.tensor().set_shape({1, expected_size.height, expected_size.width, 3}); + } + + input_info.tensor(). + set_element_type(ov::element::u8). + set_layout(tensor_layout). + set_color_format(ov::preprocess::ColorFormat::BGR); + + input_info.preprocess(). + convert_element_type(ov::element::f32). + convert_color(ov::preprocess::ColorFormat::RGB).scale({255., 255., 255.}); + ppp.input().model().set_layout("NCHW"); + + model = ppp.build(); + ppp = ov::preprocess::PrePostProcessor(model); + + ov::Shape input_shape = model->input(getInputInfo("input0")).get_shape(); + slog::debug <<"image_tensor shape is:"<< input_shape.size() <output(output); + auto shape = output_info.get_partial_shape(); + slog::info << "Output shape for [" << output << "] is: " << shape << slog::endl; + if (shape.size() == 4){ + slog::info << "find output tensor - [masks]" << slog::endl; + retagOutputByValue(output, "masks"); + } else if (shape.size() == 3){ + slog::info << "find output tensor - [detection]" << slog::endl; + retagOutputByValue(output, "detection"); + } else { + throw std::logic_error("The shape ofoutput tensers are wrong, must be 4 or 3!"); + } + }; + check_output_and_rename(getOutputName("output0")); + check_output_and_rename(getOutputName("output1")); + + ov::preprocess::OutputInfo& output_info = ppp.output(getOutputName("masks")); + output_info.tensor().set_element_type(ov::element::f32); + + model = ppp.build(); + + if(model->is_dynamic()) { + slog::warn << "Model is still dynamic !!!!" << slog::endl; + } else { + auto output_info_map = model->outputs(); + ov::Shape output_dims = output_info_map[0].get_shape(); + if (output_dims[1] < output_dims[2]){ + slog::info << "Object-Size bigger than Proposal-Count, Outputs need Transform!" << slog::endl; + setTranspose(true); + setMaxProposalCount(static_cast(output_dims[2])); + setObjectSize(static_cast(output_dims[1])); + } else { + setTranspose(false); + setMaxProposalCount(static_cast(output_dims[1])); + setObjectSize(static_cast(output_dims[2])); + } + } + + printAttribute(); + slog::info << "Layer Property updated!" << slog::endl; + return true; + +} + +bool Models::ObjectSegmentationInstanceModel::fetchResults( + const std::shared_ptr & engine, + std::vector & results, + const float & confidence_thresh, + const bool & enable_roi_constraint) +{ + + const float NMS_THRESHOLD = 0.45; // threshold for removing overlapping bounding boxes + + ov::InferRequest request = engine->getRequest(); + std::string det_output = getOutputName("detection"); + const ov::Tensor det_output_tensor = request.get_tensor(det_output); + ov::Shape det_output_shape = det_output_tensor.get_shape(); + auto *detections = det_output_tensor.data(); + int rows = det_output_shape.at(1); + int dimentions = det_output_shape.at(2); + cv::Mat output_buffer(det_output_shape[1], det_output_shape[2], CV_32F, detections); + //Check if transpose is needed + //if ( needTranspose()){ //DON'T use func needTranspose(), it is not correctly set when calling updateLayerProperty() + if (det_output_shape.at(2) > det_output_shape.at(1) && + det_output_shape.at(2) > 300){ // 300 is just a random number(bigger than the number of classes) + cv::transpose(output_buffer, output_buffer); //[8400,84+32] for yolov8 seg + detections = (float*)output_buffer.data; + rows = det_output_shape.at(2); + dimentions = det_output_shape.at(1); + } + slog::debug << "AFTER calibration: rows->" << rows << ", dimentions->" << dimentions << slog::endl; + + std::vector boxes; + std::vector mask_confs; + std::vector class_ids; + std::vector confidences; + std::vector & labels = getLabels(); + + for (int i = 0; i < rows; i++) { + //float *detection = &detections[i * dimentions]; + if (hasConfidenceOutput()) { + float confidence = output_buffer.at(int(i), 4); + if (confidence < confidence_thresh) + continue; + } + + const int classes_scores_start_pos = hasConfidenceOutput()? 5 : 4; + cv::Mat classes_scores = output_buffer.row(i).colRange(classes_scores_start_pos, dimentions-32); //4, 84 + + cv::Point class_id; + double max_class_score; + cv::minMaxLoc(classes_scores, nullptr, &max_class_score, nullptr, &class_id); + + if (max_class_score > confidence_thresh) { + confidences.emplace_back(max_class_score); + class_ids.emplace_back(class_id.x); + + float x = output_buffer.at(i, 0); //detection[0]; + float y = output_buffer.at(i, 1); //detection[1]; + float w = output_buffer.at(i, 2); //detection[2]; + float h = output_buffer.at(i, 3); //detection[3]; + auto x_min = x - (w / 2); + auto y_min = y - (h / 2); + + boxes.emplace_back(x_min, y_min, w, h); + cv::Mat mask_conf = output_buffer.row(i).colRange(dimentions-32, dimentions); //84, 116 + mask_confs.emplace_back(mask_conf); + } + } + + std::vector nms_result; + cv::dnn::NMSBoxes(boxes, confidences, confidence_thresh, NMS_THRESHOLD, nms_result); + + const ov::Tensor mask_output_tensor = request.get_tensor(getOutputName("masks")); + ov::Shape mask_output_shape = mask_output_tensor.get_shape(); + //const ov::Layout mask_layout {"NCHW"}; //must be "NCHW"? + const auto MASK_CHANNEL = mask_output_shape[1]; + const auto MASK_HEIGHT = mask_output_shape[2]; //mask_output_shape[ov::layout::height_idx(mask_layout)]; + const auto MASK_WIDTH = mask_output_shape[3]; //mask_output_shape[ov::layout::width_idx(mask_layout)]; + slog::debug << "mask_output_shape: " << mask_output_shape << ",MASK_HEIGHT:" << MASK_HEIGHT << ", MASK_WIDTH:" << MASK_WIDTH << slog::endl; + //cv::Mat proto(32, 25600, CV_32F, mask_output_tensor.data()); //[32,25600] + cv::Mat proto(MASK_CHANNEL, MASK_HEIGHT * MASK_WIDTH, CV_32F, mask_output_tensor.data()); //[32,25600] + + for (int idx: nms_result) { + double rx = getFrameResizeRatioWidth(); + double ry = getFrameResizeRatioHeight(); + slog::debug << "Detection-Ratio (Input Image to Input Tensor): "<< rx << "x" << ry << slog::endl; + + //Bounding-Box in Input Tensor Size + int vx = std::max(0, int(boxes[idx].x)); + int vy = std::max(0, int(boxes[idx].y)); + int vw = std::min(std::max(0, int(boxes[idx].width)), getInputWidth()-vx-1); + int vh = std::min(std::max(0, int(boxes[idx].height)), getInputHeight()-vy-1); + + cv::Rect vrec(vx, vy, vw, vh); + slog::debug << "Detection Rectangle in Input Tensor Size: " << vrec << slog::endl; + const int det_bb_w = vw*rx; + const int det_bb_h = vh*ry; + cv::Rect det_bb(vx*rx, vy*ry, det_bb_w, det_bb_h); + slog::debug << "Detection Rectangle in Input Image Size: " << det_bb << slog::endl; + Result result(det_bb); + result.setConfidence(confidences[idx]); + std::string label = class_ids[idx] < labels.size() ? + labels[class_ids[idx]] : std::string("label #") + std::to_string(class_ids[idx]); + result.setLabel(label); + + //Mask data operation + auto sigmoid = [](float a) {return 1. / (1. + exp(-a));}; + cv::Mat m = mask_confs[idx] * proto; + for (int col = 0; col < m.cols; col++) { + m.at(0, col) = sigmoid(m.at(0, col)); + } + cv::Mat reshaped_m = m.reshape(1, MASK_HEIGHT); //1x25600-->160x160, mask_output_shape:NCHW + + double mask_rx = static_cast(MASK_WIDTH) / getInputWidth(); + double mask_ry = static_cast(MASK_HEIGHT) / getInputHeight(); + slog::debug << "Mask-Ratio (Mask Tensor to Input Tensor): " << mask_rx <<"x" << mask_ry << slog::endl; + int mask_x = int(mask_rx * vx); + int mask_y = int(mask_ry * vy); + int mask_w = int(mask_rx * vw); + int mask_h = int(mask_ry * vh); + if (mask_x + mask_w >= MASK_WIDTH){ + mask_w = MASK_WIDTH - 1; + } + if (mask_y + mask_h >= MASK_HEIGHT){ + mask_h = MASK_HEIGHT - 1; + } + cv::Rect roi{mask_x, mask_y, mask_w, mask_h}; + slog::debug << "Mask ROI:" << roi << slog::endl; + cv::Mat roi_mask = reshaped_m(roi); + cv::Mat resized_mask; + cv::resize(roi_mask, resized_mask, cv::Size(det_bb_w, det_bb_h)); + result.setMask(resized_mask); + + results.push_back(result); + } + + return true; +} diff --git a/openvino_wrapper_lib/src/models/object_segmentation_maskrcnn_model.cpp b/openvino_wrapper_lib/src/models/object_segmentation_maskrcnn_model.cpp index d08b5a57..37ab35b4 100644 --- a/openvino_wrapper_lib/src/models/object_segmentation_maskrcnn_model.cpp +++ b/openvino_wrapper_lib/src/models/object_segmentation_maskrcnn_model.cpp @@ -82,7 +82,7 @@ bool Models::ObjectSegmentationMaskrcnnModel::matToBlob( } ov::InferRequest infer_request = engine->getRequest(); - ov::Tensor input_tensor = infer_request.get_tensor(getInputName("input")); + ov::Tensor input_tensor = infer_request.get_tensor(getInputName()); ov::Shape input_shape = input_tensor.get_shape(); OPENVINO_ASSERT(input_shape.size() == 4); @@ -183,7 +183,7 @@ bool Models::ObjectSegmentationMaskrcnnModel::updateLayerProperty( input_info.tensor(). set_element_type(ov::element::u8). set_layout(tensor_layout); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); } else if (info_shape.size() == 2) { // second input contains image info image_info.tensor().set_element_type(ov::element::f32); addInputInfo("input2", info_name_); @@ -192,7 +192,7 @@ bool Models::ObjectSegmentationMaskrcnnModel::updateLayerProperty( } } - std::string inputName = getInputName("input"); + std::string inputName = getInputName(); slog::debug << "input name is:" << inputName << slog::endl; OPENVINO_ASSERT (input_shape.size()== 4); size_t netBatchSize = input_shape[0]; diff --git a/openvino_wrapper_lib/src/models/object_segmentation_model.cpp b/openvino_wrapper_lib/src/models/object_segmentation_model.cpp index 1d076757..be720437 100644 --- a/openvino_wrapper_lib/src/models/object_segmentation_model.cpp +++ b/openvino_wrapper_lib/src/models/object_segmentation_model.cpp @@ -97,7 +97,7 @@ bool Models::ObjectSegmentationModel::matToBlob( } #else ov::InferRequest infer_request = engine->getRequest(); - ov::Tensor input_tensor = infer_request.get_tensor(getInputName("input")); + ov::Tensor input_tensor = infer_request.get_tensor(getInputName()); ov::Shape input_shape = input_tensor.get_shape(); OPENVINO_ASSERT(input_shape.size() == 4); @@ -167,7 +167,7 @@ bool Models::ObjectSegmentationModel::updateLayerProperty( input_info.preprocess(). convert_layout(expect_layout). resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); auto outputs_info = model->outputs(); if (outputs_info.size() != 1) { diff --git a/openvino_wrapper_lib/src/models/person_attribs_detection_model.cpp b/openvino_wrapper_lib/src/models/person_attribs_detection_model.cpp index 405740d6..30a1e739 100644 --- a/openvino_wrapper_lib/src/models/person_attribs_detection_model.cpp +++ b/openvino_wrapper_lib/src/models/person_attribs_detection_model.cpp @@ -48,7 +48,7 @@ bool Models::PersonAttribsDetectionModel::updateLayerProperty( } model = ppp.build(); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); addOutputInfo("attributes_output_",output_info_map[0].get_any_name()); addOutputInfo("top_output_", output_info_map[1].get_any_name()); addOutputInfo("bottom_output_", output_info_map[2].get_any_name()); diff --git a/openvino_wrapper_lib/src/models/vehicle_attribs_detection_model.cpp b/openvino_wrapper_lib/src/models/vehicle_attribs_detection_model.cpp index e5a65947..11629ba9 100644 --- a/openvino_wrapper_lib/src/models/vehicle_attribs_detection_model.cpp +++ b/openvino_wrapper_lib/src/models/vehicle_attribs_detection_model.cpp @@ -47,7 +47,7 @@ bool Models::VehicleAttribsDetectionModel::updateLayerProperty( set_layout(tensor_layout); model = ppp.build(); - addInputInfo("input", input_tensor_name_); + addInputInfo(ModelAttribute::DefaultInputName, input_tensor_name_); // set input and output layer name addOutputInfo("color_output_", output_info_map[1].get_any_name()); diff --git a/openvino_wrapper_lib/src/outputs/image_window_output.cpp b/openvino_wrapper_lib/src/outputs/image_window_output.cpp index 0f2b93e2..6c81a348 100644 --- a/openvino_wrapper_lib/src/outputs/image_window_output.cpp +++ b/openvino_wrapper_lib/src/outputs/image_window_output.cpp @@ -180,6 +180,7 @@ void Outputs::ImageWindowOutput::accept( mergeMask(results); } +//TODO: Deprecated, will merge the impl into the func for instanceResult. void Outputs::ImageWindowOutput::mergeMask( const std::vector & results) { @@ -202,6 +203,31 @@ void Outputs::ImageWindowOutput::mergeMask( cv::addWeighted(colored_mask, alpha, roi_img, 1.0f - alpha, 0.0f, roi_img); } } + +void Outputs::ImageWindowOutput::mergeMask( + const std::vector & results) +{ + std::map class_color; + for (unsigned i = 0; i < results.size(); i++) { + std::string class_label = results[i].getLabel(); + if (class_color.find(class_label) == class_color.end()) { + class_color[class_label] = class_color.size(); + } + auto & color = colors_[class_color[class_label] % colors_.size() ]; + const float alpha = 0.7f; + const float MASK_THRESHOLD = 0.5; + + cv::Rect location = results[i].getLocation(); + cv::Mat roi_img = frame_(location); + cv::Mat mask = results[i].getMask(); + cv::Mat colored_mask(location.height, location.width, frame_.type(), + cv::Scalar(color[2], color[1], color[0]) ); + roi_img.copyTo(colored_mask, mask <= MASK_THRESHOLD); + cv::addWeighted(colored_mask, alpha, roi_img, 1.0f - alpha, 0.0f, roi_img); + } +} + +//TODO: Deprecated, will merge the impl into the func for instanceResult. void Outputs::ImageWindowOutput::accept( const std::vector & results) { @@ -221,7 +247,24 @@ void Outputs::ImageWindowOutput::accept( mergeMask(results); } - +void Outputs::ImageWindowOutput::accept( + const std::vector & results) +{ + for (unsigned i = 0; i < results.size(); i++) { + cv::Rect result_rect = results[i].getLocation(); + unsigned target_index = findOutput(result_rect); + outputs_[target_index].rect = result_rect; + auto fd_conf = results[i].getConfidence(); + if (fd_conf >= 0) { + std::ostringstream ostream; + ostream << "[" << std::fixed << std::setprecision(3) << fd_conf << "]"; + outputs_[target_index].desc += ostream.str(); + } + auto label = results[i].getLabel(); + outputs_[target_index].desc += "[" + label + "]"; + } + mergeMask(results); +} void Outputs::ImageWindowOutput::accept( diff --git a/openvino_wrapper_lib/src/outputs/ros_topic_output.cpp b/openvino_wrapper_lib/src/outputs/ros_topic_output.cpp index a0f8a531..3cd16eb8 100644 --- a/openvino_wrapper_lib/src/outputs/ros_topic_output.cpp +++ b/openvino_wrapper_lib/src/outputs/ros_topic_output.cpp @@ -229,6 +229,29 @@ void Outputs::RosTopicOutput::accept( } } +void Outputs::RosTopicOutput::accept( + const std::vector & results) +{ + segmented_objects_topic_ = std::make_shared(); + object_msgs::msg::ObjectInMask object; + for (auto & r : results) { + auto loc = r.getLocation(); + object.roi.x_offset = loc.x; + object.roi.y_offset = loc.y; + object.roi.width = loc.width; + object.roi.height = loc.height; + object.object_name = r.getLabel(); + object.probability = r.getConfidence(); + cv::Mat mask = r.getMask(); + for (int h = 0; h < mask.size().height; ++h) { + for (int w = 0; w < mask.size().width; ++w) { + object.mask_array.push_back(mask.at(h, w)); + } + } + segmented_objects_topic_->objects_vector.push_back(object); + } +} + void Outputs::RosTopicOutput::accept( const std::vector & results) { diff --git a/openvino_wrapper_lib/src/pipeline_manager.cpp b/openvino_wrapper_lib/src/pipeline_manager.cpp index 184ac922..18106cdf 100644 --- a/openvino_wrapper_lib/src/pipeline_manager.cpp +++ b/openvino_wrapper_lib/src/pipeline_manager.cpp @@ -23,7 +23,7 @@ #include #include -#if 0 +#if 0 //in curent versions, these models are not supported any more. #include "openvino_wrapper_lib/inferences/landmarks_detection.hpp" #include "openvino_wrapper_lib/inferences/face_reidentification.hpp" #include "openvino_wrapper_lib/models/face_reidentification_model.hpp" @@ -47,9 +47,13 @@ #include "openvino_wrapper_lib/inferences/head_pose_detection.hpp" #include "openvino_wrapper_lib/models/head_pose_detection_model.hpp" #include "openvino_wrapper_lib/models/object_detection_yolov5_model.hpp" +#include "openvino_wrapper_lib/models/object_detection_yolov8_model.hpp" #include "openvino_wrapper_lib/models/object_detection_ssd_model.hpp" #include "openvino_wrapper_lib/inferences/object_segmentation.hpp" +#include "openvino_wrapper_lib/inferences/object_segmentation_instance.hpp" #include "openvino_wrapper_lib/models/object_segmentation_model.hpp" +#include "openvino_wrapper_lib/models/object_segmentation_instance_model.hpp" +#include "openvino_wrapper_lib/models/object_segmentation_instance_maskrcnn_model.hpp" #include "openvino_wrapper_lib/inputs/base_input.hpp" #include "openvino_wrapper_lib/inputs/image_input.hpp" #include "openvino_wrapper_lib/inputs/realsense_camera.hpp" @@ -211,6 +215,8 @@ PipelineManager::parseInference(const Params::ParamManager::PipelineRawData & pa object = createObjectSegmentation(infer); } else if (infer.name == kInferTpye_ObjectSegmentationMaskrcnn) { object = createObjectSegmentationMaskrcnn(infer); + } else if (infer.name == kInferTpye_ObjectSegmentationInstance) { + object = createObjectSegmentationInstance(infer); } else if (infer.name == kInferTpye_PersonReidentification) { object = createPersonReidentification(infer); } else if (infer.name == kInferTpye_PersonAttribsDetection) { @@ -300,6 +306,11 @@ PipelineManager::createObjectDetection( std::make_shared(infer.label, infer.model, infer.batch); } + if (infer.model_type == kInferTpye_ObjectDetectionTypeYolov8) { + object_detection_model = + std::make_shared(infer.label, infer.model, infer.batch); + } + slog::debug << "for test in createObjectDetection(), Created SSDModel" << slog::endl; object_inference_ptr = std::make_shared( infer.enable_roi_constraint, infer.confidence_threshold); // To-do theshold configuration @@ -332,6 +343,7 @@ PipelineManager::createObjectSegmentation(const Params::ParamManager::InferenceR return segmentation_inference_ptr; } +// TODO: Deprecated std::shared_ptr PipelineManager::createObjectSegmentationMaskrcnn(const Params::ParamManager::InferenceRawData & infer) { @@ -350,6 +362,30 @@ PipelineManager::createObjectSegmentationMaskrcnn(const Params::ParamManager::In return segmentation_inference_ptr; } +std::shared_ptr +PipelineManager::createObjectSegmentationInstance(const Params::ParamManager::InferenceRawData & infer) +{ + std::shared_ptr model; + if (infer.model_type == kInferTpye_ObjectSegmentationTypeMaskrcnn) { + slog::info << "Model Typle: kInferType_ObjectSegmentationTypeMaskrcnn" << slog::endl; + model = std::make_shared(infer.label, infer.model, infer.batch); + } else { + slog::info << "Model Typle: kInferType_ObjectSegmentationTypeYolo" << slog::endl; + model = std::make_shared(infer.label, infer.model, infer.batch); + } + model->modelInit(); + slog::info << "Instance Segmentation model initialized." << slog::endl; + auto engine = engine_manager_.createEngine(infer.engine, model); + slog::info << "Engine initialized for Instance Segmentation." << slog::endl; + auto segmentation_inference_ptr = std::make_shared( + infer.confidence_threshold); + slog::info << "Segmentation Inference instanced." << slog::endl; + segmentation_inference_ptr->loadNetwork(model); + segmentation_inference_ptr->loadEngine(engine); + + return segmentation_inference_ptr; +} + std::shared_ptr PipelineManager::createPersonReidentification( const Params::ParamManager::InferenceRawData & infer) diff --git a/sample/launch/pipeline_segmentation_instance.launch.py b/sample/launch/pipeline_segmentation_instance.launch.py new file mode 100644 index 00000000..ec9da321 --- /dev/null +++ b/sample/launch/pipeline_segmentation_instance.launch.py @@ -0,0 +1,55 @@ +# Copyright 2018 Open Source Robotics Foundation, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Launch face detection and rviz.""" + +import os + +from ament_index_python.packages import get_package_share_directory +from launch import LaunchDescription +import launch_ros.actions + +from launch.substitutions import LaunchConfiguration, PythonExpression +import launch + +def generate_launch_description(): + #default_yaml = os.path.join(get_package_share_directory('openvino_node'), 'param', + #'pipeline_segmentation.yaml') + default_rviz = os.path.join(get_package_share_directory('openvino_node'), 'launch', + 'rviz/default.rviz') + return LaunchDescription([ + launch.actions.DeclareLaunchArgument(name='yaml_path', default_value = + os.path.join(get_package_share_directory('openvino_node'), 'param','pipeline_segmentation_instance.yaml')), + # Realsense + # NOTE: Split realsense_node launching from OpenVINO package, which + # will be launched by RDK launching file or manually. + + # Openvino detection + launch_ros.actions.Node( + package='openvino_node', + executable='pipeline_with_params', + arguments=['-config', LaunchConfiguration('yaml_path')], + remappings=[ + ('/openvino_toolkit/image_raw', '/camera/color/image_raw'), + ('/openvino_toolkit/segmentation/segmented_obejcts', + '/ros2_openvino_toolkit/segmented_obejcts'), + ('/openvino_toolkit/segmentation/images', '/ros2_openvino_toolkit/image_rviz')], + output='screen'), + + # Rviz + #launch_ros.actions.Node( + # package='rviz2', + # executable='rviz2', output='screen', + # arguments=['--display-config', default_rviz]), + ]) diff --git a/sample/launch/pipeline_segmentation_maskrcnn.launch.py b/sample/launch/pipeline_segmentation_maskrcnn.launch.py index b07d5476..e1901a3e 100644 --- a/sample/launch/pipeline_segmentation_maskrcnn.launch.py +++ b/sample/launch/pipeline_segmentation_maskrcnn.launch.py @@ -48,8 +48,8 @@ def generate_launch_description(): output='screen'), # Rviz - launch_ros.actions.Node( - package='rviz2', - executable='rviz2', output='screen', - arguments=['--display-config', default_rviz]), + #launch_ros.actions.Node( + # package='rviz2', + # executable='rviz2', output='screen', + # arguments=['--display-config', default_rviz]), ]) diff --git a/sample/launch/rviz2.launch.py b/sample/launch/rviz2.launch.py new file mode 100644 index 00000000..a0fb3a65 --- /dev/null +++ b/sample/launch/rviz2.launch.py @@ -0,0 +1,35 @@ +# Copyright 2018 Open Source Robotics Foundation, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Launch rviz2 for ROS2-OpenVINO.""" + +import os + +from ament_index_python.packages import get_package_share_directory +from launch import LaunchDescription +import launch_ros.actions + +from launch.substitutions import LaunchConfiguration, PythonExpression +import launch + +def generate_launch_description(): + default_rviz = os.path.join(get_package_share_directory('openvino_node'), 'launch', + 'rviz/default.rviz') + return LaunchDescription([ + # Rviz + launch_ros.actions.Node( + package='rviz2', + executable='rviz2', output='screen', + arguments=['--display-config', default_rviz]), + ]) diff --git a/sample/param/pipeline_object_yolo.yaml b/sample/param/pipeline_object_yolo.yaml index 2d5e479b..a80fc4c3 100644 --- a/sample/param/pipeline_object_yolo.yaml +++ b/sample/param/pipeline_object_yolo.yaml @@ -1,11 +1,13 @@ Pipelines: - name: object - inputs: [Image] + inputs: [StandardCamera] input_path: to/be/set/image_path infers: - name: ObjectDetection - model: /opt/openvino_toolkit/models/convert/public/yolov5n/FP32/yolov5n.xml - model_type: yolov5 + #model: /opt/openvino_toolkit/models/convert/public/yolov5n/FP32/yolov5n.xml + model: /home/lewis/develop/openvino/models/models/yolo/yolov7/yolov7_int8.xml + #model: /home/lewis/develop/openvino/models/models/yolo/yolov8/yolov8n_openvino_int8_model/yolov8n.xml + model_type: yolov5 #yolov8 engine: CPU #MYRIAD label: to/be/set/xxx.labels batch: 1 @@ -13,7 +15,7 @@ Pipelines: enable_roi_constraint: true # set enable_roi_constraint to false if you don't want to make the inferred ROI (region of interest) constrained into the camera frame outputs: [ImageWindow, RosTopic, RViz] connects: - - left: Image + - left: StandardCamera right: [ObjectDetection] - left: ObjectDetection right: [ImageWindow] diff --git a/sample/param/pipeline_segmentation_instance.yaml b/sample/param/pipeline_segmentation_instance.yaml new file mode 100644 index 00000000..96ab1c3d --- /dev/null +++ b/sample/param/pipeline_segmentation_instance.yaml @@ -0,0 +1,29 @@ +Pipelines: +- name: segmentation + inputs: [StandardCamera] + infers: + - name: ObjectSegmentationInstance + # for Yolov8 Seg models ----------------- + model: /opt/openvino_toolkit/models/convert/public/yolov8n-seg/FP32/yolov8n-seg.xml + model_type: yolo + label: /opt/openvino_toolkit/labels/object_detection/coco.names + # for maskrcnn inception resnet ----------------- + #model: /opt/openvino_toolkit/models/convert/public/mask_rcnn_inception_resnet_v2_atrous_coco/FP32/mask_rcnn_inception_resnet_v2_atrous_coco.xml + #model_type: maskrcnn + #label: /opt/openvino_toolkit/labels/object_segmentation/frozen_inference_graph.labels #for maskrcnn + #---------------------- + engine: CPU #"HETERO:CPU,GPU," #"HETERO:CPU,GPU,MYRIAD" + batch: 1 + confidence_threshold: 0.5 + outputs: [ImageWindow, RosTopic, RViz] + connects: + - left: StandardCamera + right: [ObjectSegmentationInstance] + - left: ObjectSegmentationInstance + right: [ImageWindow] + - left: ObjectSegmentationInstance + right: [RosTopic] + - left: ObjectSegmentationInstance + right: [RViz] + +Common: