Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
English | [简体中文](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/README_cn.md)
English | [简体中文](README_cn.md)

<h1 align="center">📚 OpenVINO™ Notebooks</h1>

Expand Down Expand Up @@ -131,6 +131,7 @@ Demos that demonstrate inference on a particular model.
|[233-blip-visual-language-processing](notebooks/233-blip-visual-language-processing/)<br>| Visual Question Answering and Image Captioning using BLIP and OpenVINO™ | <img src=https://user-images.githubusercontent.com/29454499/221933762-4ff32ecb-5e5d-4484-80e1-e9396cb3c511.png width=225> |
|[234-encodec-audio-compression](notebooks/234-encodec-audio-compression/)<br>| # Audio compression with EnCodec and OpenVINO™ | <img src=https://github.com/facebookresearch/encodec/raw/main/thumbnail.png width=225> |
|[235-controlnet-stable-diffusion](notebooks/235-controlnet-stable-diffusion/)<br>| # A Text-to-Image Generation with ControlNet Conditioning and OpenVINO™ | <img src=https://user-images.githubusercontent.com/29454499/224541412-9d13443e-0e42-43f2-8210-aa31820c5b44.png width=225> |
|[238-cyclegan-photo2cartoon](notebooks/238-cyclegan-photo2cartoon/)<br>| Convert Photo to cartoon ONNX Model to OpenVINO™ IR | <img src=https://user-images.githubusercontent.com/52159774/225785418-3e4d223e-572e-4bb5-bb3f-508e80d08a70.png width=225> |

<div id='-model-training'/>

Expand Down
1 change: 1 addition & 0 deletions README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ Jupyter notebooks 分为四个大类,选择一个跟你需求相关的开始
|[233-blip-visual-language-processing](notebooks/233-blip-visual-language-processing/)<br>| 基于BLIP和OpenVINO™的视觉问答与图片注释 | <img src=https://user-images.githubusercontent.com/29454499/221933762-4ff32ecb-5e5d-4484-80e1-e9396cb3c511.png width=225> |
|[234-encodec-audio-compression](notebooks/234-encodec-audio-compression/)<br>| # 基于EnCodec和OpenVINO™的音频压缩 | <img src=https://github.com/facebookresearch/encodec/raw/main/thumbnail.png width=225> |
|[235-controlnet-stable-diffusion](notebooks/235-controlnet-stable-diffusion/)<br>| # 使用ControlNet状态调节Stable Diffusion 实现文字生成图片 | <img src=https://user-images.githubusercontent.com/29454499/224541412-9d13443e-0e42-43f2-8210-aa31820c5b44.png width=225> |
|[238-cyclegan-photo2cartoon](notebooks/238-cyclegan-photo2cartoon/)<br>| 将Photo2Cartoon的ONNX 模型转换为OpenVINO™的IR | <img src=https://user-images.githubusercontent.com/52159774/225785418-3e4d223e-572e-4bb5-bb3f-508e80d08a70.png width=225> |

<div id='-model-training'/>

Expand Down
340 changes: 340 additions & 0 deletions notebooks/238-cyclegan-photo2cartoon/238-cyclegan-photo2cartoon.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,340 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "5d44ad85",
"metadata": {},
"source": [
"# Convert photo to cartoon ONNX Model to OpenVINO™ IR"
]
},
{
"cell_type": "markdown",
"id": "29def0ea",
"metadata": {},
"source": [
"The aim of portrait cartoon stylization is to transform real photos into cartoon images with portrait's ID information and texture details. We use Generative Adversarial Network method to realize the mapping of picture to cartoon. Considering the difficulty in obtaining paired data and the non-corresponding shape of input and output, we adopt unpaired image translation fashion.\n",
"\n",
"The results of CycleGAN, a classic unpaired image translation method, often have obvious artifacts and are unstable. Recently, Kim et al. propose a novel normalization function (AdaLIN) and an attention module in paper \"U-GAT-IT\" and achieve exquisite selfie2anime results.\n",
"\n",
"Different from the exaggerated anime style, our cartoon style is more realistic and contains unequivocal ID information. To this end, we add a Face ID Loss (cosine distance of ID features between input image and cartoon image) to reach identity invariance.\n",
"\n",
"We propose a Soft Adaptive Layer-Instance Normalization (Soft-AdaLIN) method which fuses the statistics of encoding features and decoding features in de-standardization.\n",
"\n",
"Based on U-GAT-IT, two hourglass modules are introduced before encoder and after decoder to improve the performance in a progressively way.\n",
"\n",
"We also pre-process the data to a fixed pattern to help reduce the difficulty of optimization. For details, see below.\n",
"\n",
"This tutorial demonstrates the running results of photo2cartoon and converts it from ONNX model to the IR model used by OpenVION.\n",
"\n",
"Requirements\n",
"\n",
" ·python 3.6\n",
" ·pytorch 1.4\n",
" ·tensorflow-gpu 1.14\n",
" ·face-alignment\n",
" ·dlib\n",
" ·onnxruntime\n"
]
},
{
"cell_type": "markdown",
"id": "3a6337d8",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "markdown",
"id": "eca78da0",
"metadata": {},
"source": [
"The default environment of openvino does not contain all the packages required for the current model to run, so you need to run the following command first."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d0f6a00",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%pip install face-alignment dlib onnxruntime"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77f1859c",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"from PIL import Image\n",
"import cv2\n",
"import numpy as np\n",
"from pathlib import Path"
]
},
{
"cell_type": "markdown",
"id": "e96ff29e",
"metadata": {},
"source": [
"## Prerequisittes"
]
},
{
"cell_type": "markdown",
"id": "afd5c30c",
"metadata": {},
"source": [
"Pytorch Model is not in Gihub, so if you want pre-dowload it in: [Google Drive](https://drive.google.com/file/d/1PhwKDUhiq8p-UqrfHCqj257QnqBWD523/view?usp=sharing) or [Baidu Cloud](https://pan.baidu.com/share/init?surl=MsT3-He3UGipKhUi4OcCJw) acess code: y2ch. \n",
"\n",
"Now we clone photo2cartoon project."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eedf6c2c",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"if not Path('photo2cartoon').exists():\n",
" !git clone https://github.com/sususama/photo2cartoon.git\n",
"%cd photo2cartoon"
]
},
{
"cell_type": "markdown",
"id": "285f447f",
"metadata": {},
"source": [
"## Check model inference"
]
},
{
"cell_type": "markdown",
"id": "d8c39c87",
"metadata": {},
"source": [
"`test_onnx.py` script run ONNXmodel to test photo to cartoon.\n",
"\n",
"Please use a young Asian woman photo."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2b1a574",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!python test_onnx.py --photo_path images/photo_test.jpg --save_path images/cartoon_result.png"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "397cc371",
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"original_img = Image.open('images/photo_test.jpg')\n",
"generate_img = Image.open('images/cartoon_result.png')\n",
"fig, ax = plt.subplots(1, 2)\n",
"ax[0].imshow(original_img)\n",
"ax[0].set_title('original')\n",
"ax[1].imshow(generate_img)\n",
"ax[1].set_title('generate')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "ee70e45c",
"metadata": {},
"source": [
"This part show how to use the ONNX Model to generate cartoon image.\n",
"\n",
"This open source model is based on the trainning of yong women in Asia.For other groups with insufficient coverage, you can collect the data of corresponding groups according to the use scenario for training."
]
},
{
"cell_type": "markdown",
"id": "24af352b",
"metadata": {},
"source": [
"## Conver ONNX Model to OpenVINO Intermediate Representation (IR)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "442ec572",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"from openvino.tools import mo\n",
"from openvino.runtime import serialize\n",
"\n",
"model = mo.convert_model('models/photo2cartoon_weights.onnx')\n",
"serialize(model, 'models/photo2cartoon.xml')"
]
},
{
"cell_type": "markdown",
"id": "b4aab180",
"metadata": {},
"source": [
"## Verify model inference"
]
},
{
"cell_type": "markdown",
"id": "252c86d9",
"metadata": {},
"source": [
"To infer the model, you first need to call the `ExecutableNetwork` method `create_infer_Request()` to create an inference request, we use `compile_Model()` loaded `exec_net`. Then we must call `infer()` as `_InferRequest_` The method requires a parameter: `inputs`. This is a dictionary that maps input layer names to input data."
]
},
{
"cell_type": "markdown",
"id": "e096e9dc",
"metadata": {},
"source": [
"- Step 1: Import the model. We passed `ie.read_Model` to read the model, `ie.confile_Model` to compile the model."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "09b7ed71",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"from utils.preprocess import Preprocess\n",
"from openvino.runtime import Core\n",
"\n",
"ie = Core()\n",
"pre = Preprocess()\n",
"print(\"Load network\")\n",
"photo2cartoon_model_xml = \"models/photo2cartoon.xml\"\n",
"model = ie.read_model(model=photo2cartoon_model_xml)\n",
"compiled_model = ie.compile_model(model=model, device_name=\"CPU\")\n",
"input_layer = compiled_model.input(0)\n",
"output_layer = compiled_model.output(0)\n",
"print('Model Input and Output Info')\n",
"print(f\"- input shape: {input_layer.shape}\")\n",
"print(f\"- input precision: {input_layer.element_type}\")\n",
"print(f\"- output shape: {output_layer.shape}\")\n",
"print(f\"- output precision: {output_layer.element_type}\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9b22b3eb",
"metadata": {},
"source": [
"- Step 2: Load the image, preprocess it, and convert it to the input shape. To propagate an image through the network, you need to load it into an array, resize it to the shape the network expects, and convert it to the network's input layout format. We get a reference of the required height and width for the web and resize the image to this size. Finally, we resize the image to N, C, H, W format (where N=1), first calling `np.transpose()` to change to C, H, W."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10870056",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"print(\"Load input image\")\n",
"image_filename = 'images/photo_test.jpg'\n",
"image = cv2.cvtColor(cv2.imread(image_filename), cv2.COLOR_BGR2RGB)\n",
"face_rgba = pre.process(image)\n",
"face_rgba = cv2.resize(face_rgba, (256, 256), interpolation=cv2.INTER_AREA)\n",
"face = face_rgba[:, :, :3].copy()\n",
"mask = face_rgba[:, :, 3][:, :, np.newaxis].copy() / 255\n",
"image = (face * mask + (1 - mask) * 255) / 127.5 - 1\n",
"print(\"- input image shape: {}\".format(image.shape))\n",
"N, C, H, W = input_layer.shape\n",
"resized_image = cv2.resize(src=image, dsize=(W, H))\n",
"print(\"- resize image into shape: {}\".format(resized_image.shape))\n",
"input_data = np.expand_dims(np.transpose(\n",
" resized_image, (2, 0, 1)), 0).astype(np.float32)\n",
"print(\"- align image shape same as network input: {}\".format(input_data.shape))"
]
},
{
"cell_type": "markdown",
"id": "4650a27a",
"metadata": {},
"source": [
"- Step 3: Model reasoning. We can use `compiled_Model([input_data])[output_layer]` directly obtains the result of reasoning."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "055a1d40",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"print(\"Infrence\")\n",
"result = compiled_model([input_data])[output_layer]\n",
"print(\"- generate image[0] shape: {}\".format(result[0].shape))\n",
"print(\"- generate image precision: {}\".format(result.dtype))\n",
"result_path = 'images/openvino_result.jpg'\n",
"generate_image = result[0].transpose((1, 2, 0))\n",
"cv2.imwrite(result_path, generate_image)\n",
"plt.imshow(generate_image)\n",
"plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"vscode": {
"interpreter": {
"hash": "ab7fe26dc67756aa660507b29f71723258621efc45b8385ca452d01a79e3d7b7"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
21 changes: 21 additions & 0 deletions notebooks/238-cyclegan-photo2cartoon/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Convert Photo to cartoon ONNX Model to OpenVINO™ IR

[Photo2Cartoon results](https://user-images.githubusercontent.com/52159774/225785418-3e4d223e-572e-4bb5-bb3f-508e80d08a70.png)

This tutorial explains how to convert the [photo2cartoon](https://github.com/minivision-ai/photo2cartoon) ONNX Model to OpenVINO™ IR


## Notebook Contents

This tutorial demonstrates the running results of photo2cartoon and converts it from ONNX model to the IR model used by OpenVION.

The tutorial consists of the following steps:
- Prepare the environment required for program operation
- Downlad pre-train model
- Validate original model
- Convert ONNX model to openVINO IR
- Verify model inference

## Installation Instructions

If you have not installed all required dependencies, follow the [Installation Guide](../../README.md).