diff --git a/README.md b/README.md index 2a9a8cfd787..0fa14c497e6 100644 --- a/README.md +++ b/README.md @@ -208,7 +208,7 @@ Intel® Neural Compressor validated 420+ [examples](./examples) for quantization Distillation for Quantization - Neural Coder + Neural Coder @@ -219,9 +219,8 @@ Intel® Neural Compressor validated 420+ [examples](./examples) for quantization - Adaptor - Strategy - Reference Example + Adaptor + Strategy diff --git a/docs/Makefile b/docs/Makefile index cf810c3c2a1..7653b4d006c 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -19,11 +19,11 @@ help: html: # cp README.md to docs, modify response-link - cp -f "../README.md" "./source/getting_started.md" + cp -f "../README.md" "./source/Welcome.md" cp -f "../SECURITY.md" "./source/SECURITY.md" - cp -f "./source/getting_started.md" "./source/getting_started.md.tmp" - sed 's/.md/.html/g; s/.\/docs\/source\//.\//g; s/.\/neural_coder\/extensions\/screenshots/imgs/g; s/.\/docs\/source\/_static/..\/\/_static/g; s/.\/examples/https:\/\/github.com\/intel\/neural-compressor\/tree\/master\/examples/g; s/.md/.html/g; ' "./source/getting_started.md.tmp" > "./source/getting_started.md" - rm -f "./source/getting_started.md.tmp" + cp -f "./source/Welcome.md" "./source/Welcome.md.tmp" + sed 's/.md/.html/g; s/.\/docs\/source\//.\//g; s/.\/neural_coder\/extensions\/screenshots/imgs/g; s/.\/docs\/source\/_static/..\/\/_static/g; ' "./source/Welcome.md.tmp" > "./source/Welcome.md" + rm -f "./source/Welcome.md.tmp" # make sure other png can display normal $(SPHINXBUILD) -b html "$(SOURCEDIR)" "$(BUILDDIR)/html" $(SPHINXOPTS) $(O) diff --git a/docs/source/SECURITY.md b/docs/source/SECURITY.md new file mode 100644 index 00000000000..71a71eff1b6 --- /dev/null +++ b/docs/source/SECURITY.md @@ -0,0 +1,13 @@ +Security Policy +=============== + +## Report a Vulnerability + +Please report security issues or vulnerabilities to the [Intel® Security Center]. + +For more information on how Intel® works to resolve security issues, see +[Vulnerability Handling Guidelines]. + +[Intel® Security Center]:https://www.intel.com/security + +[Vulnerability Handling Guidelines]:https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html diff --git a/docs/source/Welcome.md b/docs/source/Welcome.md new file mode 100644 index 00000000000..35d9e3841a9 --- /dev/null +++ b/docs/source/Welcome.md @@ -0,0 +1,249 @@ +
+ +Intel® Neural Compressor +=========================== +

An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)

+ +[![python](https://img.shields.io/badge/python-3.7%2B-blue)](https://github.com/intel/neural-compressor) +[![version](https://img.shields.io/badge/release-1.14-green)](https://github.com/intel/neural-compressor/releases) +[![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE) +[![coverage](https://img.shields.io/badge/coverage-90%25-green)](https://github.com/intel/neural-compressor) +[![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor) +
+ +--- +
+ +Intel® Neural Compressor, formerly known as Intel® Low Precision Optimization Tool, is an open-source Python library that runs on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep-learning frameworks for popular network compression technologies such as quantization, pruning, and knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help the user quickly find out the best quantized model. It also implements different weight-pruning algorithms to generate a pruned model with predefined sparsity goal. It also supports knowledge distillation to distill the knowledge from the teacher model to the student model. +Intel® Neural Compressor is a critical AI software component in the [Intel® oneAPI AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). + + +**Visit the Intel® Neural Compressor online document website at: .** + +## Installation + +### Prerequisites + +Python version: 3.7, 3.8, 3.9, 3.10 + +### Install on Linux +- Release binary install + ```Shell + # install stable basic version from pip + pip install neural-compressor + # Or install stable full version from pip (including GUI) + pip install neural-compressor-full + ``` +- Nightly binary install + ```Shell + git clone https://github.com/intel/neural-compressor.git + cd neural-compressor + pip install -r requirements.txt + # install nightly basic version from pip + pip install -i https://test.pypi.org/simple/ neural-compressor + # Or install nightly full version from pip (including GUI) + pip install -i https://test.pypi.org/simple/ neural-compressor-full + ``` +More installation methods can be found at [Installation Guide](./installation_guide.html). Please check out our [FAQ](./faq.html) for more details. + +## Getting Started +### Quantization with Python API + +```shell +# A TensorFlow Example +pip install tensorflow +# Prepare fp32 model +wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb +``` +```python +import tensorflow as tf +from neural_compressor.experimental import Quantization, common +quantizer = Quantization() +quantizer.model = './mobilenet_v1_1.0_224_frozen.pb' +dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3)) +quantizer.calib_dataloader = common.DataLoader(dataset) +quantizer.fit() +``` +### Quantization with [JupyterLab Extension](./neural_coder/extensions/neural_compressor_ext_lab/README.html) +Search for ```jupyter-lab-neural-compressor``` in the Extension Manager in JupyterLab and install with one click: + + + Extension + + +### Quantization with [GUI](./bench.html) +```shell +# An ONNX Example +pip install onnx==1.12.0 onnxruntime==1.12.1 onnxruntime-extensions +# Prepare fp32 model +wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx +# Start GUI +inc_bench +``` + + Architecture + + +## System Requirements + +### Validated Hardware Environment +#### Intel® Neural Compressor supports CPUs based on [Intel 64 architecture or compatible processors](https://en.wikipedia.org/wiki/X86-64): + +* Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake) +* Future Intel Xeon Scalable processor (code name Sapphire Rapids) + +#### Intel® Neural Compressor supports GPUs built on Intel's Xe architecture: + +* [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html) + +#### Intel® Neural Compressor quantized ONNX models support multiple hardware vendors through ONNX Runtime: + +* Intel CPU, AMD/ARM CPU, and NVidia GPU. Please refer to the validated model [list](./validated_model_list.html#Validated-ONNX-QDQ-INT8-models-on-multiple-hardware-through-ONNX-Runtime). + +### Validated Software Environment + +* OS version: CentOS 8.4, Ubuntu 20.04 +* Python version: 3.7, 3.8, 3.9, 3.10 + + + + + + + + + + + + + + + + + + + + + + +
FrameworkTensorFlowIntel TensorFlowPyTorchIntel® Extension for PyTorch*ONNX RuntimeMXNet
Version2.10.0
+ 2.9.1
+ 2.8.2
+
2.10.0
+ 2.9.1
+ 2.8.0
+
1.12.1+cpu
+ 1.11.0+cpu
+ 1.10.0+cpu
1.12.0
+ 1.11.0
+ 1.10.0
1.12.1
+ 1.11.0
+ 1.10.0
1.8.0
+ 1.7.0
+ 1.6.0
+ +> **Note:** +> Set the environment variable ``TF_ENABLE_ONEDNN_OPTS=1`` to enable oneDNN optimizations if you are using TensorFlow v2.6 to v2.8. oneDNN is the default for TensorFlow v2.9. + +### Validated Models +Intel® Neural Compressor validated 420+ [examples](./examples) for quantization with a performance speedup geomean of 2.2x and up to 4.2x on VNNI while minimizing accuracy loss. Over 30 pruning and knowledge distillation samples are also available. More details for validated models are available [here](./validated_model_list.html). + +
+ + Architecture + +
+ +## Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Overview
ArchitectureExamplesGUIAPIs
Intel oneAPI AI Analytics ToolkitAI and Analytics Samples
Basic API
TransformDatasetMetricObjective
Deep Dive
QuantizationPruning(Sparsity)Knowledge DistillationMixed PrecisionOrchestration
BenchmarkingDistributed TrainingModel ConversionTensorBoard
Distillation for QuantizationNeural Coder
Advanced Topics
AdaptorStrategy
+ +## Selected Publications/Events +* [Neural Compressor: an open-source Python library for network compression](https://cloud.tencent.com/developer/article/2165895) (Nov 2022) +* [Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA Performance](https://medium.com/syncedreview/running-fast-transformers-on-cpus-intel-approach-achieves-significant-speed-ups-and-sota-448521704c5e) (Nov 2022) +* [Personalized Stable Diffusion with Few-Shot Fine-Tuning](https://medium.com/intel-analytics-software/personalized-stable-diffusion-with-few-shot-fine-tuning-on-a-single-cpu-f01a3316b13) (Nov 2022) +* [Meet the Innovation of Intel AI Software: Intel® Extension for TensorFlow*](https://www.intel.com/content/www/us/en/developer/articles/technical/innovation-of-ai-software-extension-tensorflow.html) (Oct 2022) +* [PyTorch* Inference Acceleration with Intel® Neural Compressor](https://www.intel.com/content/www/us/en/developer/articles/technical/pytorch-inference-with-intel-neural-compressor.html#gs.gnq0cj) (Oct 2022) +* Neural Coder, a new plug-in for Intel Neural Compressor was covered by [Twitter](https://twitter.com/IntelDevTools/status/1583629213697212416), [LinkedIn](https://www.linkedin.com/posts/intel-software_oneapi-ai-deeplearning-activity-6989377309917007872-Dbzg?utm_source=share&utm_medium=member_desktop), and [Intel Developer Zone](https://mp.weixin.qq.com/s/LL-4eD-R0YagFgODM23oQA) from Intel, and [Twitter](https://twitter.com/IntelDevTools/status/1583629213697212416/retweets) and [LinkedIn](https://www.linkedin.com/feed/update/urn:li:share:6990377841435574272/) from Hugging Face. (Oct 2022) +* Intel Neural Compressor successfully landed on [GCP](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [AWS](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel) marketplace. (Oct 2022) + +> View our [full publication list](./publication_list.html). + +## Additional Content + +* [Release Information](./releases_info.html) +* [Contribution Guidelines](./contributions.html) +* [Legal Information](./legal_information.html) +* [Security Policy](SECURITY.html) +* [Intel® Neural Compressor Website](https://intel.github.io/neural-compressor) + +## Hiring + +We are actively hiring. Send your resume to inc.maintainers@intel.com if you are interested in model compression techniques. diff --git a/docs/source/_static/index.html b/docs/source/_static/index.html index 22a56287809..eb630c63d1f 100644 --- a/docs/source/_static/index.html +++ b/docs/source/_static/index.html @@ -1 +1 @@ - \ No newline at end of file + \ No newline at end of file diff --git a/docs/source/_templates/layout.html b/docs/source/_templates/layout.html index 3e517410702..8154cad6356 100644 --- a/docs/source/_templates/layout.html +++ b/docs/source/_templates/layout.html @@ -7,7 +7,7 @@
diff --git a/docs/source/api-introduction.md b/docs/source/api-introduction.md deleted file mode 100644 index e23fbaf8d62..00000000000 --- a/docs/source/api-introduction.md +++ /dev/null @@ -1,210 +0,0 @@ -API Documentation -================= - -## Introduction - -Intel® Neural Compressor is an open-source Python library designed to help users quickly deploy low-precision inference solutions on popular deep learning (DL) frameworks such as TensorFlow*, PyTorch*, MXNet, and ONNX Runtime. It automatically optimizes low-precision recipes for deep learning models in order to achieve optimal product objectives, such as inference performance and memory usage, with expected accuracy criteria. - - -## User-facing APIs - -These APIs are intended to unify low-precision quantization interfaces cross multiple DL frameworks for the best out-of-the-box experiences. - -> **Note** -> -> Neural Compressor is continuously improving user-facing APIs to create a better user experience. - -> Two sets of user-facing APIs exist. One is the default one supported from Neural Compressor v1.0 for backwards compatibility. The other set consists of new APIs in -the `neural_compressor.experimental` package. - -> We recommend that you use the APIs located in neural_compressor.experimental. All examples have been updated to use the experimental APIs. - -The major differences between the default user-facing APIs and the experimental APIs are: - -1. The experimental APIs abstract the `neural_compressor.experimental.common.Model` concept to cover those cases whose weight and graph files are stored separately. -2. The experimental APIs unify the calling style of the `Quantization`, `Pruning`, and `Benchmark` classes by setting model, calibration dataloader, evaluation dataloader, and metric through class attributes rather than passing them as function inputs. -3. The experimental APIs refine Neural Compressor built-in transforms/datasets/metrics by unifying the APIs cross different framework backends. - -## Experimental user-facing APIs - -Experimental user-facing APIs consist of the following components: - -### Quantization-related APIs - -```python -# neural_compressor.experimental.Quantization -class Quantization(object): - def __init__(self, conf_fname_or_obj): - ... - - def __call__(self): - ... - - @property - def calib_dataloader(self): - ... - - @property - def eval_dataloader(self): - ... - - @property - def model(self): - ... - - @property - def metric(self): - ... - - @property - def postprocess(self, user_postprocess): - ... - - @property - def q_func(self): - ... - - @property - def eval_func(self): - ... - -``` -The `conf_fname_or_obj` parameter used in the class initialization is the path to the user yaml configuration file or Quantization_Conf class. This yaml file is used to control the entire tuning behavior on the model. - -**Neural Compressor User YAML Syntax** - -> Intel® Neural Compressor provides template yaml files for [Post-Training Quantization](../neural_compressor/template/ptq.yaml), [Quantization-Aware Training](../neural_compressor/template/qat.yaml), and [Pruning](../neural_compressor/template/pruning.yaml) scenarios. Refer to these template files to understand the meaning of each field. - -> Note that most fields in the yaml templates are optional. View the [HelloWorld Yaml](../examples/helloworld/tf_example2/conf.yaml) example for reference. - -```python -# Typical Launcher code -from neural_compressor.experimental import Quantization, common - -# optional if Neural Compressor built-in dataset could be used as model input in yaml -class dataset(object): - def __init__(self, *args): - ... - - def __getitem__(self, idx): - # return single sample and label tuple without collate. label should be 0 for label-free case - ... - - def len(self): - ... - -# optional if Neural Compressor built-in metric could be used to do accuracy evaluation on model output in yaml -class custom_metric(object): - def __init__(self): - ... - - def update(self, predict, label): - # metric update per mini-batch - ... - - def result(self): - # final metric calculation invoked only once after all mini-batch are evaluated - # return a scalar to neural_compressor for accuracy-driven tuning. - # by default the scalar is higher-is-better. if not, set tuning.accuracy_criterion.higher_is_better to false in yaml. - ... - -quantizer = Quantization(conf.yaml) -quantizer.model = '/path/to/model' -# below two lines are optional if Neural Compressor built-in dataset is used as model calibration input in yaml -cal_dl = dataset('/path/to/calibration/dataset') -quantizer.calib_dataloader = common.DataLoader(cal_dl, batch_size=32) -# below two lines are optional if Neural Compressor built-in dataset is used as model evaluation input in yaml -dl = dataset('/path/to/evaluation/dataset') -quantizer.eval_dataloader = common.DataLoader(dl, batch_size=32) -# optional if Neural Compressor built-in metric could be used to do accuracy evaluation in yaml -quantizer.metric = common.Metric(custom_metric) -q_model = quantizer.fit() -q_model.save('/path/to/output/dir') -``` - -`model` attribute in `Quantization` class is an abstraction of model formats across different frameworks. Neural Compressor supports passing the path of `keras model`, `frozen pb`, `checkpoint`, `saved model`, `torch.nn.model`, `mxnet.symbol.Symbol`, `gluon.HybirdBlock`, and `onnx model` to instantiate a `neural_compressor.experimental.` class and set to `quantizer.model`. - -`calib_dataloader` and `eval_dataloader` attribute in `Quantization` class is used to set up a calibration dataloader by code. It is optional to set if the user sets corresponding fields in yaml. - -`metric` attribute in `Quantization` class is used to set up a custom metric by code. It is optional to set if user finds Neural Compressor built-in metric could be used with their model and sets corresponding fields in yaml. - -`postprocess` attribute in `Quantization` class is not necessary in most of the use cases. It is only needed when the user wants to use the built-in metric but the model output can not directly be handled by Neural Compressor built-in metrics. In this case, the user can register a transform to convert the model output to the expected one required by the built-in metric. - -`q_func` attribute in `Quantization` class is only for `Quantization Aware Training` case, in which the user needs to register a function that takes `model` as the input parameter and executes the entire training process with self-contained training hyper-parameters. - -`eval_func` attribute in `Quantization` class is reserved for special cases. If the user had an evaluation function when train a model, the user must implement a `calib_dataloader` and leave `eval_dataloader` as None. Then, modify this evaluation function to take `model` as the input parameter and return a higher-is-better scaler. In some scenarios, it may reduce development effort. - - -### Pruning-related APIs (POC) - -```python -class Pruning(object): - def __init__(self, conf_fname_or_obj): - ... - - def on_epoch_begin(self, epoch): - ... - - def on_step_begin(self, batch_id): - ... - - def on_step_end(self): - ... - - def on_epoch_end(self): - ... - - def __call__(self): - ... - - @property - def model(self): - ... - - @property - def q_func(self): - ... - -``` - -This API is used to do sparsity pruning. Currently, it is a Proof of Concept; Neural Compressor only supports `magnitude pruning` on PyTorch. - -To learn how to use this API, refer to the [pruning document](../docs/pruning.md). - -### Benchmarking-related APIs -```python -class Benchmark(object): - def __init__(self, conf_fname_or_obj): - ... - - def __call__(self): - ... - - @property - def model(self): - ... - - @property - def metric(self): - ... - - @property - def b_dataloader(self): - ... - - @property - def postprocess(self, user_postprocess): - ... -``` - -This API is used to measure model performance and accuracy. - -To learn how to use this API, refer to the [benchmarking document](../docs/benchmark.md). - -## Default user-facing APIs - -The default user-facing APIs exist for backwards compatibility from the v1.0 release. Refer to [v1.1 API](https://github.com/intel/neural-compressor/blob/v1.1/docs/introduction.md) to understand how the default user-facing APIs work. - -View the [HelloWorld example](/examples/helloworld/tf_example6) that uses default user-facing APIs for user reference. - -Full examples using default user-facing APIs can be found [here](https://github.com/intel/neural-compressor/tree/v1.1/examples). diff --git a/docs/source/index.rst b/docs/source/index.rst index afcf722f21d..b742e5360f3 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -10,7 +10,7 @@ Sections .. toctree:: :maxdepth: 1 - README.md + Welcome.md examples_readme.md api-documentation/apis.rst releases_info.md diff --git a/docs/source/legal_information.md b/docs/source/legal_information.md index c9ede70d378..511a04b7a58 100644 --- a/docs/source/legal_information.md +++ b/docs/source/legal_information.md @@ -16,7 +16,7 @@ See the accompanying [license](https://github.com/intel/neural-compressor/tree/m ## Citation -If you use Intel® Neural Compressor in your research or you wish to refer to the tuning results published in the [Validated Models](getting_started.md), use the following BibTeX entry. +If you use Intel® Neural Compressor in your research or you wish to refer to the tuning results published in the [Validated Models](validated_model_list.md), use the following BibTeX entry. ``` @misc{Intel® Neural Compressor, diff --git a/docs/source/reference_examples.md b/docs/source/reference_examples.md deleted file mode 100644 index 4fa1dc38a42..00000000000 --- a/docs/source/reference_examples.md +++ /dev/null @@ -1,149 +0,0 @@ -Reference Examples -=== -## Validated Models - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ModelAccuracyPerformance
INT8FP32Acc Ratio[(INT8-FP32)/FP32]INT8FP32Performance Ratio[INT8/FP32]
bert_large_squad_static - 90.78%90.87%-0.11%49.0813.483.64x
bert_base_mrpc_static - 82.35%83.09%-0.89%497.28151.163.29x
bert_base_nli_mean_tokens_stsb_static - 89.23%89.55%-0.36%546.97151.773.60x
bert_base_sparse_mrpc_static - 70.59%70.59%0.00%551.90153.803.59x
bert_mini_mrpc_static - 78.19%78.68%-0.62%6962.583252.142.14x
bert_mini_sst2_static - 87.16%86.93%0.26%6850.383218.982.13x
distilbert_base_uncased_sst2_static - 90.14%90.25%-0.12%1086.13306.453.54x
distilbert_base_uncased_mrpc_static - 83.82%84.07%-0.30%1091.99303.923.59x
distilbert_base_uncased_emotion_static - 93.90%94.20%-0.32%1081.35306.333.53x
minilm_l6_h384_uncased_sst2_static - 89.33%90.14%-0.90%2594.771083.842.39x
roberta_base_mrpc_static - 88.24%88.97%-0.82%508.14153.373.31x
distilroberta_base_wnli_static - 56.34%56.34%0.00%1097.22315.943.47x
paraphrase_xlm_r_multilingual_v1_stsb_static - 86.66%87.23%-0.65%552.44153.743.59x
finbert_financial_phrasebank_static - 82.57%82.80%-0.28%999.94292.553.42x
-Note: measured by batch size 1, 4 cores/instance, 10 instances on 1 socket of Intel Xeon Platinum 8380 Scalable processor diff --git a/docs/source/welcome.md b/docs/source/welcome.md deleted file mode 100644 index 3531bf0e052..00000000000 --- a/docs/source/welcome.md +++ /dev/null @@ -1,26 +0,0 @@ -Introduction to Intel® Neural Compressor -========================== - -Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool) is an open-source Python library running on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep learning frameworks for popular network compression technologies, such as quantization, pruning, knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help user quickly find out the best quantized model. It also implements different weight pruning algorithms to generate pruned model with predefined sparsity goal and supports knowledge distillation to distill the knowledge from the teacher model to the student model. - -> **Note**: GPU support is under development. - -| Architecture | Workflow | -| - | - | -| ![Architecture](./_static/imgs/architecture.png "Architecture") | ![Workflow](./_static/imgs/workflow.png "Workflow") | - -Supported deep learning frameworks are: - -* [TensorFlow\*](https://github.com/Intel-tensorflow/tensorflow), including [1.15.0 UP3](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up3), [1.15.0 UP2](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up2), [1.15.0 UP1](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up1), [2.1.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.1.0), [2.2.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.2.0), [2.3.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.3.0), [2.4.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.4.0), [2.5.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.5.0), [Official TensorFlow 2.6.0](https://github.com/tensorflow/tensorflow/tree/v2.6.0) - -> **Note**: Intel Optimized TensorFlow 2.5.0 requires setting environment variable TF_ENABLE_MKL_NATIVE_FORMAT=0 before running quantization process or deploying the quantized model. - -> **Note**: From the official TensorFlow 2.6.0, oneDNN support has been upstreamed. Download the official TensorFlow 2.6.0 binary for the CPU device and set the environment variable TF_ENABLE_ONEDNN_OPTS=1 before running the quantization process or deploying the quantized model. - -* [PyTorch\*](https://pytorch.org/), including [1.5.0+cpu](https://download.pytorch.org/whl/torch_stable.html), [1.6.0+cpu](https://download.pytorch.org/whl/torch_stable.html), [1.8.0+cpu](https://download.pytorch.org/whl/torch_stable.html) -* [Apache\* MXNet](https://mxnet.apache.org), including [1.6.0](https://github.com/apache/incubator-mxnet/tree/1.6.0), [1.7.0](https://github.com/apache/incubator-mxnet/tree/1.7.0), [1.8.0](https://github.com/apache/incubator-mxnet/tree/1.8.0) -* [ONNX\* Runtime](https://github.com/microsoft/onnxruntime), including [1.6.0](https://github.com/microsoft/onnxruntime/tree/v1.6.0), [1.7.0](https://github.com/microsoft/onnxruntime/tree/v1.7.0), [1.8.0](https://github.com/microsoft/onnxruntime/tree/v1.8.0) - -[Get started](getting_started.md) with installation, tutorials, examples, and more! - -View the Intel® Neural Compressor repo at: . diff --git a/examples/README.md b/examples/README.md index 3cbfd9758d9..3edba4b2212 100644 --- a/examples/README.md +++ b/examples/README.md @@ -1,6 +1,6 @@ Examples -=== -Intel® Neural Compressor validated examples with multiple compression techniques, including quantization, pruning, knowledge distillation and orchestration. Part of the validated cases can be found in the example tables, and the release data is available [here](../docs/validated_model_list.md). +========== +Intel® Neural Compressor validated examples with multiple compression techniques, including quantization, pruning, knowledge distillation and orchestration. Part of the validated cases can be found in the example tables, and the release data is available [here](../docs/source/validated_model_list.md). ## Helloworld Examples