Commit af40296f authored by EC2 Default User's avatar EC2 Default User
Browse files

Update quantization config

parent 07f8d670
%% Cell type:markdown id: tags:
# Quantize models with Optimum Intel
%% Cell type:code id: tags:
``` python
%%sh
pip -q uninstall torch -y
pip -q install torch==1.11.0+cpu --extra-index-url https://download.pytorch.org/whl/cpu
```
%% Cell type:code id: tags:
``` python
%%sh
pip -q install transformers datasets evaluate scipy --upgrade
pip -q install optimum[intel]
pip -q install optimum[intel] --upgrade
```
%% Cell type:code id: tags:
``` python
import evaluate
from datasets import load_dataset, load_from_disk
from transformers import AutoModelForSequenceClassification, AutoTokenizer
```
%% Cell type:code id: tags:
``` python
# If you need to build or resize the dataset. If not, skip this cell
dataset = load_dataset("amazon_us_reviews", "Shoes_v1_00", split="train[:1%]")
dataset = dataset.remove_columns(
[
"marketplace",
"customer_id",
"review_id",
"product_id",
"product_parent",
"product_title",
"product_category",
"helpful_votes",
"total_votes",
"vine",
"verified_purchase",
"review_headline",
"review_date",
]
)
dataset = dataset.rename_column("star_rating", "labels")
dataset = dataset.rename_column("review_body", "text")
def decrement_stars(row):
return {"labels": row["labels"] - 1}
dataset = dataset.map(decrement_stars)
dataset_split = dataset.train_test_split(test_size=0.01, shuffle=True, seed=59)
dataset_split["test"].save_to_disk("data_quantize/test")
dataset_split["test"]
```
%% Output
Reusing dataset amazon_us_reviews (/home/ec2-user/.cache/huggingface/datasets/amazon_us_reviews/Shoes_v1_00/0.1.0/17b2481be59723469538adeb8fd0a68b0ba363bbbdd71090e72c325ee6c7e563)
Dataset({
features: ['labels', 'text'],
num_rows: 437
})
%% Cell type:code id: tags:
``` python
eval_dataset = load_from_disk("./data_quantize/test")
```
%% Cell type:code id: tags:
``` python
model_name = "juliensimon/distilbert-amazon-shoe-reviews"
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=5)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```
%% Output
%% Cell type:markdown id: tags:
We define an evaluation function that will be used during the optimization process.
%% Cell type:code id: tags:
``` python
def eval_func(model):
eval = evaluate.evaluator("text-classification")
results = eval.compute(
model_or_pipeline=model,
tokenizer=tokenizer,
data=eval_dataset,
metric=evaluate.load("mse"),
label_column="labels",
label_mapping=model.config.label2id,
)
return results["mse"]
```
%% Cell type:code id: tags:
``` python
eval_func(model)
```
%% Output
0.425629290617849
%% Cell type:markdown id: tags:
Then, we set up quantization.
Then, we set up quantization. We decide to use a bayesian optimization strategy (```tuning.strategy.name```) with a maximum loss of accuracy (```accuracy_criterion.relative```) set to 1%. We let the job run for 100 trials (```tuning.exit_policy.max_trials```).
You can learn about tuning strategies in the Intel Neural Compressor [documentation](https://intel.github.io/neural-compressor/docs/tuning_strategies.html).
%% Cell type:code id: tags:
``` python
!pygmentize intel_neural_compressor/quantize.yml
```
%% Output
#
# Copyright (c) 2021 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
version: 1.0
model: # mandatory.
name: bert_prune
framework: pytorch # mandatory. possible values are pytorch and pytorch_fx.
device: cpu
quantization: # optional.
approach: post_training_dynamic_quant
tuning:
accuracy_criterion:
relative: 0.03 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 3%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
max_trials: 30
random_seed: 9527 # optional. random seed for deterministic tuning.
%% Cell type:code id: tags:
``` python
from optimum.intel.neural_compressor import<