This repository was archived by the owner on Nov 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 63
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
Mismatch in output of onnx exported CharTokenizer model #477
Copy link
Copy link
Open
Description
The onnx export test for CharTokenizer is failing in the current tests so it has been disabled (link). The output comming from ML.NET, OnnxRunner, and ORT on that test are different.
Here is a repro script, and its output. Notice the difference both in values and dtypes between the different outputs.
NOTE: The DataFrameTool is the one found here in the repository.
Repro
import pandas as pd
import tempfile
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing.text import CharTokenizer
from nimbusml.preprocessing import OnnxRunner
from data_frame_tool import DataFrameTool as DFT
file_path = get_dataset("wiki_detox_train").as_filepath()
dataset = pd.read_csv(file_path, sep='\t')
dataset = dataset.head(10)
estimator = CharTokenizer(columns={'SentimentText_Transform': 'SentimentText'})
estimator.fit(dataset)
print("\n\nML.NET RESULT")
result_expected = estimator.transform(dataset)
print(estimator.model_)
print(result_expected)
print(result_expected.dtypes)
print("\n\nORT RESULT")
onnx_path = "C:\\Users\\anvelazq\Desktop\\is29chartokenizer\\chartokenizer.onnx"
estimator.export_to_onnx(onnx_path, 'com.microsoft.ml')
onnxrunner = OnnxRunner(model_file=onnx_path)
result_onnx = onnxrunner.fit_transform(dataset)
print(result_onnx)
print(result_onnx.dtypes)
print("\n\nONNX RUNNER RESULT")
df_tool = DFT(onnx_path)
result_ort = df_tool.execute(dataset, [])
print(result_ort)
print(result_ort.dtypes)Output
ML.NET RESULT
C:\Users\anvelazq\AppData\Local\Temp\tmp4dd2p6jl.model.bin
Sentiment SentimentText SentimentText_Transform.000 ... SentimentText_Transform.419 SentimentText_Transform.420 SentimentText_Transform.421
0 1 ==RUDE== Dude, you are rude upload that carl... 1.0 ... NaN NaN NaN
1 1 == OK! == IM GOING TO VANDALIZE WILD ONES W... 1.0 ... NaN NaN NaN
2 1 Stop trolling, zapatancas, calling me a lia... 1.0 ... NaN NaN NaN
3 1 ==You're cool== You seem like a really cool... 1.0 ... NaN NaN NaN
4 1 ::::: Why are you threatening me? I'm not bei... 1.0 ... NaN NaN NaN
5 1 == hey waz up? == hey ummm... the fif four ... 1.0 ... NaN NaN NaN
6 0 ::::::::::I'm not sure either. I think it has... 1.0 ... NaN NaN NaN
7 0 *::Your POV and propaganda pushing is dully n... 1.0 ... 45.0 31.0 2.0
8 0 == File:Hildebrandt-Greg and Tim.jpg listed ... 1.0 ... NaN NaN NaN
9 0 ::::::::This is a gross exaggeration. Nobody... 1.0 ... NaN NaN NaN
[10 rows x 424 columns]
Sentiment int64
SentimentText object
SentimentText_Transform.000 float64
SentimentText_Transform.001 float64
SentimentText_Transform.002 float64
...
SentimentText_Transform.417 float64
SentimentText_Transform.418 float64
SentimentText_Transform.419 float64
SentimentText_Transform.420 float64
SentimentText_Transform.421 float64
Length: 424, dtype: object
ORT RESULT
Sentiment SentimentText SentimentText_Transform.000 ... SentimentText_Transform.419 SentimentText_Transform.420 SentimentText_Transform.421
0 1 ==RUDE== Dude, you are rude upload that carl... 2.0 ... NaN NaN NaN
1 1 == OK! == IM GOING TO VANDALIZE WILD ONES W... 2.0 ... NaN NaN NaN
2 1 Stop trolling, zapatancas, calling me a lia... 2.0 ... NaN NaN NaN
3 1 ==You're cool== You seem like a really cool... 2.0 ... NaN NaN NaN
4 1 ::::: Why are you threatening me? I'm not bei... 2.0 ... NaN NaN NaN
5 1 == hey waz up? == hey ummm... the fif four ... 2.0 ... NaN NaN NaN
6 0 ::::::::::I'm not sure either. I think it has... 2.0 ... NaN NaN NaN
7 0 *::Your POV and propaganda pushing is dully n... 2.0 ... 46.0 32.0 3.0
8 0 == File:Hildebrandt-Greg and Tim.jpg listed ... 2.0 ... NaN NaN NaN
9 0 ::::::::This is a gross exaggeration. Nobody... 2.0 ... NaN NaN NaN
[10 rows x 424 columns]
Sentiment int64
SentimentText object
SentimentText_Transform.000 float32
SentimentText_Transform.001 float32
SentimentText_Transform.002 float32
...
SentimentText_Transform.417 float32
SentimentText_Transform.418 float32
SentimentText_Transform.419 float32
SentimentText_Transform.420 float32
SentimentText_Transform.421 float32
Length: 424, dtype: object
ONNX RUNNER RESULT
Sentiment.output SentimentText.output SentimentText_Transform.output.0 ... SentimentText_Transform.output.419 SentimentText_Transform.output.420 SentimentText_Transform.output.421
0 1 ==RUDE== Dude, you are rude upload that carl... 2 ... 65535 65535 65535
1 1 == OK! == IM GOING TO VANDALIZE WILD ONES W... 2 ... 65535 65535 65535
2 1 Stop trolling, zapatancas, calling me a lia... 2 ... 65535 65535 65535
3 1 ==You're cool== You seem like a really cool... 2 ... 65535 65535 65535
4 1 ::::: Why are you threatening me? I'm not bei... 2 ... 65535 65535 65535
5 1 == hey waz up? == hey ummm... the fif four ... 2 ... 65535 65535 65535
6 0 ::::::::::I'm not sure either. I think it has... 2 ... 65535 65535 65535
7 0 *::Your POV and propaganda pushing is dully n... 2 ... 46 32 3
8 0 == File:Hildebrandt-Greg and Tim.jpg listed ... 2 ... 65535 65535 65535
9 0 ::::::::This is a gross exaggeration. Nobody... 2 ... 65535 65535 65535
[10 rows x 424 columns]
Sentiment.output int64
SentimentText.output object
SentimentText_Transform.output.0 uint16
SentimentText_Transform.output.1 uint16
SentimentText_Transform.output.2 uint16
...
SentimentText_Transform.output.417 uint16
SentimentText_Transform.output.418 uint16
SentimentText_Transform.output.419 uint16
SentimentText_Transform.output.420 uint16
SentimentText_Transform.output.421 uint16
Length: 424, dtype: object
Metadata
Metadata
Assignees
Labels
No labels