-
Notifications
You must be signed in to change notification settings - Fork 17
merge bench code into benchgc #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ed55792
to
c3c5441
Compare
added a mlp case in the correctness check script as example,and will add more in the future
|
@yifeizh2 @zhczhong @crazydemo @niuxiaog @BRUCE11111 the bench tool now is part of benchgc, please follow the new readme to do benchmark, https://github.com/intel/graph-compiler/blob/xurui/merge_into_benchgc/test/benchgc/README.md Note: The current benchgc does not provide DLTI attr for the mlir module, will add that in the next PR. |
* 3 : COMPARE_VERBOSE, + print threshold for comparison | ||
* 4 : ERROR_OUTPUT_VERBOSE, + print all error data points if failed | ||
* 5 : OUTPUT_VERBOSE, + print all result including passed tensor | ||
* 6 : INPUT_VERBOSE, + print input torch tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think saving the tensor into a file will be better than printing them in the terminal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, if the tensor is larger then dump into a file sounds better than printing. The printing thing is not added by this PR, I just added them on the README.txt
, I can discuss with @WangJialei-A , and maybe we could provide another option to dump. For this PR let's keep the printing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ciyongch @xurui1995
Need more discussion and design this part carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ciyongch @xurui1995 Need more discussion and design this part carefully.
For debug capability, we shall have a more convenient/flexible way to get the intermediate result.
assert ( | ||
len(module.operation.regions) == 1 | ||
), "Expected kernel module to have only one region" | ||
assert ( | ||
len(module.operation.regions[0].blocks) == 1 | ||
), "Expected kernel module to have only one block" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have such limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed now, here I try to find the entry from the top-level functions of a module, in fact, I have not seen any module with more than one region and blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you remove the original get_entry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are almost the same, in original get_entry, you have to pass entry name as '"entry"'
, which is kind a annoy. In addition, have a func to get FuncOp by name can cover the func only for getting entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with a minor comment.
arg.fill_param = [ | ||
"matmul", | ||
"wei", | ||
arglist[0].dtype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use different idx here for wei
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing this out. Previously, I tried to use a single matmul op's filling strategy in the MLP, but later I encountered some issues. Now, I have aligned the filling and compare strategies of the MLP with the MLP validation script in GC v1. In the future, a separate PR might be proposed to optimize the MLP's filling.
merge bench code into benchgc
mode=P
for performance testingdriver=pattern, case=mlp
for mlp patternissue: #172