-
Notifications
You must be signed in to change notification settings - Fork 34
Extending UniformQuantizedType with interface-based support for new storage types in Quant dialect #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: npu/release/19.x
Are you sure you want to change the base?
Conversation
| int64_t getDefaultMinimum(bool isSigned, unsigned integralWidth) const { | ||
| return -getDefaultMaximum(isSigned, integralWidth); | ||
| } | ||
| std::string printStorageType([[maybe_unused]] bool isSigned, [[maybe_unused]] unsigned storageWidth) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we should have print method here. Isn't there more canonical way to stringify type name?
It's more getStorageTypeName
| } | ||
| return llvm::maxUIntN(integralWidth); | ||
| } | ||
| std::string printStorageType(bool isSigned, unsigned storageWidth) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to pass these argument from outside?
Isn't there some existing interface we can use here to get these directly from this?
…ck) (intel#142) Add a new API to access all blobs that are stored in the blob manager. The main purpose (as of now) is to allow users of dialect resources to iterate over all blobs, especially when the blobs are no longer used in IR (e.g. the operation that uses the blob is deleted) and thus cannot be easily accessed without manual tracking of keys.
d6153a3 to
9ad9c9f
Compare
ZoranZomborat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal looks great! Let's also get some feedback from LLVM discourse and motivate with our cases;
Summary
Currently, UniformQuantizedType only supports built-in MLIR storage types such as Integer. LLM quantization research introducing feature of using NF4 as a low precision datatype (see https://arxiv.org/pdf/2305.14314). There is a growing need to make the system extensible and maintainable as more types are added. Ensuring that MLIR can natively support NF4 through a clean, extensible interface is essential for both current and future quantization workflows.
Current Approach and Its Limitations:
Proposed Interface-Based Approach:
Benefits: