-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[mlir][tensor] Improve FoldTensorCastProducerOp (dynamic shapes)
#114559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][tensor] Improve FoldTensorCastProducerOp (dynamic shapes)
#114559
Conversation
Currently, `FoldTensorCastProducerOp` incorrectly folds the following:
```mlir
%pack = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32>
%res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32>
```
as (note the static trailing dim in the result and dynamic tile
dimension that corresponds to that):
```mlir
%res = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x8x1xi32>
```
This triggers an Op verification failure and is due to the fact that the
folder does not update the inner tile sizes in the pack Op. This PR
addresses that.
Note, supporting other Ops with size-like attributes is left as a TODO;
|
@llvm/pr-subscribers-mlir-tensor Author: Andrzej Warzyński (banach-space) ChangesCurrently, %pack = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32>
%res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32>as (note the static trailing dim in the result and dynamic tile %res = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x8x1xi32>This triggers an Op verification failure and is due to the fact that the Note, supporting other Ops with size-like attributes is left as a TODO; Full diff: https://github.com/llvm/llvm-project/pull/114559.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index c2d6bc610cd92a..406b557b0f0e39 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -4756,8 +4756,50 @@ struct FoldTensorCastProducerOp
newResultTypes[dpsInitIdx++] = newOperands.back().getType();
}
- // Clone op.
- Operation *newOp = clone(rewriter, op, newResultTypes, newOperands);
+ // For ops that have sizes-like attribute, update these accordingly.
+ // For now, only `tensor.pack` is supported.
+ // TODO: Generalize to make it work with other ops as well (e.g.
+ // `tensor.unpack`)
+ SmallVector<OpFoldResult> newMixedTileSizes;
+ if (auto pack = dyn_cast_or_null<tensor::PackOp>(*op)) {
+ for (auto it : llvm::zip(cast<ShapedType>(newResultTypes[0])
+ .getShape()
+ .take_back(pack.getMixedTiles().size()),
+ pack.getMixedTiles())) {
+
+ int64_t shape = std::get<0>(it);
+ if (shape == ShapedType::kDynamic) {
+ newMixedTileSizes.push_back(std::get<1>(it));
+ continue;
+ }
+
+ if (Attribute attr =
+ llvm::dyn_cast_if_present<Attribute>(std::get<1>(it))) {
+ // Already a constant
+ newMixedTileSizes.push_back(std::get<1>(it));
+ } else {
+ auto tileSize = getConstantIntValue(std::get<1>(it));
+ assert(tileSize == shape && "tile size and dim size don't match!");
+ newMixedTileSizes.push_back(
+ (rewriter.getIntegerAttr(rewriter.getIndexType(), shape)));
+ }
+ }
+ }
+
+ // Clone op. For ops that have sizes-like attribute, make sure to udpate
+ // those as well. For now, only `tensor.pack` is supported.
+ // TODO: Generalize to make it work with other ops as well (e.g.
+ // `tensor.unpack`)
+ // Operation *newOp;
+ Operation *newOp;
+ if (auto pack = dyn_cast_or_null<tensor::PackOp>(*op)) {
+ newOp = rewriter.create<PackOp>(
+ pack.getLoc(), newOperands[0], newOperands[1], pack.getInnerDimsPos(),
+ newMixedTileSizes, pack.getPaddingValue(), pack.getOuterDimsPerm());
+ } else {
+ newOp = clone(rewriter, op, newResultTypes, newOperands);
+ }
+
SmallVector<Value, 4> replacements;
replacements.reserve(newOp->getNumResults());
for (auto [oldResult, newResult] :
diff --git a/mlir/test/Dialect/Tensor/canonicalize.mlir b/mlir/test/Dialect/Tensor/canonicalize.mlir
index 693079c3aa2fac..ebcc69250ad56d 100644
--- a/mlir/test/Dialect/Tensor/canonicalize.mlir
+++ b/mlir/test/Dialect/Tensor/canonicalize.mlir
@@ -2718,18 +2718,37 @@ func.func @dim_out_of_bounds() -> vector<7xi32> {
// -----
-// CHECK-LABEL: func.func @test_destination_multiple_result(
+// CHECK-LABEL: func.func @fold_cast_multiple_results(
// CHECK-SAME: %[[ARG1:.*]]: tensor<2x2xf32>,
// CHECK-SAME: %[[ARG2:.*]]: tensor<2x2xf32>) -> index {
// CHECK: %[[RES:.*]]:2 = test.destination_style_op ins(%[[ARG1]] : tensor<2x2xf32>)
// CHECK-SAME: outs(%[[ARG2]] : tensor<2x2xf32>) -> tensor<2x2xf32>, index
// CHECK: return %[[RES]]#1 : index
-func.func @test_destination_multiple_result(%arg0: tensor<2x2xf32>, %arg1: tensor<2x2xf32>) -> index {
+func.func @fold_cast_multiple_results(%arg0: tensor<2x2xf32>, %arg1: tensor<2x2xf32>) -> index {
%cast = tensor.cast %arg0 : tensor<2x2xf32> to tensor<?x2xf32>
%cast_0 = tensor.cast %arg1 : tensor<2x2xf32> to tensor<?x2xf32>
%0:2 = test.destination_style_op ins(%cast : tensor<?x2xf32>) outs(%cast_0 : tensor<?x2xf32>) -> tensor<?x2xf32>, index
return %0#1 : index
}
+// -----
+
+// CHECK-LABEL: func.func @fold_cast_pack_dynamic_tile_size
+// CHECK-SAME: %[[DEST:.*]]: tensor<1x1x8x1xi32>,
+// CHECK-SAME: %[[SRC:.*]]: tensor<7x?xi32>,
+// CHECK-SAME: %[[PAD:.*]]: i32) -> tensor<1x1x8x1xi32> {
+// CHECK: %[[PACK:.*]] = tensor.pack %[[SRC]] padding_value(%[[PAD]] : i32) inner_dims_pos = [0, 1] inner_tiles = [8, 1] into %[[DEST]] : tensor<7x?xi32> -> tensor<1x1x8x1xi32>
+// CHECK: return %[[PACK]] : tensor<1x1x8x1xi32>
+func.func @fold_cast_pack_dynamic_tile_size(
+ %dest: tensor<1x1x8x1xi32>,
+ %src: tensor<7x?xi32>,
+ %pad: i32) -> tensor<1x1x8x1xi32> {
+
+ %cast = tensor.cast %dest : tensor<1x1x8x1xi32> to tensor<1x1x?x1xi32>
+ %c8 = arith.constant 8 : index
+ %pack = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32>
+ %res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32>
+ return %res : tensor<1x1x8x1xi32>
+}
// -----
|
|
@llvm/pr-subscribers-mlir Author: Andrzej Warzyński (banach-space) ChangesCurrently, %pack = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32>
%res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32>as (note the static trailing dim in the result and dynamic tile %res = tensor.pack %src
padding_value(%pad : i32)
inner_dims_pos = [0, 1]
inner_tiles = [%c8, 1]
into %cast : tensor<7x?xi32> -> tensor<1x1x8x1xi32>This triggers an Op verification failure and is due to the fact that the Note, supporting other Ops with size-like attributes is left as a TODO; Full diff: https://github.com/llvm/llvm-project/pull/114559.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index c2d6bc610cd92a..406b557b0f0e39 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -4756,8 +4756,50 @@ struct FoldTensorCastProducerOp
newResultTypes[dpsInitIdx++] = newOperands.back().getType();
}
- // Clone op.
- Operation *newOp = clone(rewriter, op, newResultTypes, newOperands);
+ // For ops that have sizes-like attribute, update these accordingly.
+ // For now, only `tensor.pack` is supported.
+ // TODO: Generalize to make it work with other ops as well (e.g.
+ // `tensor.unpack`)
+ SmallVector<OpFoldResult> newMixedTileSizes;
+ if (auto pack = dyn_cast_or_null<tensor::PackOp>(*op)) {
+ for (auto it : llvm::zip(cast<ShapedType>(newResultTypes[0])
+ .getShape()
+ .take_back(pack.getMixedTiles().size()),
+ pack.getMixedTiles())) {
+
+ int64_t shape = std::get<0>(it);
+ if (shape == ShapedType::kDynamic) {
+ newMixedTileSizes.push_back(std::get<1>(it));
+ continue;
+ }
+
+ if (Attribute attr =
+ llvm::dyn_cast_if_present<Attribute>(std::get<1>(it))) {
+ // Already a constant
+ newMixedTileSizes.push_back(std::get<1>(it));
+ } else {
+ auto tileSize = getConstantIntValue(std::get<1>(it));
+ assert(tileSize == shape && "tile size and dim size don't match!");
+ newMixedTileSizes.push_back(
+ (rewriter.getIntegerAttr(rewriter.getIndexType(), shape)));
+ }
+ }
+ }
+
+ // Clone op. For ops that have sizes-like attribute, make sure to udpate
+ // those as well. For now, only `tensor.pack` is supported.
+ // TODO: Generalize to make it work with other ops as well (e.g.
+ // `tensor.unpack`)
+ // Operation *newOp;
+ Operation *newOp;
+ if (auto pack = dyn_cast_or_null<tensor::PackOp>(*op)) {
+ newOp = rewriter.create<PackOp>(
+ pack.getLoc(), newOperands[0], newOperands[1], pack.getInnerDimsPos(),
+ newMixedTileSizes, pack.getPaddingValue(), pack.getOuterDimsPerm());
+ } else {
+ newOp = clone(rewriter, op, newResultTypes, newOperands);
+ }
+
SmallVector<Value, 4> replacements;
replacements.reserve(newOp->getNumResults());
for (auto [oldResult, newResult] :
diff --git a/mlir/test/Dialect/Tensor/canonicalize.mlir b/mlir/test/Dialect/Tensor/canonicalize.mlir
index 693079c3aa2fac..ebcc69250ad56d 100644
--- a/mlir/test/Dialect/Tensor/canonicalize.mlir
+++ b/mlir/test/Dialect/Tensor/canonicalize.mlir
@@ -2718,18 +2718,37 @@ func.func @dim_out_of_bounds() -> vector<7xi32> {
// -----
-// CHECK-LABEL: func.func @test_destination_multiple_result(
+// CHECK-LABEL: func.func @fold_cast_multiple_results(
// CHECK-SAME: %[[ARG1:.*]]: tensor<2x2xf32>,
// CHECK-SAME: %[[ARG2:.*]]: tensor<2x2xf32>) -> index {
// CHECK: %[[RES:.*]]:2 = test.destination_style_op ins(%[[ARG1]] : tensor<2x2xf32>)
// CHECK-SAME: outs(%[[ARG2]] : tensor<2x2xf32>) -> tensor<2x2xf32>, index
// CHECK: return %[[RES]]#1 : index
-func.func @test_destination_multiple_result(%arg0: tensor<2x2xf32>, %arg1: tensor<2x2xf32>) -> index {
+func.func @fold_cast_multiple_results(%arg0: tensor<2x2xf32>, %arg1: tensor<2x2xf32>) -> index {
%cast = tensor.cast %arg0 : tensor<2x2xf32> to tensor<?x2xf32>
%cast_0 = tensor.cast %arg1 : tensor<2x2xf32> to tensor<?x2xf32>
%0:2 = test.destination_style_op ins(%cast : tensor<?x2xf32>) outs(%cast_0 : tensor<?x2xf32>) -> tensor<?x2xf32>, index
return %0#1 : index
}
+// -----
+
+// CHECK-LABEL: func.func @fold_cast_pack_dynamic_tile_size
+// CHECK-SAME: %[[DEST:.*]]: tensor<1x1x8x1xi32>,
+// CHECK-SAME: %[[SRC:.*]]: tensor<7x?xi32>,
+// CHECK-SAME: %[[PAD:.*]]: i32) -> tensor<1x1x8x1xi32> {
+// CHECK: %[[PACK:.*]] = tensor.pack %[[SRC]] padding_value(%[[PAD]] : i32) inner_dims_pos = [0, 1] inner_tiles = [8, 1] into %[[DEST]] : tensor<7x?xi32> -> tensor<1x1x8x1xi32>
+// CHECK: return %[[PACK]] : tensor<1x1x8x1xi32>
+func.func @fold_cast_pack_dynamic_tile_size(
+ %dest: tensor<1x1x8x1xi32>,
+ %src: tensor<7x?xi32>,
+ %pad: i32) -> tensor<1x1x8x1xi32> {
+
+ %cast = tensor.cast %dest : tensor<1x1x8x1xi32> to tensor<1x1x?x1xi32>
+ %c8 = arith.constant 8 : index
+ %pack = tensor.pack %src padding_value(%pad : i32) inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %cast : tensor<7x?xi32> -> tensor<1x1x?x1xi32>
+ %res = tensor.cast %pack : tensor<1x1x?x1xi32> to tensor<1x1x8x1xi32>
+ return %res : tensor<1x1x8x1xi32>
+}
// -----
|
| // For now, only `tensor.pack` is supported. | ||
| // TODO: Generalize to make it work with other ops as well (e.g. | ||
| // `tensor.unpack`) | ||
| SmallVector<OpFoldResult> newMixedTileSizes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it is mixing something related to PackOp into a method that is for general destination passing style operations. Can we instead just make this a separate pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it is mixing something related to PackOp into a method that is for general destination passing style operations.
No, this logic applies to all Tensor ops that take size-like attribute. Here's an example with tensor.extract_slice:
%1 = tensor.extract_slice %t[%c0, %c0, %c0][%c2, %c2, %c2][%c1, %c1, %c1] : tensor<8x16x4xf32> to tensor<?x?x?xf32>
%res = tensor.cast tensor<?x?x?xf32> to tensor<2x2x2xf32>The folder should return this (i.e update the result type and the sizes attribute):
%res = tensor.extract_slice %t[%c0, %c0, %c0][2, 2, 2][%c1, %c1, %c1] : tensor<8x16x4xf32> to tensor<2x2x2xf32>instead of (i.e. update the result type):
%res = tensor.extract_slice %t[%c0, %c0, %c0][%c2, %c2, %c2][%c1, %c1, %c1] : tensor<8x16x4xf32> to tensor<2x2x2xf32>The latter fails verification (that's the correct and intended behaviour, as discussed in this thread ).
I only happen to implement the logic for tensor.pack as that's what's blocking me right now. I am happy to extend this to other Ops, but want to avoid sending a large patch - these tend to get longer to get reviewed ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read your discourse thread on that. I agree that we need to do this for all ops, but the way the "size" operands map to result types for operands varies on an operand. I dont think we can "combine them all" into a single pattern. Lets just split the pattern. If you find a way to combine it then thats great, but I dont see a way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻 Updated, let me know whether that's what you had in mind.
…pes) Split into two patterns.
Max191
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall makes sense to me, thanks for the detailed discussion in the thread! Just a few nits.
| for (auto it : llvm::zip(cast<ShapedType>(newResultTypes[0]) | ||
| .getShape() | ||
| .take_back(op.getMixedTiles().size()), | ||
| op.getMixedTiles())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| for (auto it : llvm::zip(cast<ShapedType>(newResultTypes[0]) | |
| .getShape() | |
| .take_back(op.getMixedTiles().size()), | |
| op.getMixedTiles())) { | |
| for (auto [shape, innerTile] : llvm::zip_equal(cast<ShapedType>(newResultTypes[0]) | |
| .getShape() | |
| .take_back(op.getMixedTiles().size()), | |
| op.getMixedTiles())) { |
| auto tileSize = getConstantIntValue(std::get<1>(it)); | ||
| assert(tileSize == shape && "tile size and dim size don't match!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think it is safe to assume the size should be constant here, so you can use int64_t. This makes the types more clear.
| auto tileSize = getConstantIntValue(std::get<1>(it)); | |
| assert(tileSize == shape && "tile size and dim size don't match!"); | |
| int64_t tileSize = getConstantIntValue(std::get<1>(it)).value(); | |
| assert(tileSize == shape && "tile size and dim size don't match!"); |
| SmallVector<Value, 4> replacements; | ||
| replacements.reserve(newOp->getNumResults()); | ||
| for (auto [oldResult, newResult] : | ||
| llvm::zip(op->getResults(), newOp->getResults())) { | ||
| newResult.getType() != oldResult.getType() | ||
| ? replacements.push_back(rewriter.create<tensor::CastOp>( | ||
| op->getLoc(), oldResult.getType(), newResult)) | ||
| : replacements.push_back(newResult); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This doesn't need a loop, since PackOp has only one result.
|
|
||
| // Exclude DPS ops that are also LoopLike from this interface as they | ||
| // might need special handling of attached regions. | ||
| if (isa<LoopLikeOpInterface>(op.getOperation())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can combine this condition with the other one
…mic shapes) Final tweaks
Adds an end-to-end test for `tensor.pack` with dynamic inner tile sizes. While relatively simple (e.g., no vectorization), this example required a few non-trivial fixes in handling `tensor.pack`: * llvm#114315, llvm#114559, llvm#113108. The end goal for this test is to incrementally increase its complexity and to work towards scalable tile sizes.
…115698) Adds an end-to-end test for `tensor.pack` with dynamic inner tile sizes. While relatively simple (e.g., no vectorization), this example required a few non-trivial fixes in handling `tensor.pack`: * #114315, #114559, #113108. The end goal for this test is to incrementally increase its complexity and to work towards scalable tile sizes.
This patch specializes `FoldTensorCastProducerOp` for `tensor::UnPackOp` by introducing a dedicated pattern: `FoldTensorCastUnPackOp`. This change mirrors a similar update made for `tensor::PackOp` in llvm#114559. Below is the updated rationale for `tensor::UnPackOp`. Currently, `FoldTensorCastProducerOp` incorrectly folds the following: ```mlir %cast = tensor.cast %dest : tensor<1x1x8x1xi32> to tensor<1x1x?x1xi32> %unpack = tensor.unpack %cast inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %res : tensor<1x1x?x1xi32> -> tensor<7x?xi32> ``` as: ```mlir %unpack = tensor.unpack %cast inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %res : tensor<1x1x?x1xi32> -> tensor<7x?xi32> ``` This leads to an Op verification failure because the folder does not update the inner tile sizes in the unpack Op. This patch resolves the issue. Additional Changes: * invalid.mlir: Fixes a typo. * TensorOps.cpp: Removes unnecessary `(void)tileSize` and adds extra comments following this discussion: llvm#115772.
This patch specializes `FoldTensorCastProducerOp` for `tensor::UnPackOp` by introducing a dedicated pattern: `FoldTensorCastUnPackOp`. This mirrors a similar update made for `tensor::PackOp` in #114559. Below is the updated rationale tailored to `tensor::UnPackOp`. ISSUE DESCRIPTION Currently, `FoldTensorCastProducerOp` incorrectly folds the following: ```mlir %cast = tensor.cast %dest : tensor<1x1x8x1xi32> to tensor<1x1x?x1xi32> // Note: `%c8` and `?`. %unpack = tensor.unpack %cast inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %res : tensor<1x1x?x1xi32> -> tensor<7x?xi32> ``` as: ```mlir // Note: `%c8` and `8`. %unpack = tensor.unpack %cast inner_dims_pos = [0, 1] inner_tiles = [%c8, 1] into %res : tensor<1x1x8x1xi32> -> tensor<7x?xi32> ``` This triggers an Op verification failure because the folder does not update the inner tile sizes in the unpack Op. This patch addresses the issue by ensuring proper handling of inner tile sizes. ADDITIONAL CHANGES * invalid.mlir: Fixed a typo. * TensorOps.cpp: * Removed unnecessary `(void)tileSize`. * Added comments following the discussion in PR #115772. * Made minor updates to `FoldTensorCastPackOp` for consistency with the newly introduced `FoldTensorCastUnPackOp`. * Tensor/canonicalize.mlir: Ensured consistent usage of `test_attr` (e.g., replaced mixed use of `test_attr` and `some_attr`).
This is merely moving code around, no new functionality is added. PATCH 5: Create `LinalgRelayoutOpInterface` to be able to exclude `linalg::PackOp` + `linalg::UnpackOp` Ops from patterns/folders outside the Linalg dialect, e.g. `FoldTensorCastProducerOp` from the Tensor dialect. Note that there's `FoldTensorCastUnPackOp` and `FoldTensorCastPackOp` in LinalgOps.cpp (i.e. Linalg dialect) that provides similar folder (but which fold "correctly"). See e.g. llvm#121393 and llvm#114559 for context. CONTEXT: This change was discussed in the following RFC: * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg
Currently,
FoldTensorCastProducerOpincorrectly folds the following:as (note the static trailing dim in the result and dynamic tile
dimension that corresponds to that):
This triggers an Op verification failure and is due to the fact that the
folder does not update the inner tile sizes in the pack Op. This PR
addresses that.
Note, supporting other Ops with size-like attributes is left as a TODO;