[SPARK-15962][SQL] Introduce implementation with a dense format for UnsafeArrayData #13680

kiszk · 2016-06-15T06:22:50Z

What changes were proposed in this pull request?

This PR introduces more compact representation for UnsafeArrayData.

UnsafeArrayData needs to accept null value in each entry of an array. In the current version, it has three parts

[numElements] [offsets] [values]

Offsets has the number of numElements, and represents null if its value is negative. It may increase memory footprint, and introduces an indirection for accessing each of values.

This PR uses bitvectors to represent nullability for each element like UnsafeRow, and eliminates an indirection for accessing each element. The new UnsafeArrayData has four parts.

[numElements][null bits][values or offset&length][variable length portion]

In the null bits region, we store 1 bit per element, represents whether an element is null. Its total size is ceil(numElements / 8) bytes, and it is aligned to 8-byte boundaries.
In the values or offset&length region, we store the content of elements. For fields that hold fixed-length primitive types, such as long, double, or int, we store the value directly in the field. For fields with non-primitive or variable-length values, we store a relative offset (w.r.t. the base address of the array) that points to the beginning of the variable-length field and length (they are combined into a long). Each is word-aligned. For variable length portion, each is aligned to 8-byte boundaries.

The new format can reduce memory footprint and improve performance of accessing each element. An example of memory foot comparison:
1024x1024 elements integer array
Size of baseObject for UnsafeArrayData: 8 + 1024x1024 + 1024x1024 = 2M bytes
Size of baseObject for UnsafeArrayData: 8 + 1024x1024/8 + 1024x1024 = 1.25M bytes

In summary, we got 1.0-2.6x performance improvements over the code before applying this PR.
Here are performance results of benchmark programs:

Read UnsafeArrayData: 1.7x and 1.6x performance improvements over the code before applying this PR

OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64
Intel Xeon E3-12xx v2 (Ivy Bridge)

Without SPARK-15962
Read UnsafeArrayData:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            430 /  436        390.0           2.6       1.0X
Double                                         456 /  485        367.8           2.7       0.9X

With SPARK-15962
Read UnsafeArrayData:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            252 /  260        666.1           1.5       1.0X
Double                                         281 /  292        597.7           1.7       0.9X

Write UnsafeArrayData: 1.0x and 1.1x performance improvements over the code before applying this PR

OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
Intel Xeon E3-12xx v2 (Ivy Bridge)

Without SPARK-15962
Write UnsafeArrayData:                   Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            203 /  273        103.4           9.7       1.0X
Double                                         239 /  356         87.9          11.4       0.8X

With SPARK-15962
Write UnsafeArrayData:                   Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            196 /  249        107.0           9.3       1.0X
Double                                         227 /  367         92.3          10.8       0.9X

Get primitive array from UnsafeArrayData: 2.6x and 1.6x performance improvements over the code before applying this PR

OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
Intel Xeon E3-12xx v2 (Ivy Bridge)

Without SPARK-15962
Get primitive array from UnsafeArrayData: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            207 /  217        304.2           3.3       1.0X
Double                                         257 /  363        245.2           4.1       0.8X

With SPARK-15962
Get primitive array from UnsafeArrayData: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            151 /  198        415.8           2.4       1.0X
Double                                         214 /  394        293.6           3.4       0.7X

Create UnsafeArrayData from primitive array: 1.7x and 2.1x performance improvements over the code before applying this PR

OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
Intel Xeon E3-12xx v2 (Ivy Bridge)

Without SPARK-15962
Create UnsafeArrayData from primitive array: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            340 /  385        185.1           5.4       1.0X
Double                                         479 /  705        131.3           7.6       0.7X

With SPARK-15962
Create UnsafeArrayData from primitive array: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
Int                                            206 /  211        306.0           3.3       1.0X
Double                                         232 /  406        271.6           3.7       0.9X

1.7x and 1.4x performance improvements in UDTSerializationBenchmark over the code before applying this PR

OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64
Intel Xeon E3-12xx v2 (Ivy Bridge)

Without SPARK-15962
VectorUDT de/serialization:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
serialize                                      442 /  533          0.0      441927.1       1.0X
deserialize                                    217 /  274          0.0      217087.6       2.0X

With SPARK-15962
VectorUDT de/serialization:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------
serialize                                      265 /  318          0.0      265138.5       1.0X
deserialize                                    155 /  197          0.0      154611.4       1.7X

How was this patch tested?

Added unit tests into UnsafeArraySuite

SparkQA · 2016-06-15T06:59:55Z

Test build #60556 has finished for PR 13680 at commit 639b32d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-15T10:55:34Z

Test build #60566 has finished for PR 13680 at commit d06d200.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-15T13:55:45Z

Test build #60568 has finished for PR 13680 at commit 6c09d72.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-06-23T03:54:11Z

Thanks for working on it! One of my concern is: do we really need 2 unsafe array implementations? For the UnsafeArrayDataDense, can we follow unsafe row and introduce a null-bits to make it support null value? Then we can have a unified format: [numElements] [null bits] [offset or primitive values] [values]

kiszk · 2016-06-23T05:09:50Z

@cloud-fan thank you for your good comment. I also read previous proposal.
I love to have only single format (or implementation). Since I thought that there are some reasons to keep the old format, I introduced a new dense format.

IMHO, a new unified format should have three properties.

Remove indirect offset (for performance and footprint)
Have capability of presence of nullbit (for generality)
Quickly get information on existence of null value in an array (for performance, in particular, primitive array)

Based on them, how about this single format?

[numElements] [all zero in null bits?] [null bits] [values] [variable length portion]

If we want to reduce memory footprint in the case of primitive array, we can drop [null bits] part if [all zero in null bits?] has a special value.

cloud-fan · 2016-06-23T05:37:41Z

null bits won't take a lot of memory(1 bit per element), and having the all zero in null bits? flag will slow down elements retrievement: we need an extra if branch to check this flag, and make the format more complex. I'd like to always have the null bits region, cc @davies

kiszk · 2016-06-23T05:59:21Z

It is OK to always keep [null bits]

One question: Is this format to keep fixed space for [values]? I mean if [null bit] is true, the corresponding element in [value] occupy space. It allow us to always access all of the elements by 4 + (null bits size) + (element size) * offset.

If an answer of the above question is yes, I agree with no [all zero in null bits?]. Then, I will revise this PR.

cc @hvanhovell

SparkQA · 2016-06-23T06:22:30Z

Test build #61095 has finished for PR 13680 at commit 85f862c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-06-23T06:29:05Z

@kiszk yea, even the null bits is true, the element still take space at [offset or fixed-length values] region.

kiszk · 2016-06-23T06:43:41Z

Good to hear. I will make an implementation for single format. If I would meet some issues, I will raise them here.

cloud-fan · 2016-06-23T06:56:45Z

Thanks! Feel free to ask any questions!

kiszk · 2016-06-23T14:26:34Z

@cloud-fan , I have one question about null field. Should we put zero into the corresponding field to position where setNullAt() is called as UnsafeRow does.

If we avoid to put zero, this avoidance affects two properties.

row equality
undetermined value may be included in the array returned by UnsafeArrayData.to<PrimitiveType>Array()

In my current implementation, a width of each element depends on element type (4: Int, 8: Double, etc). Thus, it is hard to do the same approach as it did. Since UnsafeRow always use 8 bytes per field. Since we want to make data conversion fast between primitive array and unsafe array, we have to keep the type-based element-width (e.g. 4: Int, 8: Double, etc).

What do you think? Should we have to keep the above two property by clearing a field?

kiszk · 2016-06-23T15:47:59Z

One potential performance issue is that we have to always clear all of null bits at UnsafeArrayWriter.initialize(). IIUC, this is because holder.buffer is reused for each row. If one row has more than one variable length arrays, holder.cursor may have different offset for the same array among different rows. Thus, we have to clear all of null bits.
This overhead would be larger in machine learning use cases that use large arrays.

If we have a flag to show whether there is an area of [null bits] or we have two implementations with [null bits] and without [null bits], we may alleviate performance overhead.

What do you think?
cc: @hvanhovell , @mengxr

cloud-fan · 2016-06-24T01:28:46Z

@kiszk we should definitely put zero into the corresponding field when set null. It will be a little harder than UnsafeRow, as we need setNullBoolean, setNullInt, etc. but it's still doable.

about clear out all null bits, yea, it's a big overhead for array with small element like boolean array, but I'm not sure this worth 2 different implementations, cc @rxin

kiszk · 2016-06-24T04:32:45Z

@cloud-fan , for the first issue, we are on the same page. Your proposal is what I am thinking about as possible solutions. I will do that.

For the second issue, it seems to be design choice between

introduce one conditional branch in isNullAt() in one implementation
have two implementations without conditional branch at isNullAt()

cloud-fan · 2016-06-24T04:59:58Z

having 2 implementations is also kind of a branch: the virtual function call need to be dispatched between these 2 implementations, while the only one implementation can be marked as final and doesn't have this overhead.

kiszk · 2016-06-24T08:45:11Z

I see. I assumed that virtual call will be devirtualized by declaring final method and by optimistically propagating type information in the JIT compiler. Would it be better to add a flag like containNull ?

cloud-fan · 2016-06-24T11:02:12Z

I don't have a strong preference here, each choice has its advantage and weakness:

alway have the null bits region: faster element access(read the null bits and then read the element), slower writing(always need to clear out the null bit region for primitive array)
only have null bits region if containsNull flag is true: slower element access(read flag then read null bits then read the element), faster writing(check the containsNull flag and don't need to clear out null bits if flag is false)

cc @davies

hvanhovell · 2016-06-24T17:17:04Z

I am kind in favor of the single implementation for a couple of reasons:

Declaring methods final is not a magic bullet. If you are invokingisNull or get* on the common ancestor it is still a virtual call. If you are only using two classes the JVM is clever enough to add a branch in the code and call one of the two methods depending of the type (it is probably also clever enough to inline the methods). We could get around this if we can use the concrete type during code generation (which should be doable).
Nulling out the null region is IMO not really a problem. JVMs nowadays are very good at optimizing array fill like operations. See this post for some more information: http://psy-lob-saw.blogspot.com/2015/04/on-arraysfill-intrinsics-superword-and.html
On our side code generation should be clever enough not to do null checks for non-nullable fields (is it?). So we are not incurring useless branch here.

It would help to have a number of performance benchmarks.

rxin · 2016-06-24T17:37:08Z

We should do this more holistically, i.e. thinking about what we want to do with primitive arrays for machine learning and how to handle everything end to end. Let's not rush an implementation change just for the low level data structure.

SparkQA · 2016-06-25T02:55:47Z

Test build #61218 has finished for PR 13680 at commit 517aa72.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-25T05:36:35Z

Test build #61222 has finished for PR 13680 at commit bf12e72.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-25T13:19:20Z

Test build #61233 has finished for PR 13680 at commit 682c397.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-25T14:43:47Z

Test build #61235 has finished for PR 13680 at commit 687be4d.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-25T17:04:35Z

Test build #61236 has finished for PR 13680 at commit 138810b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-25T18:52:01Z

Test build #61239 has finished for PR 13680 at commit 500e978.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2016-06-26T06:55:25Z

@cloud-fan and @hvanhovell thank you for your comments. Based on your comments, I implemented UnsafeArrayData by using one implementation with explicit clearing null bits by Arrays.fill().
I would appreciate it if you review this PR.

some refinements

kiszk · 2016-09-26T19:42:00Z

Jenkins, retest this please

SparkQA · 2016-09-26T20:37:39Z

Test build #65920 has finished for PR 13680 at commit 2ef6e3b.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-09-26T21:58:25Z

Test build #65930 has finished for PR 13680 at commit 2ef6e3b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2016-09-27T01:07:38Z

@cloud-fan could you please review this again?
I addressed your comments. After rebasing, the performance issue has been solved.

cloud-fan · 2016-09-27T06:18:48Z

thanks for your great work! merging to master!

## What changes were proposed in this pull request? Waiting for merging #13680 This PR optimizes `SerializeFromObject()` for an primitive array. This is derived from #13758 to address one of problems by using a simple way in #13758. The current implementation always generates `GenericArrayData` from `SerializeFromObject()` for any type of an array in a logical plan. This involves a boxing at a constructor of `GenericArrayData` when `SerializedFromObject()` has an primitive array. This PR enables to generate `UnsafeArrayData` from `SerializeFromObject()` for a primitive array. It can avoid boxing to create an instance of `ArrayData` in the generated code by Catalyst. This PR also generate `UnsafeArrayData` in a case for `RowEncoder.serializeFor` or `CatalystTypeConverters.createToCatalystConverter`. Performance improvement of `SerializeFromObject()` is up to 2.0x ``` OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64 Intel Xeon E3-12xx v2 (Ivy Bridge) Without this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 556 / 608 15.1 66.3 1.0X Double 1668 / 1746 5.0 198.8 0.3X with this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 352 / 401 23.8 42.0 1.0X Double 821 / 885 10.2 97.9 0.4X ``` Here is an example program that will happen in mllib as described in [SPARK-16070](https://issues.apache.org/jira/browse/SPARK-16070). ``` sparkContext.parallelize(Seq(Array(1, 2)), 1).toDS.map(e => e).show ``` Generated code before applying this PR ``` java /* 039 */ protected void processNext() throws java.io.IOException { /* 040 */ while (inputadapter_input.hasNext()) { /* 041 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 042 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 043 */ /* 044 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 045 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 046 */ /* 047 */ boolean mapelements_isNull = false || false; /* 048 */ int[] mapelements_value = null; /* 049 */ if (!mapelements_isNull) { /* 050 */ Object mapelements_funcResult = null; /* 051 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 052 */ if (mapelements_funcResult == null) { /* 053 */ mapelements_isNull = true; /* 054 */ } else { /* 055 */ mapelements_value = (int[]) mapelements_funcResult; /* 056 */ } /* 057 */ /* 058 */ } /* 059 */ mapelements_isNull = mapelements_value == null; /* 060 */ /* 061 */ serializefromobject_argIsNulls[0] = mapelements_isNull; /* 062 */ serializefromobject_argValue = mapelements_value; /* 063 */ /* 064 */ boolean serializefromobject_isNull = false; /* 065 */ for (int idx = 0; idx < 1; idx++) { /* 066 */ if (serializefromobject_argIsNulls[idx]) { serializefromobject_isNull = true; break; } /* 067 */ } /* 068 */ /* 069 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : new org.apache.spark.sql.catalyst.util.GenericArrayData(serializefromobject_argValue); /* 070 */ serializefromobject_holder.reset(); /* 071 */ /* 072 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 073 */ /* 074 */ if (serializefromobject_isNull) { /* 075 */ serializefromobject_rowWriter.setNullAt(0); /* 076 */ } else { /* 077 */ // Remember the current cursor so that we can calculate how many bytes are /* 078 */ // written later. /* 079 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 080 */ /* 081 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 082 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 083 */ // grow the global buffer before writing data. /* 084 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 085 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 086 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 087 */ /* 088 */ } else { /* 089 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 090 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 091 */ /* 092 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 093 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 094 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 095 */ } else { /* 096 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 097 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 098 */ } /* 099 */ } /* 100 */ } /* 101 */ /* 102 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 103 */ } /* 104 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 105 */ append(serializefromobject_result); /* 106 */ if (shouldStop()) return; /* 107 */ } /* 108 */ } /* 109 */ } ``` Generated code after applying this PR ``` java /* 035 */ protected void processNext() throws java.io.IOException { /* 036 */ while (inputadapter_input.hasNext()) { /* 037 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 038 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 039 */ /* 040 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 041 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 042 */ /* 043 */ boolean mapelements_isNull = false || false; /* 044 */ int[] mapelements_value = null; /* 045 */ if (!mapelements_isNull) { /* 046 */ Object mapelements_funcResult = null; /* 047 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 048 */ if (mapelements_funcResult == null) { /* 049 */ mapelements_isNull = true; /* 050 */ } else { /* 051 */ mapelements_value = (int[]) mapelements_funcResult; /* 052 */ } /* 053 */ /* 054 */ } /* 055 */ mapelements_isNull = mapelements_value == null; /* 056 */ /* 057 */ boolean serializefromobject_isNull = mapelements_isNull; /* 058 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.fromPrimitiveArray(mapelements_value); /* 059 */ serializefromobject_isNull = serializefromobject_value == null; /* 060 */ serializefromobject_holder.reset(); /* 061 */ /* 062 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 063 */ /* 064 */ if (serializefromobject_isNull) { /* 065 */ serializefromobject_rowWriter.setNullAt(0); /* 066 */ } else { /* 067 */ // Remember the current cursor so that we can calculate how many bytes are /* 068 */ // written later. /* 069 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 070 */ /* 071 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 072 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 073 */ // grow the global buffer before writing data. /* 074 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 075 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 076 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 077 */ /* 078 */ } else { /* 079 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 080 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 081 */ /* 082 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 083 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 084 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 085 */ } else { /* 086 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 087 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 088 */ } /* 089 */ } /* 090 */ } /* 091 */ /* 092 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 093 */ } /* 094 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 095 */ append(serializefromobject_result); /* 096 */ if (shouldStop()) return; /* 097 */ } /* 098 */ } /* 099 */ } ``` ## How was this patch tested? Added a test in `DatasetSuite`, `RowEncoderSuite`, and `CatalystTypeConvertersSuite` Author: Kazuaki Ishizaki <[email protected]> Closes #15044 from kiszk/SPARK-17490.

Waiting for merging #13680 This PR optimizes `SerializeFromObject()` for an primitive array. This is derived from #13758 to address one of problems by using a simple way in #13758. The current implementation always generates `GenericArrayData` from `SerializeFromObject()` for any type of an array in a logical plan. This involves a boxing at a constructor of `GenericArrayData` when `SerializedFromObject()` has an primitive array. This PR enables to generate `UnsafeArrayData` from `SerializeFromObject()` for a primitive array. It can avoid boxing to create an instance of `ArrayData` in the generated code by Catalyst. This PR also generate `UnsafeArrayData` in a case for `RowEncoder.serializeFor` or `CatalystTypeConverters.createToCatalystConverter`. Performance improvement of `SerializeFromObject()` is up to 2.0x ``` OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64 Intel Xeon E3-12xx v2 (Ivy Bridge) Without this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 556 / 608 15.1 66.3 1.0X Double 1668 / 1746 5.0 198.8 0.3X with this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 352 / 401 23.8 42.0 1.0X Double 821 / 885 10.2 97.9 0.4X ``` Here is an example program that will happen in mllib as described in [SPARK-16070](https://issues.apache.org/jira/browse/SPARK-16070). ``` sparkContext.parallelize(Seq(Array(1, 2)), 1).toDS.map(e => e).show ``` Generated code before applying this PR ``` java /* 039 */ protected void processNext() throws java.io.IOException { /* 040 */ while (inputadapter_input.hasNext()) { /* 041 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 042 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 043 */ /* 044 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 045 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 046 */ /* 047 */ boolean mapelements_isNull = false || false; /* 048 */ int[] mapelements_value = null; /* 049 */ if (!mapelements_isNull) { /* 050 */ Object mapelements_funcResult = null; /* 051 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 052 */ if (mapelements_funcResult == null) { /* 053 */ mapelements_isNull = true; /* 054 */ } else { /* 055 */ mapelements_value = (int[]) mapelements_funcResult; /* 056 */ } /* 057 */ /* 058 */ } /* 059 */ mapelements_isNull = mapelements_value == null; /* 060 */ /* 061 */ serializefromobject_argIsNulls[0] = mapelements_isNull; /* 062 */ serializefromobject_argValue = mapelements_value; /* 063 */ /* 064 */ boolean serializefromobject_isNull = false; /* 065 */ for (int idx = 0; idx < 1; idx++) { /* 066 */ if (serializefromobject_argIsNulls[idx]) { serializefromobject_isNull = true; break; } /* 067 */ } /* 068 */ /* 069 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : new org.apache.spark.sql.catalyst.util.GenericArrayData(serializefromobject_argValue); /* 070 */ serializefromobject_holder.reset(); /* 071 */ /* 072 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 073 */ /* 074 */ if (serializefromobject_isNull) { /* 075 */ serializefromobject_rowWriter.setNullAt(0); /* 076 */ } else { /* 077 */ // Remember the current cursor so that we can calculate how many bytes are /* 078 */ // written later. /* 079 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 080 */ /* 081 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 082 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 083 */ // grow the global buffer before writing data. /* 084 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 085 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 086 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 087 */ /* 088 */ } else { /* 089 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 090 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 091 */ /* 092 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 093 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 094 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 095 */ } else { /* 096 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 097 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 098 */ } /* 099 */ } /* 100 */ } /* 101 */ /* 102 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 103 */ } /* 104 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 105 */ append(serializefromobject_result); /* 106 */ if (shouldStop()) return; /* 107 */ } /* 108 */ } /* 109 */ } ``` Generated code after applying this PR ``` java /* 035 */ protected void processNext() throws java.io.IOException { /* 036 */ while (inputadapter_input.hasNext()) { /* 037 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 038 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 039 */ /* 040 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 041 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 042 */ /* 043 */ boolean mapelements_isNull = false || false; /* 044 */ int[] mapelements_value = null; /* 045 */ if (!mapelements_isNull) { /* 046 */ Object mapelements_funcResult = null; /* 047 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 048 */ if (mapelements_funcResult == null) { /* 049 */ mapelements_isNull = true; /* 050 */ } else { /* 051 */ mapelements_value = (int[]) mapelements_funcResult; /* 052 */ } /* 053 */ /* 054 */ } /* 055 */ mapelements_isNull = mapelements_value == null; /* 056 */ /* 057 */ boolean serializefromobject_isNull = mapelements_isNull; /* 058 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.fromPrimitiveArray(mapelements_value); /* 059 */ serializefromobject_isNull = serializefromobject_value == null; /* 060 */ serializefromobject_holder.reset(); /* 061 */ /* 062 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 063 */ /* 064 */ if (serializefromobject_isNull) { /* 065 */ serializefromobject_rowWriter.setNullAt(0); /* 066 */ } else { /* 067 */ // Remember the current cursor so that we can calculate how many bytes are /* 068 */ // written later. /* 069 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 070 */ /* 071 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 072 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 073 */ // grow the global buffer before writing data. /* 074 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 075 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 076 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 077 */ /* 078 */ } else { /* 079 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 080 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 081 */ /* 082 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 083 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 084 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 085 */ } else { /* 086 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 087 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 088 */ } /* 089 */ } /* 090 */ } /* 091 */ /* 092 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 093 */ } /* 094 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 095 */ append(serializefromobject_result); /* 096 */ if (shouldStop()) return; /* 097 */ } /* 098 */ } /* 099 */ } ``` Added a test in `DatasetSuite`, `RowEncoderSuite`, and `CatalystTypeConvertersSuite` Author: Kazuaki Ishizaki <[email protected]> Closes #15044 from kiszk/SPARK-17490. (cherry picked from commit 19cf208) Signed-off-by: Herman van Hovell <[email protected]>

## What changes were proposed in this pull request? Waiting for merging apache#13680 This PR optimizes `SerializeFromObject()` for an primitive array. This is derived from apache#13758 to address one of problems by using a simple way in apache#13758. The current implementation always generates `GenericArrayData` from `SerializeFromObject()` for any type of an array in a logical plan. This involves a boxing at a constructor of `GenericArrayData` when `SerializedFromObject()` has an primitive array. This PR enables to generate `UnsafeArrayData` from `SerializeFromObject()` for a primitive array. It can avoid boxing to create an instance of `ArrayData` in the generated code by Catalyst. This PR also generate `UnsafeArrayData` in a case for `RowEncoder.serializeFor` or `CatalystTypeConverters.createToCatalystConverter`. Performance improvement of `SerializeFromObject()` is up to 2.0x ``` OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64 Intel Xeon E3-12xx v2 (Ivy Bridge) Without this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 556 / 608 15.1 66.3 1.0X Double 1668 / 1746 5.0 198.8 0.3X with this PR Write an array in Dataset: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------ Int 352 / 401 23.8 42.0 1.0X Double 821 / 885 10.2 97.9 0.4X ``` Here is an example program that will happen in mllib as described in [SPARK-16070](https://issues.apache.org/jira/browse/SPARK-16070). ``` sparkContext.parallelize(Seq(Array(1, 2)), 1).toDS.map(e => e).show ``` Generated code before applying this PR ``` java /* 039 */ protected void processNext() throws java.io.IOException { /* 040 */ while (inputadapter_input.hasNext()) { /* 041 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 042 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 043 */ /* 044 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 045 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 046 */ /* 047 */ boolean mapelements_isNull = false || false; /* 048 */ int[] mapelements_value = null; /* 049 */ if (!mapelements_isNull) { /* 050 */ Object mapelements_funcResult = null; /* 051 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 052 */ if (mapelements_funcResult == null) { /* 053 */ mapelements_isNull = true; /* 054 */ } else { /* 055 */ mapelements_value = (int[]) mapelements_funcResult; /* 056 */ } /* 057 */ /* 058 */ } /* 059 */ mapelements_isNull = mapelements_value == null; /* 060 */ /* 061 */ serializefromobject_argIsNulls[0] = mapelements_isNull; /* 062 */ serializefromobject_argValue = mapelements_value; /* 063 */ /* 064 */ boolean serializefromobject_isNull = false; /* 065 */ for (int idx = 0; idx < 1; idx++) { /* 066 */ if (serializefromobject_argIsNulls[idx]) { serializefromobject_isNull = true; break; } /* 067 */ } /* 068 */ /* 069 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : new org.apache.spark.sql.catalyst.util.GenericArrayData(serializefromobject_argValue); /* 070 */ serializefromobject_holder.reset(); /* 071 */ /* 072 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 073 */ /* 074 */ if (serializefromobject_isNull) { /* 075 */ serializefromobject_rowWriter.setNullAt(0); /* 076 */ } else { /* 077 */ // Remember the current cursor so that we can calculate how many bytes are /* 078 */ // written later. /* 079 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 080 */ /* 081 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 082 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 083 */ // grow the global buffer before writing data. /* 084 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 085 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 086 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 087 */ /* 088 */ } else { /* 089 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 090 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 091 */ /* 092 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 093 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 094 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 095 */ } else { /* 096 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 097 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 098 */ } /* 099 */ } /* 100 */ } /* 101 */ /* 102 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 103 */ } /* 104 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 105 */ append(serializefromobject_result); /* 106 */ if (shouldStop()) return; /* 107 */ } /* 108 */ } /* 109 */ } ``` Generated code after applying this PR ``` java /* 035 */ protected void processNext() throws java.io.IOException { /* 036 */ while (inputadapter_input.hasNext()) { /* 037 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 038 */ int[] inputadapter_value = (int[])inputadapter_row.get(0, null); /* 039 */ /* 040 */ Object mapelements_obj = ((Expression) references[0]).eval(null); /* 041 */ scala.Function1 mapelements_value1 = (scala.Function1) mapelements_obj; /* 042 */ /* 043 */ boolean mapelements_isNull = false || false; /* 044 */ int[] mapelements_value = null; /* 045 */ if (!mapelements_isNull) { /* 046 */ Object mapelements_funcResult = null; /* 047 */ mapelements_funcResult = mapelements_value1.apply(inputadapter_value); /* 048 */ if (mapelements_funcResult == null) { /* 049 */ mapelements_isNull = true; /* 050 */ } else { /* 051 */ mapelements_value = (int[]) mapelements_funcResult; /* 052 */ } /* 053 */ /* 054 */ } /* 055 */ mapelements_isNull = mapelements_value == null; /* 056 */ /* 057 */ boolean serializefromobject_isNull = mapelements_isNull; /* 058 */ final ArrayData serializefromobject_value = serializefromobject_isNull ? null : org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.fromPrimitiveArray(mapelements_value); /* 059 */ serializefromobject_isNull = serializefromobject_value == null; /* 060 */ serializefromobject_holder.reset(); /* 061 */ /* 062 */ serializefromobject_rowWriter.zeroOutNullBytes(); /* 063 */ /* 064 */ if (serializefromobject_isNull) { /* 065 */ serializefromobject_rowWriter.setNullAt(0); /* 066 */ } else { /* 067 */ // Remember the current cursor so that we can calculate how many bytes are /* 068 */ // written later. /* 069 */ final int serializefromobject_tmpCursor = serializefromobject_holder.cursor; /* 070 */ /* 071 */ if (serializefromobject_value instanceof UnsafeArrayData) { /* 072 */ final int serializefromobject_sizeInBytes = ((UnsafeArrayData) serializefromobject_value).getSizeInBytes(); /* 073 */ // grow the global buffer before writing data. /* 074 */ serializefromobject_holder.grow(serializefromobject_sizeInBytes); /* 075 */ ((UnsafeArrayData) serializefromobject_value).writeToMemory(serializefromobject_holder.buffer, serializefromobject_holder.cursor); /* 076 */ serializefromobject_holder.cursor += serializefromobject_sizeInBytes; /* 077 */ /* 078 */ } else { /* 079 */ final int serializefromobject_numElements = serializefromobject_value.numElements(); /* 080 */ serializefromobject_arrayWriter.initialize(serializefromobject_holder, serializefromobject_numElements, 4); /* 081 */ /* 082 */ for (int serializefromobject_index = 0; serializefromobject_index < serializefromobject_numElements; serializefromobject_index++) { /* 083 */ if (serializefromobject_value.isNullAt(serializefromobject_index)) { /* 084 */ serializefromobject_arrayWriter.setNullInt(serializefromobject_index); /* 085 */ } else { /* 086 */ final int serializefromobject_element = serializefromobject_value.getInt(serializefromobject_index); /* 087 */ serializefromobject_arrayWriter.write(serializefromobject_index, serializefromobject_element); /* 088 */ } /* 089 */ } /* 090 */ } /* 091 */ /* 092 */ serializefromobject_rowWriter.setOffsetAndSize(0, serializefromobject_tmpCursor, serializefromobject_holder.cursor - serializefromobject_tmpCursor); /* 093 */ } /* 094 */ serializefromobject_result.setTotalSize(serializefromobject_holder.totalSize()); /* 095 */ append(serializefromobject_result); /* 096 */ if (shouldStop()) return; /* 097 */ } /* 098 */ } /* 099 */ } ``` ## How was this patch tested? Added a test in `DatasetSuite`, `RowEncoderSuite`, and `CatalystTypeConvertersSuite` Author: Kazuaki Ishizaki <[email protected]> Closes apache#15044 from kiszk/SPARK-17490.

kiszk mentioned this pull request Jun 23, 2016

[SPARK-16043][SQL] Prepare GenericArrayData implementation specialized for a primitive array #13758

Closed

kiszk changed the title ~~[SPARK-15962][SQL] Introduce additonal implementation with a dense format for UnsafeArrayData~~ [SPARK-15962][SQL] Introduce implementation with a dense format for UnsafeArrayData Jun 25, 2016

kiszk mentioned this pull request Jun 26, 2016

[SPARK-16215][SQL] Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset #13911

Closed

kiszk added 18 commits September 26, 2016 17:09

addressed comments

3fa7052

addressed comments

4c094c2

update test cases

9887171

address comments

9fe7ad0

address comments for test cases and benchmark

e4b4b52

addressed comments

585ca7b

addressed review comments

9933a06

fixed test failures

919e832

update test suites

0886e3a

align each of variable length elements to 8 bytes

c385bf4

fixed test failures

c8813db

some refinements

fixed test failures

aa7cfdb

address review comments

0b7867b

address review comments

ab9a16a

address review comments

515701b

change benchmark size

8169abd

addressed comments

e356a79

update performance results

2ef6e3b

kiszk force-pushed the SPARK-15962 branch from f19020f to 2ef6e3b Compare September 26, 2016 16:25

asfgit closed this in 85b0a15 Sep 27, 2016

[SPARK-15962][SQL] Introduce implementation with a dense format for UnsafeArrayData #13680

[SPARK-15962][SQL] Introduce implementation with a dense format for UnsafeArrayData #13680

Uh oh!

Conversation

kiszk commented Jun 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jun 15, 2016

Uh oh!

SparkQA commented Jun 15, 2016

Uh oh!

SparkQA commented Jun 15, 2016

Uh oh!

cloud-fan commented Jun 23, 2016

Uh oh!

kiszk commented Jun 23, 2016

Uh oh!

cloud-fan commented Jun 23, 2016

Uh oh!

kiszk commented Jun 23, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Jun 23, 2016

Uh oh!

cloud-fan commented Jun 23, 2016

Uh oh!

kiszk commented Jun 23, 2016

Uh oh!

cloud-fan commented Jun 23, 2016

Uh oh!

kiszk commented Jun 23, 2016

Uh oh!

kiszk commented Jun 23, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Jun 24, 2016

Uh oh!

kiszk commented Jun 24, 2016

Uh oh!

cloud-fan commented Jun 24, 2016

Uh oh!

kiszk commented Jun 24, 2016

Uh oh!

cloud-fan commented Jun 24, 2016

Uh oh!

hvanhovell commented Jun 24, 2016

Uh oh!

rxin commented Jun 24, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

SparkQA commented Jun 25, 2016

Uh oh!

kiszk commented Jun 26, 2016

Uh oh!

kiszk commented Sep 26, 2016

Uh oh!

SparkQA commented Sep 26, 2016

Uh oh!

SparkQA commented Sep 26, 2016

Uh oh!

kiszk commented Sep 27, 2016

Uh oh!

cloud-fan commented Sep 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

kiszk commented Jun 15, 2016 •

edited

Loading

kiszk commented Jun 23, 2016 •

edited

Loading

kiszk commented Jun 23, 2016 •

edited

Loading