Skip to content

Improve Nullable support during dataframe arithmetic operations  #6825

@asmirnov82

Description

@asmirnov82

During arithmetic operations dataframe performs cloning the left side column into the result to have validity bitmap and than checks the right side validity bitmap for NULL value.

For example for Multiply we do cloning in case of inPlace parameter is set to false (default behavior):

PrimitiveDataFrameColumn<U> newColumn = inPlace ? primitiveColumn : primitiveColumn.Clone();
newColumn._columnContainer.Multiply(column._columnContainer);

and inside container for each value we check validity:

 for (int i = 0; i < span.Length; i++)
 {
     if (BitmapHelper.IsValid(right.NullBitMapBuffers[b].ReadOnlySpan, i))
     {
         span[i] = (double)(span[i] * otherSpan[i]);
     }
     else
     {
         left[index] = null;
     }

     index++;
 }

Validity check is a very slow operation. It's possible to calculate Raw values and then use binary logic (AND) for calculating validity bitmap for whole byte.

//calculate raw values
for (int i = 0; i < span.Length; i++)
{                
    resultSpan[i] =  (double)(span[i] * otherSpan[i]);
}

//Calculate validity (nulls)
resultValidityBitmap = Bitmap.ElementWiseAnd(validityBitmap, otherValidityBitmap));

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions