Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions python/pyspark/sql/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -2909,6 +2909,12 @@ def pandas_udf(f=None, returnType=None, functionType=None):
can fail on special rows, the workaround is to incorporate the condition into the functions.

.. note:: The user-defined functions do not take keyword arguments on the calling side.

.. note:: The data type of returned `pandas.Series` from the user-defined functions should be
matched with defined returnType (see :meth:`types.to_arrow_type` and
:meth:`types.from_arrow_type`). When there is mismatch between them, Spark might do
conversion on returned data. The conversion is not guaranteed to be correct and results
should be checked for accuracy by users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am merging this since this describes the current status but let's make it clear and try to get rid of this note within 3.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agreed. If there is next upgrade of PyArrow available, we may be able to provide the option to raise error when an unsafe cast.

"""
# decorator @pandas_udf(returnType, functionType)
is_decorator = f is None or isinstance(f, (str, DataType))
Expand Down