Skip to content

Conversation

@frankfqchen
Copy link

@frankfqchen frankfqchen commented Sep 23, 2016

What changes were proposed in this pull request?

replace function type with function isinstance

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

I think we need a JIRA because type and isinstance are not exactly same. Also, maybe it'd better if the PR descriptions explains the bug and how this PR tries to resolve it.

BTW, it seems you intend to support sub-classes via isinstance consistently across the API, right?

If so, there are some instances similar with this. Maybe we should check those as well.

$ grep -r "type(.*) [=|\!]" . | grep .python | grep -v "tests.py"

./python/pyspark/ml/linalg/__init__.py:    elif type(v) == np.ndarray:
./python/pyspark/ml/linalg/__init__.py:        if type(other) == np.ndarray:
./python/pyspark/ml/linalg/__init__.py:            if type(pairs) == dict:
./python/pyspark/ml/param/__init__.py:        if type(value) == list:
./python/pyspark/ml/param/__init__.py:        elif type(value) == np.unicode_:
./python/pyspark/ml/param/__init__.py:        if type(value) == bool:
./python/pyspark/mllib/linalg/__init__.py:    elif type(v) == np.ndarray:
./python/pyspark/mllib/linalg/__init__.py:        if type(other) == np.ndarray:
./python/pyspark/mllib/linalg/__init__.py:            if type(pairs) == dict:
./python/pyspark/mllib/stat/_statistics.py:        if type(y) == str:
./python/pyspark/sql/column.py:        if type(startPos) != type(length):
./python/pyspark/sql/readwriter.py:            if type(path) != list:
./python/pyspark/sql/readwriter.py:        if type(path) == list:
./python/pyspark/sql/streaming.py:        if type(interval) != str or len(interval.strip()) == 0:
./python/pyspark/sql/streaming.py:            if type(path) != str or len(path.strip()) == 0:
./python/pyspark/sql/streaming.py:        if not outputMode or type(outputMode) != str or len(outputMode.strip()) == 0:
./python/pyspark/sql/streaming.py:        if not queryName or type(queryName) != str or len(queryName.strip()) == 0:
./python/pyspark/sql/streaming.py:            if type(processingTime) != str or len(processingTime.strip()) == 0:
./python/pyspark/sql/types.py:        return type(self) == type(other)

"""
if isinstance(v, Vector):
return len(v)
elif type(v) in (array.array, list, tuple, xrange):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change is legitimate, we should change this to isinstance(v, (array.array, list, tuple, xrange))

@holdenk
Copy link
Contributor

holdenk commented Oct 7, 2016

Thanks for looking to get involved @frankfqchen :)

I think this PR definitely needs a JIRA and maybe some more description about what the intent of the change is rather than just the contents of the change. The Spark Contributing guide can help you out - https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Feb 9, 2017

@frankfqchen Could you follow the comment above? If you are not able to proceed further, I think it might be better closed for now. Actually, IMHO, I am not too sure if it is worth sweeping them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants