Skip to content

Commit 39ee2ac

Browse files
BryanCutlerFelix Cheung
authored andcommitted
[SPARK-23163][DOC][PYTHON] Sync ML Python API with Scala
## What changes were proposed in this pull request? This syncs the ML Python API with Scala for differences found after the 2.3 QA audit. ## How was this patch tested? NA Author: Bryan Cutler <[email protected]> Closes #20354 from BryanCutler/pyspark-ml-doc-sync-23163.
1 parent e29b08a commit 39ee2ac

File tree

3 files changed

+9
-3
lines changed

3 files changed

+9
-3
lines changed

python/pyspark/ml/evaluation.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,13 @@ class ClusteringEvaluator(JavaEvaluator, HasPredictionCol, HasFeaturesCol,
334334
.. note:: Experimental
335335
336336
Evaluator for Clustering results, which expects two input
337-
columns: prediction and features.
337+
columns: prediction and features. The metric computes the Silhouette
338+
measure using the squared Euclidean distance.
339+
340+
The Silhouette is a measure for the validation of the consistency
341+
within clusters. It ranges between 1 and -1, where a value close to
342+
1 means that the points in a cluster are close to the other points
343+
in the same cluster and far from the points of the other clusters.
338344
339345
>>> from pyspark.ml.linalg import Vectors
340346
>>> featureAndPredictions = map(lambda x: (Vectors.dense(x[0]), x[1]),

python/pyspark/ml/feature.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3440,7 +3440,7 @@ class ChiSqSelector(JavaEstimator, HasFeaturesCol, HasOutputCol, HasLabelCol, Ja
34403440

34413441
selectorType = Param(Params._dummy(), "selectorType",
34423442
"The selector type of the ChisqSelector. " +
3443-
"Supported options: numTopFeatures (default), percentile and fpr.",
3443+
"Supported options: numTopFeatures (default), percentile, fpr, fdr, fwe.",
34443444
typeConverter=TypeConverters.toString)
34453445

34463446
numTopFeatures = \

python/pyspark/ml/fpm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ def freqItemsets(self):
144144
@since("2.2.0")
145145
def associationRules(self):
146146
"""
147-
Data with three columns:
147+
DataFrame with three columns:
148148
* `antecedent` - Array of the same type as the input column.
149149
* `consequent` - Array of the same type as the input column.
150150
* `confidence` - Confidence for the rule (`DoubleType`).

0 commit comments

Comments
 (0)