-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24747][ML] Make Instrumentation class more flexible #21719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instrumentation class. Updated LogisticRegression to use this API as an example.
|
Test build #92653 has finished for PR 21719 at commit
|
mengxr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made one pass
| protected[spark] def train( | ||
| dataset: Dataset[_], | ||
| handlePersistence: Boolean): LogisticRegressionModel = { | ||
| handlePersistence: Boolean): LogisticRegressionModel = Instrumentation.instrumented { instr => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid line too wide, we might want to import instrumented and save "Instrumentation" from this line.
| if (handlePersistence) instances.persist(StorageLevel.MEMORY_AND_DISK) | ||
|
|
||
| val instr = Instrumentation.create(this, dataset) | ||
| instr.logContext(this, dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't log anything. I think we should auto-generate prefix and keep it as a constant. So logs would appear as:
[PREFIX]: instrumentation started
[PREFIX]: using estimator logReg-abc128
[PREFIX]: using dataset some hashcode
[PREFIX]: param maxIter=10
[PREFIX]: ...
[PREFIX]: run succeeded/failed
[PREFIX]: instrumentation ended
We can generate 8 random chars as the PREFIX. This is sufficient for correlate metrics from the same run. The issue with making it mutable is that we do not have a way to guarantee logContext is always called.
So I would suggest replacing logContext with the following:
- logEstimator or logPipelineStage
- logDataset
Btw, we can by default log call site. It provides more info for instrumentation, not necessary in this PR.
| * @param estimator the estimator that is being fit | ||
| * @param dataset the training dataset | ||
| */ | ||
| def logContext(estimator: Estimator[_], dataset: RDD[_]): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comment above
| } | ||
|
|
||
| def logSuccess(): Unit = { | ||
| log("training finished") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't have this log alias. I was wondering which log level it uses. Just use logInfo and remove log(.
| */ | ||
| def logFailure(e: Throwable): Unit = { | ||
| val msg = e.getStackTrace.mkString("\n") | ||
| super.logInfo(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failures should go to ERROR level.
|
|
||
| def instrumented[T](body: (Instrumentation => T)): T = { | ||
| val instr = new Instrumentation() | ||
| Try(body(new Instrumentation())) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use already constructed instr
| case Failure(NonFatal(e)) => | ||
| instr.logFailure(e) | ||
| throw e | ||
| case Success(model) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model -> result, it doesn't need to be a model
40c8b41 to
3a6537d
Compare
|
Test build #92906 has finished for PR 21719 at commit
|
|
jenkins test this please |
|
Test build #92948 has finished for PR 21719 at commit
|
jkbradley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than a couple of nits
| train(dataset, handlePersistence) | ||
| } | ||
|
|
||
| import Instrumentation.instrumented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put import at top of file with the other imports (just to make imports easier to track).
| private val prefix = s"[$shortId] " | ||
|
|
||
| // TODO: update spark.ml to use new Instrumentation APIs and remove this constructor | ||
| var stage: Params = _ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend we either plan to remove "stage" or change "logPipelineStage" so it only allows setting "stage" once. If we go with the former, how about leaving a note to remove "stage" once spark.ml code is migrated to use the new logParams() method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the plan is to remove stage once we port switch over to the new APIs
|
Test build #93189 has finished for PR 21719 at commit
|
|
LGTM |
## What changes were proposed in this pull request? Followup for #21719. Update spark.ml training code to fully wrap instrumented methods and remove old instrumentation APIs. ## How was this patch tested? existing tests. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Bago Amirbekian <[email protected]> Closes #21799 from MrBago/new-instrumentation-apis2.
What changes were proposed in this pull request?
This PR updates the Instrumentation class to make it more flexible and a little bit easier to use. When these APIs are merged, I'll followup with a PR to update the training code to use these new APIs so we can remove the old APIs. These changes are all to private APIs so this PR doesn't make any user facing changes.
How was this patch tested?
Existing tests.
Please review http://spark.apache.org/contributing.html before opening a pull request.