Skip to content

Conversation

@dimitris-athanasiou
Copy link
Contributor

No description provided.

@dimitris-athanasiou dimitris-athanasiou added the :ml Machine learning label Dec 7, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - I just noted a few minor things

* end of data message.
*/
private static final String END_OF_DATA_MESSAGE_CODE = "r";
private static final String END_OF_DATA_MESSAGE_CODE = "$";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still 'r' on the C++ side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK cool. I missed that.

process.flushStream();

LOGGER.debug("[{}] Closing process", jobId);
LOGGER.info("[{}] Waiting for result processor to complete", jobId);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll want this to be an INFO in production. If you want to leave it like this on the feature branch please add a TODO to downgrade it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this shouldn't be an info message. I left it on purpose as I think it's useful during development. I had in mind that we'd review all logging when this is refactored into persistent tasks. Not sure I'd have a todo for each one of them, but I can add them if you think it's best.

} catch (IOException e) {
LOGGER.error(new ParameterizedMessage("[{}] Error writing data to the process", jobId), e);
} finally {
LOGGER.info("[{}] Closing process", jobId);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto either downgrade this now or add a TODO to do so before merging the feature branch.

}
AnalyticsResult result = currentResults.get(i);
SearchHit hit = row.getHit();
Map<String, Object> source = new HashMap(hit.getSourceAsMap());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the new fields to come at the end of the source document? If so, this should use LinkedHashMap instead of HashMap. (Maybe we don't care though.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I am not sure what is actually affected by the order. But I think it makes sense to append them at the end as the least invasive annotation of the original rows.

public Iterator<AutodetectResult> parseResults(InputStream in) throws ElasticsearchParseException {
public class ProcessResultsParser<T> {

private static final Logger LOGGER = LogManager.getLogger(ProcessResultsParser.class);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tradition is lowercase logger for static loggers, as the object is not immutable.

inOrder.verify(lengthEncodedWriter).writeNumFields(4);
inOrder.verify(lengthEncodedWriter, times(3)).writeField("");
inOrder.verify(lengthEncodedWriter).writeField("r");
inOrder.verify(lengthEncodedWriter).writeField("$");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still r in C++.

Copy link

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dimitris-athanasiou dimitris-athanasiou merged commit fd3e6a9 into elastic:feature-ml-data-frame-analytics Dec 11, 2018
@dimitris-athanasiou dimitris-athanasiou deleted the parse-results-and-join-them-in-data-frame-copy-index branch December 11, 2018 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants