The Whitespace tokenizer should support the `max_token_length` parameter

Other tokenizers (like Standard) support overriding the `max_token_length` parameter, but it seems Whitespace doesn't, while the underlying Lucene WhitespaceTokenizer seems to support this parameter. We should probably enable setting this parameter in Elasticsearch as well.