-
Notifications
You must be signed in to change notification settings - Fork 285
Add LST20 Part-Of-Speech tagger model #464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @wannaphong! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-08-20 10:32:19 UTC |
|
I doing train new model. I will combine eval with train. |
Done |
|
It would be nice if we have some comparisons between these taggers and also the existing ones we have in PyThaiNLP. |
Use list comprehension in _orchid_to_ud and _lst20_to_ud
Use list comprehension for _postag_clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
…modules (lst20/orchid) to tagger submodules (unigram/perceptron).
|
I refactor some of the code, move tagger-related stuffs away from the corpus data files (lst20.py and unigram.py) and put them to tagger files (perceptron.py and unigram.py) |
LST20 Corpus from National Electronics and Computer Technology Center, Thailand. It can download dataset from https://aiforthai.in.th/corpus.php.
Support Model
and tag map for LST20 to Universal Dependencies.
Model train by Mr.Wannaphong Phatthiyaphaibun
Model License : CC-0
Code
TODO