Add PhayaThaiBERT model into PyThaiNLP [WIP]

Due to an impressive result of the new released paper **PhayaThaiBERT: Enhancing a Pretrained Thai Language Model with Unassimilated Loanwords** for better handling with foreign words compared to former existing Thai encoder-based model. 


I think it is great to add it into supported downstream task of PyThaiNLP e.g. token classification etc. to strengthen the library. What do you think? If all of us agreed on this, I can help integrating it as a new engine asap.

## New features 
Here is the task that I found that it can be integrated in PyThaiNLP after reading a paper. The list below here is the current progress and contributors who put their efforts develop the model ( ✅  check mark means that it already added in the source code and will make a complete PR after complete all of it krub):

- [x] Part-of-speech tagging on blackboard corpus by @MpolaarbearM 
- [x] Named-entity-recognition on Thainer-v2 corpus by @pavaris-pm 
- [x] Tokenization by @pavaris-pm 
- [x] Data Augmentation (Text) by @pavaris-pm 
- [ ] Word Correction (currently under research and development)


etc ... (I will keep add more into the list based on what I have found during an experiment)

For those who interested, feel free to leave a comment below in case you want to develop a model in any of your interested task krub. After that, you can made a PR to the same brach as in PR #873 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add PhayaThaiBERT model into PyThaiNLP [WIP] #868

New features

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add PhayaThaiBERT model into PyThaiNLP [WIP] #868

Description

New features

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions