Skip to content

Commit 52524c4

Browse files
authored
Merge pull request #845 from BLKSerene/update_code_comments
Update code comments and clean up codes
2 parents 73b17e3 + c63e568 commit 52524c4

File tree

154 files changed

+1126
-1178
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+1126
-1178
lines changed

CONTRIBUTING.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -7,36 +7,36 @@ Please refer to our [Contributor Covenant Code of Conduct](https://github.com/Py
77
## Issue Report and Discussion
88

99
- Discussion: https://github.com/PyThaiNLP/pythainlp/discussions
10-
- GitHub issues (problems and suggestions): https://github.com/PyThaiNLP/pythainlp/issues
11-
- Facebook group (not specific to PyThaiNLP, can be Thai NLP discussion in general): https://www.facebook.com/groups/thainlp
10+
- GitHub issues (for problems and suggestions): https://github.com/PyThaiNLP/pythainlp/issues
11+
- Facebook group (not specific to PyThaiNLP, for Thai NLP discussion in general): https://www.facebook.com/groups/thainlp
1212

1313

1414
## Code
1515

1616
## Code Guidelines
1717

18-
- Follows [PEP8](http://www.python.org/dev/peps/pep-0008/), use [black](https://github.com/ambv/black) with `--line-length` = 79;
18+
- Follow [PEP8](http://www.python.org/dev/peps/pep-0008/), use [black](https://github.com/ambv/black) with `--line-length` = 79;
1919
- Name identifiers (variables, classes, functions, module names) with meaningful
2020
and pronounceable names (`x` is always wrong);
2121
- Please follow this [naming convention](https://namingconvention.org/python/). For example, global constant variables must be in `ALL_CAPS`;
2222
<img src="https://i.stack.imgur.com/uBr10.png" />
23-
- Write tests for your new features. Test suites are in `tests/` directory. (see "Testing" section below);
23+
- Write tests for your new features. The test suite is in `tests/` directory. (see "Testing" section below);
2424
- Run all tests before pushing (just execute `tox`) so you will know if your
2525
changes broke something;
26-
- Commented code is [dead
27-
code](http://www.codinghorror.com/blog/2008/07/coding-without-comments.html);
26+
- Commented out codes are [dead
27+
codes](http://www.codinghorror.com/blog/2008/07/coding-without-comments.html);
2828
- All `#TODO` comments should be turned into [issues](https://github.com/pythainlp/pythainlp/issues) in GitHub;
29-
- When appropriate, use [f-String](https://www.python.org/dev/peps/pep-0498/)
29+
- When appropriate, use [f-string](https://www.python.org/dev/peps/pep-0498/)
3030
(use `f"{a} = {b}"`, instead of `"{} = {}".format(a, b)` and `"%s = %s' % (a, b)"`);
31-
- All text files, including source code, must be ended with one empty line. This is [to please git](https://stackoverflow.com/questions/5813311/no-newline-at-end-of-file#5813359) and [to keep up with POSIX standard](https://stackoverflow.com/questions/729692/why-should-text-files-end-with-a-newline).
31+
- All text files, including source codes, must end with one empty line. This is [to please git](https://stackoverflow.com/questions/5813311/no-newline-at-end-of-file#5813359) and [to keep up with POSIX standard](https://stackoverflow.com/questions/729692/why-should-text-files-end-with-a-newline).
3232

3333
### Version Control System
3434

3535
- We use [Git](http://git-scm.com/) as our [version control system](http://en.wikipedia.org/wiki/Revision_control),
3636
so it may be a good idea to familiarize yourself with it.
3737
- You can start with the [Pro Git book](http://git-scm.com/book/) (free!).
3838

39-
### Commit Comment
39+
### Commit Message
4040

4141
- [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/)
4242
- [Commit Verbs 101: why I like to use this and why you should also like it.](https://chris.beams.io/posts/git-commit/)
@@ -45,24 +45,24 @@ so it may be a good idea to familiarize yourself with it.
4545

4646
- We use the famous [gitflow](http://nvie.com/posts/a-successful-git-branching-model/)
4747
to manage our branches.
48-
- When you do pull request on GitHub, Travis CI and AppVeyor will run tests
48+
- When you create pull requests on GitHub, Github Actions and AppVeyor will run tests
4949
and several checks automatically. Click the "Details" link at the end of
5050
each check to see what needs to be fixed.
5151

5252

5353
## Documentation
5454

5555
- We use [Sphinx](https://www.sphinx-doc.org/en/master/) to generate API document
56-
automatically from "docstring" comments in source code. This means the comment
57-
section in the source code is important for the quality of documentation.
58-
- A docstring should start with one summary line, ended the line with a full stop (period),
59-
then followed by a blank line before the start new paragraph.
60-
- A commit to release branches (e.g. `2.2`, `2.1`) with a title **"(build and deploy docs)"** (without quotes) will trigger the system to rebuild the documentation files and upload them to the website https://pythainlp.github.io/docs
56+
automatically from "docstring" comments in source codes. This means the comment
57+
section in the source codes is important for the quality of documentation.
58+
- A docstring should start with one summary line, end with one line with a full stop (period),
59+
then be followed by a blank line before starting a new paragraph.
60+
- A commit to release branches (e.g. `2.2`, `2.1`) with a title **"(build and deploy docs)"** (without quotes) will trigger the system to rebuild the documentation files and upload them to the website https://pythainlp.github.io/docs.
6161

6262

6363
## Testing
6464

65-
We use standard Python `unittest`. Test suites are in `tests/` directory.
65+
We use standard Python `unittest`. The test suite is in `tests/` directory.
6666

6767
To run unit tests locally together with code coverage test:
6868

@@ -81,12 +81,12 @@ Generate code coverage test in HTML (files will be available in `htmlcov/` direc
8181
coverage html
8282
```
8383

84-
Make sure the same tests pass on Travis CI and AppVeyor.
84+
Make sure the tests pass on both Github Actions and AppVeyor.
8585

8686

8787
## Releasing
8888
- We use [semantic versioning](https://semver.org/): MAJOR.MINOR.PATCH, with development build suffix: MAJOR.MINOR.PATCH-devBUILD
89-
- Use [`bumpversion`](https://github.com/c4urself/bump2version/#installation) to manage versioning.
89+
- We use [`bumpversion`](https://github.com/c4urself/bump2version/#installation) to manage versioning.
9090
- `bumpversion [major|minor|patch|release|build]`
9191
- Example:
9292
```
@@ -129,18 +129,18 @@ Make sure the same tests pass on Travis CI and AppVeyor.
129129
<img src="https://contributors-img.firebaseapp.com/image?repo=PyThaiNLP/pythainlp" />
130130
</a>
131131

132-
Thanks all the [contributors](https://github.com/PyThaiNLP/pythainlp/graphs/contributors). (Image made with [contributors-img](https://contributors-img.firebaseapp.com))
132+
Thanks to all [contributors](https://github.com/PyThaiNLP/pythainlp/graphs/contributors). (Image made with [contributors-img](https://contributors-img.firebaseapp.com))
133133

134-
### Development Lead
135-
- Wannaphong Phatthiyaphaibun <[email protected]> - founder, distribution and maintainance
136-
- Korakot Chaovavanich - initial tokenization and soundex code
134+
### Development Leads
135+
- Wannaphong Phatthiyaphaibun <[email protected]> - foundation, distribution and maintenance
136+
- Korakot Chaovavanich - initial tokenization and soundex codes
137137
- Charin Polpanumas - classification and benchmarking
138138
- Peeradej Tanruangporn - documentation
139-
- Arthit Suriyawongkul - refactoring, packaging, distribution, and maintainance
139+
- Arthit Suriyawongkul - refactoring, packaging, distribution, and maintenance
140140
- Chakri Lowphansirikul - documentation
141141
- Pattarawat Chormai - benchmarking
142-
- Thanathip Suntorntip - nlpO3 maintainance, Rust Developer
143-
- Can Udomcharoenchaikit - documentation and code
142+
- Thanathip Suntorntip - nlpO3 maintenance, Rust Developer
143+
- Can Udomcharoenchaikit - documentation and codes
144144

145145
### Maintainers
146146
- Arthit Suriyawongkul

INTHEWILD.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Who uses PyThaiNLP?
22

3-
We'd like to keep track of who is using the package. Please send a PR with your company name or @githubhandle or company name with @githubhandle.
3+
We'd like to keep track of who are using the package. Please send a PR with your company name or @githubhandle or both company name and @githubhandle.
44

5-
Currently, officially using PyThaiNLP:
5+
Currently, those who are officially using PyThaiNLP are as follows:
66

77
1. [Hope Data Annotations Co., Ltd.](https://hopedata.org) ([@hopedataannotations](https://github.com/hopedataannotaions))
88
2. [Codustry (Thailand) Co., Ltd.](https://codustry.com) ([@codustry](https://github.com/codustry))

README.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,13 @@
1313
<a href="https://matrix.to/#/#thainlp:matrix.org" rel="noopener" target="_blank"><img src="https://matrix.to/img/matrix-badge.svg" alt="Chat on Matrix"></a>
1414
</div>
1515

16-
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to [NLTK](https://www.nltk.org/) with focus on Thai language.
16+
PyThaiNLP is a Python package for text processing and linguistic analysis, similar to [NLTK](https://www.nltk.org/) with a focus on the Thai language.
1717

1818
PyThaiNLP เป็นไลบารีภาษาไพทอนสำหรับประมวลผลภาษาธรรมชาติ คล้ายกับ NLTK โดยเน้นภาษาไทย [ดูรายละเอียดภาษาไทยได้ที่ README_TH.MD](https://github.com/PyThaiNLP/pythainlp/blob/dev/README_TH.md)
1919

2020
**News**
2121

22-
> Now, You can contact or ask any questions with the PyThaiNLP team. <a href="https://matrix.to/#/#thainlp:matrix.org" rel="noopener" target="_blank"><img src="https://matrix.to/img/matrix-badge.svg" alt="Chat on Matrix"></a>
22+
> Now, You can contact with or ask any questions of the PyThaiNLP team. <a href="https://matrix.to/#/#thainlp:matrix.org" rel="noopener" target="_blank"><img src="https://matrix.to/img/matrix-badge.svg" alt="Chat on Matrix"></a>
2323
2424
| Version | Description | Status |
2525
|:------:|:--:|:------:|
@@ -37,7 +37,7 @@ PyThaiNLP เป็นไลบารีภาษาไพทอนสำหร
3737

3838
## Capabilities
3939

40-
PyThaiNLP provides standard NLP functions for Thai, for example part-of-speech tagging, linguistic unit segmentation (syllable, word, or sentence). Some of these functions are also available via command-line interface.
40+
PyThaiNLP provides standard NLP functions for Thai, for example part-of-speech tagging, linguistic unit segmentation (syllable, word, or sentence). Some of these functions are also available via the command-line interface.
4141

4242
<details>
4343
<summary>List of Features</summary>
@@ -48,11 +48,11 @@ PyThaiNLP provides standard NLP functions for Thai, for example part-of-speech t
4848
- Thai spelling suggestion and correction (`spell` and `correct`)
4949
- Thai transliteration (`transliterate`)
5050
- Thai soundex (`soundex`) with three engines (`lk82`, `udom83`, `metasound`)
51-
- Thai collation (sort by dictionary order) (`collate`)
51+
- Thai collation (sorted by dictionary order) (`collate`)
5252
- Read out number to Thai words (`bahttext`, `num_to_thaiword`)
5353
- Thai datetime formatting (`thai_strftime`)
5454
- Thai-English keyboard misswitched fix (`eng_to_thai`, `thai_to_eng`)
55-
- Command-line interface for basic functions, like tokenization and pos tagging (run `thainlp` in your shell)
55+
- Command-line interface for basic functions, like tokenization and POS tagging (run `thainlp` in your shell)
5656
</details>
5757

5858

@@ -67,7 +67,7 @@ This will install the latest stable release of PyThaiNLP.
6767
Install different releases:
6868

6969
- Stable release: `pip install --upgrade pythainlp`
70-
- Pre-release (near ready): `pip install --upgrade --pre pythainlp`
70+
- Pre-release (nearly ready): `pip install --upgrade --pre pythainlp`
7171
- Development (likely to break things): `pip install https://github.com/PyThaiNLP/pythainlp/archive/dev.zip`
7272

7373
### Installation Options
@@ -92,27 +92,27 @@ pip install pythainlp[extra1,extra2,...]
9292
- `wordnet` (for Thai WordNet API)
9393
</details>
9494

95-
For dependency details, look at `extras` variable in [`setup.py`](https://github.com/PyThaiNLP/pythainlp/blob/dev/setup.py).
95+
For dependency details, look at the `extras` variable in [`setup.py`](https://github.com/PyThaiNLP/pythainlp/blob/dev/setup.py).
9696

9797

98-
## Data directory
98+
## Data Directory
9999

100-
- Some additional data, like word lists and language models, may get automatically download during runtime.
100+
- Some additional data, like word lists and language models, may be automatically downloaded during runtime.
101101
- PyThaiNLP caches these data under the directory `~/pythainlp-data` by default.
102-
- Data directory can be changed by specifying the environment variable `PYTHAINLP_DATA_DIR`.
102+
- The data directory can be changed by specifying the environment variable `PYTHAINLP_DATA_DIR`.
103103
- See the data catalog (`db.json`) at https://github.com/PyThaiNLP/pythainlp-corpus
104104

105105

106106
## Command-Line Interface
107107

108-
Some of PyThaiNLP functionalities can be used at command line, using `thainlp` command.
108+
Some of PyThaiNLP functionalities can be used via command line with the `thainlp` command.
109109

110-
For example, displaying a catalog of datasets:
110+
For example, to display a catalog of datasets:
111111
```sh
112112
thainlp data catalog
113113
```
114114

115-
Showing how to use:
115+
To show how to use:
116116
```sh
117117
thainlp help
118118
```
@@ -122,16 +122,16 @@ thainlp help
122122

123123
| | License |
124124
|:---|:----|
125-
| PyThaiNLP Source Code and Notebooks | [Apache Software License 2.0](https://github.com/PyThaiNLP/pythainlp/blob/dev/LICENSE) |
125+
| PyThaiNLP source codes and notebooks | [Apache Software License 2.0](https://github.com/PyThaiNLP/pythainlp/blob/dev/LICENSE) |
126126
| Corpora, datasets, and documentations created by PyThaiNLP | [Creative Commons Zero 1.0 Universal Public Domain Dedication License (CC0)](https://creativecommons.org/publicdomain/zero/1.0/)|
127127
| Language models created by PyThaiNLP | [Creative Commons Attribution 4.0 International Public License (CC-by)](https://creativecommons.org/licenses/by/4.0/) |
128-
| Other corpora and models that may included with PyThaiNLP | See [Corpus License](https://github.com/PyThaiNLP/pythainlp/blob/dev/pythainlp/corpus/corpus_license.md) |
128+
| Other corpora and models that may be included in PyThaiNLP | See [Corpus License](https://github.com/PyThaiNLP/pythainlp/blob/dev/pythainlp/corpus/corpus_license.md) |
129129

130130

131131
## Contribute to PyThaiNLP
132132

133-
- Please do fork and create a pull request :)
134-
- For style guide and other information, including references to algorithms we use, please refer to our [contributing](https://github.com/PyThaiNLP/pythainlp/blob/dev/CONTRIBUTING.md) page.
133+
- Please fork and create a pull request :)
134+
- For style guides and other information, including references to algorithms we use, please refer to our [contributing](https://github.com/PyThaiNLP/pythainlp/blob/dev/CONTRIBUTING.md) page.
135135

136136
## Who uses PyThaiNLP?
137137

@@ -140,13 +140,13 @@ You can read [INTHEWILD.md](https://github.com/PyThaiNLP/pythainlp/blob/dev/INTH
140140

141141
## Citations
142142

143-
If you use `PyThaiNLP` in your project or publication, please cite the library as follows
143+
If you use `PyThaiNLP` in your project or publication, please cite the library as follows:
144144

145145
```
146146
Wannaphong Phatthiyaphaibun, Korakot Chaovavanich, Charin Polpanumas, Arthit Suriyawongkul, Lalita Lowphansirikul, & Pattarawat Chormai. (2016, Jun 27). PyThaiNLP: Thai Natural Language Processing in Python. Zenodo. http://doi.org/10.5281/zenodo.3519354
147147
```
148148

149-
or BibTeX entry:
149+
or by BibTeX entry:
150150

151151
``` bib
152152
@misc{pythainlp,
@@ -166,7 +166,7 @@ or BibTeX entry:
166166
| Logo | Description |
167167
| --- | ----------- |
168168
| [![VISTEC-depa Thailand Artificial Intelligence Research Institute](https://airesearch.in.th/assets/img/logo/airesearch-logo.svg)](https://airesearch.in.th/) | Since 2019, our contributors Korakot Chaovavanich and Lalita Lowphansirikul have been supported by [VISTEC-depa Thailand Artificial Intelligence Research Institute](https://airesearch.in.th/). |
169-
| [![MacStadium](https://i.imgur.com/rKy1dJX.png)](https://www.macstadium.com) | We get support free Mac Mini M1 from [MacStadium](https://www.macstadium.com) for doing Build CI. |
169+
| [![MacStadium](https://i.imgur.com/rKy1dJX.png)](https://www.macstadium.com) | We get support of free Mac Mini M1 from [MacStadium](https://www.macstadium.com) for running CI builds. |
170170

171171
------
172172

@@ -181,5 +181,5 @@ or BibTeX entry:
181181
</div>
182182

183183
<div align="center">
184-
<strong>Beware of malware if you use code from mirrors other than the official two at GitHub and GitLab.</strong>
184+
<strong>Beware of malware if you use codes from mirrors other than the official two on GitHub and GitLab.</strong>
185185
</div>

0 commit comments

Comments
 (0)