Skip to content

Conversation

@peternewman
Copy link
Collaborator

No description provided.

@peternewman peternewman added the dictionary Changes to the dictionary label Aug 14, 2020
@peternewman peternewman requested a review from larsoner August 14, 2020 18:55
@peternewman peternewman changed the title Add correspomding->corresponding and friends Add various new words Aug 28, 2020
@peternewman peternewman changed the title Add various new words Add various new words and suggestions for existing typos Aug 28, 2020
@peternewman peternewman marked this pull request as draft August 28, 2020 04:20
@peternewman
Copy link
Collaborator Author

Blocked behind #1647 as I'd like to test it catches these ones too.

@peternewman peternewman marked this pull request as ready for review September 2, 2020 23:10
@peternewman
Copy link
Collaborator Author

Blocked behind #1647 as I'd like to test it catches these ones too.

It does indeed.

Copy link
Collaborator Author

@peternewman peternewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done @lurch and @sebweb3r . Do you want to re-review?

automatizes->automates
backword->backward
backwords->backwards
bale->able
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pale?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And kale, dale, male etc. None of those letters are close to B though. I really just wanted to cover the transposing of two characters from one perfectly valid word ( https://en.wikipedia.org/wiki/Baler ) to another more likely one. I suspect male might be more likely than pale anyway if we went down that route.

Copy link
Contributor

@sebweb3r sebweb3r Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pale just came to my mind, because there is a german dialect, which doesn't distinguish between b or p, and d or t 😄
Close on the keyboard would be vale. I just wanted to prevent people from auto-correcting something they didn't want to (yes i know interactive mode).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also suggest bale->bald ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think vaguely infrequent rare valid words should suggest similar but unrelated ones unless there are very strong arguments for it, bald seems the only likely typo we've come up with. I'd suggest vale is as infrequent as bale.

I think also the frequency stats are very telling (I wasn't too far off with vale; bald is still pretty rare):
http://app.aspell.net/lookup?dict=en_US&words=bale%0D%0Aable%0D%0Avale%0D%0Abald%0D%0Apale%0D%0Amale%0D%0Adale%0D%0Akale%0D%0Asale

Between those and the fact it's just a transposition, so not even a real typo, I'd really like to leave it as is.

I think the auto-correct solution is to add what we dropped when we went to multi dictionary and have everything in rare (and names and informal) offer corrections to itself automatically). I can pull it out to a separate PR until that's been done if that would keep you two happy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You obviously understand this project much better than I do; I don't have strong opinions either way.

accessoirez->accessorize, accessories,
accessort->accessor
accesss->access
accesssor->accessor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also suggest accesssor->access or (if that's not too far outside the scope of this PR)
Should accessor->access or maybe be in the code dictionary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also suggest accesssor->access or (if that's not too far outside the scope of this PR)

I think that should be a separate PR if it goes in, but you're saying you typed an s instead of a space seems rather unlikely to be. Some real world examples would be good again.

Should accessor->access or maybe be in the code dictionary?

Yes if it was to go in. There seem to be enough examples of those words together that it's plausible. But probably one for a separate PR given this one is getting big. Why don't you do the honours?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I made too many suggestions, I'm happy for you to just ignore any comments I made, as you see fit.

ninimum->minimum
ninjs->ninja
ninteenth->nineteenth
ninties->1990s
Copy link
Contributor

@lurch lurch Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, this seems like a bit of an extreme "correction"! Should probably just be ninties->nineties ?
You wouldn't really want "My grandma lived into her ninties" to be corrected to "My grandma lived into her 1990s" 🤣

EDIT: Also, perhaps the ninty->ninety just below here should be ninty->ninety, minty,?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, do you want to open a new PR for them given they're entirely unrelated to this one and it's growing rather large already? I suspect they should be in informal probably. There are probably 80s/70s etc variants too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably just be ninties->nineties ?

I've added that option as the primary choice.

EDIT: Also, perhaps the ninty->ninety just below here should be ninty->ninety, minty,?

Done this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably 80s/70s etc variants too.

Nope 🙂

$ grep '[0-9]' codespell_lib/data/dictionary.txt 
1nd->1st
2rd->2nd
2st->2nd
3nd->3rd
3st->3rd
nd->and, 2nd,
ninties->1990s
p0enis->penis
sh1sum->sha1sum
UTF8ness->UTF-8-ness

calcutated->calculated
caleed->called
calender->calendar
calenders->calendars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, new one for me

@larsoner larsoner merged commit 20b30ed into master Nov 3, 2020
@larsoner larsoner deleted the peternewman-correspomding branch November 3, 2020 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dictionary Changes to the dictionary

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants