New feature: uppercase first letter in sentence, code cleanup#20
New feature: uppercase first letter in sentence, code cleanup#20ivoras wants to merge 2 commits intooliverguhr:mainfrom
Conversation
|
Thanks for the PR! I always had the idea to add true casing to the model. However, I see an issue here. For example, given the following text:
the output would be:
This true casing would only work after a "." or "?" not at the beginning of a sentence and not with "!" as we don't detect them. |
I don't know what you mean with "!", as the patch doesn't use it, but I've also noticed it doesn't capitalise the starting sentence of the text, so I've updated the patch. I know this is not proper true-casing as that would probably involve also applying it to possible names inside sentences, but it's good enough for my needs. There's a model on HF that attempts to do that (1-800-BAD-CODE/xlm-roberta_punctuation_fullstop_truecase) but it's too buggy. |
Major change: making the first letter in the word following a "." or "?" uppercase (optional, defaults to off).
Minor changes: code cleanup, whitespace removal.