Skip to content

added two modes for simple_word_tokenize compact and full#7

Merged
hadikhamoud merged 1 commit intomainfrom
simplifying-word-tokenize
Dec 21, 2025
Merged

added two modes for simple_word_tokenize compact and full#7
hadikhamoud merged 1 commit intomainfrom
simplifying-word-tokenize

Conversation

@hadikhamoud
Copy link
Member

compact mode:

  • only includes arabic, and english charsets
    Full mode:
  • contains full charset (provided by camel_tools)

@hadikhamoud
Copy link
Member Author

WIP on adding it to the cli and options

@hadikhamoud hadikhamoud merged commit 5e00041 into main Dec 21, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant