Commit Graph

9 Commits

Author SHA1 Message Date
Mark Backman
26c937af87 Update match_endofsentence to use NLTK sentence tokenizer 2025-07-22 20:19:29 -04:00
Aleix Conchillo Flaqué
1a3a268c9d utils(string): add new function parse_start_end_tags() 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
11984b89b7 utils(string): add support for floating point numbers 2025-03-19 10:57:29 -07:00
Aleix Conchillo Flaqué
1dbad2326a utils(string): support email addresses in end of sentence matching 2025-03-19 10:57:27 -07:00
Mark Backman
b5662520aa Add one additional ellipsis test to test_utils_string 2025-02-23 11:04:24 -05:00
Aleix Conchillo Flaqué
12bce2e8c0 utils: add support for ellipses in match_endofsentence() 2025-02-21 15:05:50 -08:00
Aleix Conchillo Flaqué
f6912c0f9a utils: don't consider colon an end of sentence 2025-02-14 18:47:33 -08:00
vengadanathan srinivasan
7a0cfc8d3d Adding hindi danda symbol as end of sentence marker 2025-01-25 14:55:51 +05:30
Aleix Conchillo Flaqué
a27fe4bde2 tests: move test_ai_services to test_utils_string 2025-01-21 10:06:14 -08:00