initial utf-8 support

This commit is contained in:
Kyryl Melekhin
2021-07-16 22:05:33 +00:00
parent 9a7b9d1498
commit 45e331dc79
3 changed files with 88 additions and 20 deletions

3
README
View File

@@ -38,11 +38,12 @@ so that the user does not need to waste time taking strlen()
* Support for "match" vs "search" operations, as common in other regex APIs.
* Support for named character classes: \d \D \s \S \w \W.
* Support for repetition operator {n} and {n,m}.
* Support for Unicode (UTF-8).
* Unlike other engines, the output is byte level offset. (Which is more useful)
TODO
====
* Support for Unicode (UTF-8). (trivial to do, because of int type sized code)
* Support for matching flags like case-insensitive, dot matches all,
multiline, etc.
* Support for more assertions like \A, \Z.