mirror of
https://github.com/nodejs/node.git
synced 2025-08-15 21:58:48 +02:00
![]() PR-URL: https://github.com/nodejs/node/pull/58070 Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com> Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> |
||
---|---|---|
.. | ||
generalized-utf8-decoder.h | ||
LICENSE | ||
OWNERS | ||
README.v8 | ||
utf8-decoder.h |
Name: DFA UTF-8 Decoder Short Name: utf8-decoder URL: http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ Version: N/A Date: 2010-06-25 License: MIT License File: LICENSE Security Critical: yes Shipped: yes Description: Decodes UTF-8 bytes using a fast and simple definite finite automata. Local modifications: - Rejection state has been mapped to row 0 (instead of row 1) of the DFA, saving some 50 bytes and making the table easier to reason about. - The transitions have been remapped to represent both a state transition and a bit mask for the incoming byte. - The caller must now zero out the code point buffer after successful or unsuccessful state transitions. - Specifically for generalized-utf8-decoder.h: we adapt the original decoder to decode and validate "generalized UTF-8", a variant of UTF-8 used in WTF-8 that can encode surrogates. See https://simonsapin.github.io/wtf-8/#generalized-utf8. There is one fewer state and so the transition table is smaller by one in both dimensions.