encoding
Character Encoding Utilities
Functions
| Function | Description |
|---|---|
enc_utf8_valid | Validate UTF-8 byte sequence. Returns 1 if valid, 0 if invalid. |
enc_utf8_char_count | Count UTF-8 codepoints in buffer. |
enc_utf8_byte_len | Byte length of first nchars codepoints. |
enc_is_ascii | Check if all bytes are ASCII (< 128). |
enc_to_upper | Convert ASCII lowercase to uppercase. Returns length. |
enc_to_lower | Convert ASCII uppercase to lowercase. Returns length. |
enc_latin1_to_utf8 | Convert Latin-1 (ISO 8859-1) to UTF-8. Returns output length. |
enc_utf8_codepoint_at | Read UTF-8 codepoint at byte offset. Returns codepoint value. |
enc_utf8_next | Get next byte offset after codepoint at given offset. |
Details
enc_utf8_valid
fn enc_utf8_valid(buf: &i8, len: i64) -> i64Validate UTF-8 byte sequence. Returns 1 if valid, 0 if invalid.
enc_utf8_char_count
fn enc_utf8_char_count(buf: &i8, len: i64) -> i64Count UTF-8 codepoints in buffer.
enc_utf8_byte_len
fn enc_utf8_byte_len(buf: &i8, len: i64, nchars: i64) -> i64Byte length of first nchars codepoints.
enc_is_ascii
fn enc_is_ascii(buf: &i8, len: i64) -> i64Check if all bytes are ASCII (< 128).
enc_to_upper
fn enc_to_upper(buf: &i8, len: i64, out: &i8) -> i64Convert ASCII lowercase to uppercase. Returns length.
enc_to_lower
fn enc_to_lower(buf: &i8, len: i64, out: &i8) -> i64Convert ASCII uppercase to lowercase. Returns length.
enc_latin1_to_utf8
fn enc_latin1_to_utf8(src: &i8, slen: i64, out: &i8) -> i64Convert Latin-1 (ISO 8859-1) to UTF-8. Returns output length.
enc_utf8_codepoint_at
fn enc_utf8_codepoint_at(buf: &i8, off: i64) -> i64Read UTF-8 codepoint at byte offset. Returns codepoint value.
enc_utf8_next
fn enc_utf8_next(buf: &i8, off: i64, len: i64) -> i64Get next byte offset after codepoint at given offset.