encoding

Character Encoding Utilities

Functions

FunctionDescription
enc_utf8_validValidate UTF-8 byte sequence. Returns 1 if valid, 0 if invalid.
enc_utf8_char_countCount UTF-8 codepoints in buffer.
enc_utf8_byte_lenByte length of first nchars codepoints.
enc_is_asciiCheck if all bytes are ASCII (< 128).
enc_to_upperConvert ASCII lowercase to uppercase. Returns length.
enc_to_lowerConvert ASCII uppercase to lowercase. Returns length.
enc_latin1_to_utf8Convert Latin-1 (ISO 8859-1) to UTF-8. Returns output length.
enc_utf8_codepoint_atRead UTF-8 codepoint at byte offset. Returns codepoint value.
enc_utf8_nextGet next byte offset after codepoint at given offset.

Details

enc_utf8_valid

fn enc_utf8_valid(buf: &i8, len: i64) -> i64

Validate UTF-8 byte sequence. Returns 1 if valid, 0 if invalid.

enc_utf8_char_count

fn enc_utf8_char_count(buf: &i8, len: i64) -> i64

Count UTF-8 codepoints in buffer.

enc_utf8_byte_len

fn enc_utf8_byte_len(buf: &i8, len: i64, nchars: i64) -> i64

Byte length of first nchars codepoints.

enc_is_ascii

fn enc_is_ascii(buf: &i8, len: i64) -> i64

Check if all bytes are ASCII (< 128).

enc_to_upper

fn enc_to_upper(buf: &i8, len: i64, out: &i8) -> i64

Convert ASCII lowercase to uppercase. Returns length.

enc_to_lower

fn enc_to_lower(buf: &i8, len: i64, out: &i8) -> i64

Convert ASCII uppercase to lowercase. Returns length.

enc_latin1_to_utf8

fn enc_latin1_to_utf8(src: &i8, slen: i64, out: &i8) -> i64

Convert Latin-1 (ISO 8859-1) to UTF-8. Returns output length.

enc_utf8_codepoint_at

fn enc_utf8_codepoint_at(buf: &i8, off: i64) -> i64

Read UTF-8 codepoint at byte offset. Returns codepoint value.

enc_utf8_next

fn enc_utf8_next(buf: &i8, off: i64, len: i64) -> i64

Get next byte offset after codepoint at given offset.