htmlparser

Simple HTML Tag Parser (SAX-style events)

Functions

FunctionDescription
_html_byte_atInternal byte reader.
_html_copyInternal: copy bytes from src+off to dst+doff, length n.
_html_add_evtInternal: add an event to the handle.
_html_find_gtInternal: find ‘>’ starting from pos. Returns index of ‘>’ or len if not found.
_html_find_ltInternal: find ‘<’ starting from pos. Returns index of ‘<’ or len if not found.
_html_tag_endInternal: find end of tag name (space or end). Returns index.
html_parseParse HTML buffer into events.
html_event_countReturn event count.
html_event_typeReturn event type at index (1=open, 2=close, 3=text).
html_event_tagCopy tag name at event idx to out. Return length.
html_event_textCopy text content at event idx to out. Return length.
html_strip_tagsStrip all HTML tags, return plain text length.
_html_poke_ampInternal: poke a string literal byte sequence for HTML escape.
_html_poke_lt
_html_poke_gt
_html_poke_quot
html_escapeEscape HTML special characters: & < > "

Details

_html_byte_at

fn _html_byte_at(buf: &i8, idx: i64) -> i64

Internal byte reader.

_html_copy

fn _html_copy(dst: &i8, doff: i64, src: &i8, soff: i64, n: i64) -> i64

Internal: copy bytes from src+off to dst+doff, length n.

_html_add_evt

fn _html_add_evt(h: &i64, typ: i64, off: i64, ln: i64) -> i64

Internal: add an event to the handle.

_html_find_gt

fn _html_find_gt(data: &i8, pos: i64, len: i64) -> i64

Internal: find ‘>’ starting from pos. Returns index of ‘>’ or len if not found.

_html_find_lt

fn _html_find_lt(data: &i8, pos: i64, len: i64) -> i64

Internal: find ‘<’ starting from pos. Returns index of ‘<’ or len if not found.

_html_tag_end

fn _html_tag_end(data: &i8, start: i64, limit: i64) -> i64

Internal: find end of tag name (space or end). Returns index.

html_parse

fn html_parse(buf: &i8, len: i64) -> &i64

Parse HTML buffer into events.

html_event_count

fn html_event_count(h: &i64) -> i64

Return event count.

html_event_type

fn html_event_type(h: &i64, idx: i64) -> i64

Return event type at index (1=open, 2=close, 3=text).

html_event_tag

fn html_event_tag(h: &i64, idx: i64, out: &i8) -> i64

Copy tag name at event idx to out. Return length.

html_event_text

fn html_event_text(h: &i64, idx: i64, out: &i8) -> i64

Copy text content at event idx to out. Return length.

html_strip_tags

fn html_strip_tags(buf: &i8, len: i64, out: &i8) -> i64

Strip all HTML tags, return plain text length.

_html_poke_amp

fn _html_poke_amp(out: &i8, oi: i64) -> i64

Internal: poke a string literal byte sequence for HTML escape.

_html_poke_lt

fn _html_poke_lt(out: &i8, oi: i64) -> i64

_html_poke_gt

fn _html_poke_gt(out: &i8, oi: i64) -> i64

_html_poke_quot

fn _html_poke_quot(out: &i8, oi: i64) -> i64

html_escape

fn html_escape(buf: &i8, len: i64, out: &i8) -> i64

Escape HTML special characters: & < > "