Next: HTML tokens
Up: Abidjan Course on Hypertext
Previous: Structured text: hierarchy and
- Tokenisation (lexical analysis): the analysis of a stream of characters into minimal interpretable character sequences (symbols, tokens)
- File: stream of characters
- Text: stream of tokens
- Token:
- enclosed in separators
- tag
- Separator:
- White space: SP (space), NL (carriage return and/or linefeed)
- Special characters (e.g. <, >, =)
- BOF (beginning of file)
- EOF (end of file)
Dafydd Gibbon, Sat Oct 17 18:58:17 CEST 1998