Module optlex

This module does lexer-based optimizations.

Notes:

  • TODO: General string delimiter conversion optimizer.
  • TODO: (numbers) warn if overly significant digit.

Functions

optimize (option, toklist, semlist, toklnlist) The main entry point.

Local Functions

atlinestart (i) Returns true if current token is at the start of a line.
atlineend (i) Returns true if current token is at the end of a line.
commenteols (lcomment) Counts comment EOLs inside a long comment.
checkpair (i, j) Compares two tokens (i, j) and returns the whitespace required.
repack_tokens () Repack tokens, removing deletions caused by optimization process.
do_number (i) Does number optimization.
do_string (I) Does string optimization.
do_lstring (I) Does long string optimization.
do_lcomment (I) Does long comment optimization.
do_comment (i) Does short comment optimization.
keep_lcomment (opt_keep, info) Returns true if string found in long comment.


Functions

optimize (option, toklist, semlist, toklnlist)

The main entry point.

  • currently, lexer processing has 2 passes
  • processing is done on a line-oriented basis, which is easier to grok due to the next point...
  • since there are various options that can be enabled or disabled, processing is a little messy or convoluted

Parameters:

Returns:

  1. {string,...} toklist
  2. {string,...} semlist
  3. {int,...} toklnlist

Local Functions

atlinestart (i)
Returns true if current token is at the start of a line.

It skips over deleted tokens via recursion.

Parameters:

  • i int

Returns:

    bool
atlineend (i)
Returns true if current token is at the end of a line.

It skips over deleted tokens via recursion.

Parameters:

  • i int

Returns:

    bool
commenteols (lcomment)
Counts comment EOLs inside a long comment.

In order to keep line numbering, EOLs need to be reinserted.

Parameters:

Returns:

    int
checkpair (i, j)
Compares two tokens (i, j) and returns the whitespace required.

See documentation for a reference table of interactions.

Only two grammar/real tokens are being considered:

  • if "", no separation is needed,
  • if " ", then at least one whitespace (or EOL) is required.

Note: This doesn't work at the start or the end or for EOS!

Parameters:

  • i int
  • j int

Returns:

    string
repack_tokens ()
Repack tokens, removing deletions caused by optimization process.
do_number (i)
Does number optimization.

Optimization using string formatting functions is one way of doing this, but here, we consider all cases and handle them separately (possibly an idiotic approach...).

Scientific notation being generated is not in canonical form, this may or may not be a bad thing.

Note: Intermediate portions need to fit into a normal number range.

Optimizations can be divided based on number patterns:

  • hexadecimal: (1) no need to remove leading zeros, just skip to (2) (2) convert to integer if size equal or smaller
    • change if equal size -> lose the 'x' to reduce entropy (3) number is then processed as an integer (4) note: does not make 0[xX] consistent
  • integer: (1) reduce useless fractional part, if present, e.g. 123.000 -> 123. (2) remove leading zeros, e.g. 000123
  • float: (1) split into digits dot digits (2) if no integer portion, take as zero (can omit later) (3) handle degenerate .000 case, after which the fractional part must be non-zero (if zero, it's matched as float .0) (4) remove trailing zeros for fractional portion (5) p.q where p > 0 and q > 0 cannot be shortened any more (6) otherwise p == 0 and the form is .q, e.g. .000123 (7) if scientific shorter, convert, e.g. .000123 -> 123e-6
  • scientific: (1) split into (digits dot digits) [eE] ([+-] digits) (2) if significand is zero, just use .0 (3) remove leading zeros for significand (4) shift out trailing zeros for significand (5) examine exponent and determine which format is best: number with fraction, or scientific

Note: Number with fraction and scientific number is never converted to integer, because Lua 5.3 distinguishes between integers and floats.

Parameters:

  • i int
do_string (I)
Does string optimization.

Note: It works on well-formed strings only!

Optimizations on characters can be summarized as follows:

 \a\b\f\n\r\t\v -- no change
 \\             -- no change
 \"\'           -- depends on delim, other can remove \
 \[\]           -- remove \
 \<char>        -- general escape, remove \  (Lua 5.1 only)
 \<eol>         -- normalize the EOL only
 \ddd           -- if \a\b\f\n\r\t\v, change to latter
                   if other < ascii 32, keep ddd but zap leading zeros
                                        but cannot have following digits
                   if >= ascii 32, translate it into the literal, then also
                                   do escapes for \\,\",\' cases
 <other>        -- no change

Switch delimiters if string becomes shorter.

Parameters:

  • I int
do_lstring (I)
Does long string optimization.

  • remove first optional newline
  • normalize embedded newlines
  • reduce '=' separators in delimiters if possible

Note: warning flagged if trailing whitespace found, not trimmed.

Parameters:

  • I int
do_lcomment (I)
Does long comment optimization.

  • trim trailing whitespace
  • normalize embedded newlines
  • reduce '=' separators in delimiters if possible

Note: It does not remove first optional newline.

Parameters:

  • I int
do_comment (i)

Does short comment optimization.

  • trim trailing whitespace

Parameters:

  • i int
keep_lcomment (opt_keep, info)
Returns true if string found in long comment.

This is a feature to keep copyright or license texts.

Parameters:

Returns:

    bool
generated by LDoc 1.4.6