[climacs-devel] html syntax buglet

Robert Strandh strandh at labri.fr
Tue Jun 28 04:19:46 UTC 2005


Christophe Rhodes writes:
 > 
 > I think NEXT-LEXEME has a buglet in HTML syntax.

Looks plausible. 

 > 
 > in this clause:
 > 
 >         (#\< (fo) (cond ((or (end-of-buffer-p scan)
 >                              (not (eql (object-after scan) #\/)))
 >                          (make-instance 'start-tag-start))
 >                         (t (fo)
 >                            (make-instance 'end-tag-start))))
 > 
 > Here the effect of the /next/ character affects the identification of
 > the current lexeme.  This means that the lexer is no longer entirely
 > local, and so UPDATE-SYNTAX (which invalidates the lexemes based on
 > the damaged region) might not invalidate enough.  To see this, open an
 > empty html file and type
 > 
 > <html><head><title>foo</title>
 > 
 > and observe that the "</title>" is treated as a parse error.  The
 > problem is that the "<" of "</title>", when first typed, is treated as
 > a start-tag-start, and this is never invalidated.
 > 
 > I think the simplest way to fix this is probably to make #\< always
 > lex as a tag-start, rather than differentiating between
 > start-tag-start and end-tag-start, because then the update-syntax
 > method doesn't need to be updated.

You might be right.  However, this bug was introduced because I wanted
to have fewer lexemes in order to speed up the parser.  I'll think
about it some more before making this change.

 > (I found this while briefly looking for the bug which affects Prolog
 > syntax: loading a prolog file initially shows everything to be a parse
 > error; typing space at the top of the buffer forces a reparse and
 > everything is well from that point.  I have no idea why this happens.)

That's strange indeed.  Is it OK even after removing the space? 

-- 
Robert Strandh

---------------------------------------------------------------------
Greenspun's Tenth Rule of Programming: any sufficiently complicated C
or Fortran program contains an ad hoc informally-specified bug-ridden
slow implementation of half of Common Lisp.
---------------------------------------------------------------------



More information about the climacs-devel mailing list