From pao at ascent.com Wed Sep 15 17:27:23 2004 From: pao at ascent.com (Patrick O'Donnell) Date: Wed, 15 Sep 2004 13:27:23 -0400 (EDT) Subject: [cl-ppcre-devel] Porting to Genera, feature request Message-ID: <200409151727.NAA20475@kamet.ascent.com> Hi, Edi, I just downloaded cl-ppcre to see if it would help with some parsing I'm working on. I thought I'd drop you a line just to alert you to the issues I ran into. I don't think they're worth a lot of attention, but just so you know they're there. I'm doing development on Genera, and there were several issues with compiling cl-ppcre, mostly due to Genera not quite being ANSI compliant. I finally got it to compile and ran the test. Some (21) of the tests failed. Most of the failures, I've noted are based on assumptions about character codes that do not hold for Genera (e.g. test 432 -- line termination (char-code #\Return) -> #o215, not #o12.) The others I haven't thoroughly investigated. I believe that I'll be able to use the limited features I need, and if I run into trouble, I'll let you know. The feature that I'd like to see relates to your s-expression parse tree capability. I'd like to abstract sub-parse-trees. A small change to the end of convert-aux, (otherwise (let ((translation (get parse-tree 'parse-tree-synonym))) (if translation (convert-aux translation) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree)))) and a quick macro, (defmacro DEFINE-PARSE-TREE-SYNONYM (name parse-tree) `(setf (get ',name 'ppcre::parse-tree-synonym) ',parse-tree)) and I can do something like: (define-parse-tree-synonym A (:CHAR-CLASS (:RANGE #\a #\z) (:RANGE #\A #\Z))) (define-parse-tree-synonym X (:CHAR-CLASS (:RANGE #\a #\z) (:RANGE #\A #\Z) :DIGIT-CLASS)) (define-parse-tree-synonym N :DIGIT-CLASS) (define-parse-tree-synonym SMA-DATE (:SEQUENCE n n a a a)) (define-parse-tree-synonym AIRLINE-DESIGNATOR (:SEQUENCE x x (:GREEDY-REPETITION 0 1 a))) (define-parse-tree-synonym FLIGHT-NUMBER (:GREEDY-REPETITION 3 4 n)) (define-parse-tree-synonym OPERATIONAL-SUFFIX (:GREEDY-REPETITION 0 1 a)) (defparameter *flight-scanner* (ppcre:create-scanner '(:sequence airline-designator flight-number operational-suffix "/" sma-date))) Much more perspicuous, especially in more complex parse trees where the abstracted elements are repeated. Just a thought. - Patrick O'Donnell pao at ascent.com From pao at ascent.com Wed Sep 15 20:48:15 2004 From: pao at ascent.com (Patrick O'Donnell) Date: Wed, 15 Sep 2004 16:48:15 -0400 (EDT) Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <200409151727.NAA20475@kamet.ascent.com> (pao@ascent.com) References: <200409151727.NAA20475@kamet.ascent.com> Message-ID: <200409152048.QAA20667@kamet.ascent.com> Date: Wed, 15 Sep 2004 13:27:23 -0400 (EDT) From: "Patrick O'Donnell" The feature that I'd like to see relates to your s-expression parse tree capability. I'd like to abstract sub-parse-trees. A small change to the end of convert-aux, (otherwise (let ((translation (get parse-tree 'parse-tree-synonym))) (if translation (convert-aux translation) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree)))) Except that it has to be: (otherwise (let ((translation (and (symbolp parse-tree) (get parse-tree 'parse-tree-synonym)))) (if translation (convert-aux (copy-tree translation)) (signal-ppcre-syntax-error "Unknown token ~A in parse-tree" parse-tree)))))))) - Pat From edi at agharta.de Thu Sep 16 08:44:42 2004 From: edi at agharta.de (Edi Weitz) Date: Thu, 16 Sep 2004 10:44:42 +0200 Subject: [cl-ppcre-devel] New version 0.8.0 Message-ID: <87k6uuob5x.fsf@miles.agharta.de> Hi! A new release is available from . Here's the relevant part from the changelog: Version 0.8.0 2004-09-16 Added parse tree synonyms (thanks to Patrick O'Donnell) Have fun, Edi. From edi at agharta.de Thu Sep 16 08:53:04 2004 From: edi at agharta.de (Edi Weitz) Date: Thu, 16 Sep 2004 10:53:04 +0200 Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <200409151727.NAA20475@kamet.ascent.com> (Patrick O'Donnell's message of "Wed, 15 Sep 2004 13:27:23 -0400 (EDT)") References: <200409151727.NAA20475@kamet.ascent.com> Message-ID: <87brg6oarz.fsf@miles.agharta.de> Hi Patrick! On Wed, 15 Sep 2004 13:27:23 -0400 (EDT), "Patrick O'Donnell" wrote: > I'm doing development on Genera, and there were several issues with > compiling cl-ppcre, mostly due to Genera not quite being ANSI > compliant. If you send a #+/#- patch to make compilation on Genera work I'll gladly integrate it. > I finally got it to compile and ran the test. Some (21) > of the tests failed. Most of the failures, I've noted are based on > assumptions about character codes that do not hold for Genera > (e.g. test 432 -- line termination (char-code #\Return) -> #o215, > not #o12.) The others I haven't thoroughly investigated. I believe > that I'll be able to use the limited features I need, and if I run > into trouble, I'll let you know. I can't do anything about these assumptions because CL-PPCRE purports to be Perl-compatible but I'd be glad to add a note to the docs or the README file about these (expected) failures on Genera if you send details - preferably as a patch do the README file. (I've bought an Alpha and a copy of Open Genera some months ago but haven't yet had the time to install it - let alone to play with it. Sigh...) Let me know if there are any other problems. > The feature that I'd like to see relates to your s-expression parse > tree capability. I'd like to abstract sub-parse-trees. Thanks. I've just released a new version which incorporates your patch. I've changed the API a little bit such that you also have a functional interface to these synonyms - see the docs. I've also wrapped the macro with EVAL-WHEN because it might break compiler macros otherwise. Cheers, Edi. From pao at ascent.com Thu Sep 16 13:17:26 2004 From: pao at ascent.com (Patrick O'Donnell) Date: Thu, 16 Sep 2004 09:17:26 -0400 (EDT) Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <87brg6oarz.fsf@miles.agharta.de> (message from Edi Weitz on Thu, 16 Sep 2004 10:53:04 +0200) References: <200409151727.NAA20475@kamet.ascent.com> <87brg6oarz.fsf@miles.agharta.de> Message-ID: <200409161317.JAA21086@kamet.ascent.com> From: Edi Weitz Date: Thu, 16 Sep 2004 10:53:04 +0200 On Wed, 15 Sep 2004 13:27:23 -0400 (EDT), "Patrick O'Donnell" wrote: > I'm doing development on Genera, and there were several issues with > compiling cl-ppcre, mostly due to Genera not quite being ANSI > compliant. If you send a #+/#- patch to make compilation on Genera work I'll gladly integrate it. OK. I'll have to go over it again to clean it up; some of the earlier conditionalizations I did were superseded by more comprehensive changes later. Quick precis: user::in-package is still a function, so (in-package #:cl-ppcre) doesn't work. I just changed all those to :cl-ppcre, the keyword clutter being less ugly to me than #+/#- for all those. In the package declaration, I :use'd FUTURE-COMMON-LISP instead of CL, which almost worked. I also had to shadowing import LAMBDA from CL. (Weird.) Either Genera doesn't handle the simple-string type right or future-common-lisp:simple-string isn't fully implemented. I'll want to investigate that better to determine the best solution. As it is, I just conditionalized all the simple-string usages to string. There were a couple other minor things. I'll send a cleaned-up diff sometime when deadline pressure is relieved. I can't do anything about these assumptions because CL-PPCRE purports to be Perl-compatible but I'd be glad to add a note to the docs or the README file about these (expected) failures on Genera if you send details ... OK. (For some of the tests, I'll still have to wrap my brain around the Perl syntax, to figure out what's going wrong, to see whether they are the char-code issue or something else!) (I've bought an Alpha and a copy of Open Genera some months ago but haven't yet had the time to install it - let alone to play with it. Sigh...) I understand. I've had a 3650 in my basement for some years, now, and I still haven't time to set it up. - Pat From edi at agharta.de Thu Sep 16 20:49:51 2004 From: edi at agharta.de (Edi Weitz) Date: Thu, 16 Sep 2004 22:49:51 +0200 Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <200409161317.JAA21086@kamet.ascent.com> (Patrick O'Donnell's message of "Thu, 16 Sep 2004 09:17:26 -0400 (EDT)") References: <200409151727.NAA20475@kamet.ascent.com> <87brg6oarz.fsf@miles.agharta.de> <200409161317.JAA21086@kamet.ascent.com> Message-ID: <874qlyc51s.fsf@miles.agharta.de> On Thu, 16 Sep 2004 09:17:26 -0400 (EDT), "Patrick O'Donnell" wrote: > I'll send a cleaned-up diff sometime when deadline pressure is > relieved. Cool, that'd be nice. Take your time. Thanks, Edi. From edi at agharta.de Tue Sep 28 23:14:12 2004 From: edi at agharta.de (Edi Weitz) Date: Wed, 29 Sep 2004 01:14:12 +0200 Subject: [cl-ppcre-devel] Re: CL-PPCRE bug found in do-scans In-Reply-To: =?iso-8859-1?q?=28S=E9bastien?= Saint-Sevin's message of "Wed, 29 Sep 2004 00:23:50 +0200") References: Message-ID: <87oejqasvv.fsf@miles.agharta.de> Hi S?bastien! On Wed, 29 Sep 2004 00:23:50 +0200, S?bastien Saint-Sevin wrote: > I think I found a bug when using do-scans : > > CL-USER 30 > (setq *scanner* (cl-ppcre:create-scanner "[0-9]-$")) > # > > CL-USER 31 > (cl-ppcre:do-scans (s e rs re *scanner* "simple string-")) > NIL > > CL-USER 32 > (cl-ppcre:do-scans (s e rs re *scanner* "simple string 4-")) > ==> infinite loop. 100% cpu. memory growing indefinitely. > > The same regex with a simple scan is OK : > > CL-USER 34 > (cl-ppcre:scan *scanner* "simple string 4-") > 14 > 16 > #() > #() Which version of CL-PPCRE (current is 0.8.0) and which Lisp is this? I can't reproduce this problem with CMUCL 19a or AllegroCL 7.0 beta. Please use the mailing list for bug reports, thanks. Cheers, Edi. From edi at agharta.de Tue Sep 28 23:18:58 2004 From: edi at agharta.de (Edi Weitz) Date: Wed, 29 Sep 2004 01:18:58 +0200 Subject: [cl-ppcre-devel] Re: Cl-ppcre usage In-Reply-To: =?iso-8859-1?q?=28S=E9bastien?= Saint-Sevin's message of "Tue, 28 Sep 2004 16:12:21 +0200") References: Message-ID: <87k6ueasnx.fsf@miles.agharta.de> Hi S?bastien! On Tue, 28 Sep 2004 16:12:21 +0200, S?bastien Saint-Sevin wrote: > I've got two questions regarding cl-ppcre usage. > > 1) do-scans > I need to use multi-matching within a string so I use do-scans. > However, I've got some troubles when I use ^ to force a match at the > begining of the string. It's like the start of the string is > constantly reinit after each match within the string (and this is > not what I want as you already suppose). > > Is there a way to do it? I'm not really sure what you want to achieve - maybe an example would help. If your search pattern contains ^ then it can only match once, right? Have you looked at the m modifier? > 2) user hook > I want to combine regex and some kind of lookup tables (hash-table > that stores string). In Michael parker's regex engine, it is > possible to give the engine a user hook function that, when a match > is found, check it against this user function to finally validate > it, otherwise backtrack the matching. > > How can I do that with CL-PPCRE ? (where do I need to hack the code > otherwise ? ;-)) I have plans to implement this in the future but my current workload and other projects don't allow me to do that. If you provide a clean patch which implements this functionality I'd be glad to integrate it. (In my opinion this should only be possible in parse trees, though, in order not to break Perl compatibility.) Cheers, Edi. From pao at ascent.com Wed Sep 29 17:03:14 2004 From: pao at ascent.com (Patrick O'Donnell) Date: Wed, 29 Sep 2004 13:03:14 -0400 (EDT) Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <874qlyc51s.fsf@miles.agharta.de> (message from Edi Weitz on Thu, 16 Sep 2004 22:49:51 +0200) References: <200409151727.NAA20475@kamet.ascent.com> <87brg6oarz.fsf@miles.agharta.de> <200409161317.JAA21086@kamet.ascent.com> <874qlyc51s.fsf@miles.agharta.de> Message-ID: <200409291703.NAA04141@kamet.ascent.com> Edi, Date: Thu, 16 Sep 2004 22:49:51 +0200 From: Edi Weitz On Thu, 16 Sep 2004 09:17:26 -0400 (EDT), "Patrick O'Donnell" wrote: > I'll send a cleaned-up diff sometime when deadline pressure is > relieved. Cool, that'd be nice. Take your time. I took a few moments to clean things up a bit. The diff is included, below. Most of the porting problems were taken care of by judicious package manipulation. The change in convert.lisp was because Genera had problems failing to grow the hash table when the rehash threshold was 1.0 for certain sizes of table. In errors.lisp, Symbolics didn't get around to adding the :default-initargs option. Just commenting this out causes the errors to not print, but that didn't bother me, so I haven't spent time fixing it. In optimize.lisp and lexer.lisp, Genera had problems with the string type declarations. I just diked them. I moved the defpackage of cl-ppcre-test from ppcre-tests.lisp into packages.lisp. That way, Genera could correctly utilize the package specification in the file attribute list. I could see no downside. - Pat diff -r cl-ppcre-0.8.0/convert.lisp cl-ppcre/convert.lisp 118c118 < :rehash-threshold 1.0) --- > :rehash-threshold #-genera 1.0 #+genera 0.99) diff -r cl-ppcre-0.8.0/errors.lisp cl-ppcre/errors.lisp 1c1 < ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE-LISP; Base: 10 -*- --- > ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*- 44a45 > #-genera diff -r cl-ppcre-0.8.0/lexer.lisp cl-ppcre/lexer.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/lexer.lisp Warning: missing newline at end of file cl-ppcre/lexer.lisp 89c89 < (type string string)) --- > #-genera (type string string)) diff -r cl-ppcre-0.8.0/load.lisp cl-ppcre/load.lisp 30c30 < (in-package #:cl-user) --- > (in-package :cl-user) diff -r cl-ppcre-0.8.0/optimize.lisp cl-ppcre/optimize.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/optimize.lisp Warning: missing newline at end of file cl-ppcre/optimize.lisp 48c48 < (declare (type string string)) --- > #-genera (declare (type string string)) 54c54 < (declare (type string string)) --- > #-genera (declare (type string string)) diff -r cl-ppcre-0.8.0/packages.lisp cl-ppcre/packages.lisp 30c30 < (in-package #:cl-user) --- > (in-package :cl-user) 35c35,36 < (:use #:cl) --- > #+genera (:shadowing-import-from #:common-lisp #:lambda #:simple-string #:string) > (:use #-genera #:cl #+genera #:future-common-lisp) 92a94,105 > > > #-:cormanlisp > (defpackage #:cl-ppcre-test > #+genera (:shadowing-import-from #:common-lisp #:lambda) > (:use #-genera #:cl #+genera #:future-common-lisp #:cl-ppcre) > (:export #:test)) > > #+:cormanlisp > (defpackage "CL-PPCRE-TEST" > (:use "CL" "CL-PPCRE") > (:export "TEST")) diff -r cl-ppcre-0.8.0/ppcre-tests.lisp cl-ppcre/ppcre-tests.lisp 30,41d29 < (in-package #:cl-user) < < #-:cormanlisp < (defpackage #:cl-ppcre-test < (:use #:cl #:cl-ppcre) < (:export #:test)) < < #+:cormanlisp < (defpackage "CL-PPCRE-TEST" < (:use "CL" "CL-PPCRE") < (:export "TEST")) < 154c142 < :type nil :version nil --- > :type #-genera nil #+genera :unspecific :version nil diff -r cl-ppcre-0.8.0/regex-class.lisp cl-ppcre/regex-class.lisp Warning: missing newline at end of file cl-ppcre-0.8.0/regex-class.lisp Warning: missing newline at end of file cl-ppcre/regex-class.lisp 35a36,40 > ;;; Genera need the eval-when, here, or the types created by the > ;;; class definitions aren't seen by the typep calls later in the > ;;; file. > (eval-when (:compile-toplevel :load-toplevel :execute) > 238a244,245 > );;; End eval-when > From edi at agharta.de Wed Sep 29 18:04:04 2004 From: edi at agharta.de (Edi Weitz) Date: Wed, 29 Sep 2004 20:04:04 +0200 Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <200409291703.NAA04141@kamet.ascent.com> (Patrick O'Donnell's message of "Wed, 29 Sep 2004 13:03:14 -0400 (EDT)") References: <200409151727.NAA20475@kamet.ascent.com> <87brg6oarz.fsf@miles.agharta.de> <200409161317.JAA21086@kamet.ascent.com> <874qlyc51s.fsf@miles.agharta.de> <200409291703.NAA04141@kamet.ascent.com> Message-ID: <87r7ol6jfv.fsf@miles.agharta.de> Hi Patrick! On Wed, 29 Sep 2004 13:03:14 -0400 (EDT), "Patrick O'Donnell" wrote: > I took a few moments to clean things up a bit. The diff is > included, below. Cool, thanks! I'll release a new version as soon as possible. Two little questions: 1. The abstract lists the implementations CL-PPCRE is known to work with. What am I supposed to add in this case? Something like "Genera (version x.y on Symbolics LispMachine ZZZ)", i.e. what's the official name and version number of the Lisp implementation and the OS you're using. 2. You mentioned there are a couple of issues with the tests due to different character encodings. Is there a short sentence I could add to the README file like: "Note that some tests will fail on Genera because characters like ... have encodings which differ from Perl's expectations?" Thanks again, Edi. From pao at ascent.com Wed Sep 29 18:47:50 2004 From: pao at ascent.com (Patrick O'Donnell) Date: Wed, 29 Sep 2004 14:47:50 -0400 (EDT) Subject: [cl-ppcre-devel] Porting to Genera, feature request In-Reply-To: <87r7ol6jfv.fsf@miles.agharta.de> (message from Edi Weitz on Wed, 29 Sep 2004 20:04:04 +0200) References: <200409151727.NAA20475@kamet.ascent.com> <87brg6oarz.fsf@miles.agharta.de> <200409161317.JAA21086@kamet.ascent.com> <874qlyc51s.fsf@miles.agharta.de> <200409291703.NAA04141@kamet.ascent.com> <87r7ol6jfv.fsf@miles.agharta.de> Message-ID: <200409291847.OAA04426@kamet.ascent.com> From: Edi Weitz Date: Wed, 29 Sep 2004 20:04:04 +0200 Two little questions: 1. The abstract lists the implementations CL-PPCRE is known to work with. What am I supposed to add in this case? Something like "Genera (version x.y on Symbolics LispMachine ZZZ)", i.e. what's the official name and version number of the Lisp implementation and the OS you're using. Genera 8.5. 2. You mentioned there are a couple of issues with the tests due to different character encodings. Is there a short sentence I could add to the README file like: "Note that some tests will fail on Genera because characters like ... have encodings which differ from Perl's expectations?" Return, Linefeed, and Tab. There are others, such as Back-Space and Page, but I don't think they appeared in the tests. You also should mention the issue with ppcre's errors -- that incomplete ANSI compatibility in Genera means that attempts to print the errors will fail. - Pat From edi at agharta.de Thu Sep 30 10:06:57 2004 From: edi at agharta.de (Edi Weitz) Date: Thu, 30 Sep 2004 12:06:57 +0200 Subject: [cl-ppcre-devel] New version 0.8.1 Message-ID: <87d604rry6.fsf@miles.agharta.de> Hi! A new release is available from . Here's the relevant part from the changelog: Version 0.8.1 2004-09-30 Patches for Genera 8.5 (thanks to Patrick O'Donnell) Have fun, Edi. From seb-cl-mailist at matchix.com Thu Sep 30 13:33:40 2004 From: seb-cl-mailist at matchix.com (=?iso-8859-1?Q?S=E9bastien_Saint-Sevin?=) Date: Thu, 30 Sep 2004 15:33:40 +0200 Subject: [cl-ppcre-devel] Re: Cl-ppcre usage Message-ID: Hi Edi & list, two points : 1) virtuals bugs 2) user hook 1) I feel really stupid : the two behaviours I mentionned in my previous mails were already corrected in recents releases. I was using 0.1.3 !!! The reason : the library was so good from its first releases that I simply forgot to upgrade ;-) 2) More interesting : I tried a few things to simply implement what I need, ie simple lookup function to validate a match. Here is what I came to: I added a (:FILTER ) parse-tree S-expr. It is similar to a (:BACK-REFERENCE ) in that it refers to the :REGISTER defined by . It is different in that it does not consume any char in the regex string, but submit the register substring to its predicate to validate or not the current register match. So to achieve what I wanted, I just have to encapsulate the parsed tree with (:SEQUENCE (:REGISTER ) (:FILTER 1 MY-PREDICATE-SYMBOL)) If I still want to use (parse-string ) to build the main part of the tree, I just need to incf all registers within the parsed-tree by one to take into account the register I finally add to the tree. What do you think about all that ? Cheers, Sebastien. > -----Message d'origine----- > De : cl-ppcre-devel-bounces at common-lisp.net > [mailto:cl-ppcre-devel-bounces at common-lisp.net]De la part de Edi Weitz > Envoy? : mercredi 29 septembre 2004 01:19 > ? : S?bastien Saint-Sevin > Cc : cl-ppcre-devel at common-lisp.net > Objet : [cl-ppcre-devel] Re: Cl-ppcre usage > > > Hi S?bastien! > > On Tue, 28 Sep 2004 16:12:21 +0200, S?bastien Saint-Sevin > wrote: > > > I've got two questions regarding cl-ppcre usage. > > > > 1) do-scans > > I need to use multi-matching within a string so I use do-scans. > > However, I've got some troubles when I use ^ to force a match at the > > begining of the string. It's like the start of the string is > > constantly reinit after each match within the string (and this is > > not what I want as you already suppose). > > > > Is there a way to do it? > > I'm not really sure what you want to achieve - maybe an example would > help. If your search pattern contains ^ then it can only match once, > right? Have you looked at the m modifier? > > > 2) user hook > > I want to combine regex and some kind of lookup tables (hash-table > > that stores string). In Michael parker's regex engine, it is > > possible to give the engine a user hook function that, when a match > > is found, check it against this user function to finally validate > > it, otherwise backtrack the matching. > > > > How can I do that with CL-PPCRE ? (where do I need to hack the code > > otherwise ? ;-)) > > I have plans to implement this in the future but my current workload > and other projects don't allow me to do that. If you provide a clean > patch which implements this functionality I'd be glad to integrate > it. (In my opinion this should only be possible in parse trees, > though, in order not to break Perl compatibility.) > > Cheers, > Edi. > > _______________________________________________ > cl-ppcre-devel site list > cl-ppcre-devel at common-lisp.net > http://common-lisp.net/mailman/listinfo/cl-ppcre-devel >