From edi at agharta.de Sat Aug 6 13:09:53 2005 From: edi at agharta.de (Edi Weitz) Date: Sat, 06 Aug 2005 15:09:53 +0200 Subject: [regex-coach] Re: ^123$, Str:123; Result:No match ??? In-Reply-To: <6C11F971A507DF4B951EA4153880BDFA01C76B28@inblrm999msx.in002.siemens.net> (M. Nithyanandham's message of "Sat, 6 Aug 2005 16:23:24 +0530") References: <6C11F971A507DF4B951EA4153880BDFA01C76B28@inblrm999msx.in002.siemens.net> Message-ID: [Please send questions and bug reports to the mailing list. The address is /not/ regex-coach-owner at common-lisp.net but regex-coach at common-lisp.net. You have to subscribe first.] On Sat, 6 Aug 2005 16:23:24 +0530, "Nithyanandham, M" wrote: > I currently use regex-coach that is developed by you. It is really > helpful to learn regular expression very fast. Meanwhile, I just > found one bug with it and hence I wanted to inform you. > > Regular expression: > ^123$ > > Target string: > 123 > > Result: No match. > > Isn't that incorrect result? That would be incorrect. I can't reproduce that, though. If I enter these strings The Regex Coach shows a match. Are you using the newest version (0.6.7)? Maybe there's some whitespace or some hidden character in either the regex or the target string? Cheers, Edi. From era+regex=coach at iki.fi Tue Aug 30 10:01:53 2005 From: era+regex=coach at iki.fi (era+regex=coach at iki.fi) Date: Tue, 30 Aug 2005 13:01:53 +0300 Subject: [regex-coach] Apparent crash on complex regex in "Tree" tab Message-ID: <1125396113.6889.241770226@webmail.messagingengine.com> I came across your tool, and while it appears interesting, the first thing I tried it with appears to have crashed it somehow. I was trying to run a SpamAssassin rule against a message from the SpamAssassin public corpus. Both of these are publicly available from http://www.spamassassin.org but for completeness, I am also attaching them below in exactly the form I was using them. Running the regex from 20_head_tests_cf_UNRESOLVED_TEMPLATE.txt with the /m modifier against the message headers in easy_ham_00001-head.txt and clicking on the "Tree" tab causes a lot of output to be printed on standard output (cf. third attachment) and the main window to disappear after a short while. The process is still alive, consuming around 90-95% of the CPU, though not producing a lot of load (load average still well below one). I finally killed it after allowing it to run for some 10 minutes wall clock time. Incidentally, ctrl-C causes a message "Quittingregex-coach/regex-coach : Interrupted - quitting" to be printed (on standard output, not standard error, sigh), but does not kill the process. PS. Would you mind if your list was signed up for Gmane ? If I go ahead and sign up the list, do you have an mbox archive handy which could be imported into Gmane? /* era */ -- If this were a real .signature, it would suck less. Well, maybe not. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 20_head_tests_cf_UNRESOLVED_TEMPLATE.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: easy_ham_00001-head.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: regex-coach-out.txt URL: From edi at agharta.de Tue Aug 30 13:30:01 2005 From: edi at agharta.de (Edi Weitz) Date: Tue, 30 Aug 2005 15:30:01 +0200 Subject: [regex-coach] Apparent crash on complex regex in "Tree" tab In-Reply-To: <1125396113.6889.241770226@webmail.messagingengine.com> (era's message of "Tue, 30 Aug 2005 13:01:53 +0300") References: <1125396113.6889.241770226@webmail.messagingengine.com> Message-ID: Hi! On Tue, 30 Aug 2005 13:01:53 +0300, era+regex=coach at iki.fi wrote: > I came across your tool, and while it appears interesting, the first > thing I tried it with appears to have crashed it somehow. > > I was trying to run a SpamAssassin rule against a message from the > SpamAssassin public corpus. Both of these are publicly available > from http://www.spamassassin.org but for completeness, I am also > attaching them below in exactly the form I was using them. > > Running the regex from 20_head_tests_cf_UNRESOLVED_TEMPLATE.txt with > the /m modifier against the message headers in > easy_ham_00001-head.txt and clicking on the "Tree" tab causes a lot > of output to be printed on standard output (cf. third attachment) > and the main window to disappear after a short while. The process is > still alive, consuming around 90-95% of the CPU, though not > producing a lot of load (load average still well below one). I > finally killed it after allowing it to run for some 10 minutes wall > clock time. Incidentally, ctrl-C causes a message > "Quittingregex-coach/regex-coach : Interrupted - quitting" to be > printed (on standard output, not standard error, sigh), but does not > kill the process. Which version of Regex Coach on which OS are you using? FWIW, I can't reproduce your result with 0.6.7 on Windows XP. > PS. Would you mind if your list was signed up for Gmane > ? It's OK with me but note that the list is already archived on common-lisp.net. See footer. > If I go ahead and sign up the list, do you have an mbox archive > handy which could be imported into Gmane? No, sorry. Cheers, Edi. From randy at randy.nu Tue Aug 30 17:27:30 2005 From: randy at randy.nu (Randy Gerritse) Date: Tue, 30 Aug 2005 19:27:30 +0200 Subject: [regex-coach] feature request: Named Groups Message-ID: <43149702.8080901@randy.nu> Hi First of all I've gotta say, having tested all the available regex tools, your regex coach is farout the best. However, one thing has been bugging me A LOT. You don't seem to support named groups! Since this makes the output in the language I use, PHP, a lot more userfriendly (and less subject to change if you alter the expression) I use this language feature a lot: class="cell2">.*?(?P(\d{1,3},{0,1}){0,5})\s*& .*?class="cell5">(?P.*?)<\/td> .*?Publisher:<\/td> \s*(?P.*?)<\/td> .*?Limitations:<\/td> \s*(?P.*?)<\/td> .*?Date.added:<\/td> \s*(?P.*?)<\/td> .*?\s*?(?P.*?)\s*?<\/div> like in the above example (which by the way your tool processes correctly except for the named groups, where most other tested tools failed!) in the documentation I found it says that this format: (?Pgroup) ...has been adapted from python and made part of the standard... is there a reason you have not put it in the tool? and is it possible to start supporting it? that would be a BIG help!!! right now I need to strip out all names before testing and that sux0rs... it would make your tool the best out there, by a mile..! From edi at agharta.de Tue Aug 30 22:58:47 2005 From: edi at agharta.de (Edi Weitz) Date: Wed, 31 Aug 2005 00:58:47 +0200 Subject: [regex-coach] feature request: Named Groups In-Reply-To: <43149702.8080901@randy.nu> (Randy Gerritse's message of "Tue, 30 Aug 2005 19:27:30 +0200") References: <43149702.8080901@randy.nu> Message-ID: Hi! On Tue, 30 Aug 2005 19:27:30 +0200, Randy Gerritse wrote: > First of all I've gotta say, having tested all the available regex > tools, your regex coach is farout the best. Thanks... :) > However, one thing has been bugging me A LOT. You don't seem to > support named groups! Since this makes the output in the language I > use, PHP, a lot more userfriendly (and less subject to change if you > alter the expression) I use this language feature a lot: > > class="cell2">.*?(?P(\d{1,3},{0,1}){0,5})\s*& > .*?class="cell5">(?P.*?)<\/td> > .*?Publisher:<\/td> > \s*(?P.*?)<\/td> > .*?Limitations:<\/td> > \s*(?P.*?)<\/td> > .*?Date.added:<\/td> > \s*(?P.*?)<\/td> > .*?\s*?(?P.*?)\s*?<\/div> > > like in the above example (which by the way your tool processes > correctly except for the named groups, where most other tested tools > failed!) > > in the documentation I found it says that this format: > > (?Pgroup) > > ...has been adapted from python and made part of the standard... is > there a reason you have not put it in the tool? and is it possible > to start supporting it? that would be a BIG help!!! right now I need > to strip out all names before testing and that sux0rs... it would > make your tool the best out there, by a mile..! The reason for not supporting named groups is that they're not in Perl. CL-PPCRE, my regex library on which Regex Coach is based, tries to be Perl-compatible. Cheers, Edi.