[cl-ppcre-devel] Byte vectors instead of strings

Edi Weitz edi at agharta.de
Mon Jul 18 00:09:07 UTC 2005


On Sun, 17 Jul 2005 20:02:05 -0400, pete-cl-ppcre at kazmier.com wrote:

> How hard would it be to modify cl-ppcre to work on byte vectors
> instead of strings?  I'm trying to obtain faster performance when
> parsing large log files.  Most of the time spent processing the logs
> is wasted on the creation of strings.  I want to use read-sequence
> with unsigned-byte as the external format to avoid that processing.
> Of course, this means I need a regexp library that can handle byte
> vectors.
>
> As a newbie, is it even worth hacking cl-ppcre to use byte vectors
> or is the difficulty level too high?  I am also considering learning
> FFI and just making an interface to a standard C regexp library
> which will work with bytes.  However, if I can use cl-ppcre, I'd
> prefer as its written in CL.

Hi Pete!

If I'm not mistaken this has already been done.  I seem to remember
someone patched CL-PPCRE to work on arbitrary sequences and this was
done for the CLIMACS project.  If you can't find it in the CLIMACS
sources which should be online somewhere you could ask Robert Strandh
- he should know about it.  Google will find his homepage.  Maybe
there's also an initial conversation about this topic in the archives
of this mailing list.

Sorry that I can't be more helpful at the moment but I'm in a hurry.

Cheers,
Edi.

PS: And in case you have to do it yourself: It shouldn't be /too/ hard
    but maybe a bit tedious.



More information about the Cl-ppcre-devel mailing list