From Jaroslaw.Tomczak at sanofi-aventis.com Mon Jul 4 11:24:54 2005 From: Jaroslaw.Tomczak at sanofi-aventis.com (Jaroslaw.Tomczak at sanofi-aventis.com) Date: Mon, 4 Jul 2005 13:24:54 +0200 Subject: [tbnl-devel] Upload of text files Message-ID: Hi, I have noticed a minor problem when uploading text files with TBNL (version 0.5.5 + Apache 2.0.54 + mod_lisp2 + LispWorks 4.4.5, all on Windows 2000) -- DOS/Windows text files (CR/LF) have one empty line appended at the end when uploaded. Unix text files, however, are transferred correctly. Any ideas? Jaroslaw Tomczak P.S. Edi, thanks for your great libraries! -- Dr. Jaroslaw Tomczak Aventis Pharma Deutschland GmbH A company of the sanofi-aventis group SMA Chemical Sciences Drug Design I Industriepark Hoechst, H840 room 104 D-65926 Frankfurt / Main phone: +49 (69) 305-14710 fax: +49 (69) 305-13068 From edi at agharta.de Tue Jul 5 11:07:45 2005 From: edi at agharta.de (Edi Weitz) Date: Tue, 05 Jul 2005 13:07:45 +0200 Subject: [tbnl-devel] Upload of text files In-Reply-To: (Jaroslaw Tomczak's message of "Mon, 4 Jul 2005 13:24:54 +0200") References: Message-ID: Hi! On Mon, 4 Jul 2005 13:24:54 +0200, wrote: > I have noticed a minor problem when uploading text files with TBNL > (version 0.5.5 + Apache 2.0.54 + mod_lisp2 + LispWorks 4.4.5, all on > Windows 2000) -- DOS/Windows text files (CR/LF) have one empty line > appended at the end when uploaded. Unix text files, however, are > transferred correctly. Thanks for the report, I wasn't aware of this. > Any ideas? I sniffed around a bit and it looks like the problem is in the RFC 2388 library TBNL uses. As a workaround try to replace the part of rfc2388.lisp which looks like (3 ;; first dash in dash-boundary (cond ((char= char #\-) (enqueue-char) (setq state 4)) (t (write-queued-chars) (write-char char result) (setq state 1)))) with this one: (3 ;; first dash in dash-boundary (cond ((char= char #\-) (enqueue-char) (setq state 4)) ((char= char #\return) (write-queued-chars) (enqueue-char) (setq state 2)) (t (write-queued-chars) (write-char char result) (setq state 1)))) I haven't really tested if this breaks anything else but at least it seems to get rid of your problem. Let me know if it works for you. I'll send a bug report to Janis. > P.S. Edi, thanks for your great libraries! You're welcome. Cheers, Edi. From Jaroslaw.Tomczak at sanofi-aventis.com Tue Jul 5 13:58:42 2005 From: Jaroslaw.Tomczak at sanofi-aventis.com (Jaroslaw.Tomczak at sanofi-aventis.com) Date: Tue, 5 Jul 2005 15:58:42 +0200 Subject: AW: [tbnl-devel] Upload of text files Message-ID: Hi Edi, yes it seems to work for me. Thank you! Jaroslaw > -----Urspr?ngliche Nachricht----- > Von: Edi Weitz [mailto:edi at agharta.de] > Gesendet: Dienstag, 5. Juli 2005 13:08 > An: Tomczak, Jaroslaw PH/DE > Cc: tbnl-devel at common-lisp.net > Betreff: Re: [tbnl-devel] Upload of text files > > > Hi! > > On Mon, 4 Jul 2005 13:24:54 +0200, > wrote: > > > I have noticed a minor problem when uploading text files with TBNL > > (version 0.5.5 + Apache 2.0.54 + mod_lisp2 + LispWorks 4.4.5, all on > > Windows 2000) -- DOS/Windows text files (CR/LF) have one empty line > > appended at the end when uploaded. Unix text files, however, are > > transferred correctly. > > Thanks for the report, I wasn't aware of this. > > > Any ideas? > > I sniffed around a bit and it looks like the problem is in the RFC > 2388 library TBNL uses. As a workaround try to replace the part of > rfc2388.lisp which looks like > > (3 ;; first dash in dash-boundary > (cond ((char= char #\-) > (enqueue-char) > (setq state 4)) > (t > (write-queued-chars) > (write-char char result) > (setq state 1)))) > > with this one: > > (3 ;; first dash in dash-boundary > (cond ((char= char #\-) > (enqueue-char) > (setq state 4)) > ((char= char #\return) > (write-queued-chars) > (enqueue-char) > (setq state 2)) > (t > (write-queued-chars) > (write-char char result) > (setq state 1)))) > > I haven't really tested if this breaks anything else but at least it > seems to get rid of your problem. Let me know if it works for you. > > I'll send a bug report to Janis. > > > P.S. Edi, thanks for your great libraries! > > You're welcome. > > Cheers, > Edi. > From hutch at recursive.ca Thu Jul 14 14:38:20 2005 From: hutch at recursive.ca (Bob Hutchison) Date: Thu, 14 Jul 2005 10:38:20 -0400 Subject: [tbnl-devel] Serving static files In-Reply-To: <051570acb557f5745ecbbfdf4843de07@recursive.ca> References: <07de62902ae5369cfc2d86239d85f955@recursive.ca> <20050610133857.GA9758@parsec.no-spoon.de> <051570acb557f5745ecbbfdf4843de07@recursive.ca> Message-ID: <1c366938db6df8f29738c3650a60979c@recursive.ca> Hi, I have implemented a handler for serving static files using TBNL. I ended up doing this differently than I expected. I extended create-prefix-dispatcher and called it create-prefix-dispatcher/2 and a little macro, prefix-dispatcher/2, to simplify things a bit. create-prefix-dispatcher/2 does everything that create-prefix-dispatcher does (in the same way) and is intended to be a drop-in replacement. create-prefix-dispatcher/2 looks at its page-function argument and handles symbols exactly as create-prefix-dispatcher did, but cons are handled differently. When create-prefix-dispatcher/2 encounters a page function that is a cons it assumes that it is a lambda with one argument that returns a lambda of no arguments. create-prefix-dispatcher/2 uses gensym to generate a symbol for the function then setfs its symbol-function to the lambda returned by (funcall (eval page-function) prefix). The generated symbol is then handled exactly as though that symbol was passed in as page-function. This seems a bit awkward to me, but it works. The static-directory/*-handler takes 4 arguments: prefix -- the url prefix that matched the script-name. When this is removed from the front of the script-name, we are left with the path relative to the directory-path of the file requested. directory-path -- the root directory of the files to be served default-type -- the default mime-type (can be nil) file-type-map -- an assoc list of file name extensions (e.g. ".gif") and a mime type. If the content-type cannot be determined it is not set (maybe not the best idea, but...) I've not been able to test this on anything other that LWM using OS/X 10.3.9 -- so only tbnl-bivalent-streams has been tested at all. Sorry, the documentation is a bit weak. There is an example near the end of how to use this stuff. Hope somebody finds this useful. Cheers, Bob ------- (defun static-directory/*-handler (prefix directory-path default-type file-type-map) "A TBNL handler that will serve static files located relative to a directory. 'prefix' is what TBNL matched to the script-name (this match provided the excuse to call this handler). If we remove the prefix from the front of the script-name we get the path, relative to 'directory-path', that identifies the file. 'default-type' is the default mime-type for the file, nil is okay. 'file-type-map' is an assoc list of file extensions and mime types. " (labels ((determine-content-type (relative-file-path) (or (cdr (find-if (lambda (pair) (zerop (mismatch (car pair) relative-file-path :from-end t))) file-type-map)) default-type))) (let* ((script-name (script-name)) (relative-file-path (subseq script-name (mismatch prefix script-name))) (path (concatenate 'string directory-path relative-file-path)) (time (or (file-write-date path) (get-universal-time))) (content-type (determine-content-type relative-file-path))) (when content-type (setf (content-type) content-type)) (unless (probe-file path) (setf (return-code) +http-not-found+) (throw 'tbnl-handler-done nil)) #+:tbnl-bivalent-streams (progn (handle-if-modified-since time) (with-open-file (file path :direction :input :element-type '(unsigned-byte 8) :if-does-not-exist nil) (let* ((len (file-length file)) (buf (make-array len :element-type '(unsigned-byte 8)))) (read-sequence buf file) (setf (header-out "Last-Modified") (rfc-1123-date time)) buf))) #-:tbnl-bivalent-streams (let ((buf (make-array 8192 :element-type 'character))) (handle-if-modified-since time) (let ((str (with-output-to-string (out) (with-open-file (file path :direction :input :if-does-not-exist nil) (loop for pos = (read-sequence buf file) until (zerop pos) do (write-sequence buf out :end pos)))))) (setf (header-out "Last-Modified") (rfc-1123-date time)) str))))) (defmacro prefix-dispatcher/2 (fn &rest args) "construction a function with one argument, prefix, that returns a function of no arguments that calls fn with prefix as the first argument followed by the args." `(lambda (prefix) (lambda () (,fn prefix , at args)))) (defun create-prefix-dispatcher/2 (prefix page-function) "Creates a dispatch function which will dispatch to the function denoted by PAGE-FUNCTION if the file name of the current request starts with the string PREFIX. This is exactly what create-prefix-dispatcher does. However, if page-function is a cons, then it must be of the form: (lambda (prefix) (lambda () ...)) This lambda serves as the page function." (when (consp page-function) (let ((fn (gensym "handler")) (fv (funcall (eval page-function) prefix))) (setf (symbol-function fn) fv) (setf page-function fn))) (lambda (request) (let ((mismatch (mismatch (script-name request) prefix :test #'char=))) (and (or (null mismatch) (>= mismatch (length prefix))) page-function)))) (setq *dispatch-table* (nconc (mapcar (lambda (args) (apply #'create-prefix-dispatcher/2 args)) '(("/sienna/image/" (prefix-dispatcher/2 static-directory/*-handler "images/" nil '((".jpg" . "image/jpeg") (".jpeg" . "image/jpeg") (".gif" . "image/gif")))) ("/sienna/images/" (prefix-dispatcher/2 static-directory/*-handler "images/" nil '((".jpg" . "image/jpeg") (".jpeg" . "image/jpeg") (".gif" . "image/gif")))) ("/sienna/css/" (prefix-dispatcher/2 static-directory/*-handler "css/" nil '((".js" . "text/javascript") (".css" . "text/css") (".gif" . "image/gif")))) ("/sienna/item/" show-item))) (list #'default-dispatcher))) From ivan4th at gmail.com Wed Jul 27 23:11:52 2005 From: ivan4th at gmail.com (Ivan Shvedunov) Date: Thu, 28 Jul 2005 03:11:52 +0400 Subject: [tbnl-devel] UTF-8 problems -- patch Message-ID: <4d93c5bf05072716113df7032a@mail.gmail.com> Hello. Well, I've promised this patch somewhat earlier, but I didn't have time to complete it... I've discovered several problems with TBNL's handling of UTF-8. Namely, there was a problem with url-decode in util.lisp which was turning UTF-8 urlencoded strings into something incomprehensible, and also there was problem with Content-Length in modlisp.lisp which was causing UTF-8 content to be truncated. The attached patch works only with SBCL. I mean that it shouldn't break other Lisps, but proper unicode hanling is implemented only for SBCL. I've tried to make it work with Allegro demo/LispWorks Personal Edition, but with no luck. Well, concerning Allegro, the problem here is that sockets that are used to talk to mod_lisp are set to latin-1 encoding for some reason, most likely KMRCL needs to be fixed a bit, again, unfortunatelly I just have no time to complete this. As of LispWorks, I just don't know how to turn a string into series of octets and vice versa using current encoding - i.e. I didn't find something like Allegro/SBCL octets-to-string/string-to-octets there. Concerning implementation - I've introduced :tbnl-unicode feature that is set for supported Unicode-aware Lisps in specials.lisp (I'm setting it for Allegro and SBCL, thogh it doesn't help Allegro much). Also I've added supporting funcs, bytes-to-string and string-to-bytes (defined only when #+tbnl-unicode) that do the dirty job of string conversion. Ivan -------------- next part -------------- A non-text attachment was scrubbed... Name: tbnl-0.5.5-unicode.patch Type: text/x-patch Size: 3861 bytes Desc: not available URL: From edi at agharta.de Thu Jul 28 17:37:20 2005 From: edi at agharta.de (Edi Weitz) Date: Thu, 28 Jul 2005 19:37:20 +0200 Subject: [tbnl-devel] UTF-8 problems -- patch In-Reply-To: <4d93c5bf05072716113df7032a@mail.gmail.com> (Ivan Shvedunov's message of "Thu, 28 Jul 2005 03:11:52 +0400") References: <4d93c5bf05072716113df7032a@mail.gmail.com> Message-ID: Hi! On Thu, 28 Jul 2005 03:11:52 +0400, Ivan Shvedunov wrote: > Well, I've promised this patch somewhat earlier, but I didn't have > time to complete it... Thanks for the patch. See my comments below. > I've discovered several problems with TBNL's handling of > UTF-8. Namely, there was a problem with url-decode in util.lisp > which was turning UTF-8 urlencoded strings into something > incomprehensible, Note that you're calling COERCE twice in your version of URL-DECODE. > and also there was problem with Content-Length in modlisp.lisp which > was causing UTF-8 content to be truncated. > > The attached patch works only with SBCL. I mean that it shouldn't > break other Lisps, but proper unicode hanling is implemented only > for SBCL. I've tried to make it work with Allegro demo/LispWorks > Personal Edition, but with no luck. Well, concerning Allegro, the > problem here is that sockets that are used to talk to mod_lisp are > set to latin-1 encoding for some reason, most likely KMRCL needs to > be fixed a bit, again, unfortunatelly I just have no time to > complete this. As of LispWorks, I just don't know how to turn a > string into series of octets and vice versa using current encoding - > i.e. I didn't find something like Allegro/SBCL > octets-to-string/string-to-octets there. The file test/test.lisp demonstrates the usage of external-format:encode-lisp-string for LispWorks. See also > Concerning implementation - I've introduced :tbnl-unicode feature > that is set for supported Unicode-aware Lisps in specials.lisp (I'm > setting it for Allegro and SBCL, thogh it doesn't help Allegro > much). My main concern is that at the moment the external format is kind of hard-coded into TBNL (or relying on some global setting), so if for example you use UTF-8 you can't serve binary content like JPGs anymore. Wouldn't it be better if content were always sent as a sequence of octets? (That would also solve the AllegroCL problem you mention above.) > Also I've added supporting funcs, bytes-to-string and > string-to-bytes (defined only when #+tbnl-unicode) that do the dirty > job of string conversion. I'd prefer if they were called "bytes" and not "octets" because a byte doesn't necessarily have 8 bits. They should also be exported from the TBNL package, shouldn't they? Thanks, Edi. From edi at agharta.de Thu Jul 28 18:18:08 2005 From: edi at agharta.de (Edi Weitz) Date: Thu, 28 Jul 2005 20:18:08 +0200 Subject: [tbnl-devel] UTF-8 problems -- patch In-Reply-To: <4d93c5bf050728110360e7877d@mail.gmail.com> (Ivan Shvedunov's message of "Thu, 28 Jul 2005 22:03:56 +0400") References: <4d93c5bf05072716113df7032a@mail.gmail.com> <4d93c5bf050728110360e7877d@mail.gmail.com> Message-ID: On Thu, 28 Jul 2005 22:03:56 +0400, Ivan Shvedunov wrote: > Well, I hope that (coerce bytes '(vector (unsigned-byte 8))) in > bytes-to-string doesn't add much overhead when bytes are already > '(vector (unsigned-byte 8)), Me too... :) > but it allows one to pass just a vector of numbers there without > making Lisp complain about it. OK. >> I'd prefer if they were called "bytes" and not "octets" because a >> byte doesn't necessarily have 8 bits. > > ? They _are_ called "bytes"... I meant: I'd prefer if they were called "octets" and not "bytes," sorry. > I'll try to build a more elaborate patch, but probably this will > happen no earlier than next week. Thanks! Cheers, Edi. From ivan4th at gmail.com Thu Jul 28 18:03:56 2005 From: ivan4th at gmail.com (Ivan Shvedunov) Date: Thu, 28 Jul 2005 22:03:56 +0400 Subject: [tbnl-devel] UTF-8 problems -- patch In-Reply-To: References: <4d93c5bf05072716113df7032a@mail.gmail.com> Message-ID: <4d93c5bf050728110360e7877d@mail.gmail.com> Hi. On 7/28/05, Edi Weitz wrote: > Hi! > > On Thu, 28 Jul 2005 03:11:52 +0400, Ivan Shvedunov wrote: > > > Well, I've promised this patch somewhat earlier, but I didn't have > > time to complete it... > > Thanks for the patch. See my comments below. You're welcome :) > > I've discovered several problems with TBNL's handling of > > UTF-8. Namely, there was a problem with url-decode in util.lisp > > which was turning UTF-8 urlencoded strings into something > > incomprehensible, > > Note that you're calling COERCE twice in your version of URL-DECODE. Well, I hope that (coerce bytes '(vector (unsigned-byte 8))) in bytes-to-string doesn't add much overhead when bytes are already '(vector (unsigned-byte 8)), but it allows one to pass just a vector of numbers there without making Lisp complain about it. > > > and also there was problem with Content-Length in modlisp.lisp which > > was causing UTF-8 content to be truncated. > > > > The attached patch works only with SBCL. I mean that it shouldn't > > break other Lisps, but proper unicode hanling is implemented only > > for SBCL. I've tried to make it work with Allegro demo/LispWorks > > Personal Edition, but with no luck. Well, concerning Allegro, the > > problem here is that sockets that are used to talk to mod_lisp are > > set to latin-1 encoding for some reason, most likely KMRCL needs to > > be fixed a bit, again, unfortunatelly I just have no time to > > complete this. As of LispWorks, I just don't know how to turn a > > string into series of octets and vice versa using current encoding - > > i.e. I didn't find something like Allegro/SBCL > > octets-to-string/string-to-octets there. > > The file test/test.lisp demonstrates the usage of > > external-format:encode-lisp-string > > for LispWorks. See also > > Thanks for pointer, I'll look at it. > > > Concerning implementation - I've introduced :tbnl-unicode feature > > that is set for supported Unicode-aware Lisps in specials.lisp (I'm > > setting it for Allegro and SBCL, thogh it doesn't help Allegro > > much). > > My main concern is that at the moment the external format is kind of > hard-coded into TBNL (or relying on some global setting), so if for > example you use UTF-8 you can't serve binary content like JPGs > anymore. Wouldn't it be better if content were always sent as a > sequence of octets? (That would also solve the AllegroCL problem you > mention above.) I think this will be DEFINITELY better. I just haven't studied TBNL sources enough and don't know whether this will require a lot of changes. Well, it's possible to make simple versions of bytes-to-string and string-to-bytes funcs for non-Unicode lisps (utilizing char-code/code-char) and then convert the code to binary output mode. > > Also I've added supporting funcs, bytes-to-string and > > string-to-bytes (defined only when #+tbnl-unicode) that do the dirty > > job of string conversion. > > I'd prefer if they were called "bytes" and not "octets" because a byte > doesn't necessarily have 8 bits. ? They _are_ called "bytes"... > They should also be exported from the TBNL package, shouldn't they? Yes, I think they can be useful. I'll try to build a more elaborate patch, but probably this will happen no earlier than next week. Ivan.