From ctdean at sokitomi.com Sat Feb 3 02:27:34 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Fri, 02 Feb 2007 18:27:34 -0800 Subject: [drakma-devel] Closing streams with :want-stream t Message-ID: I have a file handle leak in my code (that is, I run out of file handles after running for a while) and I suspect that I'm not using drakma correctly. I wish to use the :want-stream t parameter and read the resulting stream directly, so I have this code: (defun simple-get (url) "Download the url using GET and return the body as a string." (handler-case (multiple-value-bind (stream code headers dummy-uri dummy-stream must-close?) (drakma:http-request url :want-stream t :keep-alive nil :method :get) (declare (ignore headers dummy-stream dummy-uri)) (unwind-protect (and stream code (= code 200) (with-output-to-string (out) (do ((ch (read-char stream nil :eof) (read-char stream nil :eof))) ((not (characterp ch))) (princ ch out)))) (when (and stream must-close?) (ignore-errors (close stream))))) (error (condition) (format t "Error ~s: ~a~%" url condition) nil))) Is there anything extra I need to do to make sure that all the streams opened by drakma are closed? My production code is much more complex, but the simple stub above will generate the out of file handles problem. Besides the actual error I can use lsof on Linux and Mac OS and see many sockets stuck in CLOSED or CLOSE_WAIT states. This is all under LispWorks 5.0.1 Also, when using :want-stream nil I never encounter the problem. Cheers, Chris Dean From vodonosov at mail.ru Sat Feb 3 13:33:28 2007 From: vodonosov at mail.ru (Anton Vodonosov) Date: Sat, 03 Feb 2007 15:33:28 +0200 Subject: [drakma-devel] Closing streams with :want-stream t In-Reply-To: References: Message-ID: <45C48F28.1020302@mail.ru> Hi, Dean. I've tried your code, but I can't reproduce socket handle leak. I'm on Windows + Clisp. I've made several calls to (SIMPLE-GET "http://microsoft.com") and (SIMPLE-GET "http://google.com"); and watching sockets using netstat. All sockets are closed properly. What URLs lead to socket handle leak? May it be that URLs you use point to servers that use http 1.1 but don't return close http header properly? As far as I understand, MUST-CLOSE? = NIL means that stream may be reused in further calls of HTTP-REQUEST for the same server. If so and your are not intended to resuse stream in further calls, you can always CLOSE it. Try to always CLOSE returned streams, without regard to MUST-CLOSE?. Regards, -Anton From edi at agharta.de Sat Feb 3 15:09:48 2007 From: edi at agharta.de (Edi Weitz) Date: Sat, 03 Feb 2007 16:09:48 +0100 Subject: [drakma-devel] Closing streams with :want-stream t In-Reply-To: (Chris Dean's message of "Fri, 02 Feb 2007 18:27:34 -0800") References: Message-ID: On Fri, 02 Feb 2007 18:27:34 -0800, Chris Dean wrote: > I have a file handle leak in my code (that is, I run out of file > handles after running for a while) and I suspect that I'm not using > drakma correctly. > > I wish to use the :want-stream t parameter and read the resulting > stream directly, so I have this code: > > (defun simple-get (url) > "Download the url using GET and return the body as a string." > (handler-case > (multiple-value-bind (stream code headers dummy-uri dummy-stream > must-close?) > (drakma:http-request url :want-stream t :keep-alive nil > :method :get) > (declare (ignore headers dummy-stream dummy-uri)) > (unwind-protect > (and stream > code > (= code 200) > (with-output-to-string (out) > (do ((ch (read-char stream nil :eof) > (read-char stream nil :eof))) > ((not (characterp ch))) > (princ ch out)))) > (when (and stream must-close?) > (ignore-errors (close stream))))) > (error (condition) > (format t "Error ~s: ~a~%" url condition) > nil))) > > Is there anything extra I need to do to make sure that all the > streams opened by drakma are closed? > > My production code is much more complex, but the simple stub above > will generate the out of file handles problem. Besides the actual > error I can use lsof on Linux and Mac OS and see many sockets stuck > in CLOSED or CLOSE_WAIT states. This is all under LispWorks 5.0.1 > > Also, when using :want-stream nil I never encounter the problem. The meaning of the sixth return value (MUST-CLOSE) is that you're not allowed to re-use the stream, because according to the reply headers the server will close the stream on its side. However, if you do /not/ want to re-use the stream (which is obviously the case in your example as your function doesn't return the stream), you must of course always close it. Drakma can't close it for you as it doesn't know when you're done with it, and why would you want to keep an open stream hanging around in your image that can't be accessed by your code anyway? In other words: It should be (when stream (ignore-errors (close stream))))) above. I'll re-word this in the documentation to make it more clear (hopefully). HTH, Edi. From edi at agharta.de Sat Feb 3 15:37:48 2007 From: edi at agharta.de (Edi Weitz) Date: Sat, 03 Feb 2007 16:37:48 +0100 Subject: [drakma-devel] Closing streams with :want-stream t In-Reply-To: (Edi Weitz's message of "Sat, 03 Feb 2007 16:09:48 +0100") References: Message-ID: On Sat, 03 Feb 2007 16:09:48 +0100, Edi Weitz wrote: > I'll re-word this in the documentation to make it more clear > (hopefully). Although, after reading it once more, I think the documentation was already pretty clear: HTTP-REQUEST will always close the stream to the server before it returns unless WANT-STREAM is true or if the headers exchanged between Drakma and the server determine that the connection will be kept alive - for example if both client and server used the HTTP 1.1 protocol and no explicit "Connection: close" header was sent. In these cases /you/ will have to close the stream manually. [...] If WANT-STREAM is true, the message body is not read and instead the (open) socket stream is returned as the first return value. If the sixth value of HTTP-REQUEST is true, the stream should be closed (and not be re-used) after the body has been read. Anyway, I'll try to be even more precise... :) From ctdean at sokitomi.com Sat Feb 3 20:42:51 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Sat, 03 Feb 2007 12:42:51 -0800 Subject: [drakma-devel] Closing streams with :want-stream t In-Reply-To: <45C48F28.1020302@mail.ru> (Anton Vodonosov's message of "Sat, 03 Feb 2007 15:33:28 +0200") References: <45C48F28.1020302@mail.ru> Message-ID: Anton Vodonosov writes: > May it be that URLs you use point to servers that use http 1.1 but > don't return close http header properly? The problem is very data dependent and I have a test set of 1647 urls that exercises the leak. I can send the data in a private email if you wish. Thanks for running a test on Windows. Cheers, Chris Dean From ctdean at sokitomi.com Sat Feb 3 21:10:31 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Sat, 03 Feb 2007 13:10:31 -0800 Subject: [drakma-devel] Closing streams with :want-stream t In-Reply-To: (Edi Weitz's message of "Sat, 03 Feb 2007 16:09:48 +0100") References: Message-ID: Edi Weitz writes: > However, if you do /not/ want to re-use the stream (which is obviously > the case in your example as your function doesn't return the stream), > you must of course always close it. Sure, of course. > (when stream > (ignore-errors (close stream))))) Fair enough. FWIW, my production code looks exactly like this. (During my debugging I noticed that must-close was always t in my case.) Regardless, if I make that change I still see the leak. I have a data set I can send off-list if anyone is interested. Cheers, Chris Dean (defun simple-get (url) "Download the url using GET and return the body as a string." (handler-case (multiple-value-bind (stream code) (drakma:http-request url :want-stream t :keep-alive nil :method :get) (unwind-protect (and stream code (= code 200) (with-output-to-string (out) (do ((ch (read-char stream nil :eof) (read-char stream nil :eof))) ((not (characterp ch))) (princ ch out)))) (when stream (ignore-errors (close stream))))) (error (condition) (format t "Error ~s: ~a~%" url condition) nil))) From edi at agharta.de Sun Feb 4 23:25:08 2007 From: edi at agharta.de (Edi Weitz) Date: Mon, 05 Feb 2007 00:25:08 +0100 Subject: [drakma-devel] Re: drakma/chunga problem. In-Reply-To: =?iso-8859-1?q?=28Asbj=F8rn_Bj=F8rnstad's?= message of "Sun, 4 Feb 2007 13:45:25 +0800") References: Message-ID: Hi! On Sun, 4 Feb 2007 13:45:25 +0800, "Asbj?rn Bj?rnstad" wrote: > I'm not sure whether this is a bug or not. [Please use the mailing list to report bugs. See Cc.] > I'm planning to set up automatic bug reporting into trac. > (http://trac.edgewall.com) > > Posting the message works, but if I try to add a cookie jar, I get > an error. (The cookie jar is empty before the call, if that is not > how it's supposed to be used, you can safely ignore this.) As I said > it's working without cookies, and I don't plan to use cookies, just > thought you might want to know about a possible bug. This is with > the latest version of chunga and drakma. > > Backtrace attached as I don't know if gmail might break lines and > make it unreadable (Password changed, if you want a trac account to > test youself, it can be arranged.) I think this is the relevant part of the backtrace: Call to CHUNGA:READ-NAME-VALUE-PAIR (offset 78) STREAM : # CHUNGA::VALUE-REQUIRED-P : NIL CHUNGA::COOKIE-SYNTAX : T Call to CHUNGA:READ-NAME-VALUE-PAIRS (offset 449) STREAM : # CHUNGA::VALUE-REQUIRED-P : NIL CHUNGA::COOKIE-SYNTAX : T CHAR : #\; DBG::|accumulator-| : (NIL ("expires" . "Sat, 05-May-2007 05:11:54 GMT") ("Path" . "/realtist")) DBG::|aux-var-| : (("Path" . "/realtist")) Binding frame: CHUNGA:*CURRENT-ERROR-MESSAGE* : NIL Catch frame: # Call to DRAKMA::PARSE-SET-COOKIE (offset 426) STRING : "trac_session=20ae843edfe4ed8c7a3815ec; expires=Sat, 05-May-2007 05:11:54 GMT; Path=/realtist;" DBG::OBJ : # DBG::DESC : (# T :DONT-CARE) STREAM : # CHUNGA:*CURRENT-ERROR-MESSAGE* : "While parsing cookie header \"trac_session=20ae843edfe4ed8c7a3815ec; expires=Sat, 05-May-2007 05:11:54 GMT; Path=/realtist;\":" FIRST : T DRAKMA::NEXT : #\t DRAKMA::NAME/VALUE : ("trac_session" . "20ae843edfe4ed8c7a3815ec") DRAKMA::PARAMETERS : NIL DBG::|accumulator-| : (NIL) DBG::|aux-var-| : (NIL) It seems the problem is the semicolon at the end and that this is a bug that was fixed in Chunga 0.2.3. Are you sure you're using the latest version? http://weitz.de/chunga/CHANGELOG.txt Cheers, Edi. From edi at agharta.de Mon Feb 5 00:22:02 2007 From: edi at agharta.de (Edi Weitz) Date: Mon, 05 Feb 2007 01:22:02 +0100 Subject: [drakma-devel] New version 0.5.5 (Was: Closing streams with :want-stream t) In-Reply-To: (Chris Dean's message of "Sat, 03 Feb 2007 13:10:31 -0800") References: Message-ID: On Sat, 03 Feb 2007 13:10:31 -0800, Chris Dean wrote: > Regardless, if I make that change I still see the leak. > > I have a data set I can send off-list if anyone is interested. OK, thanks for the data. I think I've found the leak: It happened if there was a redirect and the server explicitely wanted to close the first connection, the one which sent the 302. Drakma realized that it couldn't re-use the socket stream and created a new one, but it "forgot" to close the old one. Please try the new release and see if you still have the same problems. Thanks for the bug report, Edi. From ctdean at sokitomi.com Mon Feb 5 00:28:56 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Sun, 04 Feb 2007 16:28:56 -0800 Subject: [drakma-devel] New version 0.5.5 In-Reply-To: (Edi Weitz's message of "Mon, 05 Feb 2007 01:22:02 +0100") References: Message-ID: Edi Weitz writes: > Please try the new release and see if you still have the same > problems. I will, and I'll let you know the results of my tests. Cheers, Chris Dean From ctdean at sokitomi.com Mon Feb 5 02:20:05 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Sun, 04 Feb 2007 18:20:05 -0800 Subject: [drakma-devel] New version 0.5.5 In-Reply-To: (Edi Weitz's message of "Mon, 05 Feb 2007 01:22:02 +0100") References: Message-ID: > Please try the new release and see if you still have the same > problems. I've tried it out on my data and it is certainly much better. But there are still some connections left after a run of 1646 urls. I'll take another look at the code later tonight and see if I can discover anything. Also, I'll send along my run data in a separate email to interested parties. Cheers, Chris Dean From ctdean at sokitomi.com Mon Feb 5 07:45:34 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Sun, 04 Feb 2007 23:45:34 -0800 Subject: [drakma-devel] New version 0.5.5 In-Reply-To: (Chris Dean's message of "Sun, 04 Feb 2007 18:20:05 -0800") References: Message-ID: Chris Dean writes: > I've tried it out on my data and it is certainly much better. But > there are still some connections left after a run of 1646 urls. Some urls give an error when parsing the header. We do hit the final unwind-protect in http-request, but since the error occurs during the parsing of the header the caller (me) doesn't have the stream object available to close. One solution is below: set another flag that indicates whether or not to leave the stream open. I'll continue testing in case I come across any other issues. Cheers, Chris Dean -------------- next part -------------- A non-text attachment was scrubbed... Name: drakma-force-open.patch Type: text/x-patch Size: 1469 bytes Desc: drakma-force-open.patch URL: From edi at agharta.de Mon Feb 5 12:31:05 2007 From: edi at agharta.de (Edi Weitz) Date: Mon, 05 Feb 2007 13:31:05 +0100 Subject: [drakma-devel] Re: drakma/chunga problem. In-Reply-To: =?iso-8859-1?q?=28Asbj=F8rn_Bj=F8rnstad's?= message of "Mon, 5 Feb 2007 19:55:34 +0800") References: Message-ID: [Cc to mailing list.] On Mon, 5 Feb 2007 19:55:34 +0800, "Asbj?rn Bj?rnstad" wrote: > BTW, I got a bounce back from the mailing list as I am not a member. > Is that intentional? (Could stop some from submitting bug reports.) It's for subscribers only as are almost all mailing lists I know. Yes, you have to subscribe, but I think it's not asking too much if you want free support for software you didn't pay for. The alternative would be that the list would be swamped with spam which is not really an alternative to me. (And I don't like to handle questions and bug reports off list either, because it very often means that you have to say the same thing more than once.) Cheers, Edi. From edi at agharta.de Tue Feb 6 00:46:02 2007 From: edi at agharta.de (Edi Weitz) Date: Tue, 06 Feb 2007 01:46:02 +0100 Subject: [drakma-devel] New version 0.5.5 In-Reply-To: (Chris Dean's message of "Sun, 04 Feb 2007 18:20:05 -0800") References: Message-ID: On Sun, 04 Feb 2007 18:20:05 -0800, Chris Dean wrote: > I've tried it out on my data and it is certainly much better. But > there are still some connections left after a run of 1646 urls. I've now done a full run through the test URLs you sent and they provide for a lot of interesting problematic cases. I'll update Drakma and probably Chunga with bugfixes and/or workarounds in the next days. Thanks, Edi. From edi at agharta.de Thu Feb 8 14:30:11 2007 From: edi at agharta.de (Edi Weitz) Date: Thu, 08 Feb 2007 15:30:11 +0100 Subject: [drakma-devel] New Chunga release 0.2.4 Message-ID: ChangeLog: Version 0.2.4 2007-02-08 Allow more characters in cookie names/values according to original Netscape spec Robustified READ-COOKIE-VALUE Download: http://weitz.de/files/chunga.tar.gz Cheers, Edi. From edi at agharta.de Thu Feb 8 14:37:10 2007 From: edi at agharta.de (Edi Weitz) Date: Thu, 08 Feb 2007 15:37:10 +0100 Subject: [drakma-devel] New Drakma release 0.6.0 Message-ID: ChangeLog: Version 0.6.0 2006-02-08 Make sure stream is closed in case of early errors (thanks to Chris Dean for test data) Robustified cookie parsing Send all outgoing cookies in one fell swoop (for Sun's buggy web server) Deal with empty Location headers Deal with corrupted Content-Type headers Download: http://weitz.de/files/drakma.tar.gz Have fun, Edi. From edi at agharta.de Thu Feb 8 14:45:10 2007 From: edi at agharta.de (Edi Weitz) Date: Thu, 08 Feb 2007 15:45:10 +0100 Subject: [drakma-devel] Several fixes and workarounds (Was: New version 0.5.5) In-Reply-To: (Edi Weitz's message of "Tue, 06 Feb 2007 01:46:02 +0100") References: Message-ID: On Tue, 06 Feb 2007 01:46:02 +0100, Edi Weitz wrote: > I've now done a full run through the test URLs you sent and they > provide for a lot of interesting problematic cases. I'll update > Drakma and probably Chunga with bugfixes and/or workarounds in the > next days. OK, see the new releases of Drakma and Chunga. I can now run Drakma (tested on LWW 5.0.1) through Chris Dean's 1600+ test cases with only very few warnings and errors. These are: 1. Charsets like GB2312 that FLEXI-STREAMS doesn't know. 2. Headers sent by the server which are really corrupt. 3. Network-related errors like "Unknown host". 4. Five cases of "End of file while reading ...". I'm only concerned about #4, but unfortunately these aren't reproducible. I'll see what I can find out, but if someone has an idea, please step forward. FWIW, this is the function I used for testing: (defun simple-get (url) (handler-case (let ((puri:*strict-parse* nil) (flex:*provide-use-value-restart* t) (flex:*substitution-char* #\?)) (multiple-value-bind (stream code) (drakma:http-request url :cookie-jar (make-instance 'drakma:cookie-jar) :want-stream t) (unwind-protect (and stream (eql code 200) (with-output-to-string (out) (do ((ch (read-char stream nil :eof) (read-char stream nil :eof))) ((not (characterp ch))) (princ ch out)))) (when stream (ignore-errors (close stream :abort t)))))) (error (condition) (format t "~&Error (~A): ~A~%%" url condition) nil))) Edi. From ctdean at sokitomi.com Thu Feb 8 19:01:14 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Thu, 08 Feb 2007 11:01:14 -0800 Subject: [drakma-devel] Several fixes and workarounds In-Reply-To: (Edi Weitz's message of "Thu, 08 Feb 2007 15:45:10 +0100") References: Message-ID: That's great! I'll pull the new versions and run them through my code. Cheers, Chris Dean From saurabhnanda at gmail.com Fri Feb 9 12:01:18 2007 From: saurabhnanda at gmail.com (Saurabh Nanda) Date: Fri, 9 Feb 2007 17:31:18 +0530 Subject: [drakma-devel] Not following redirects and conditions Message-ID: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> Hi, I'm trying to write some tests for an HTTP based API. I need to check whether a server responds with a 302 status code and then need to check the referred location as well, without actually visiting that link. How is it possible with drakma? If I use :redirect 0 then the http-request function throws up an error. TIA Nandz. -- http://nandz.blogspot.com http://foodieforlife.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From edi at agharta.de Fri Feb 9 12:10:30 2007 From: edi at agharta.de (Edi Weitz) Date: Fri, 09 Feb 2007 13:10:30 +0100 Subject: [drakma-devel] Not following redirects and conditions In-Reply-To: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> (Saurabh Nanda's message of "Fri, 9 Feb 2007 17:31:18 +0530") References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> Message-ID: Hi, On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda" wrote: > I'm trying to write some tests for an HTTP based API. I need to > check whether a server responds with a 302 status code and then need > to check the referred location as well, without actually visiting > that link. > > How is it possible with drakma? If I use :redirect 0 then the > http-request function throws up an error. How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS. Cheers, Edi. From saurabhnanda at gmail.com Fri Feb 9 12:21:37 2007 From: saurabhnanda at gmail.com (Saurabh Nanda) Date: Fri, 9 Feb 2007 17:51:37 +0530 Subject: [drakma-devel] Not following redirects and conditions In-Reply-To: References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> Message-ID: <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com> Super! It works -- I should probably read the documentation more carefully the next time around! Nandz. On 2/9/07, Edi Weitz wrote: > Hi, > > On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda" > wrote: > > > I'm trying to write some tests for an HTTP based API. I need to > > check whether a server responds with a 302 status code and then need > > to check the referred location as well, without actually visiting > > that link. > > > > How is it possible with drakma? If I use :redirect 0 then the > > http-request function throws up an error. > > How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS. > > Cheers, > Edi. > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > -- http://nandz.blogspot.com http://foodieforlife.blogspot.com From saurabhnanda at gmail.com Fri Feb 9 13:19:29 2007 From: saurabhnanda at gmail.com (Saurabh Nanda) Date: Fri, 9 Feb 2007 18:49:29 +0530 Subject: [drakma-devel] Not following redirects and conditions In-Reply-To: <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com> References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com> Message-ID: <794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com> If ":redirect nil" is set and the server responds with a Set-Cooke: header and an HTTP redirect, will the cookie be set? I noticed that when the header is the following the cookie is set -- "Set-Cookie: cookie-name=some-randome-value" But when the header is the following, the cookie is not set -- "Set-Cookie: cookie-name=" Is this correct, or is it some bug in my tests? Regards, Saurabh. On 2/9/07, Saurabh Nanda wrote: > Super! It works -- I should probably read the documentation more > carefully the next time around! > > Nandz. > > On 2/9/07, Edi Weitz wrote: > > Hi, > > > > On Fri, 9 Feb 2007 17:31:18 +0530, "Saurabh Nanda" > > > wrote: > > > > > I'm trying to write some tests for an HTTP based API. I need to > > > check whether a server responds with a 302 status code and then need > > > to check the referred location as well, without actually visiting > > > that link. > > > > > > How is it possible with drakma? If I use :redirect 0 then the > > > http-request function throws up an error. > > > > How about setting :REDIRECT to NIL? See also :REDIRECT-METHODS. > > > > Cheers, > > Edi. > > _______________________________________________ > > drakma-devel mailing list > > drakma-devel at common-lisp.net > > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > > > > > -- > http://nandz.blogspot.com > http://foodieforlife.blogspot.com > -- http://nandz.blogspot.com http://foodieforlife.blogspot.com From edi at agharta.de Fri Feb 9 14:19:41 2007 From: edi at agharta.de (Edi Weitz) Date: Fri, 09 Feb 2007 15:19:41 +0100 Subject: [drakma-devel] Not following redirects and conditions In-Reply-To: <794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com> (Saurabh Nanda's message of "Fri, 9 Feb 2007 18:49:29 +0530") References: <794f042d0702090401s7f21b328y706aa892771fdef3@mail.gmail.com> <794f042d0702090421w5aa8b56eid016a53d27d3f3e6@mail.gmail.com> <794f042d0702090519x5774113bx3522ec73c2705d62@mail.gmail.com> Message-ID: On Fri, 9 Feb 2007 18:49:29 +0530, "Saurabh Nanda" wrote: > If ":redirect nil" is set and the server responds with a Set-Cooke: > header and an HTTP redirect, will the cookie be set? > > I noticed that when the header is the following the cookie is set -- > "Set-Cookie: cookie-name=some-randome-value" > > But when the header is the following, the cookie is not set -- > "Set-Cookie: cookie-name=" > > Is this correct, or is it some bug in my tests? The cookie should be set, with an empty string as its value. The value of :REDIRECT should not affect this. If it's not set, it's an error in Drakma and I'd be happy if you could send me a test case to reproduce it. From lispercat at gmail.com Mon Feb 12 20:44:40 2007 From: lispercat at gmail.com (Andrei Stebakov) Date: Mon, 12 Feb 2007 15:44:40 -0500 Subject: [drakma-devel] Problem with file uploading Message-ID: This is exaclty what I need. The GET method works just fine, but I have trouble with the POST method uploading the files. Edi, here is a question (I am not sure if it's the right mailing list to ask it...) When I say (this is part of a function, so I use back-quote for parameters): (drakma:http-request "/some/uri" :method :post :form-data t :parameters `(("Name1" . ,name1) ("Name2" . ,name2) ("File" . ,file-name)))) I got an "unknown error" from the remote host. Looks like there is problem with streaming file contents. I did a little of debugging printing the content of file buffer (in send-content function) looks like the file is being open and read, but something happens at the receiving end. I wonder how can I debug it more. When I do the same request from the FORM in Firefox everything works. What debuggind techniques I can try here? (Sorry I am still very new to Lisp) Thank you, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctdean at sokitomi.com Mon Feb 12 20:53:59 2007 From: ctdean at sokitomi.com (Chris Dean) Date: Mon, 12 Feb 2007 12:53:59 -0800 Subject: [drakma-devel] Problem with file uploading In-Reply-To: (Andrei Stebakov's message of "Mon, 12 Feb 2007 15:44:40 -0500") References: Message-ID: "Andrei Stebakov" writes: > This is exaclty what I need. The GET method works just fine, but I > have trouble with the POST method uploading the files. I've never used POST with drakma. So I can only offer very general advice. One way to debug the system is to test against your own server. You could, for example, use hunchentoot to easily create a test webserver. Once you have control of the server you can debug both sides of the problem. Cheers, Chris Dean From edi at agharta.de Mon Feb 12 21:01:50 2007 From: edi at agharta.de (Edi Weitz) Date: Mon, 12 Feb 2007 22:01:50 +0100 Subject: [drakma-devel] Problem with file uploading In-Reply-To: (Andrei Stebakov's message of "Mon, 12 Feb 2007 15:44:40 -0500") References: Message-ID: On Mon, 12 Feb 2007 15:44:40 -0500, "Andrei Stebakov" wrote: > (drakma:http-request "/some/uri" > :method :post :form-data t For file uploads you don't need the ":FORM-DATA T" part. > :parameters `(("Name1" . ,name1) > ("Name2" . ,name2) > ("File" . ,file-name))) Without knowing what "/some/uri", NAME1, NAME2, and FILE-NAME are this is hard to say. Is FILE-NAME really a pathname object? Or maybe the receiving web server can't cope with chunked transfer encoding (like Apache 1.x)? Then you'll have to add :CONTENT-LENGTH T to the call, but note that this will force Drakma to compose the whole request body in RAM before sending it which might not work for /very/ large files. HTH, Edi. From edi at agharta.de Mon Feb 12 21:04:35 2007 From: edi at agharta.de (Edi Weitz) Date: Mon, 12 Feb 2007 22:04:35 +0100 Subject: [drakma-devel] Problem with file uploading In-Reply-To: (Chris Dean's message of "Mon, 12 Feb 2007 12:53:59 -0800") References: Message-ID: On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean wrote: > One way to debug the system is to test against your own server. You > could, for example, use hunchentoot to easily create a test > webserver. Once you have control of the server you can debug both > sides of the problem. Of course, this won't help much if Hunchentoot and the /real/ server behave differently. (See my other email for an example - Hunchentoot knows how to handle chunked transfer encoding used by clients, Apache 1.x doesn't.) Another way to debug Drakma it to use *HEADER-STREAM* to see at least the headers flying by. http://weitz.de/drakma/#*header-stream* Or use something like Ethereal (or whatever it is called nowadays). From lispercat at gmail.com Wed Feb 14 18:56:28 2007 From: lispercat at gmail.com (Andrei Stebakov) Date: Wed, 14 Feb 2007 13:56:28 -0500 Subject: [drakma-devel] Problem with file uploading In-Reply-To: References: Message-ID: The headers printed are following: GET /authentication.getToken.cp?appKey=1234 HTTP/1.1 Host: domain.com User-Agent: Drakma/0.6.0 (CMU Common Lisp CVS release-19a 19a-release-20040728 + minimal debian patches; Linux; Linux version 2.2.20-idepci (herbert at gondolin) (gcc version 2.7.2.3) #1 Sat Apr 20 12:45:19 EST 2002; http://weitz.de/drakma/) Accept: */* Connection: close HTTP/1.1 200 OK Cache-Control: private Content-Length: 74 Content-Type: text/xml Date: Wed, 14 Feb 2007 18:52:53 GMT Connection: close POST /image.upload.cp HTTP/1.1 Host: domain.com User-Agent: Drakma/0.6.0 (CMU Common Lisp CVS release-19a 19a-release-20040728 + minimal debian patches; Linux; Linux version 2.2.20-idepci (herbert at gondolin) (gcc version 2.7.2.3) #1 Sat Apr 20 12:45:19 EST 2002; http://weitz.de/drakma/) Accept: */* Connection: close Content-Type: multipart/form-data; boundary=----------WueD0PVGvZzxvyK3835D6znnVITzpU5zaysqeYq41qhj1Nlv79 Content-Length: 521 HTTP/1.1 100 Continue HTTP/1.1 200 OK Cache-Control: private Content-Length: 94 Content-Type: text/xml; charset=utf-8 Date: Wed, 14 Feb 2007 18:53:01 GMT Set-Cookie: Coyote-2-c0a8017a=c0a8073f:0;Max-Age=1800;Path=/ Connection: close ((:CACHE-CONTROL . "private") (:CONTENT-LENGTH . "94") (:CONTENT-TYPE . "text/xml; charset=utf-8") (:DATE . "Wed, 14 Feb 2007 18:53:01 GMT") (:SET-COOKIE . "Coyote-2-c0a8017a=c0a8073f:0;Max-Age=1800;Path=/") (:CONNECTION . "close")) Maybe, as you mentioned, it's that the server I am trying to upload images to doesn't understand chunked stream? Thank you, Andrew On 2/12/07, Edi Weitz wrote: > > On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean > wrote: > > > One way to debug the system is to test against your own server. You > > could, for example, use hunchentoot to easily create a test > > webserver. Once you have control of the server you can debug both > > sides of the problem. > > Of course, this won't help much if Hunchentoot and the /real/ server > behave differently. (See my other email for an example - Hunchentoot > knows how to handle chunked transfer encoding used by clients, Apache > 1.x doesn't.) > > Another way to debug Drakma it to use *HEADER-STREAM* to see at least > the headers flying by. > > http://weitz.de/drakma/#*header-stream* > > Or use something like Ethereal (or whatever it is called nowadays). > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lispercat at gmail.com Wed Feb 14 21:11:55 2007 From: lispercat at gmail.com (Andrei Stebakov) Date: Wed, 14 Feb 2007 16:11:55 -0500 Subject: [drakma-devel] Problem with file uploading In-Reply-To: References: Message-ID: Hi Edi, First of all sorry for bringing it to the lisp NG. I didn't want to discuss it there, I just wanted to hear what libs are available. For some reason, my gmail client didn't show your messages so the last one was the one from Chris. Basically setting :CONTENT-LENGTH T and sending a pathname object instead of string solve the problem. Now I understand that with :CONTENT-LENGTH nil it was sending the chunked data. I still don't understand why when I send the request without :CONTENT-LENGTH T and giving a file name starting with p# the lisp process hangs (cmucl), maybe it's just the lisp implementation. Anyway, the problem solved, thank you Edi and Chris! Andrew On 2/12/07, Edi Weitz wrote: > > On Mon, 12 Feb 2007 12:53:59 -0800, Chris Dean > wrote: > > > One way to debug the system is to test against your own server. You > > could, for example, use hunchentoot to easily create a test > > webserver. Once you have control of the server you can debug both > > sides of the problem. > > Of course, this won't help much if Hunchentoot and the /real/ server > behave differently. (See my other email for an example - Hunchentoot > knows how to handle chunked transfer encoding used by clients, Apache > 1.x doesn't.) > > Another way to debug Drakma it to use *HEADER-STREAM* to see at least > the headers flying by. > > http://weitz.de/drakma/#*header-stream* > > Or use something like Ethereal (or whatever it is called nowadays). > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edi at agharta.de Wed Feb 14 22:23:45 2007 From: edi at agharta.de (Edi Weitz) Date: Wed, 14 Feb 2007 23:23:45 +0100 Subject: [drakma-devel] Problem with file uploading In-Reply-To: (Andrei Stebakov's message of "Wed, 14 Feb 2007 16:11:55 -0500") References: Message-ID: On Wed, 14 Feb 2007 16:11:55 -0500, "Andrei Stebakov" wrote: > For some reason, my gmail client didn't show your messages so the > last one was the one from Chris. Maybe they ended up in the spam folder? > Basically setting :CONTENT-LENGTH T and sending a pathname object > instead of string solve the problem. Now I understand that with > :CONTENT-LENGTH nil it was sending the chunked data. Good. > I still don't understand why when I send the request without > :CONTENT-LENGTH T and giving a file name starting with p# the lisp > process hangs (cmucl), maybe it's just the lisp implementation. No, I think that's pretty clear: If the server doesn't understand chunked encoding, then Drakma will try to send the file (and it'll only do that if you're using a pathname, i.e. the #P"" syntax), but the server won't accept it - it doesn't know how much it is supposed to read from the stream. So, Drakma tries to send data, but on the other end nobody is reading the data. That's why the whole thing appears to be hanging. You'll probably get a timeout at some point if you wait long enough. From edi at agharta.de Thu Feb 22 18:09:59 2007 From: edi at agharta.de (Edi Weitz) Date: Thu, 22 Feb 2007 19:09:59 +0100 Subject: [drakma-devel] Darcs repositories Message-ID: [My apologies if you get this more than once.] Several people have asked for Darcs repositories of my software. These do exists now: http://common-lisp.net/~loliveira/ediware/ Special thanks to Lu?s Oliveira who made this possible and who maintains the repositories. Cheers, Edi. From jeffrey at cunningham.net Sat Feb 24 17:07:25 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sat, 24 Feb 2007 09:07:25 -0800 Subject: [drakma-devel] Bug handling bad html? Message-ID: <20070224170725.GA23865@achilles.olympus.net> I was playing with drakma and had it drop into the debugger when retrieving a commercial page. It looks like it might be a bug in flexi-streams, but I don't know how to isolate the input more specifically than what came up here: Unexpected value #xA0 at start of UTF-8 sequence. [Condition of type FLEXI-STREAMS:FLEXI-STREAM-ENCODING-ERROR] Restarts: 0: [ABORT] Abort SLIME compilation. 1: [ABORT] Return to SLIME's top level. 2: [TERMINATE-THREAD] Terminate this thread (#) Backtrace: 0: (FLEXI-STREAMS::SIGNAL-ENCODING-ERROR # "Unexpected value #x~X at start of UTF-8 sequence." 160) 1: (FLEXI-STREAMS::SIGNAL-ENCODING-ERROR # "Unexpected value #x~X at start of UTF-8 sequence.") 2: ((FLET #:BODY-FN327)) 3: ((SB-PCL::FAST-METHOD STREAM-READ-CHAR (FLEXI-STREAMS::FLEXI-UTF-8-INPUT-STREAM)) # # #) 4: ((SB-PCL::FAST-METHOD TRIVIAL-GRAY-STREAMS:STREAM-READ-SEQUENCE (FLEXI-STREAMS:FLEXI-INPUT-STREAM #1="#<...>" . #1#)) # # # # # #) 5: (READ-SEQUENCE "y make a difference this holiday season. Our gift ideas are unique and of high quality.

Gift ideas for every occasion, Christmas, Birthday, Mother's day...
Gift ideas for every occasion, Christmas, Birthday, Mothers day, Graduation, Fathers day, Anniversary, Wedding, & Baby Shower.

Hanukkah card, Christmas gift idea and Holiday greeting cards from MixedBlessing
Greeting Cards for Interfaith and Multicultures from MixedBlesing. Hanukkah cards, Holiday cards, Christmas Gift Ideas, Holiday Gifts and more.. Find great gifts now!

..) 6: (DRAKMA::READ-BODY # ((:DATE . "Sat, 24 Feb 2007 06:30:03 GMT") (:SERVER . "Apache/2.0.46 (Red Hat)") (:SET-COOKIE . "GS_UUID=24.18.193.65.1172298603635841; path=/,PHPSESSID=e009a521cb2bf134a00df925e4f4d510; path=/,cart_hash=e009a521cb2bf134a00df925e4f4d510; expires=Tuesday, 27-Feb-07 06:30:03 GMT; path=/") (:X-POWERED-BY . "PHP/4.4.0") (:EXPIRES . "Thu, 19 Nov 1981 08:52:00 GMT") (:CACHE-CONTROL . "no-store, no-cache, must-revalidate, post-check=0, pre-check=0") ..)) 7: ((LABELS DRAKMA::FINISH-REQUEST) NIL NIL) 8: (HTTP-REQUEST # :PROXY NIL) 9: (RETRIEVE-URI "http://www.gifttree.com/Christmas/Christmas-gift-idea.html" NIL) 10: (WALK-SITE "http://www.gifttree.com/Christmas/Christmas-gift-idea.html" # # # # # #) 11: (SB-FASL::FOP-FUNCALL) 12: (SB-FASL::LOAD-FASL-GROUP #) 13: (SB-FASL::LOAD-AS-FASL # NIL #) 14: (SB-FASL::INTERNAL-LOAD #P"/tmp/fileIQGlqR.fasl" #P"/tmp/fileIQGlqR.fasl" :ERROR NIL NIL :BINARY NIL) 15: (SB-FASL::INTERNAL-LOAD #P"/tmp/fileIQGlqR.fasl" #P"/tmp/fileIQGlqR.fasl" :ERROR NIL NIL NIL :DEFAULT) 16: (LOAD #P"/tmp/fileIQGlqR.fasl") 17: ((LAMBDA (STRING &KEY #1="#<...>" . #1#)) "(print (walk-site \"http://www.gifttree.com\")) " :BUFFER "seo.lisp" :POSITION 27060 :DIRECTORY #) 18: ((LAMBDA ())) --more-- --Jeff From edi at agharta.de Sat Feb 24 20:47:15 2007 From: edi at agharta.de (Edi Weitz) Date: Sat, 24 Feb 2007 21:47:15 +0100 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070224170725.GA23865@achilles.olympus.net> (Jeffrey Cunningham's message of "Sat, 24 Feb 2007 09:07:25 -0800") References: <20070224170725.GA23865@achilles.olympus.net> Message-ID: On Sat, 24 Feb 2007 09:07:25 -0800, Jeffrey Cunningham wrote: > I was playing with drakma and had it drop into the debugger when > retrieving a commercial page. It looks like it might be a bug in > flexi-streams, but I don't know how to isolate the input more > specifically than what came up here: > > Unexpected value #xA0 at start of UTF-8 sequence. My guess is that the website sends wrong content-type headers. (Or, in other words, it claims to send UTF-8 but it doesn't.) This is not unusual. See the mailing list archive of the last weeks for similar problems and for workarounds. If you still think this is a bug in FLEXI-STREAMS, send a simple, reproducible test case and point out where in the sequence of characters FLEXI-STREAMS thinks it's not UTF-8 anymore although it is. Thanks, Edi. From edi at agharta.de Sat Feb 24 22:20:36 2007 From: edi at agharta.de (Edi Weitz) Date: Sat, 24 Feb 2007 23:20:36 +0100 Subject: [drakma-devel] Re: Portability of Drakma In-Reply-To: (Erik Huelsmann's message of "Sat, 24 Feb 2007 22:35:10 +0100") References: Message-ID: Hi Erik, I'm sending a copy of this to the mailing list where I think we should continue this discussion. On Sat, 24 Feb 2007 22:35:10 +0100, "Erik Huelsmann" wrote: > I've been working on a very portable library for sockets code. This > library is now more portable than trivial-sockets and supports more > functionality on all of its supported lisp implementations. > > If you want to support the same platforms (and all the ones I'll be > adding), you could switch from the -unmaintained- trivial-sockets to > usocket (http://common-lisp.net/project/usocket/). > > I'm merely sending this mail to point out the existence of the > library, in case you didn't know. Thanks for your time, attention > and continued support for Common Lisp libraries. I'm aware of usockets' existence because Andreas Fuchs pointed it out to me shortly after I had released the portable version of Drakma (using trivial-sockets). At that point I tried to switch to usocket and immediately ran into problems - IIRC it didn't even load on LispWorks on Windows (although ISTR the website claimed that LispWorks was a supported implementation), and it couldn't provide binary socket streams for all supported implementations. So, I dismissed it for the time being. It might well be the case that both of these issues have been fixed since, but I currently don't have the time to test again. I generally think it's better to rely on a maintained and documented library than on obscure and old code, but of course the new code should work at least as good as the old one. I'd be happy to accept patches to switch Drakma from trivial-sockets to usocket, but the following criteria should be met: - The LispWorks code should remain untouched (i.e. not use usocket). - The code should have been tested successfully on at least the Lisp/OS combinations that are currently supported by Drakma. The actual patch itself should be a piece of cake, but I guess the testing will take some time. Thanks, Edi. From jeffrey at cunningham.net Sun Feb 25 00:39:54 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sat, 24 Feb 2007 16:39:54 -0800 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: References: <20070224170725.GA23865@achilles.olympus.net> Message-ID: <20070225003954.GA32401@achilles.olympus.net> On Sat Feb 24, 2007 at 09:47:15PM +0100, Edi Weitz wrote: > My guess is that the website sends wrong content-type headers. (Or, > in other words, it claims to send UTF-8 but it doesn't.) This is not > unusual. See the mailing list archive of the last weeks for similar > problems and for workarounds. > > If you still think this is a bug in FLEXI-STREAMS, send a simple, > reproducible test case and point out where in the sequence of > characters FLEXI-STREAMS thinks it's not UTF-8 anymore although it is. I believe you are right - incorrectly identified content-type. This gets it to work: (setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0)) (setf flexi-streams::*PROVIDE-USE-VALUE-RESTART* t) (http-request "http://www.gifttree.com/Christmas/Christmas-gift-idea.html") And I read about the performance hit associated with setting this up as a default. But it seems like it raises some issues - at least for what I'm doing, which is trying to automate updating information about some sites I have no control over. In this case I set it to make a substitution for the 'bad' character. Is it possible for there to be more than one? If so, how could that be handled? And more generally, should there not be a way to set drakma so it may take a performance hit but is guaranteed not to die on any html that is thrown at it? Thanks, --Jeff From edi at agharta.de Sun Feb 25 10:25:04 2007 From: edi at agharta.de (Edi Weitz) Date: Sun, 25 Feb 2007 11:25:04 +0100 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225003954.GA32401@achilles.olympus.net> (Jeffrey Cunningham's message of "Sat, 24 Feb 2007 16:39:54 -0800") References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> Message-ID: On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote: > In this case I set it to make a substitution for the 'bad' > character. Is it possible for there to be more than one? Not yet. See current discussion on the FLEXI-STREAMS mailing list. > And more generally, should there not be a way to set drakma so it > may take a performance hit but is guaranteed not to die on any html > that is thrown at it? It's not dying, it just signals an error. And, no, I don't think there's a way to provide meaningful results and at the same time to be prepared to accept whatever bogus data or headers the server choses to send. If you find something like that, send patches, but it sounds like magic (or at least very good AI) to me. As for dealing with wrong character encodings, there are already ways to deal with that. You cited one yourself. Another one would be to read everything as binary data (and then to decode it yourself it needed). From ehuels at gmail.com Sun Feb 25 15:35:39 2007 From: ehuels at gmail.com (Erik Huelsmann) Date: Sun, 25 Feb 2007 16:35:39 +0100 Subject: [drakma-devel] Fwd: Portability of Drakma In-Reply-To: References: Message-ID: Forwarding rejected message. I wasn't subscribed yet. Sorry. bye, Erik. ---------- Forwarded message ---------- From: Erik Huelsmann Date: Feb 25, 2007 2:38 PM Subject: Re: Portability of Drakma To: Edi Weitz Cc: drakma-devel at common-lisp.net On 2/24/07, Edi Weitz wrote: > Hi Erik, > > I'm sending a copy of this to the mailing list where I think we should > continue this discussion. Ah. Sorry about that, I wasn't aware of this list. > > I've been working on a very portable library for sockets code. This > > library is now more portable than trivial-sockets and supports more > > functionality on all of its supported lisp implementations. > > I'm aware of usockets' existence because Andreas Fuchs pointed it out > to me shortly after I had released the portable version of Drakma > (using trivial-sockets). At that point I tried to switch to usocket > and immediately ran into problems - IIRC it didn't even load on > LispWorks on Windows (although ISTR the website claimed that LispWorks > was a supported implementation), and it couldn't provide binary socket > streams for all supported implementations. So, I dismissed it for the > time being. That's both great and bad news: It's great you're aware of the usocket project, it's too bad you tried and failed. > It might well be the case that both of these issues have been fixed > since, but I currently don't have the time to test again. I generally > think it's better to rely on a maintained and documented library than > on obscure and old code, but of course the new code should work at > least as good as the old one. Absolutely. New code shouldn't be a step backward. With that requirement, a chicken-and-egg problem is introduced though: to develop well-tested code, it needs to be (widely) used. But to address your findings: you probably used one of the very first releases: With 0.3.0, binary streams are supported on all implementations. Next to that, I just downloaded and used LW5.0 to test a simple GET request: all seems to work well. Indeed have there been win32 related fixes to many backends. > I'd be happy to accept patches to switch Drakma from trivial-sockets > to usocket, but the following criteria should be met: > > - The LispWorks code should remain untouched (i.e. not use usocket). > > - The code should have been tested successfully on at least the > Lisp/OS combinations that are currently supported by Drakma. Is there a list somewhere as a reference to what I'm getting into? > The actual patch itself should be a piece of cake, but I guess the > testing will take some time. Yes. Not having a Mac, I won't be able to test OpenMCL myself, but maybe others can assist there? Thanks for your time. bye, Erik. From jeffrey at cunningham.net Sun Feb 25 16:26:45 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sun, 25 Feb 2007 08:26:45 -0800 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> Message-ID: <20070225162645.GB16675@achilles.olympus.net> On Sun Feb 25, 2007 at 11:25:04AM +0100, Edi Weitz wrote: > On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote: > > > In this case I set it to make a substitution for the 'bad' > > character. Is it possible for there to be more than one? > > Not yet. See current discussion on the FLEXI-STREAMS mailing list. > > > And more generally, should there not be a way to set drakma so it > > may take a performance hit but is guaranteed not to die on any html > > that is thrown at it? > > It's not dying, it just signals an error. > > And, no, I don't think there's a way to provide meaningful results and > at the same time to be prepared to accept whatever bogus data or > headers the server choses to send. If you find something like that, > send patches, but it sounds like magic (or at least very good AI) to > me. I guess I disagree. If I try to access a page like that using: links, lynx, wget, mozilla, firefox, or any html parsing entity I can think of they don't stop functioning, signal an error, or whatever you want to call it. They give me their best approximation of the content. Seems like that ought be the goal here, or at least a possibility. In an automated process, signaling an error means that processing has stopped (or 'died'). The source of the error signal may be in flexi-streams (I have read the discussions in the that list), but its drakma that has to deal with its consequences. How do the above mentioned applications manage this problem? Certainly not by magic. And I doubt the AI in links or lynx is very sophisticated. --Jeff From vodonosov at mail.ru Sun Feb 25 17:00:03 2007 From: vodonosov at mail.ru (Anton Vodonosov) Date: Sun, 25 Feb 2007 19:00:03 +0200 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225162645.GB16675@achilles.olympus.net> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> Message-ID: <45E1C093.2060800@mail.ru> Hi, Jeff. "Signaling an error" means in this case that work can be proceeded. (setq *provide-use-value-restart* t) (handler-bind ((flexi-stream-encoding-error (lambda (condition) (use-value \?)))) (drakma:http-request("http://bad-host/bad-page.html"))) This is example from flexi-stream documentation. You can easy get "the best approximation of the content" using drakma, but with more control. So it is unclear to my, what problems you have. -Anton Jeffrey Cunningham: > On Sun Feb 25, 2007 at 11:25:04AM +0100, Edi Weitz wrote: >> On Sat, 24 Feb 2007 16:39:54 -0800, Jeffrey Cunningham wrote: >> >>> In this case I set it to make a substitution for the 'bad' >>> character. Is it possible for there to be more than one? >> Not yet. See current discussion on the FLEXI-STREAMS mailing list. >> >>> And more generally, should there not be a way to set drakma so it >>> may take a performance hit but is guaranteed not to die on any html >>> that is thrown at it? >> It's not dying, it just signals an error. >> >> And, no, I don't think there's a way to provide meaningful results and >> at the same time to be prepared to accept whatever bogus data or >> headers the server choses to send. If you find something like that, >> send patches, but it sounds like magic (or at least very good AI) to >> me. > > I guess I disagree. > > If I try to access a page like that using: links, lynx, wget, mozilla, > firefox, or any html parsing entity I can think of they don't stop > functioning, signal an error, or whatever you want to call it. They > give me their best approximation of the content. Seems like that ought > be the goal here, or at least a possibility. > > In an automated process, signaling an error means that processing has > stopped (or 'died'). The source of the error signal may be in > flexi-streams (I have read the discussions in the that list), but its > drakma that has to deal with its consequences. > > How do the above mentioned applications manage this problem? Certainly > not by magic. And I doubt the AI in links or lynx is very > sophisticated. > > > --Jeff > > > > > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > > From jeffrey at cunningham.net Sun Feb 25 17:23:45 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sun, 25 Feb 2007 09:23:45 -0800 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <45E1C093.2060800@mail.ru> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> Message-ID: <20070225172345.GA23630@achilles.olympus.net> On Sun Feb 25, 2007 at 07:00:03PM +0200, Anton Vodonosov wrote: > Hi, Jeff. > > "Signaling an error" means in this case that > work can be proceeded. > > (setq *provide-use-value-restart* t) > > (handler-bind > ((flexi-stream-encoding-error (lambda (condition) > > (use-value \?)))) > (drakma:http-request("http://bad-host/bad-page.html"))) > > > This is example from flexi-stream documentation. > > You can easy get "the best approximation of the content" > using drakma, but with more control. So it is unclear to my, > what problems you have. > > -Anton Hi Anton, Thanks for the help. Will the example above work for any bad charactor, or only the one set by (setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0)) The only example I've run across is the site I mentioned, but it seems like the possibilities for bad html are endless. --Jeff From vodonosov at mail.ru Sun Feb 25 17:43:26 2007 From: vodonosov at mail.ru (Anton Vodonosov) Date: Sun, 25 Feb 2007 19:43:26 +0200 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225172345.GA23630@achilles.olympus.net> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> Message-ID: <45E1CABE.8020605@mail.ru> You misunderstand meaning of *substitution-char*. This is the character that will be used as a substitution for all badly encoded characters. Thus, this example is equvalent to (setq flexi-streams::*provide-use-value-restart* t) (setf flexi-streams::*SUBSTITUTION-CHAR* \?) You will have ? instead of any wrong character. I.e. you can use the whatever mechanism you like: *substitution-char* for most cases or use-value-restart if you whant more control (for example you what to use ? as a substitution for even wrong byte sequence, and * for odd wrong byte sequence; count encoding errors, log them into file or something) Read the docs, http://weitz.de/flexi-streams/ -Anton Jeffrey Cunningham: > On Sun Feb 25, 2007 at 07:00:03PM +0200, Anton Vodonosov wrote: >> Hi, Jeff. >> >> "Signaling an error" means in this case that >> work can be proceeded. >> >> (setq *provide-use-value-restart* t) >> >> (handler-bind >> ((flexi-stream-encoding-error (lambda (condition) >> >> (use-value \?)))) >> (drakma:http-request("http://bad-host/bad-page.html"))) >> >> >> This is example from flexi-stream documentation. >> >> You can easy get "the best approximation of the content" >> using drakma, but with more control. So it is unclear to my, >> what problems you have. >> >> -Anton > > Hi Anton, > > Thanks for the help. Will the example above work for any bad > charactor, or only the one set by > > (setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0)) > > The only example I've run across is the site I mentioned, but it seems > like the possibilities for bad html are endless. > > --Jeff > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > > From nowhere.man at levallois.eu.org Sun Feb 25 18:06:25 2007 From: nowhere.man at levallois.eu.org (Pierre THIERRY) Date: Sun, 25 Feb 2007 19:06:25 +0100 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225162645.GB16675@achilles.olympus.net> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> Message-ID: <20070225180625.GW7500@bateleur.arcanes.fr.eu.org> Scribit Jeffrey Cunningham dies 25/02/2007 hora 08:26: > > If you find something like that, send patches, but it sounds like > > magic (or at least very good AI) to me. > > I guess I disagree. > > If I try to access a page like that using: links, lynx, wget, mozilla, > firefox, or any html parsing entity I can think of they don't stop > functioning, signal an error, or whatever you want to call it. They > give me their best approximation of the content. Seems like that ought > be the goal here, or at least a possibility. AFAICS, those browsers just substitute bad bytes with a single substitution glyph. My Firefox uses a white interrogation mark in a black diamond. You can already achieve that with flexi-streams, IIUC. Quickly, Pierre -- nowhere.man at levallois.eu.org OpenPGP 0xD9D50D8A -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From jeffrey at cunningham.net Sun Feb 25 19:34:06 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sun, 25 Feb 2007 11:34:06 -0800 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <45E1CABE.8020605@mail.ru> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> <45E1CABE.8020605@mail.ru> Message-ID: <20070225193406.GA26412@achilles.olympus.net> On Sun Feb 25, 2007 at 07:43:26PM +0200, Anton Vodonosov wrote: > You misunderstand meaning of *substitution-char*. > This is the character that will be used as a > substitution for all badly encoded characters. > > Thus, this example is equvalent to > (setq flexi-streams::*provide-use-value-restart* t) > (setf flexi-streams::*SUBSTITUTION-CHAR* \?) > > You will have ? instead of any wrong character. > > I.e. you can use the whatever mechanism you like: > *substitution-char* for most cases or use-value-restart > if you whant more control (for example you what to > use ? as a substitution for even wrong byte sequence, > and * for odd wrong byte sequence; count encoding errors, > log them into file or something) You're right, Anton, I did misunderstand the meaning. Thank you for clearing that up. --Jeff From edi at agharta.de Sun Feb 25 20:38:15 2007 From: edi at agharta.de (Edi Weitz) Date: Sun, 25 Feb 2007 21:38:15 +0100 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225162645.GB16675@achilles.olympus.net> (Jeffrey Cunningham's message of "Sun, 25 Feb 2007 08:26:45 -0800") References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> Message-ID: On Sun, 25 Feb 2007 08:26:45 -0800, Jeffrey Cunningham wrote: > If I try to access a page like that using: links, lynx, wget, > mozilla, firefox, or any html parsing entity I can think of they > don't stop functioning, signal an error, or whatever you want to > call it. They give me their best approximation of the content. Seems > like that ought be the goal here, or at least a possibility. > > In an automated process, signaling an error means that processing > has stopped (or 'died'). The source of the error signal may be in > flexi-streams (I have read the discussions in the that list), but > its drakma that has to deal with its consequences. You are missing two crucial points: 1. The applications you listed are just that - monolithic applications. You either use them for what they are intended or you leave them alone. They'd better be as permissible as possible. Drakma, OTOH, is a library - a tool or building block used by programmers to build applications. It should do what it advertises to do correctly - not more and not less. And if that's not what the programmer expected, he can tweak it as much as he wants. (That doesn't imply that he modifies the library itself, but as Drakma is open source he can do even that, if deemed necessary.) 2. In Common Lisp, signalling an error doesn't mean that processing has stopped. If that is news to you, you might want to read, for example, the chapter about conditions and restarts in Peter Seibel's book. > How do the above mentioned applications manage this problem? > Certainly not by magic. In this specific case, they're usually doing it the same way you can do it with Drakma and FLEXI-STREAMS - they insert some kind of replacement character. I don't see where the problem is. Cheers, Edi. From edi at agharta.de Sun Feb 25 20:16:58 2007 From: edi at agharta.de (Edi Weitz) Date: Sun, 25 Feb 2007 21:16:58 +0100 Subject: [drakma-devel] Re: Portability of Drakma In-Reply-To: (Erik Huelsmann's message of "Sun, 25 Feb 2007 14:38:02 +0100") References: Message-ID: On Sun, 25 Feb 2007 14:38:02 +0100, "Erik Huelsmann" wrote: >> - The code should have been tested successfully on at least the >> Lisp/OS combinations that are currently supported by Drakma. > > Is there a list somewhere as a reference to what I'm getting into? No, unfortunately not. I myself use mostly LispWorks (Windows and Linux/x86) and SBCL (Linux/x86). (LispWorks shouldn't be a problem anyway as it's not affected by the switch.) I think that at least LispWorks, SBCL, AllegroCL, CMUCL, CLISP, and OpenMCL should be supported, everything else being a bonus. Operating systems: Windows (where applicable), Linux, OS X. If you don't want to test all of this for yourself, how about offering a tarball of Drakma which uses usocket for download? Send an announcement to this mailing list including an overview of what you've tested and what not. We can ask "interested parties" to try it out and we'll switch to the new version in four weeks, say, unless someone objects. Does that sound OK? Cheers, Edi. From edi at agharta.de Sun Feb 25 20:40:00 2007 From: edi at agharta.de (Edi Weitz) Date: Sun, 25 Feb 2007 21:40:00 +0100 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225172345.GA23630@achilles.olympus.net> (Jeffrey Cunningham's message of "Sun, 25 Feb 2007 09:23:45 -0800") References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> Message-ID: On Sun, 25 Feb 2007 09:23:45 -0800, Jeffrey Cunningham wrote: > The only example I've run across is the site I mentioned, but it > seems like the possibilities for bad html are endless. The problems you've encountered have nothing to do with bad HTML at all, and Drakma doesn't try to parse HTML. I think you're a bit confused. Cheers, Edi. From jeffrey at cunningham.net Sun Feb 25 20:47:39 2007 From: jeffrey at cunningham.net (Jeffrey Cunningham) Date: Sun, 25 Feb 2007 12:47:39 -0800 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> Message-ID: <20070225204739.GA28011@achilles.olympus.net> On Sun Feb 25, 2007 at 09:40:00PM +0100, Edi Weitz wrote: > > I think you're a bit confused. I agree, but I'm slowly getting less confused. Thanks for the help. -Jeff From vodonosov at mail.ru Sun Feb 25 21:02:30 2007 From: vodonosov at mail.ru (Anton Vodonosov) Date: Sun, 25 Feb 2007 23:02:30 +0200 Subject: [drakma-devel] Bug handling bad html? In-Reply-To: <20070225193406.GA26412@achilles.olympus.net> References: <20070224170725.GA23865@achilles.olympus.net> <20070225003954.GA32401@achilles.olympus.net> <20070225162645.GB16675@achilles.olympus.net> <45E1C093.2060800@mail.ru> <20070225172345.GA23630@achilles.olympus.net> <45E1CABE.8020605@mail.ru> <20070225193406.GA26412@achilles.olympus.net> Message-ID: <45E1F966.1010109@mail.ru> Jeffrey Cunningham: > You're right, Anton, I did misunderstand the meaning. Thank you for > clearing that up. > Not at all. Thanks Edi for all that great software he creates for us. -Anton From rsynnott at gmail.com Tue Feb 27 13:00:10 2007 From: rsynnott at gmail.com (Robert Synnott) Date: Tue, 27 Feb 2007 13:00:10 +0000 Subject: [drakma-devel] Fwd: Portability of Drakma In-Reply-To: References: Message-ID: <24f203480702270500x1ab963cbhbf508435c4993dc7@mail.gmail.com> On 2/25/07, Erik Huelsmann wrote: ... > Yes. Not having a Mac, I won't be able to test OpenMCL myself, but > maybe others can assist there? > > Thanks for your time. > > > bye, > > Erik. > _______________________________________________ > drakma-devel mailing list > drakma-devel at common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel > I can test on OpenMCL on a PPC Mac if desired. Rob