From edi at agharta.de Fri Feb 5 15:58:03 2010 From: edi at agharta.de (Edi Weitz) Date: Fri, 5 Feb 2010 16:58:03 +0100 Subject: [flexi-streams-devel] a patch for chineses's cp936(gbk) encoding support. In-Reply-To: <201002051831493546864@gmail.com> References: <201002051831493546864@gmail.com> Message-ID: Thanks, I'll take a look at this when I'm less busy than now. Again, please use the mailing list for patches and questions - see Cc. Thanks, Edi. On Fri, Feb 5, 2010 at 11:31 AM, jingtaozf wrote: > Hi Edi; > I have modifed the code to fit your request.and the encode name changed from cp936 to gbk. > The following is the response. > > 1. There are several new functions and macros which don't have a > documentation string. > ==> i have add document to function get-multibyte-mapper,others are referenced with your codes. > 2. There are tabs in the files and sometimes the indentation seems wrong. > ==> i have removed tabs. > 3. There are parts of the code which are commented out. If they > aren't used, they shouldn't be in there. > ==> yes,i deleted them. > 4. There's at least one file (seems like a variant of the ASDF system > definition) which shouldn't be in there at all. > ==>sorry,but i don't know what you mean? > 5. The link in the HTML documentation is wrong. > ==>i fixed the url link to gbk. > 6. Isn't get-multibyte-mapper the wrong name? You don't get the mapper, do you? > ==>yes,this function is borrowed from sbcl's source code,and i have pointed out this in the comment. > 7. And the function itself looks like Scheme to me. I think it'd be > easier to understand using normal iteration. > ==>i have re-write the decode and encode part.but i am not familiar with scheme. :( > 8. I don't understand why the encoding factor is 1.5. Is the comment > correct or just copied? > ==>i don't know how to set the encoding factor,gbk(cp936) has one or two octets.i set it to 1.5 with my estimate. > 9. I'm not familiar with CP936. Is it correct that there's only a > big-endian version? > ==>gbk(cp936) has no endian problem, i borrowed code from your utf16's implemention,now i have fixed it. > 10. One of the new files contains DOS line endings. > ==> some your source files has DOS line endings(such as ascii.lisp),some use unix line endings. > i don't know why and maybe i modified it un-expectly. > > > Fothermore,this patch file is encoded use utf-8,because my test file contains file which is encoded by gbk format. > I also put the test files(with gbk format) to the attatchment,because i am not sure the patch will generate them correctly. > > 2010-02-04 > > > > jingtaozf > > > > ???? Edi Weitz > ????? 2009-10-29 23:08:38 > ???? jingtao xu > ??? General interest list about flexi-streams > ??? Re: a patch for chineses's cp936(gbk) encoding support. > > On Thu, Oct 29, 2009 at 3:17 PM, jingtao xu wrote: >> hi,Edi Weitz: >> Because i want to use drakma as my web development tool,but it use >> flexi-streams and could not decode chinese's cp936 characters,so i make a >> patch to flexi-streams. >> >> The sbcl has supported cp936(by files supported by beinghe,which cvs path >> is:sbcl/src/code/external-formats/enc-cn.lisp,enc-cn-tbl.lisp). i use file >> enc-cn-tbl.lisp here. >> >> The patches is make by following commands: >> diff -urN ~/.sbcl/site/flexi-streams-1.0.7/ ~/Downloads/flexi-streams/ >>>flexi-streams-cp936.patch >> >> I send you both the patch files and the patched codes. >> The test codes is added and passed. >> The documention is updated. > [I've put the mailing list on Cc where I think we should continue this.] > First of all, I'd be happy to include this into the flexi-streams > distribution. There are a couple of issues with the patch, though. > What I saw from briefly looking at it: > 1. There are several new functions and macros which don't have a > documentation string. > 2. There are tabs in the files and sometimes the indentation seems wrong. > 3. There are parts of the code which are commented out. If they > aren't used, they shouldn't be in there. > 4. There's at least one file (seems like a variant of the ASDF system > definition) which shouldn't be in there at all. > 5. The link in the HTML documentation is wrong. > 6. Isn't get-multibyte-mapper the wrong name? You don't get the mapper, do you? > 7. And the function itself looks like Scheme to me. I think it'd be > easier to understand using normal iteration. > 8. I don't understand why the encoding factor is 1.5. Is the comment > correct or just copied? > 9. I'm not familiar with CP936. Is it correct that there's only a > big-endian version? > 10. One of the new files contains DOS line endings. > There's probably more. > If you could go over these and send a cleaned-up version of the patch, > I'll review it. FWIW, here are some guidelines: > http://weitz.de/patches.html > Thanks, > Edi. >