From cmucl-devel at common-lisp.net  Tue Oct  5 12:04:03 2010
From: cmucl-devel at common-lisp.net (cmucl)
Date: Tue, 05 Oct 2010 12:04:03 -0000
Subject: [cmucl-ticket] [cmucl] #42: read-sequence vs unicode
In-Reply-To: <052.aaf6cdf1e447bfe169ffe0abdc956acd@common-lisp.net>
References: <052.aaf6cdf1e447bfe169ffe0abdc956acd@common-lisp.net>
Message-ID: <061.275064d512378e188d4cd0159991d27c@common-lisp.net>

#42: read-sequence vs unicode
---------------------+------------------------------------------------------
  Reporter:  rtoy    |       Owner:  somebody
      Type:  defect  |      Status:  new     
  Priority:  major   |   Milestone:          
 Component:  Core    |     Version:  20b     
Resolution:          |    Keywords:          
---------------------+------------------------------------------------------
Description changed by rtoy:

Old description:

> Cmucl has been able to use {{{READ-SEQUENCE}}} to read octets (and other
> integers types) from character streams.  With the introduction of Unicode
> support, this no longer works correctly in general.  The data that is
> read is not done from the last position, and the data that is read is not
> necessarily reflected in the next {{{READ-CHAR}}}.  That is, {{{READ-
> CHAR}}} might re-read the data that {{{READ-SEQUENCE}}} already read.
> (This depends on how much data has been read, and the internal stream
> buffering.)
>
> However, if the external format is :iso8859-1, then {{{READ-SEQUENCE}}}
> behaves as it used to.  Hence, as a workaround, the user can set the
> external format to :iso8859-1 before {{{READ-SEQUENCE}}} and set it back
> afterwards.  This works as expected.
>
> Perhaps {{{READ-SEQUENCE}}} should do that itself?  (Appropriately
> wrapping everything in {{{UNWIND-PROTECT}}} so that the stream external
> format isn't unexpected modified.)

New description:

 Cmucl has been able to use {{{READ-SEQUENCE}}} to read octets (and other
 integers types) from character streams.  With the introduction of Unicode
 support, this no longer works correctly in general.  The data that is read
 is not done from the last position, and the data that is read is not
 necessarily reflected in the next {{{READ-CHAR}}}.  That is, {{{READ-
 CHAR}}} might re-read the data that {{{READ-SEQUENCE}}} already read.
 (This depends on how much data has been read, and the internal stream
 buffering.)

 However, if the external format is :iso8859-1, then {{{READ-SEQUENCE}}}
 behaves as it used to.  Hence, as a workaround, the user can set the
 external format to :iso8859-1 before {{{READ-SEQUENCE}}} and set it back
 afterwards.  This works as expected.

 Perhaps {{{READ-SEQUENCE}}} should do that itself?  (Appropriately
 wrapping everything in {{{UNWIND-PROTECT}}} so that the stream external
 format isn't unexpectedly modified.)

--

-- 
Ticket URL: <http://trac.common-lisp.net/cmucl/ticket/42#comment:1>
cmucl <http://common-lisp.net/project/cmucl>
Cmucl is a high-performance, free Common Lisp implementation.

From cmucl-devel at common-lisp.net  Fri Oct 22 00:57:23 2010
From: cmucl-devel at common-lisp.net (cmucl)
Date: Fri, 22 Oct 2010 00:57:23 -0000
Subject: [cmucl-ticket] [cmucl] #42: read-sequence vs unicode
In-Reply-To: <052.aaf6cdf1e447bfe169ffe0abdc956acd@common-lisp.net>
References: <052.aaf6cdf1e447bfe169ffe0abdc956acd@common-lisp.net>
Message-ID: <061.f91909d5295e919fd332343802968ffc@common-lisp.net>

#42: read-sequence vs unicode
---------------------+------------------------------------------------------
  Reporter:  rtoy    |       Owner:  somebody
      Type:  defect  |      Status:  new     
  Priority:  major   |   Milestone:          
 Component:  Core    |     Version:  20b     
Resolution:          |    Keywords:          
---------------------+------------------------------------------------------

Comment(by rtoy):

 The October snapshot should have this mostly fixed.  Errors will now be
 signaled if the stream is of the wrong type, but if the stream is a
 binary-text-stream, {{{READ-SEQUENCE}}} will work as before.  This is a
 change from 20a and 20b, and is also incompatible with 19f and earlier
 which allowed {{{READ-SEQUENCE}}} to work on character streams and binary
 streams of {{{'(unsigned-byte 8)}}}.

-- 
Ticket URL: <http://trac.common-lisp.net/cmucl/ticket/42#comment:2>
cmucl <http://common-lisp.net/project/cmucl>
Cmucl is a high-performance, free Common Lisp implementation.