[cl-pdf-devel] Mapping useful Unicode characters to single-byte-encoding

Dmitriy Ivanov divanov at aha.ru
Thu Jan 10 08:52:31 UTC 2008


Hello folks,

I guess somebody has been experiencing problems when printing characters
like Em-dash with standard fonts. The reason is the following. These
characters are represented in the Lisp system internally as Unicode (usually
UCS-2), but their counterparts in Helvetica and other fonts are glyphs with
codes in [0...255] range.

For custom-encoding, char-external-code provides necessary mapping thanks to
a charset notion. But for single-byte-encoding, we do not have anything.

I suggest introducing "extended charset", an alist storing (char . code) pairs:

(defparameter *char-single-byte-codes*
  '((#.(code-char #x2014) . #x97)))  ; Em dash: 3212 -> 151

(defmethod charset ((encoding single-byte-encoding))
*char-single-byte-codes*)

(defun char-external-code (char charset)
  (cond ((null charset)
            (char-code char))
           ((atom charset)
          #+lispworks (ef:char-external-code char charset)
          #+allegro   (aref (excl:string-to-octets
                            (coerce `(,char) 'string)
                            :external-format charset) 0)
          #-(or lispworks allegro)  (char-code char))
           ((cdr (assoc char charset))    ; map to single-byte if possible
           (t (char-code char))))

For single-byte-encoding, we would define methods on get-char-metrics and
write-to-page.

Actually, I have got all the code working and could commit it.
--
Sincerely,
Dmitriy Ivanov
lisp.ystok.ru




More information about the cl-pdf-devel mailing list