[Python-talk] unicode handling in older Python versions
Lloyd Kvam
python at venix.com
Sat Oct 3 10:02:14 EDT 2009
On Sat, 2009-10-03 at 01:26 -0400, Arc Riley wrote:
> If anyone has Python <2.5, can you please try the following and report
> back on whether it worked? I have verified this as broken in
> different ways on 2.5, 2.6, 3.0, and 3.1.1
>
> It appears that every version of Python to date has a serious utf-8
> >plane0 bug that has gone unnoticed until now. It may be
> useful to learn if the bug was introduced at some point in antiquity
> and we just lacked the unit test for it.
>
> $ python2.5
> Python 2.5.4 (r254:67916, Jan 24 2009, 01:30:20)
> [GCC 4.3.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> line = u'𐑑𐑧𐑕𐑑𐑦𐑙'
> >>> first = u'𐑑'
> >>> first
> u'\ud801\udc51'
>
> first should either be u'\U00010451' or obviously u'𐑑'
Emailing unicode text directly can be a little problematic. Here's the
email source I received:
"""""""""""""""""""""""""""""""""""""
Type "help", "copyright", "credits" or "license" for more information.
>>> line =3D
u'=F0=90=91=91=F0=90=91=A7=F0=90=91=95=F0=90=91=91=F0=90=91=A6=
=F0=90=91=99'
>>> first =3D u'=F0=90=91=91'
>>> first
u'\ud801\udc51'
first should either be u'\U00010451' or obviously u'=F0=90=91=91'
"""""""""""""""""""""""""""""""""""""""""
I could not paste the string in the email into my Python window. So I
tried to build it up. This fails to demonstrate the bug. I saw Kent's
email, showing the bug, so I assume I am fouling up the test scenario
somehow. I have Python 2.3 and 2.4 for testing, but need a more
reliable way to create the problem string.
IPython 0.8.4 [on Py 2.5.2] ## Fedora 10
[~]|20> first_ords = [0xf0,0x90,0x91,0x91]
[~]|21> first_str = ''.join(chr(n) for n in first_ords)
[~]|22> first_uni = first_str.decode('utf8')
[~]|23> first_uni
<23> u'\U00010451'
So I'm not seeing the bug or I'm not building first in the proper way to
demonstrate the bug.
>
> _______________________________________________
> Python-talk mailing list
> Python-talk at dlslug.org
> http://dlslug.org/mailman/listinfo/python-talk
--
Lloyd Kvam
Venix Corp
DLSLUG/GNHLUG library
http://dlslug.org/library.html
http://www.librarything.com/catalog/dlslug
http://www.librarything.com/rsshtml/recent/dlslug
http://www.librarything.com/rss/recent/dlslug
More information about the Python-talk
mailing list