[Python-talk] unicode handling in older Python versions
Arc Riley
arcriley at gmail.com
Sat Oct 3 10:10:06 EDT 2009
Been focusing on 3.1.1, what we found is that the attached script returns:
'\ud801\udc51'
'\U00010451'
This was attached to ensure it transfers properly over the email list :-)
And, sadly, the workaround is adding .encode('utf-16').decode('utf-16'). It
appears that utf-8 support is bugged.
Make sure that you have a "wide" Python build for this, you can test that
with:
>>> import sys
>>> sys.maxunicode
1114111
A narrow build will report 65536.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://dlslug.org/pipermail/python-talk/attachments/20091003/ced8d0ad/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: u.py
Type: text/x-python
Size: 71 bytes
Desc: not available
URL: <http://dlslug.org/pipermail/python-talk/attachments/20091003/ced8d0ad/attachment.py>
More information about the Python-talk
mailing list