Tuesday, October 26, 2010

Writing unicode strings via sys.stdout in Python

Programmer Question

Assume for a moment that one cannot use print (and thus enjoy the benefit of automatic encoding detection). So that leaves us with sys.stdout. However, sys.stdout is so dumb as to not do any sensible encoding.



Now one reads the Python wiki page PrintFails and goes to try out the following code:



$ python -c 'import sys, codecs, locale; print str(sys.stdout.encoding); \
sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout);


However this too does not work (at least on Mac). Too see why:



>>> import locale
>>> locale.getpreferredencoding()
'mac-roman'
>>> sys.stdout.encoding
'UTF-8'


(UTF-8 is what my terminal understands).



So one changes the above code to:



$ python -c 'import sys, codecs, locale; print str(sys.stdout.encoding); \
sys.stdout = codecs.getwriter(sys.stdout.encoding)(sys.stdout);


And now unicode strings are properly sent to sys.stdout and hence printed properly on the terminal (sys.stdout is attached the terminal).



Is this the correct way to write unicode strings in sys.stdout or should I be doing something else?



EDIT: sometimes sys.stdout.encoding will be None (example: when piping the out through less). in this case, the above code will fail.



Find the answer here

No comments:

Post a Comment

LinkWithin

Related Posts with Thumbnails