Python/Django: How to convert utf-16 str bytes to unicode? -


fellows,

i unable parse unicode text file submitted using django forms. here quick steps performed:

  1. uploaded text file ( encoding: utf-16 ) ( file contents: hello world 13 )

  2. on server side, received file using filename = request.files['file_field']

  3. going line line: for line in filename: yield line

  4. type(filename) gives me <class 'django.core.files.uploadedfile.inmemoryuploadedfile'>

  5. type(line) <type 'str'>

  6. print line : '\xff\xfeh\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00 \x001\x003\x00'

  7. codecs.bom_utf16_le == line[:2] returns true

  8. now, want re-construct unicode or ascii string "hello world 13" can parse integer line.

one of ugliest way of doing retrieve using line[-5:] (= '\x001\x003\x00') , construct using line[-5:][1], line[-5:][3].

i sure there must better way of doing this. please help.

thanks in advance!

use codecs.iterdecode() decode object on fly:

from codecs import iterdecode  line in iterdecode(filename, 'utf16'): yield line 

Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -