python - string.decode() function in python2 -

- April 15, 2014

so converting code python2 python3. don't understand python2 encode/decode functionality enough determine should doing in python3

in python2, can following things:

>>> c = '\xe5\xb8\x90\xe6\x88\xb7' >>> print c 帐户 >>> c.decode('utf8') u'\u5e10\u6237'

what did there? doesn't 'u' prefix mean unicode? shouldn't utf8 '\xe5\xb8\x90\xe6\x88\xb7' since input in first place?

your variable c not declared unicode (with prefix 'u'). if decode using 'latin1' encoding same result:

>>> c.decode('latin1') u'\xe5\xb8\x90\xe6\x88\xb7'

note result of decode unicode string:

>>> type(c) <type 'str'> >>> type(c.decode('latin1')) <type 'unicode'>

if declare c unicode , keep same input, not print same characters:

>>> c=u'\xe5\xb8\x90\xe6\x88\xb7' >>> print c å¸æ·

if use input '\u5e10\u6237', print initial characters:

>>> c=u'\u5e10\u6237' >>> print c 帐户

encoding , decoding matter of using table of correspondence value<->character. thing same value not render same character according encoding (ie table) used.

the main difficulty when don't know encoding of input string have handle. tools can try guess it, not successful (see https://superuser.com/questions/301552/how-to-auto-detect-text-file-encoding).

Search This Blog

M16

python - string.decode() function in python2 -

Comments

Post a Comment

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

android - CoordinatorLayout, FAB and container layout conflict -

unity3d - How do I remove the Unity Splash Screen from my iOS builds? -