10.6. Python3 Migration

10.6.1. General Python3 Notes

  • there is no PyGtk for python3. instead use Gtk3 via gobject-introspection.

  • py3 strings

    py2

    py3

    typical usage

    str / bytes

    bytes

    bunch of bytes, possibly binary / non-readable

    unicode

    str

    text, bunch of unicode-codepoints, usually human readable

py2:

str == bytes # -> True
a = b"s\xc3\xb6me" # a is of type str (and bytes), length 5 bytes
a2 = "s\xc3\xb6me" # same as a
b = a.decode("utf-8") # b is of type unicode, length 4 unicode-codepoints/chars
c = u"söme" # c is of type unicode, length 4 chars
d = c.encode("utf-8") # d is again of type str, length 5 bytes

open("some_file", "rb").read() # returns str
open("some_file", "r").read() # returns str

py3:

str == bytes # -> False
a = "söme" # a is of type str, length 4 unicode-codepoints/chars
a2 = u"söme" # same as a
b = a.encode("utf-8") # b is of type bytes, length 5
c = b.decode("utf-8") # c is again of type str, length 4

open("some_file", "rb").read() # returns bytes
open("some_file", "r").read() # returns str, decodes contents with
# locale.getpreferredencoding() (usually utf-8)
  • py3 replaced the print-instruction by a print()-function.

  • py3 dict has no .iteritems() (use .items() instead)

  • py3 very often returns iterators/generator-objects where py2 returned lists

  • py3 no longer understands the L-postfix for numbers (like 123L)

  • py3 removed these keywords/builtin’s: unicode(), file(), xrange()

i suggest to start migrating your python source by using 2to3 on it.

10.6.2. LN py3 binding and char-type in message-definitions

LN always tried to suggest to use the char-type within message definitions to transport text. accordingly the py3 binding translates an array of char to a py3-str-type by automatically decoding them with the utf-8 codec. (and also encoding py3-str’s to get bytes for char)

BUT: never try to transport some random binary data via the char-type – as the binary sequence might not translate to valid unicode-codepoints! instead use the uint8_t-type for binary data! this will be passed unmodified as py3-bytes.

sadly the file_services example client used the char-type in its message-definitions (write_file and read_from_file) to transport file-contents. those md’s were fixed and renamed to file_services2/write_file and file_services2/read_from_file.

the fs_sync python program and the file_services executable were adjusted for this.

please check your message definitions: use char for utf-8 encoded text and uint8_t for arbitrary binary data.

(as always: please rename your md’s on change if they were already released – e.g. put your major version into your the name of your md’s – changing a md is changing the interface, which should trigger a major-version change…)