10.6. Python3 Migration
10.6.1. General Python3 Notes
there is no PyGtk for python3. instead use Gtk3 via gobject-introspection.
py3 strings
py2
py3
typical usage
str / bytes
bytes
bunch of bytes, possibly binary / non-readable
unicode
str
text, bunch of unicode-codepoints, usually human readable
py2:
str == bytes # -> True
a = b"s\xc3\xb6me" # a is of type str (and bytes), length 5 bytes
a2 = "s\xc3\xb6me" # same as a
b = a.decode("utf-8") # b is of type unicode, length 4 unicode-codepoints/chars
c = u"söme" # c is of type unicode, length 4 chars
d = c.encode("utf-8") # d is again of type str, length 5 bytes
open("some_file", "rb").read() # returns str
open("some_file", "r").read() # returns str
py3:
str == bytes # -> False
a = "söme" # a is of type str, length 4 unicode-codepoints/chars
a2 = u"söme" # same as a
b = a.encode("utf-8") # b is of type bytes, length 5
c = b.decode("utf-8") # c is again of type str, length 4
open("some_file", "rb").read() # returns bytes
open("some_file", "r").read() # returns str, decodes contents with
# locale.getpreferredencoding() (usually utf-8)
py3 replaced the
print-instruction by aprint()-function.py3 dict has no
.iteritems()(use.items()instead)py3 very often returns iterators/generator-objects where py2 returned lists
py3 no longer understands the L-postfix for numbers (like
123L)py3 removed these keywords/builtin’s:
unicode(), file(), xrange()
i suggest to start migrating your python source by using 2to3 on it.
10.6.2. LN py3 binding and char-type in message-definitions
LN always tried to suggest to use the char-type within message definitions to transport text. accordingly the py3 binding translates an array of char to a py3-str-type by automatically decoding them with the utf-8 codec. (and also encoding py3-str’s to get bytes for char)
BUT: never try to transport some random binary data via the char-type – as the binary sequence might not translate to valid unicode-codepoints! instead use the uint8_t-type for binary data! this will be passed unmodified as py3-bytes.
sadly the file_services example client used the char-type in its message-definitions (write_file and read_from_file) to transport file-contents. those md’s were fixed and renamed to file_services2/write_file and file_services2/read_from_file.
the fs_sync python program and the file_services executable were adjusted for this.
please check your message definitions: use char for utf-8 encoded text and uint8_t for arbitrary binary data.
(as always: please rename your md’s on change if they were already released – e.g. put your major version into your the name of your md’s – changing a md is changing the interface, which should trigger a major-version change…)