Migrating to Python 3

I migrated the Python 2.7 code for hydrus to Python 3 (3.6 on FreeBSD). The automatic conversion tool, 2to3-3.6 handled most of the grunt work, as represented by the first five conversions. The others took a little more consideration.

  1. Turn print statement into a function
  2. Change except statements to use as to introduce exception variable
  3. module urlparse replaced by urllib.parse
  4. module HTMLParser replaced by html.parse
  5. module httplib replaced by http.client
  6. Use f' ' formatting for variable interpolation in message strings (optional)
  7. Specify explicit utf8 encoding when opening files. Reading files can be tricky because, if the LANG environment is set (for example to en_GB.UTF-8), then the setting is automatically used as the encoding. If LANG is unset, default is ASCII and read() will error on non-ascii characters. E.g.
          Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "/usr/local/lib/python3.6/encodings/ascii.py", line 26, in decode
          return codecs.ascii_decode(input, self.errors)[0]
          UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3277: ordinal not in range(128)
        
    It took me a while to figure out why I was getting errors when running the cgi script under Apache, but no errors running it from the command line.
  8. urllib.request returns a bytes object, which must be decoded to utf8 str before passing to html.parser.HTMLParser.feed()
  9. Sorting is handled differently in Python 3, resulting in errors like:
          TypeError: '<' not supported between instances of 'ipa' and 'ipa'
        

    This occurred when sorting of a list of tuples, where I'd used the 'decorate - sort' pattern, to sort on the count attribute of ipa instances. Python 2 doesn't mind if the second object in the tuple is not orderable, but clearly Python 3 does. The simplest fix was to add a key function to the sort, which explicitly sorted on the first element of the tuple:

          sorted_list = ls.sort(key=lambda tup: tup[0])
        

Support for Python 3 version of libxml2 on OpenBSD

The libxml2 bindings for Python 3 do not appear to exist in packages or ports on OpenBSD 6.5. However, the source can be downloaded from pypi.org.

The module is built and installed as follows:

  [mark@chrome:~/dev/libxml2-python3-2.9.5]$ python setup.py build
  [mark@chrome:~/dev/libxml2-python3-2.9.5]$ doas python setup.py install

Since httpd on OpenBSD runs chrooted, to use python 3 with CGI, a set of support files must be copied into the chroot environment. I'd done this for Python 2 before, and this is how it looks for Python 3:

  #!/bin/sh
  #
  # Enable Python3 to operate in a chrooted environment (for httpd)
  #
  export CHROOT=/home/www
  rm -rf ${CHROOT}/usr ${CHROOT}/var ${CHROOT}/etc ${CHROOT}/sbin \
  ${CHROOT}/run ${CHROOT}/logs	
  mkdir -p ${CHROOT}/usr/local/bin ${CHROOT}/usr/lib ${CHROOT}/usr/libexec \
  ${CHROOT}/sbin ${CHROOT}/var/run ${CHROOT}/etc \
  ${CHROOT}/usr/local/lib ${CHROOT}/run ${CHROOT}/logs
  cp -p /sbin/ldconfig ${CHROOT}/sbin
  cp -p /usr/local/bin/python3.7 ${CHROOT}/usr/local/bin/python
  cp -pr /usr/local/lib/python3.7 ${CHROOT}/usr/local/lib
  cp -p /usr/local/lib/libpython3.7m.so.*  ${CHROOT}/usr/local/lib
  cp -p /usr/local/lib/libintl.so.6.0 ${CHROOT}/usr/local/lib
  cp -p /usr/local/lib/libiconv.so.6.0 ${CHROOT}/usr/local/lib
  cp -p /usr/lib/libpthread.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libutil.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libstdc++.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libm.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libc.so.* ${CHROOT}/usr/lib
  cp -p /usr/libexec/ld.so ${CHROOT}/usr/libexec
  cp -p /usr/lib/libz.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libpthread.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libutil.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libm.so.* ${CHROOT}/usr/lib
  cp -p /usr/lib/libssl.so* ${CHROOT}/usr/lib
  cp -p /usr/lib/libcrypto.so* ${CHROOT}/usr/lib
  cp -p /etc/pwd.db ${CHROOT}/etc
  # build ld.hints.so file so python can find its libraries
  chroot ${CHROOT} /sbin/ldconfig /usr/local/lib