Performing HTTP SSL server certificate validation from Python or Perl

Update: Since I’ve written this post I’ve switched to using Python Requests, which is a much better way of achieving verified SSL connections.

SSL/TLS, love it or hate it, is the backbone of nearly all online communication. These days most network protocols are usually written atop HTTP, and then wrapped inside TLS (HTTPS) to provide encryption. But HTTPS also provides verification and trust via certificates. In this way you can ensure that not only are you sending your data in an encrypted fashion but you are talking to the real server and not a rogue server instead.

You would think, given the prevalence of systems written using HTTPS as the underlying protocol, that writing a HTTPS client in Python or Perl would be easy and all the complex security and verification would be done for you. Sadly not so. Simply opening HTTPS connections from Python and Perl are both extremely easy – you can use urllib2 in Python and LWP in Perl. Both provide encryption – but certificate verification? Not so easy.

Perl is best placed here because the current version of LWP (or any version from 6.00) performs certificate validation. When you connect to an SSL HTTP server it will validate the certificate and ensure that the CN of the certificate matches the hostname you specified. For more details, see the LWP docs for SSL opts. This feature is relatively recent however – it was only released in March 2011. As such nearly all current stable Linux distributions ship a previous LWP.

Previous releases could be made – with some difficulty – to validate the certificate, but not to verify the CN of the certificate matched the hostname the code was connecting to. That wasn’t all that useful of course because as long as the server had /any/ valid SSL certificate trusted by the client then it would work. So for LWP connections using SSL make sure you update your module to LWP 6 or later, and of course you can do this easily on even old versions of Perl.

What about Python? Surely Python’s famous ease of use and great built in modules will mean it works? Alas, no. The situation is even worse. As the urllib2 documentation for Python 2.7 says, HTTPS connections do not perform certificate verification. Python 3.0 or 3.1 doesn’t support it either. Python 3.2 or later, thankfully, does support performing SSL certificate verification – it has new options for specifying the CA certificate details. It even does hostname CN checking/verification. But Python 3.2 is very new as well – first release in February 2011 – and is in virtually no Linux distributions – certainly no stable or enterprise releases.

However, there is another option. Many programs are written in Python that perform proper SSL certificate checking. A little poking around at the source of some Red Hat utilities revealed the answer – they don’t use urllib, they use libcurl / pycurl instead. The python curl module supports performing SSL verification and hostname verification and as such this module is used by virtually all the tools in Python which need to talk over HTTPS. Sadly, many don’t however and were probably coded with urllib thinking they did.

Sadly, pycurl is really badly documented. So if you want to find out how to do it here is a little example:

import pycurl
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "https://secure.domain.com/")
curl.setopt(pycurl.SSL_VERIFYPEER, 1)
curl.setopt(pycurl.SSL_VERIFYHOST, 2)
curl.setopt(pycurl.CAINFO, "/path/to/certificate-chain-bundle.crt")
curl.perform()

The “SSL_VERIFYPEER” flag means that cURL will check the validity of the certificate against the certificate chain / root CA certificates. The “SSL_VERIFYHOST” flag means that cURL will check the certificate CN matches the hostname you connected to. The latter option must be set to 2 – see the link for more information.