Skip to content

Commit de8cdfc

Browse files
committed
[bpo-28414] In SSL module, store server_hostname as an A-label
Historically, our handling of international domain names (IDNs) in the ssl module has been very broken. The flow went like: 1. User passes server_hostname= to the SSLSocket/SSLObject constructor. This gets normalized to an A-label by using the PyArg_Parse "et" mode: bytes objects get passed through unchanged (assumed to already be A-labels); str objects get run through .encode("idna") to convert them into A-labels. 2. newPySSLSocket takes this A-label, and for some reason decodes it *back* to a U-label, and stores that as the object's server_hostname attribute. 3. Later, this U-label server_hostname attribute gets passed to match_hostname, to compare against the hostname seen in the certificate. But certificates contain A-labels, and match_hostname expects to be passed an A-label, so this doesn't work at all. This PR fixes the problem by removing the pointless decoding at step 2, so that internally we always use A-labels, which matches how internet protocols are designed in general: A-labels are used everywhere internally and on-the-wire, and U-labels are basically just for user interfaces. This also matches the general advice to handle encoding/decoding once at the edges, though for backwards-compatibility we continue to use 'str' objects to store A-labels, even though they're now always ASCII. Technically there is a minor compatibility break here: if a user examines the .server_hostname attribute of an ssl-wrapped socket, then previously they would have seen a U-label like "pythön.org", and now they'll see an A-label like "xn--pythn-mua.org". But this only affects non-ASCII domain names, which have never worked in the first place, so it seems unlikely that anyone is relying on the old behavior. This PR also adds an end-to-end test for IDN hostname validation. Previously there were no tests for this functionality. Fixes bpo-28414.
1 parent 0df1905 commit de8cdfc

File tree

7 files changed

+110
-11
lines changed

7 files changed

+110
-11
lines changed

Doc/library/ssl.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1243,6 +1243,12 @@ SSL sockets also have the following additional methods and attributes:
12431243

12441244
.. versionadded:: 3.2
12451245

1246+
.. versionchanged:: 3.7
1247+
When ``server_hostname`` is an internationalized domain name
1248+
(IDN), this attribute now stores the A-label form
1249+
(``"xn--pythn-mua.org"``), rather than the U-label form
1250+
(``"pythön.org"``).
1251+
12461252
.. attribute:: SSLSocket.session
12471253

12481254
The :class:`SSLSession` for this SSL connection. The session is available

Doc/whatsnew/3.7.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -297,6 +297,16 @@ string
297297
expression pattern for braced placeholders and non-braced placeholders
298298
separately. (Contributed by Barry Warsaw in :issue:`1198569`.)
299299

300+
ssl
301+
---
302+
303+
Added support for validating server certificates containing
304+
internationalized domain names (IDNs). As part of this change, the
305+
:attr:`ssl.SSLSocket.server_hostname` attribute now stores the
306+
expected hostname in A-label form (``"xn--pythn-mua.org"``), rather
307+
than the U-label form (``"pythön.org"``). (Contributed by
308+
Nathaniel J. Smith in :issue:`28414`.)
309+
300310
unittest.mock
301311
-------------
302312

Lib/test/ssl-idn-ca.pem

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
-----BEGIN CERTIFICATE-----
2+
MIICQDCCAamgAwIBAgIUBg5Z+vupJbrxjKmrYeJ6nb3xnQEwDQYJKoZIhvcNAQEL
3+
BQAwQDEXMBUGA1UECgwOdHJ1c3RtZSB2MC40LjAxJTAjBgNVBAsMHFRlc3Rpbmcg
4+
Q0EgI3QxM0tzY2dCQm8xQzZpTnUwIBcNMDAwMTAxMDAwMDAwWhgPMzAwMDAxMDEw
5+
MDAwMDBaMEAxFzAVBgNVBAoMDnRydXN0bWUgdjAuNC4wMSUwIwYDVQQLDBxUZXN0
6+
aW5nIENBICN0MTNLc2NnQkJvMUM2aU51MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCB
7+
iQKBgQDRWbKkz1+y31q1bu5P5J/XOjSwEac1ESG2G7W6hbYsVTn6OqqjXvebE3ex
8+
+pNd/ciBHMv0SzlqKyo5l0BNLOjlth8C7j9LbUimddl4rpkpmtEuu4acwPT9pzts
9+
jHxSPehJsF26ixReg8qi/E8Rsrri+3swFbI0pos6pQZo81HvjwIDAQABozUwMzAd
10+
BgNVHQ4EFgQUYDMWkOTMwi9BwFK0blya/ou4r/YwEgYDVR0TAQH/BAgwBgEB/wIB
11+
CTANBgkqhkiG9w0BAQsFAAOBgQC6BjPf9juAfJPNVqRX3qAWIf6wpOVWX1CO/Qtc
12+
HSiCxxpTv2xAGX9ZAwK8liBKR4qGd0lDmujpKVLKAdsWlhFWJNgO3VyQgTkOYBzf
13+
6fq2RE1oKXmqg2H7ndcku6TACNVCyFv4hbN4RISXO0Al/gR4h3lL9+05BXxT2eb5
14+
dcUonA==
15+
-----END CERTIFICATE-----

Lib/test/ssl-idn-cert.pem

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
-----BEGIN RSA PRIVATE KEY-----
2+
MIICXQIBAAKBgQCjbzNwZrxp50RAVF55jSw5M//KD5/kswwdpUePux00JS5WnHh3
3+
SE98rnrWS34ryblBEMOEgIZPuYFjLs0fVN1XgmUHxs2cDPFUOpBrK2tf+nMDN0o9
4+
AZG4V3e6wbwOKPyIybJyhkyCe1jj5oXjhrYTcDB0TIteAmtRkLU4nZlLrwIDAQAB
5+
AoGAYhHyTfp4CRyLaga2gj3iUZkQXpGtorCGDqwFGwxu48GD4tkVuI4dlHWmpDy8
6+
w03S6mZCzJnK/sAUEg4dbDWic1D9QyocWtpFFPJ3RyWKEuzN9Ka518dAzRtKya+9
7+
oUovXCfCAiw3gwi2sO6QeADPbnScNgLlSiOZTOyTTJhydyECQQDPCnRxqXjLJjgT
8+
Wpur6oSLiLmtitG1KsU80d7X2yqCnScTysw4IwYoOdq4BmiwSSnwV2Glha4pZKz1
9+
trghZ9opAkEAyhT1vLNON4WDc6vpbWYCGFf3TaJRSqdd/hjaUmBeR9Wsa+bUccxS
10+
6Jk2RcfP0dv8NT4ZY6bILNktCUr4V9liFwJBAJ+jTA2nwn/BRFOH9agk93YvQhvR
11+
gcjS5anzmIOPdcOoMM1N/RD70G+LzF1Ac9AZWcD7X0slPBimi8YZ0PfQ/6ECQQCU
12+
hAT6EvlIGsq6Jz0d1ptxkqzBFKsT559PkKpbYlHID4RxpKq7m8PPCFL3w9q7TCa2
13+
ZpY4Q6nYNCBCNSQBRFUvAkBhGZjMj25DqZO3vsW2LqwlBVk+hIih0LrLHThCuzT1
14+
rKdC2AdCvu3FZCbNg0lpjiYXEYCXvCP4c5TqS3H8uaC8
15+
-----END RSA PRIVATE KEY-----
16+
-----BEGIN CERTIFICATE-----
17+
MIICjTCCAfagAwIBAgIUHg5Af4BmKSy4Xyot/yAGeSkO+PswDQYJKoZIhvcNAQEL
18+
BQAwQDEXMBUGA1UECgwOdHJ1c3RtZSB2MC40LjAxJTAjBgNVBAsMHFRlc3Rpbmcg
19+
Q0EgI3QxM0tzY2dCQm8xQzZpTnUwIBcNMDAwMTAxMDAwMDAwWhgPMzAwMDAxMDEw
20+
MDAwMDBaMEkxFzAVBgNVBAoMDnRydXN0bWUgdjAuNC4wMS4wLAYDVQQLDCVUZXN0
21+
aW5nIHNlcnZlciBjZXJ0ICNaRlFmTEh1MDZaYU56UFYxMIGfMA0GCSqGSIb3DQEB
22+
AQUAA4GNADCBiQKBgQCjbzNwZrxp50RAVF55jSw5M//KD5/kswwdpUePux00JS5W
23+
nHh3SE98rnrWS34ryblBEMOEgIZPuYFjLs0fVN1XgmUHxs2cDPFUOpBrK2tf+nMD
24+
N0o9AZG4V3e6wbwOKPyIybJyhkyCe1jj5oXjhrYTcDB0TIteAmtRkLU4nZlLrwID
25+
AQABo3kwdzAdBgNVHQ4EFgQU1BZqNqO4rmUwk8V015AXCb8+keQwDAYDVR0TAQH/
26+
BAIwADAfBgNVHSMEGDAWgBRgMxaQ5MzCL0HAUrRuXJr+i7iv9jAnBgNVHREBAf8E
27+
HTAbghl4bi0tcHl0aG4tbXVhLmV4YW1wbGUub3JnMA0GCSqGSIb3DQEBCwUAA4GB
28+
ABJ4tUqfj9gHEYGxousPf7HSX/ZHF8e9HW+qpOX/urPRdGM0ObYrUlPgKJ1NIlA2
29+
HSOPWGVQgvk6P84s0oBYLAJ0C2CrKg2AQsusFn9s8dAM9hlYNEK9rfTQILxrnCyz
30+
vpg6hKEGXN0UjYPb5HBPFKsWF0DbbNaWrr0co32yH2L8
31+
-----END CERTIFICATE-----

Lib/test/test_ssl.py

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,15 @@ def data_file(*name):
8080
DHFILE = data_file("dh1024.pem")
8181
BYTES_DHFILE = os.fsencode(DHFILE)
8282

83+
# These were generated by doing 'pip install trustme', and then:
84+
# import trustme
85+
# ca = trustme.CA()
86+
# cert = ca.issue_server_cert("pythön.example.org")
87+
# ca.cert_pem.write_to_path("ssl-idn-ca.pem")
88+
# cert.private_key_and_cert_chain_pem.write_to_path("ssl-idn-cert.pem")
89+
IDN_CA = data_file("ssl-idn-ca.pem")
90+
IDN_CERT = data_file("ssl-idn-cert.pem")
91+
8392
# Not defined in all versions of OpenSSL
8493
OP_NO_COMPRESSION = getattr(ssl, "OP_NO_COMPRESSION", 0)
8594
OP_SINGLE_DH_USE = getattr(ssl, "OP_SINGLE_DH_USE", 0)
@@ -1473,16 +1482,6 @@ def test_subclass(self):
14731482
# For compatibility
14741483
self.assertEqual(cm.exception.errno, ssl.SSL_ERROR_WANT_READ)
14751484

1476-
def test_bad_idna_in_server_hostname(self):
1477-
# Note: this test is testing some code that probably shouldn't exist
1478-
# in the first place, so if it starts failing at some point because
1479-
# you made the ssl module stop doing IDNA decoding then please feel
1480-
# free to remove it. The test was mainly added because this case used
1481-
# to cause memory corruption (see bpo-30594).
1482-
ctx = ssl.create_default_context()
1483-
with self.assertRaises(UnicodeError):
1484-
ctx.wrap_bio(ssl.MemoryBIO(), ssl.MemoryBIO(),
1485-
server_hostname="xn--.com")
14861485

14871486
class MemoryBIOTests(unittest.TestCase):
14881487

@@ -2521,6 +2520,39 @@ def test_check_hostname(self):
25212520
"check_hostname requires server_hostname"):
25222521
client_context.wrap_socket(s)
25232522

2523+
def test_check_hostname_idn(self):
2524+
if support.verbose:
2525+
sys.stdout.write("\n")
2526+
2527+
server_context = ssl.SSLContext(ssl.PROTOCOL_TLS)
2528+
server_context.load_cert_chain(IDN_CERT)
2529+
2530+
context = ssl.SSLContext(ssl.PROTOCOL_TLS)
2531+
context.verify_mode = ssl.CERT_REQUIRED
2532+
context.check_hostname = True
2533+
context.load_verify_locations(IDN_CA)
2534+
2535+
# correct hostname should verify, when specified in several
2536+
# different ways
2537+
for server_hostname in ["pythön.example.org",
2538+
"xn--pythn-mua.example.org",
2539+
b"xn--pythn-mua.example.org"]:
2540+
server = ThreadedEchoServer(context=server_context, chatty=True)
2541+
with server:
2542+
with context.wrap_socket(socket.socket(),
2543+
server_hostname=server_hostname) as s:
2544+
s.connect((HOST, server.port))
2545+
cert = s.getpeercert()
2546+
self.assertTrue(cert, "Can't get peer certificate.")
2547+
2548+
# incorrect hostname should raise an exception
2549+
server = ThreadedEchoServer(context=server_context, chatty=True)
2550+
with server:
2551+
with context.wrap_socket(socket.socket(),
2552+
server_hostname="python.example.org") as s:
2553+
with self.assertRaises(ssl.CertificateError):
2554+
s.connect((HOST, server.port))
2555+
25242556
def test_wrong_cert(self):
25252557
"""Connecting when the server rejects the client's certificate
25262558
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
The ssl module can now validate hostnames that contain non-ASCII
2+
characters (IDNs).

Modules/_ssl.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -716,8 +716,11 @@ newPySSLSocket(PySSLContext *sslctx, PySocketSockObject *sock,
716716
self->owner = NULL;
717717
self->server_hostname = NULL;
718718
if (server_hostname != NULL) {
719+
/* server_hostname was encoded to an A-label by our caller; put it
720+
* back into a str object, but still as an A-label (bpo-28414)
721+
*/
719722
PyObject *hostname = PyUnicode_Decode(server_hostname, strlen(server_hostname),
720-
"idna", "strict");
723+
"ascii", "strict");
721724
if (hostname == NULL) {
722725
Py_DECREF(self);
723726
return NULL;

0 commit comments

Comments
 (0)