Skip to content

Conversation

@pfalcon
Copy link
Contributor

@pfalcon pfalcon commented Feb 20, 2018

A number of (HTTPS) sites on the modern Internet mandate use of
forward secrecy via ephemeral key exchange (DHE and ECDHE
cyphersuites). Elabling such cyphersuites how leads to increased
code size, mbedTLS heap usage, and runtime overheads, so enabling
them in the existing config-mini-tls1_2.h doesn't seem like a good
idea. Instead, create a new config, config-tls1_2.h, which includes
config-mini-tls1_2.h, and enables more options. The idea of this
file is to provide configuration as compatible as possible with
modern state of TLS 1.2 usage on the Internet, so more options
can be enabled in the future.

In the meantime, this config enables DHE by default, because it
leads to a moderate code size increase (~5K x86). However, DHE
cyphersuites are known to be slow during handshake phase (up to
3 times slower than with ephemeral key exchanges, based on
reports for OpenSSL). ECDHE is known to be faster (~50% time
overhead), but leads to much higher code impact (~15K). So, for
now only DHE is enabled.

Signed-off-by: Paul Sokolovsky [email protected]

A number of (HTTPS) sites on the modern Internet mandate use of
forward secrecy via ephemeral key exchange (DHE and ECDHE
cyphersuites). Elabling such cyphersuites how leads to increased
code size, mbedTLS heap usage, and runtime overheads, so enabling
them in the existing config-mini-tls1_2.h doesn't seem like a good
idea. Instead, create a new config, config-tls1_2.h, which includes
config-mini-tls1_2.h, and enables more options. The idea of this
file is to provide configuration as compatible as possible with
modern state of TLS 1.2 usage on the Internet, so more options
can be enabled in the future.

In the meantime, this config enables DHE by default, because it
leads to a moderate code size increase (~5K x86). However, DHE
cyphersuites are known to be slow during handshake phase (up to
3 times slower than with ephemeral key exchanges, based on
reports for OpenSSL). ECDHE is known to be faster (~50% time
overhead), but leads to much higher code impact (~15K). So, for
now only DHE is enabled.

Signed-off-by: Paul Sokolovsky <[email protected]>
@pfalcon
Copy link
Contributor Author

pfalcon commented Feb 20, 2018

Besides a generic issue statement in the commit message, here's a specific case which prompted going for such a solution: #5985 (comment) .

@codecov-io
Copy link

codecov-io commented Feb 20, 2018

Codecov Report

Merging #6271 into master will decrease coverage by 0.2%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6271      +/-   ##
==========================================
- Coverage   53.16%   52.96%   -0.21%     
==========================================
  Files         412      438      +26     
  Lines       40136    44329    +4193     
  Branches     7731     8875    +1144     
==========================================
+ Hits        21339    23479    +2140     
- Misses      15663    17468    +1805     
- Partials     3134     3382     +248
Impacted Files Coverage Δ
boards/posix/native_posix/cmdline.c 27.02% <0%> (-22.98%) ⬇️
subsys/net/ip/route.c 51.11% <0%> (-2.1%) ⬇️
boards/posix/native_posix/main.c 90% <0%> (-0.91%) ⬇️
...s/kernel/workq/work_queue_api/src/test_workq_api.c 100% <0%> (ø) ⬆️
subsys/net/ip/ipv6.h 100% <0%> (ø) ⬆️
include/bluetooth/hci.h 0% <0%> (ø) ⬆️
.../kernel/threads/lifecycle/lifecycle_api/src/main.c 75% <0%> (ø) ⬆️
arch/posix/include/posix_arch_internal.h 100% <0%> (ø) ⬆️
subsys/net/lib/http/http.c 0% <0%> (ø) ⬆️
tests/kernel/workq/work_queue_api/src/main.c 66.66% <0%> (ø) ⬆️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5490d28...4707bc9. Read the comment docs.

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having "#if 1....", why not create a config option for this so that it is easier to change by the user?

@pfalcon
Copy link
Contributor Author

pfalcon commented Feb 20, 2018

Instead of having "#if 1....", why not create a config option for this so that it is easier to change by the user?

By the same reason why you didn't want to introduce a similar Kconfig option in #6208 : it's unclear with such a low-level option would useful to user. Maybe, we would need to do it completely differently, e.g. introduce higher-level "profile" option, or maybe hardcode it at all. So, that "#if" captures a fact that there's a choice on developer's level, but it's not clear if an option on Kconfig level is useful yet.

@jukkar
Copy link
Member

jukkar commented Feb 20, 2018

By the same reason why you didn't want to introduce a similar Kconfig option in #6208

My motivation in that PR was slightly different (at least in my mind). There we could have various config options for different net_buf allocation timeouts, and that can be really hard to explain in config file especially if we have multiple such options.

The scenario here is a bit different. You have clear numbers for memory impact for different options and that could be placed in Kconfig help option. You could basically place the commit message as a Kconfig help.

@pfalcon
Copy link
Contributor Author

pfalcon commented Feb 20, 2018

@jukkar : Absolute the same picture in my view. In my view, #6208, is just a first step in deadlock-busting in the IP stack. There can be possible dozen(s) more, further steps iteratively improving - or completely overwriting - previous steps, so trying to make "config" based on the initial steps doesn't make sense (it will be changed).

With this TLS stuff, I tested 2 (two) sites, one work, one didin't, I proceeded to investigate why this one site didn't work, and figured it requires DHE or ECDHE cyphersuites, and tried both. That may seem as something useful, and indeed it is, as one step in TLS journey. Maybe trying next 100 sites, and each of them will have its own additional issues and options to tweak. So, I can add Kconfig-level option based in this single-site investigation, but the only thing we can be sure that it will be changed (likely, removed in the sense "generalized") later.

@jukkar
Copy link
Member

jukkar commented Feb 20, 2018

With this TLS stuff, I tested 2 (two) sites, one work, one didin't, I proceeded to investigate why this one site didn't work, and figured it requires DHE or ECDHE cyphersuites, and tried both. That may seem as something useful, and indeed it is, as one step in TLS journey. Maybe trying next 100 sites, and each of them will have its own additional issues and options to tweak. So, I can add Kconfig-level option based in this single-site investigation, but the only thing we can be sure that it will be changed (likely, removed in the sense "generalized") later.

Ok, makes sense. If every site requires different options, then it does not make sense to provide Kconfig options in this case.

@pfalcon
Copy link
Contributor Author

pfalcon commented Feb 20, 2018

If every site requires different options, then it does not make sense to provide Kconfig options in this case.

Maybe each site not, but there's not enough evidence pro or cons so far. There's no hurry to merge this, so it can wait for now, maybe more data and insight will be collected. So far, it serves just as an example of ideas in #6132 - "hierarchical configs", i.e. chain of configs where there's a common one, and few which include and extend it (perhaphs of 2-3 levels of such).

@d3zd3z
Copy link
Contributor

d3zd3z commented Mar 13, 2018

Actually, several security standards we are working to comply with will require forward secrecy, so I would argue we just need to change the default.

Realistically, if you are memory constrained, using certificates at all is unrealistic, and the application will almost certainly use a PSK-based ciphersuite.

@d3zd3z
Copy link
Contributor

d3zd3z commented Mar 13, 2018

Is there a use-case for an IoT device connecting to arbitrary hosts on the internet? In order to do this, you would need to include the rather large root certificate database (about 256K), in addition to requiring the large 16K in each direction buffers.

It seems more realistic that the IoT device would be communicating to either gateway devices, update servers, or cloud services that are defined for this particular device. In this case, we should have pretty good control over the TLS configuration on that end.

@d3zd3z
Copy link
Contributor

d3zd3z commented Mar 13, 2018

I put some of my additional thoughts on this here: https://www.davidb.org/post/tls-hard/

@pfalcon
Copy link
Contributor Author

pfalcon commented Mar 15, 2018

Actually, several security standards we are working to comply with will require forward secrecy, so I would argue we just need to change the default.

Forward secrecy features in mbedTLS requires noticeably larger amount of memory, so we'd better enable it for specific standards/applications/usecases, and keep other standards/applications/usecases unaffected.

Is there a use-case for an IoT device connecting to arbitrary hosts on the internet?

I'm not sure about the answer to this question. However, if posed as "Is there a use-case for a general-purpose (RT)OS to connect to an arbitrary hosts on the Internet?", then the answer is definitely "yes". Why would we limit Zephyr to not be able to connect to an arbitrary host on the Internet? ;-)

@nashif
Copy link
Member

nashif commented Jul 3, 2018

what is the story with this PR? Is this going to be updated and is it still required? If nothing happens within the next week, it will be closed.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jul 6, 2018

Ok, this was a next small shy step towards elaborating Zephyr's mbedTLS configs (#6132).

Since then, much bigger work was done in the frame of #7118 , but there it was done to apply only to that sockopt stuff, instead of being done on the level of generic Zephyr-mbedTLS integration. I made a suggestion that it was reworked to apply on generic mbedTLS level: #7118 (review) (p.2 there), I don't think it got a reply.

Anyway, this can be closed now, and further work might be done based on the updated #7118 results, whatever they are.

@pfalcon pfalcon closed this Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants