-
Couldn't load subscription status.
- Fork 8.1k
net: mqtt: BSD socket based implementation #10125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: mqtt: BSD socket based implementation #10125
Conversation
|
Some numbers from building Old MQTT lib: frdm_k64f: frdm_k64f_tls: New MQTT lib: frdm_k64f: frdm_k64f_tls: |
Codecov Report
@@ Coverage Diff @@
## master #10125 +/- ##
==========================================
- Coverage 48.38% 48.36% -0.02%
==========================================
Files 265 270 +5
Lines 42193 42081 -112
Branches 10137 10130 -7
==========================================
- Hits 20413 20351 -62
+ Misses 17703 17653 -50
Partials 4077 4077
Continue to review full report at Codecov.
|
bbb9897 to
a82a987
Compare
include/net/mqtt_socket.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit subject "net: mqtt: Add MQTT socket implementation" contains redundant information, what about just saying "net: mqtt: Add BSD socket implementation"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will change.
include/net/mqtt_socket.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically do not use typedef for structs and enums in zephyr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if it's a requirement to avoid typedef in Zephyr, so I've left it as it was originally. Of course I can change that as it's requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do that, typically it is harder to read code when these xxx_t typedefs are used as one needs to constantly try to remember the actual type of a variable. Currently, at least in networking code, the typedefs are only used in function pointers which are annoying to use otherwise.
include/net/mqtt_socket.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there would be holes in the struct, what about ordering the fields so that the u8_t and the bitfields are at the end.
subsys/net/lib/CMakeLists.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the MQTT_SOCKET prefix, what about calling it just MQTT and renaming the old mqtt as MQTT_LEGACY? The directory name we can change later when old MQTT library is removed but renaming config variable name is annoying later as people are using the old name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to touch the existing implementation, until there was a positive feedback on the new one. Renaming it to MQTT_LEGACY sounds like a good idea though.
include/net/mqtt_socket.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall the original Nordic SDK MQTT source used stdint.h types, which made the code a bit more portable.
Any reason we cannot continue to use stdint.h, as was done here:
e37c3b3 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, source implementation used stdint.h types. I've changed them as I've thought Zephyr prefers to use u8_t etc, especially that checkpatch throws a warning when these standard types are used.
I have nothing against to stdint.h types though, so if we agree that it's fine to use it, I'll be happy to bring them back. @jukkar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change in Zephyr from uint8_t -> u8_t was a bit pointless but as we have that now used everywhere, it is better to have it used here too. So usage of u8_t is fine here.
a82a987 to
fdb4361
Compare
|
@jukkar Covered initial comments, I've got rid of Additionally, I've added commit with MQTT tests ported to the new API. |
cd92113 to
46d140d
Compare
46d140d to
3767ffb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is lot of code and I missed probably lot of things in this review. This needs more eyes but looks very good overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be static as I could not find any access to it outside of this source file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this symbol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean ret < 0here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I'm surprised such a mistake slipped in. Correcting it rightaway.
3767ffb to
f22ea39
Compare
|
@jukkar Thank you for the comments, I've addressed them. |
So, quick feedback - your today's presentation at the networking forum meeting had such a nice feature list, why can't we have it here in the description? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions, looking mostly only at the sample code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per posix, this should be struct sockaddr_storage. struct sockaddr is good only to be a base of pointer type struct sockaddr*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related to this PR, but any idea to introduce Kconfig options for these (better sounding apparently), and using those instead? (That's the direction we want to move in, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm not 100% happy with these mbedTLS defines here. I don't find myself a Kconfig expert, but I believe it should be doable to have such symbols defined at Kconfig level, depending on selected configuration. We can add it to the wishlist, if mebdTLS Kconfig rework will take place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "id" the right name for this field? I'd say it's event type, id could be something else (e.g. changing (e.g. incrementing) id of a particular event).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing stuff like utf_strlen = strlen(param.message.topic.topic.utf_str) is very confusing. So, is utf_strlen really a length of utf string (in characters), then strlen() is the wrong function to use for, or is it in bytes actually, then why it's called utf_strlen?
(Let me get to the actual definition of that structure, but starting with a sample, that's the impression it leaves.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MQTT operates on byte-length, so utf_strlen is in bytes as well. I can rework the naming, to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps after peering into the example it will become clear why 4 is multiplied by APP_MAX_ITERATIONS, but just starting with it, nope, it's not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, my guess after ~2min: because you try 4 different QoS types per iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given you concerns, I think I'll rework the application code a little bit, to be something more similar to the original version. I indeed might be over-optimized for the sample code at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in symbol name: EACTLY. Perhaps a sudden lower-case "o" is a typo too (or what a random person would think?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower case "o" is typically found in Quality of Service acronym, I guess that's the reason it's here. It might be upper case though, for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be upper case though, for consistency.
+1 for consistency. If everything is upper-case, then CamelCase acronyms should be upper-case too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so that's why there's a > operation against an enum value above. Well, worth a comment there.
And generally, relying on a particular enum value ordering/nearness may be a good optimization trick for internal library code, but looks somewhat hacky in a user-facing sample.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: use utf8.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: use len.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: use bin or data.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: use len.
Compare with the title of #5985 which is "net: zstream API to abstract socket transport protocols (e.g. TCP vs TLS)". Compare that with https://github.com/zephyrproject-rtos/zephyr/pull/5985/files#diff-59b451503e52060a73241f8517bebc66R25 : So, the irony continues. There was a proposal to have consistent, universally useful transport/stream abstraction in Zephyr. "No, we don't need transport abstraction, we really need non-standard, proprietary socket extensions". Some time later, transport abstraction is there, because guess what - it's useful! But now we also have mis-layered API stack. And of course, each new protocol library can invent it's own transport abstraction. Like, one have read, then write, and another - write, then read in structures, because there's only so much to invent. |
|
What I like in this PR is that in doesn't introduce threads and stuff (again, this feature was presented in @rlubos slides and not very obvious here, hope that will change). Ping to @GAnthony re: our discussion that protocol implementations should avoid usage of threads and stuff whenever possible (leaving the decision to use or not use threads to applications). Also ping to @d3zd3z that people actually implement MQTT without using threads ;-). (Didn't review how well it works yet, but I doubt that threads would make it better, both in terms of clarity and resource usage.) |
|
Thanks @pfalcon for the review. I'll rework the code and sample and push the changes soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Robert, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed new commit "net: mqtt: Handle publish payload separately". Yes, the propose idea is implemented right, thanks. There're just minor tweaks to achieve POSIX-like semantics required.
Re-reviewing all together is on TODO...
subsys/net/lib/mqtt_sock/mqtt.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfectly normal condition, EOF reached. Just return EOF (== length of 0).
subsys/net/lib/mqtt_sock/mqtt.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be the right error, can it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you recommend then? I was thinking about EBUSY.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, notionally, this is "state [transition] error". Would need to go thru errno's and see what's the closest to it - maybe nothing, POSIX errnos suck :-(
EINPROGRESS is probably more close to sockets, and thus confusing (is that socket error that we got or mqtt lib error?). EBUSY sounds good then, yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized why it came up to me to use EPERM here, beacuse it doesn't make much sense basing on Zephyr's errno.h. In Zephyr we have:
EPERM 1 /* Not owner */
While errno man pages state:
EPERM Operation not permitted (POSIX.1-2001).
IMO the two descriptions are completely different, and the latter one makes this choice a little more reasonable, as mqtt_input shall not be called when payload is available to read. Anyway, if there's a better choice, let's change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, EPERM is "Operation not permitted", regardless of what Zephyr's bugs in the description say. And "not permitted" in the sense of "permission", i.e. access control, as the lexical form of EPERM clearly says. Regardless of any attempts to clarify its meaning/associate additional constraints, people will keep thinking of it as "there's not enough permission to do that".
For a classical reference (rant) on that matter, see http://blog.unclesniper.org/archives/2-Linux-programmers,-learn-the-difference-between-EACCES-and-EPERM-already!.html - calls like that are futile, things should be named right, to be used right.
(Note that explanation there slightly favors your usage - but not really, from my reading of it. It says EPERM should be used based on the static property of the system, like hardlinks to directory are forever not permitted, whereas we here talk about dynamic property, i.e. "in the current state, that operation is not permitted"). Again, it's sad that POSIX doesn't provide generic "invalid state" error :-(.
subsys/net/lib/mqtt_sock/mqtt.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spurious empty line? Should be after brace instead?
73262e7 to
f27d60b
Compare
|
@pfalcon Thanks, comments have been adressed. |
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find it unfortunate that "mqtt_utf8" has "size", while "mqtt_binstr" - "len".
(Not requesting to fix anything.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, all of struct mqtt_puback_param, struct mqtt_pubrec_param, struct mqtt_pubrel_param, struct mqtt_pubcomp_param managed to end up with the member named message_id, but for some reason, mqtt_utf8 and mqtt_binstr have got to have it different.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we say here that payload field is to be ignored? Should we define separate struct for received PUBLISH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll extend the PUBLISH event description.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A consequence of the good design - one less option for users to care about! ;-) (Please remove.)
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
include/net/mqtt.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention that this should be called even before setting other fields in the client. I may suggest: "Shall be called to initialize client structure, before setting any client parameters and before connecting to broker."
subsys/net/lib/mqtt_sock/mqtt.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please-please let's change this to "goto out", or "done", or "finish", just not "error".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @rlubos! I think that we have very solid and efficient MQTT library here, thanks for going thru all the refactoring. We will likely need to clarify/cleanup some behavioral aspects yet later. What's really left here is updating docstrings to not contain info no longer relevant after refactoring, clarify init behavior, and - I hope - to rename a got label. I'll -1 for now to avoid spurious merge. I'll be ready to +1 immediately after that.
My own checklist is below:
- publish() should send payload data out of user-supplied buffer, without copying it to lib buffer first.
- Receiving of incoming PUBLISH message should leave reading payload data to user, by supplying a wrapper function with a standard read() interface.
- Global array of clients should go, there should not be inter-dependencies among different client instances.
- 3.1. Which means that mqtt_live() should take mqtt client pointer like any other func.
- 3.2. And if there should be mutex there at all, then it belongs to the client structure, not global scope.
1ca9b9e to
b0e6113
Compare
|
@pfalcon Thank you for the feedback, I've addressed final remarks. As the API changes have been reviewed and I can see that there is a chance to have this PR merged soon, I've squashed the commits with the API change. |
Rename existing headers and sybols to mqtt_legacy, to allow new implementation to keep old config and header names. Signed-off-by: Robert Lubos <[email protected]>
Add new, socket based MQTT implementation, based on MQTT from Nordic nRF5 SDK, introducing the following features: * transport independent MQTT logic, with support for multiple transports * support for multiple MQTT versions (3.1.0 and 3.1.1 supported) * single event handler - no need to keep callback array in RAM * automatic send of Ping Requests, for connection keep-alive * message/event parameters wrapped into strucutres - easier extension for future MQTT versions * no separate thread needed to run MQTT - application only needs to call mqtt_input and mqtt_live periodically Signed-off-by: Robert Lubos <[email protected]>
Add TLS transport to socket MQTT implementation. Signed-off-by: Robert Lubos <[email protected]>
Port MQTT Publisher sample to new socket MQTT API. Signed-off-by: Robert Lubos <[email protected]>
Port existing MQTT tests to new socket MQTT API. Signed-off-by: Robert Lubos <[email protected]>
Add MQTT test to verify PUBLISH message reception. Signed-off-by: Robert Lubos <[email protected]>
b0e6113 to
16b03ec
Compare
|
And rebased to current master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Hello everyone,
This PR introduces new MQTT implementation, based on MQTT from Nordics nRF5 SDK. The library have been reworked, to fit into Zephyr, introducing changes proposed in #8975, which include:
mqtt_inputandmqtt_liveperiodicallyThe existing MQTT example (
mqtt_publisher) was ported to the new API. It was tested successfully on qemu_x86 (with TLS as well) and one of Nordic platforms (not present in this PR). Remaining boards supporting the sample were also verified that they do compile, but they'd require testing though.MQTT tests were ported as well.
I'm attaching presentation from the October Networking Forum (it can be find on the forum's Google Drive as well):
mqtt_zephyr.pdf
Feedback appreciated :)