-
Couldn't load subscription status.
- Fork 8.1k
Omnibus LwM2M changes for v1.14 (including socket API migration) #9832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Omnibus LwM2M changes for v1.14 (including socket API migration) #9832
Conversation
|
@GAnthony Would it be possible to test the LwM2M client sample on your HW which requires the socket APIs to see if this PR is headed in the right direction? |
|
@nashif @jukkar @tbursztyka @laperie @pfalcon Please let me know if this is a direction we can go for socket API support. If so, I'll continue to break this down into acceptable upstream patches. |
5a8fa02 to
1b96de0
Compare
|
Updated all the samples / tests I could find using CoAP APIs. Let's check sanity now. |
1b96de0 to
ba22afd
Compare
|
@mike-scott I am also working on CoAP over sockets support from past couple of days. I did't know that you went very far. My approach is quite different than yours. I just modified only net_pkt related stuff from CoAP library. But your approach is changing a lot in net_buf, which is not required IMO. My changes are at initial stage. I will create WIP/RFC PR by EOD. |
|
I've briefly glanced through the changes, and without diving into details I have one general remark/question. Is it still a good approach to use |
It's definitely not. A litmus test for doing it right would be that the development actually happens on Linux, as more comfortable, and then everything just works on both Zephyr and Linux. (That's how for example #5985 was done.) |
|
@mike-scott This is my branch (https://github.com/rveerama1/zephyr/commits/coap_sock) and commit is rveerama1@9f0647e. This is exactly same as current CoAP without net_pkt stuff in CoAP library. As I said just started few days back, it's work in progress. coap-client sampel don't even compile :). Just for reference about my approach. |
|
@rveerama1 Your method seems to add a 2nd CoAP library for socket support? Are we sure we want to maintain and test 2 libs? This seems prone to bit rot. |
I think the plan is to deprecate the older net_buf based one. The grand master plan is to provide only socket based interfaces to applications. Internally net_buf's will be used later too. |
But those may be reasonably different net_buf's after #7578. Churn is everywhere! Sockets are the only safe heaven ;-). |
|
@rveerama1 : Did you compare the RAM / flash resource usage before / after your PR?
To clarify: The PR moves
I think the keyword you said is "comfortable" and not "doing it right". The reality is: it won't be like Linux programming. Zephyr is an RTOS aimed at small HW. Have you seen how careless Linux programmers are with resources? While it might provide a general direction, your statement is filled with unicorns and fairy tales. There is a very concerning trend of heavier resource usage currently happening in Zephyr. I hope we keep in mind that currently you can run a bluetooth stack, IP-stack, and an LwM2M client using DTLS all on HW with 64K RAM and 512K flash in a way where you can have room for 2 copies of the application in order to perform OTA updates in a secure manner. Very quickly we wont:
There has to be a way that we can still support the "simple" RTOS use-cases. I illustrated the resource jump in my PR description and I chose a fairly conservative approach for sharing multi-use If you look at companies introducing products into the market, they are already looking for ways to cut cost. The last thing they are going to do is "buy bigger" so they can use Zephyr. They'll just go elsewhere.
I think you meant to say net-app APIs are being depreciated, and that There is confusion between "applications" and "internal network library code" atm that probably needs to be ironed out.
Not so fast: #7590 |
|
@rveerama1 This means the CoAP APIs are not thread safe? |
Like I mentioned already, sample doesn't even compile. Changes are at early stage to compare those details. |
Note that those static buffers are only in the sample application, the library itself is just using what ever buffer that is given to it. |
include/net/buf.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy-paste errors/confusion (?), missing opportunity to improve the API. E.g.:
offset Offset of input buffer
which "input buffer" and "offset of"??
pos Pointer to position of offset
Position of offset? What's that, in plain words?
Why these are still 2 params, if at least 95% of the code has those 2 params as pos, &pos?
after reading 2 bytes
Not 2, 1 here.
value Value is returned
Value is returned?
|
@mike-scott re: #9832 (comment) I don't know where to start ;-). Let go quick over some statements.
Yup ;-). But that doesn't mean that POSIX API is resource-bloat - it's carefully thought out and designed to balance well overheads, usability, and underlying technology abstraction.
We can even have a case study why ;-). So, some time ago one programmer decided to add a framework for "modem drivers". It went as low-level as handling IRQs from UART. He was pointed that there's already an API in Zephyr which is very close to his needs, and abstracts from the need to deal with UART IRQ, but he decided to duplicate functionality anyway. So, any wonder that Zephyr code size bloats up, and RAM of course too, because nobody works on optimization, everyone just adds their "original designs". P.S. Just had a look at your code again and paid attention that you put to a k_pipe from ISR. But see a Note at the end of http://docs.zephyrproject.org/latest/kernel/data_passing/pipes.html#concepts section.
That's too bloated IMHO, we should target for much smaller ;-).
Socket API forever requires flat buffers. It's one of the biggest features. (No, it's not - it's just the obvious, baseline requirement.)
Really? I thought that LwM2M is UDP based, so you can't send more than MTU at once.
Let's fix it if you're concerned.
I'm with you on these ;-)
I see it a bit differently - as if you can't let those net_buf's go, and thus it precludes you to start from clean page and implement new exciting optimizations ;-). Just as examples:
Keep max MTU buffers only for LwM2M which can go to MTU. For less than 128 bytes messages, keep buffers not larger than 128 bytes.
You can go up from there. Yeah, just imagine - you e.g. can make a message overflow check once, and if it passes, just write data directly, without any error checking. Compare that to calling 100 times some function with gazillion of params, result of each call requiring a check. Do you see where bloat comes from?
Yeah, they go elsewhere, to RTOS which offer well-known, time-proven APIs. That's the way to cut cost - on R&D and support/maintenance. As for BOM, it doesn't have to be bigger. Again, compare direct pointer access vs calling some godawful functions with error checking. |
|
@mike-scott : And now some general comments. It's a big achievement that we all came to conclusion that BSD Sockets API is a way forward in Zephyr. At the same time, there should be absolutely no worries about native (net_context-based) API and its wrappers like net_app. They aren't going anywhere until sockets-based solutions are proven to be at least not worse. With that in mind, I would imagine that someone starting to work with sockets, does that out of excitement of using well-known API and simplifying the design that offers. However, with your code - and that's just my personal opinion - I don't see that approach. Instead I see an attempt to marry approach from old adhoc API with sockets. While one of the possibilities, I (again, personally), don't see a point in doing it like that - sockets allow to exercise new approaches, and old net_app doesn't go away. I may be wrong. After all, you know LwM2M and its architecture, in Zephyr and overall, so maybe it makes sense. But note that trying to marry native API and sockets, you already introduce new entities like Anyway, let me challenge myself in these conclusions in try to advocate your approach - in the next message. |
|
@mike-scott, So, summing up, I "reject" approach in this PR, in a sense that if you tell me that's the right approach, I don't believe you ;-). Note that I may be wrong. But let me try to find independent, external references for such an approach which could have some similarity with approach used here. Some time ago I stumbled upon this doc https://github.com/sustrik/bsd-socket-revamp/blob/master/source.txt from Martin Sustrik, a guy behind ZeroMQ and its rewrite nanomsg. The title is clickbait, of course it's not about "revamping" BSD Sockets API, it's perfect as it is. It's about designing new, "modern" networking API supposed to be of the same level as BSD Sockets API. So, as you can see, he advocates usage of linked list of buffers for network I/O operations: https://github.com/sustrik/bsd-socket-revamp/blob/master/source.txt#L216 . Bingo! Of course, there're big differences too. First of all, no talking about fixed-size buffers. So, @tbursztyka's refactor is what will be faithful implementation if this idea in Zephyr. Back to Sustrik's API, there's of course no "compact" function (which is hilarious one anyway), nor net_buf_read_u8 and friends with 4 params and return value which can signify error - such of course can be implemented, though many apps will get along without them. Nor there's net_buf_insert_bytes(), because presence of such a function signifies a design problem (protocol messages are constructed from beginning to end), but again, with linked lists it's possible and trivial. So, much approach towards this? It's simple - we start with plain ol' good BSD Sockets and implement everything in their native manner. Then maybe we implement novelties like Sustrik's API - especially if he really submits it as an IETF RFC, etc. Implement on userspace level I mean. Because on kernel space we'll have it with @tbursztyka's refactor. |
I do as a matter of fact need the ability to insert bytes into whatever memory structure is used for building the LwM2M protocol payload. Spoiler alert: it's not a design flaw. One example is how TLV formatting works, in the OMA TLV spec for LwM2M. There are "header" object entries such as "object instance" entry. It lands before a series of "resource" entries. This "object instance" entry includes the total length of the following resource data entries and that length value can be 1 or 2 bytes depending on how long it needs to be. Without literally doing everything twice, that needs to be inserted after the resource data is added and the length is known: |
As part of the migration from net_app APIs to socket APIs, let's stop referencing the net_pkt fragments throughout the LwM2M library. Establish a msg_data flat buffer inside lwm2m_message and use that instead. NOTE: As a part of this change we remove the COAP_NET_PKT setting. The COAP library reverts to COAP_SOCK behavior. This doesn't mean we use sockets in LwM2M (yet), it only means we use the socket-compatible COAP library which parses flat buffers instead of net_pkt fragments. Signed-off-by: Michael Scott <[email protected]>
net_app contexts save the remote address and we use this during observe notifications and pending handling. If we move to another network layer such as sockets, then the remote address becomes harder to reference. Let's save it as a part of the client context. Signed-off-by: Michael Scott <[email protected]>
The JSON formatter is currently not enabled for incoming WRITE operations. To update the code in the formatter and not litter the input context with extra data, let's allow formatters to store their own user data. Signed-off-by: Michael Scott <[email protected]>
Update the parsing functions for JSON used by the JSON data formatter and enable it in the LwM2M engine. Signed-off-by: Michael Scott <[email protected]>
For bootstrap support, we need to store connection credentials in the security object. This way the client can start a connection at index 0 and after bootstrapping, move to the next connection. Let's add the needed fields and a config item to set the key length. Signed-off-by: Michael Scott <[email protected]>
In order to support bootstrap mode, we need to store server data in the security / server objects. Once the connection to the bootstrap server is made, it will clear these objects and add new server connection data. Signed-off-by: Michael Scott <[email protected]>
c0bc887 to
d6b7500
Compare
|
Rebased on todays master branch and removed merged patches |
|
ping @dbkinder please re-review the changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving on the idea that this needs to get into 1.14 anyway.
subsys/net/lib/lwm2m/lwm2m_engine.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot I had added some stub DNS_RESOLVER code. This doesn't work. I'll remove the stub code from the socket addition patch and then add a new patch which implements DNS_RESOLVER.
Not applicable anymore due to PR split
Now that the security data can be loaded into and used from the security / server objects, we can add support for LwM2M bootstrap. This is a mode where initially a connection can be made to a server which can update several LwM2M (including security and server data) and then trigger a "bootstrap complete". Once this happens the client will start it's connection process over but now with the new information. Signed-off-by: Michael Scott <[email protected]>
We can save some resources by removing the periodic service thread
and replacing it by queuing the services to the work queue.
Before (reel_board using BT + DTLS)
Memory region Used Size Region Size %age Used
FLASH: 289464 B 1 MB 27.61%
SRAM: 75620 B 256 KB 28.85%
IDT_LIST: 136 B 2 KB 6.64%
After
Memory region Used Size Region Size %age Used
FLASH: 289576 B 1 MB 27.62%
SRAM: 74596 B 256 KB 28.46%
IDT_LIST: 136 B 2 KB 6.64%
Signed-off-by: Michael Scott <[email protected]>
This commit resets the firmware status to IDLE after a bad download attempt. Previously, the firmware object would stay in an odd state and any further attempts to download firmware would return an error. Signed-off-by: Michael Scott <[email protected]>
d6b7500 to
142e3be
Compare
|
Updated the commit with the following:
If we need to unblock the other net_app removal PRs, this can be merged now and I'll continue to run tests and shake out bugs over the next 2 weeks. |
|
For reference: bootstrap support, new JSON write operation parsing and the buf utilities make up most of the code additions (+600ish lines) |
142e3be to
c543188
Compare
|
Pushed a small bugfix for proxy firmware download |
This commit removes the net_app layer from the LwM2M library and replaces it with BSD-sockets APIs. Signed-off-by: Michael Scott <[email protected]>
The LwM2M library has moved from the network application library APIs to BSD socket APIs. Let's make the needed changes in the LwM2M sample to follow those changes. Signed-off-by: Michael Scott <[email protected]>
Previously, the net_app layer handled DNS support as a part of network initialization. With the move to BSD-socket APIs, we need to add support for DNS to the LwM2M library. Signed-off-by: Michael Scott <[email protected]>
c543188 to
25df294
Compare
|
Re-pushed to fix a line over 80 char warning. |
This is a WIP PR so that some key stake holders can discuss if this approach is feasible.
After attempting to convert the LwM2M subsys lib to socket API, I realized that many devices don't have the hardware resources necessary to run in a secure state. Previously, this was possible under the net-app APIs. To that end, I don't believe we can make a direct cut over and should support net-app APIs until a better solution is presented.
Goals:
Discussion / Thought process:
WHAT WORKS:
LWM2M_NET_LAYER_NET_APP=y
or
CONFIG_LWM2M_NET_LAYER_SOCKET=y
Here are some size comparisons of the LwM2M client sample built for BLENano2 (nRF52832)
TODO:
Signed-off-by: Michael Scott [email protected]