Skip to content

Conversation

@phelter
Copy link
Contributor

@phelter phelter commented Nov 9, 2022

Bugfix for Socket lifetime issues

Have confirmed that this fix solves the issue originally reported in (#570)

Description

Added a new option to vSocketClose to select whether to destroy the socket at the end of close. bDestroy

  • Updated all uses of vSocketClose:
    • All user requested or new socket instances where it fails to create a new socket have bDestroy = true.
    • All internal close instances (such as in the TCP layer when far end is closed) have bDestroy = false.

Keep track of the lifetime of all sockets in 2 lists inside the FreeRTOS_Sockets.c file: xActiveSocketsList and xInactiveSocketsList.
Also check internally that the data arriving from events to the IPTask are valid and correct. - socketASSERT_IS_VALID().
Check internally that only vSocketClose can be performed on a socket that hasn't been destroyed, and all other events/actions can only be performed on active sockets

Test Steps

See Related issue.
Confirmed that previous flow of multiple closes (by internal and user) are now handled properly: Log file of working solution:

Gain: Socket 12345 now has 1 / 5 child
prvSocketSetMSS: 1460 bytes for C0A80064ip:49222
Socket 12345 -> C0A80064ip:49222 State eCLOSED->eSYN_FIRST
prvWinScaleFactor: uxRxWinSize 512 MSS 1460 Factor 4
Socket 12345 -> C0A80064ip:49222 State eSYN_FIRST->eSYN_RECEIVED
TCP: passive 12345 => C0A80064ip:49222 set ESTAB (scaling 1)
Socket 12345 -> C0A80064ip:49222 State eSYN_RECEIVED->eESTABLISHED
Gain: Socket 12121 now has 1 / 5 child
prvSocketSetMSS: 1460 bytes for C0A80064ip:49223
Socket 12121 -> C0A80064ip:49223 State eCLOSED->eSYN_FIRST
prvWinScaleFactor: uxRxWinSize 512 MSS 1460 Factor 4
Socket 12121 -> C0A80064ip:49223 State eSYN_FIRST->eSYN_RECEIVED
TCP: passive 12121 => C0A80064ip:49223 set ESTAB (scaling 1)
Socket 12121 -> C0A80064ip:49223 State eSYN_RECEIVED->eESTABLISHED
Socket 12345 -> C0A80064ip:49222 State eESTABLISHED->eLAST_ACK
Socket 12121 -> C0A80064ip:49223 State eESTABLISHED->eLAST_ACK
Socket 12345 -> C0A80064ip:49222 State eLAST_ACK->eCLOSE_WAIT
vTCPStateChange: Closing socket
Socket 12121 -> C0A80064ip:49223 State eLAST_ACK->eCLOSE_WAIT
Lost: Socket 12121 now has 0 / 5 children
FreeRTOS_closesocket[12121 to C0A80064ip:49223]: buffers 64 socks 4
Lost: Socket 12345 now has 0 / 5 children
FreeRTOS_closesocket[12345 to C0A80064ip:49222]: buffers 64 socks 3

Related Issue

See Issue (#570)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@phelter phelter requested a review from a team as a code owner November 9, 2022 20:08
@phelter phelter force-pushed the bugfix/socket-lifetime-multiple-close branch from c577157 to 72c72b2 Compare November 9, 2022 22:55
@phelter phelter force-pushed the bugfix/socket-lifetime-multiple-close branch from 72c72b2 to a30bd2d Compare November 10, 2022 01:03
@phelter
Copy link
Contributor Author

phelter commented Nov 10, 2022

Note - I've fixed up the existing unit-tests to pass with all of the changes. So the original code paths are all passing 100%, but there are new code paths.

The ci unit-tests are currently failing because there is an expectation that the code-coverage is 100%.

Please feel free to assign someone else to perform the task of increasing the code coverage to 100% by adding new test cases to cover the new code paths.

Note: the unit-test methodology that has been used for this project means the developer must write the code multiple times. Once for the actual code, and then another time for each and every execution path that is there (eg. per test).

I'd suggest looking into other test frameworks such as googletest which could provide much better abstraction layers and provide test frameworks where setting-up and tearing-down tests is performed by a test-framework so that the test suite can be partitioned and each test is only the things that are actually being tested (and not all the bring-up and tear-down operations). This would also allow private C functions to be tested in the environment that they are being used, and not require independent tests per private function.

I am all for unit-tests, code-coverage and advancing this further with static analysis runs on the unit tests; however, the technical methodology is putting an undue burden on the developer to change too much test code every time a change occurs.

@shubnil
Copy link
Contributor

shubnil commented Feb 24, 2023

Hi @phelter, Can you please try with this change #707 if it solves the issue that you are facing.

@shubnil
Copy link
Contributor

shubnil commented Mar 31, 2023

Hi @phelter, Gentle Reminder. Please try with this change #707 if it solves the issue that you are facing.

@moninom1
Copy link
Member

moninom1 commented Apr 17, 2023

Hi @phelter, Can you please check if the issue is resolved with #707.

@phelter
Copy link
Contributor Author

phelter commented Apr 18, 2023

Hi @phelter, Can you please check if the issue is resolved with #707.

Hello @moninom1 and @shubnil I won't be getting to this any time soon. The customer that I provided the fix for in my original fork of this repo is fine with the current change described in this PR and has another contractor maintaining the code at this point. So there is no incentive for me to revisit this in the short term.

If the customer comes back with a request to upgrade to a newer version of FreeRTOS+TCP then I will most likely suggest doing this check at that time.

You have closed the aforementioned bug #570 with a different fix to this one (#707). I am assuming you have in place a unit test that tests the specific use case outlined in : #570 (comment) and confirms that it passes when that particular use case occurs? If you have confirmation that the above use case failed before that change and now passes, then feel free to close; otherwise, please leave open until the customer decides to no longer use this fork and either upgrade to a newer version of FreeRTOS+TCP or use a different network stack.

Thanks

@shubnil
Copy link
Contributor

shubnil commented Apr 27, 2023

Hi @phelter,
Thanks for the confirmation.
We had a demo created to recreate this specific use-case. And with the change (#707) the issue was not observed.

With all these consideration, can we close the PR and it can be re-opened if it needs to be re-looked at in the future.

@kar-rahul-aws
Copy link
Member

Hi @phelter,
Closing the PR for now and it can be re-opened if it needs to be re-looked at in the future.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants