Skip to content

[BUG] Invalid calls to vSocketClose() when system bogged down and multiple TCP ports are closed all at once #570

@phelter

Description

@phelter

Describe the bug

When the system is bogged down (cpu oversubscribed) and there are multiple TCP sockets open, and all TCP sockets are closed all at once, there are cases where extra calls to vSocketCLose() are performed on sockets that have already have been closed with bad socket data.

After much review of the environment and the message queue to the FreeRTOS-Plus-TCP IP task, narrowed down a bug in the lifetime management of the sockets in the socket layer.

Due to the use of: vSocketCloseNextTime, a socket may be both closed:

  • by the TCP layer calling vSocketCloseNextTime( pxSocket ), storing the pxSocket socket in the xSocketToClose global variable and then closed later by another call to vSocketCloseNextTime(NULL) which is called first thing in the prvIPTask(). While the second call:
  • User defined FreeRTOS_closesocket() - is performed by the user since the user still has a local store of the opened FreeRTOS_Socket_t.

Believe at some point in the tcp code a vTCPStateChange( pxSocket, eCLOSE_WAIT) is issued without intervention by the user. The user also requests a FreeRTOS_closesocket() to terminate the socket, but by then, the socket has already performed the close.

Because vSocketClose() deletes it's resources, when a closed socket is closed again, a DataAbort interrupt is executed due to a Bad Address, and attempting to delete a list item twice.

Target

  • Development board: [e.g. Zynq]
  • Instruction Set Architecture: [ARM CortexA9]
  • IDE and version: VSCode + CMake
  • Toolchain and version: arm-none-eabi-gcc 10.3.1

Host

  • Host OS: Ubuntu 18.04
  • Version: [e.g. Mojave 10.14.6]

To Reproduce

  • Unable to provide an example at this moment, but if you look at the code you can see this scenario is in fact possible.

Expected behavior
The Socket layer maintains a well known lifetime definition of all sockets created by the user, and ensures if at any point the socket is closed for internal reasons, it does not perform the same action again.

Screenshots
None

Wireshark logs
None

Additional context
Debug - additional messages from Close Execution:

vSocketClose: Caller name: 1315BCS
vSocketClose: pxSocket: 0x0036FDD0
vSocketClose: pxSocket.usLocalPort: 12121
Lost: Socket 12121 now has 0 / 5 children
FreeRTOS_closesocket[12121 to C0A80064ip:49305]: buffers 64 socks 4
______________________prvProcessIPEventsAndTimers: vSocketClose about to get called______________________
vSocketClose: Caller name: 12A184S
vSocketClose: pxSocket: 0x00308870
vSocketClose: pxSocket.usLocalPort: 12345
Lost: Socket 12345 now has 0 / 5 children
FreeRTOS_closesocket[12345 to C0A80064ip:49304]: buffers 64 socks 3
______________________prvProcessIPEventsAndTimers: vSocketClose about to get called______________________
vSocketClose: Caller name: 12A184S
vSocketClose: pxSocket: 0x0036FDD0
vSocketClose: pxSocket.usLocalPort: 12602
FreeRTOS_closesocket: xSocket: 0x0036FDD0
FreeRTOS_closesocket: xSocket.usLocalPort: 12602

Note that the vSocketClose is called twice on pxSocket 0x0036FDD0 - due to another socket's close being executed.

Where a dump of the callers is:

0x0012cfe0 vSocketClose(): FreeRTOS_Sockets.c, line 1550	
0x001315bc vSocketCloseNextTime(): FreeRTOS_TCP_IP.c, line 126	
0x0013d7c0 vCheckNetworkTimers(): FreeRTOS_IP_Timers.c, line 243	
0x00129ffc prvProcessIPEventsAndTimers(): FreeRTOS_IP.c, line 290	
0x00129fe8 prvIPTask(): FreeRTOS_IP.c, line 271	

So the first close of the first socket is due to the vSocketCloseNextTime() and the second one is due to user request. On the second close, a DataAbort occurs inside the uxListRemove function.

image

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions