Skip to content

Commit 3226b15

Browse files
edumazetkuba-moo
authored andcommitted
net: avoid 32 x truesize under-estimation for tiny skbs
Both virtio net and napi_get_frags() allocate skbs with a very small skb->head While using page fragments instead of a kmalloc backed skb->head might give a small performance improvement in some cases, there is a huge risk of under estimating memory usage. For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations per page (order-3 page in x86), or even 64 on PowerPC We have been tracking OOM issues on GKE hosts hitting tcp_mem limits but consuming far more memory for TCP buffers than instructed in tcp_mem[2] Even if we force napi_alloc_skb() to only use order-0 pages, the issue would still be there on arches with PAGE_SIZE >= 32768 This patch makes sure that small skb head are kmalloc backed, so that other objects in the slab page can be reused instead of being held as long as skbs are sitting in socket queues. Note that we might in the future use the sk_buff napi cache, instead of going through a more expensive __alloc_skb() Another idea would be to use separate page sizes depending on the allocated length (to never have more than 4 frags per page) I would like to thank Greg Thelen for his precious help on this matter, analysing crash dumps is always a time consuming task. Fixes: fd11a83 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb") Signed-off-by: Eric Dumazet <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Greg Thelen <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 7da1762 commit 3226b15

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

net/core/skbuff.c

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -501,20 +501,25 @@ EXPORT_SYMBOL(__netdev_alloc_skb);
501501
struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
502502
gfp_t gfp_mask)
503503
{
504-
struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
504+
struct napi_alloc_cache *nc;
505505
struct sk_buff *skb;
506506
void *data;
507507

508508
len += NET_SKB_PAD + NET_IP_ALIGN;
509509

510-
if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
510+
/* If requested length is either too small or too big,
511+
* we use kmalloc() for skb->head allocation.
512+
*/
513+
if (len <= SKB_WITH_OVERHEAD(1024) ||
514+
len > SKB_WITH_OVERHEAD(PAGE_SIZE) ||
511515
(gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
512516
skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
513517
if (!skb)
514518
goto skb_fail;
515519
goto skb_success;
516520
}
517521

522+
nc = this_cpu_ptr(&napi_alloc_cache);
518523
len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
519524
len = SKB_DATA_ALIGN(len);
520525

0 commit comments

Comments
 (0)