sys_heap heap allocator optimizations #24249

npitre · 2020-04-09T19:51:50Z

This goes on top of PR #23941 and therefore patches from PR #23941 are included here.

This series implements an assortment of optimizations providing both
execution efficiency and footprint overhead reduction. Because of that, a few tests presuming a higher overhead are now failing.

I'm posting this here for people to see and comment.

zephyrbot · 2020-04-09T19:54:47Z

All checks are passing now.

checkpatch (informational only, not a failure)

-:126: WARNING:LONG_LINE: line over 80 characters
#126: FILE: lib/os/heap-validate.c:105:
+	for (c = right_chunk(h, 0); c <= max_chunkid(h); c = right_chunk(h, c)) {

-:156: WARNING:LONG_LINE: line over 80 characters
#156: FILE: lib/os/heap-validate.c:135:
+		for (c = c0; n == 0 || c != c0; n++, c = next_free_chunk(h, c)) {

-:170: WARNING:LONG_LINE: line over 80 characters
#170: FILE: lib/os/heap-validate.c:146:
+	for (c = right_chunk(h, 0); c <= max_chunkid(h); c = right_chunk(h, c)) {

-:328: WARNING:LONG_LINE: line over 80 characters
#328: FILE: lib/os/heap.c:124:
+		 "corrupted heap bounds (buffer overflow?) for memory at %p", mem);

-:389: WARNING:LINE_SPACING: Missing a blank line after declarations
#389: FILE: lib/os/heap.c:182:
+		int i = CONFIG_SYS_HEAP_ALLOC_LOOPS;
+		do {

-:487: WARNING:NEW_TYPEDEFS: do not add new typedefs
#487: FILE: lib/os/heap.h:52:
+typedef struct { char bytes[CHUNK_UNIT]; } chunk_unit_t;

-:685: WARNING:LINE_SPACING: Missing a blank line after declarations
#685: FILE: lib/os/heap.h:224:
+	size_t usable_sz = sz - min_chunk_size(h) + 1;
+	return 31 - __builtin_clz(usable_sz);

- total: 0 errors, 7 warnings, 654 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PREFER_KERNEL_TYPES PRINTK_WITHOUT_KERN_LEVEL SPDX_LICENSE_TAG SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

andrewboie · 2020-04-14T18:56:16Z

parent PR now merged, please rebase

npitre · 2020-04-14T19:55:54Z

On Tue, 14 Apr 2020, Andrew Boie wrote: parent PR now merged, please rebase

Done.

andrewboie · 2020-04-28T17:10:42Z

Bunch of tests failing with " Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/lifo/lifo_usage/src/main.c:329: test_timeout_lifo_thread: packet != NULL is false"

Gonna kick CI again to make sure.

andrewboie · 2020-05-11T22:38:31Z

Seeing a consistent assertion failure:

starting test - test_mheap_block_desc

    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/mem_heap/mheap_api_concept/src/test_mheap_concept.c:129: test_mheap_block_desc: block_fail is not NULL

kicking again to make sure

npitre · 2020-05-12T02:35:05Z

On Mon, 11 May 2020, Andrew Boie wrote: Seeing a consistent assertion failure: ``` starting test - test_mheap_block_desc Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/mem_heap/mheap_api_concept/src/test_mheap_concept.c:129: test_mheap_block_desc: block_fail is not NULL ```

OK, found it. The last patch is titled "sys_heap: reduce the size of struct z_heap_bucket by half". It is missing the following bit to a file that didn't exist when the patch was initially created: ``` diff --git a/include/mempool_heap.h b/include/mempool_heap.h index df64322b50..439371556d 100644 --- a/include/mempool_heap.h +++ b/include/mempool_heap.h @@ -35,7 +35,7 @@ struct k_mem_pool { * that k_heap does not. We make space for the number of maximum * objects defined, and include extra so there's enough metadata space * available for the maximum number of minimum-sized objects to be - * stored: 8 bytes for each desired chunk header, and a 24 word block + * stored: 8 bytes for each desired chunk header, and a 12 word block * to reserve room for a "typical" set of bucket list heads (this size * was picked more to conform with existing test expectations than any * rigorous theory -- we have tests that rely on being able to @@ -46,7 +46,7 @@ struct k_mem_pool { K_HEAP_DEFINE(poolheap_##name, \ ((maxsz) * (nmax)) \ + 8 * ((maxsz) * (nmax) / (minsz)) \ - + 24 * sizeof(void *)); \ + + 12 * sizeof(void *)); \ struct k_mem_pool name = { \ .heap = &poolheap_##name \ } ``` Should be fine now.

andrewboie · 2020-05-12T17:25:37Z

[0mSeaBIOS (version rel-1.12.1-0-ga5cab58-dirty-20200214_052440-f7294c49af13-zephyr
)
Booting from ROM..Optimal CONFIG_X86_MMU_PAGE_POOL_PAGES 8
*** Booting Zephyr OS build v2.3.0-rc1-25-g401d2147465a  ***
Running test suite mempool
===================================================================
starting test - test_pool_block_get

    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/mem_pool/mem_pool/src/main.c:178: pool_block_get_work: (rv not equal to tests[i].rcode)
k_mem_pool_alloc() expected 0, got -12
size: 512, timeout: 0

FAIL - test_pool_block_get
===================================================================
starting test - test_pool_block_get_timeout

    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/mem_pool/mem_pool/src/main.c:178: pool_block_get_work: (rv not equal to tests[i].rcode)
k_mem_pool_alloc() expected 0, got -11
size: 4096, timeout: 100

FAIL - test_pool_block_get_timeout
===================================================================
starting test - test_pool_block_get_wait

    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/mem_pool/mem_pool/src/main.c:271: test_pool_block_get_wait: (rv not equal to 0)
k_mem_pool_alloc(3000) expected 0, got -11

FAIL - test_pool_block_get_wait
===================================================================
starting test - test_pool_malloc
PASS - test_pool_malloc
===================================================================
Test suite mempool failed.
===================================================================
PROJECT EXECUTION FAILED

Still failing.

andyross

Just throwing up my +1 for when this exits the CI banging phase. These all look great to me.

npitre · 2020-05-12T21:56:18Z

On Tue, 12 May 2020, Andy Ross wrote: Just throwing up my +1 for when this exits the CI banging phase. These all look great to me.

Thanks. What is not great are all those tests: some of them are pure bollocks in the context of the heap allocator. Some patches in this series reduce the heap footprint overhead and then some allocation tests pass when they were expected to fail with ENOMEM. So futzing with the fake numbers within Z_MEM_POOL_DEFINE() make them pass but then other tests start failing for the opposite reason. So I just picked another semi-random number to satisfy all tests... unless I missed some.

First, some renames to make accessors more explicit: size() --> chunk_size() used() --> chunk_used() free_prev() --> prev_free_chunk() free_next() --> next_free_chunk() Then, the return type of chunk_size() is changed from chunkid_t to size_t, and chunk_used() from chunkid_t to bool. The left_size() accessor is used only once and can be easily substituted by left_chunk(), so it is removed. And in free_list_add() the variable b is renamed to bi so to be consistent with usage in sys_heap_alloc(). Signed-off-by: Nicolas Pitre <[email protected]>

Let's provide accessors for getting and setting every field to make the chunk header layout abstracted away from the main code. Those are: SIZE_AND_USED: chunk_used(), chunk_size(), set_chunk_used() and chunk_size(). LEFT_SIZE: left_chunk() and set_left_chunk_size(). FREE_PREV: prev_free_chunk() and set_prev_free_chunk(). FREE_NEXT: next_free_chunk() and set_next_free_chunk(). To be consistent, the former chunk_set_used() is now set_chunk_used(). Signed-off-by: Nicolas Pitre <[email protected]>

By storing the used flag in the LSB, it is no longer necessary to have a size_mask variable to locate that flag. This produces smaller and faster code. Replace the validation check in chunk_set() to base it on the storage type. Also clarify the semantics of set_chunk_size() which allows for clearing the used flag bit unconditionally which simplifies the code further. The idea of moving the used flag bit into the LEFT_SIZE field was raised. It turns out that this isn't as beneficial as it may seem because the used bit is set only once i.e. when the memory is handed off to a user and the size field becomes frozen at that point. Modifications on the leftward chunk may still occur and extra instructions to preserve that bit would be necessary if it were moved there. Signed-off-by: Nicolas Pitre <[email protected]>

It is possible to remove a few fields from struct z_heap, removing some runtime indirections by doing so: - The buf pointer is actually the same as the struct z_heap pointer itself. So let's simply create chunk_buf() that perform a type conversion. That type is also chunk_unit_t now rather than u64_t so it can be defined based on CHUNK_UNIT. - Replace the struct z_heap_bucket pointer by a zero-sized array at the end of struct z_heap. - Make chunk #0 into an actual chunk with its own header. This allows for removing the chunk0 field and streamlining the code. This way h->chunk0 becomes right_chunk(h, 0). This sets the table for further simplifications to come. Signed-off-by: Nicolas Pitre <[email protected]>

With this we can remove magic constants, especially those used with big_heap(). Signed-off-by: Nicolas Pitre <[email protected]>

We already have chunk #0 containing our struct z_heap and marked as used. We can add a partial chunk at the very end that is also marked as used. By doing so there is no longer a need for checking heap boundaries at run time when merging/splitting chunks meaning fewer conditionals in the code's hot path. Signed-off-by: Nicolas Pitre <[email protected]>

Avoid redundancy and bucket_idx() usage when possible. Signed-off-by: Nicolas Pitre <[email protected]>

Make the LEFT_SIZE field first and SIZE_AND_USED field last (for an allocated chunk) so they sit right next to the allocated memory. The current chunk's SIZE_AND_USED field points to the next (right) chunk, and from there the LEFT_SIZE field should point back to the current chunk. Many trivial memory overflows should trip that test. One way to make this test more robust could involve xor'ing the values within respective accessor pairs. But at least the fact that the size value is shifted by one bit already prevent fooling the test with a same-byte corruption. Signed-off-by: Nicolas Pitre <[email protected]>

This struct is taking up most of the heap's constant footprint overhead. We can easily get rid of the list_size member as it is mostly used to determine if the list is empty, and that can be determined through other means. Signed-off-by: Nicolas Pitre <[email protected]>

carlescufi · 2020-06-20T18:43:45Z

There are some build failures here that I am not sure are attributable to this PR's changes. @nashif any ideas?

carlescufi · 2020-06-20T22:00:22Z

The current failure in qemu_arc_em is almost certainly unrelated to this PR.

Failures

 Sanitycheck / qemu_arc_em:tests/kernel/lifo/lifo_usage/kernel.lifo.usage / tests/kernel/lifo/lifo_usage/kernel.lifo.usage

*** Booting Zephyr OS build zephyr-v2.3.0-488-g6c7ca9602316  ***
Running test suite test_lifo_usage
===================================================================
starting test - test_lifo_nowait
PASS - test_lifo_nowait
===================================================================
starting test - test_lifo_wait
PASS - test_lifo_wait
===================================================================
starting test - test_timeout_empty_lifo
PASS - test_timeout_empty_lifo
===================================================================
starting test - test_timeout_non_empty_lifo
PASS - test_timeout_non_empty_lifo
===================================================================
starting test - test_timeout_lifo_thread
PASS - test_timeout_lifo_thread
===================================================================
starting test - test_timeout_threads_pend_on_lifo
 thread (q order: 2, t/o: 0, lifo 0x8040621c)

    Assertion failed at WEST_TOPDIR/zephyr/tests/kernel/lifo/lifo_usage/src/main.c:140: test_multiple_threads_pending: (data->timeout_order not equal to ii)
 *** thread 2 woke up, expected 1

FAIL - test_timeout_threads_pend_on_lifo
===================================================================
Test suite test_lifo_usage failed.
===================================================================
PROJECT EXECUTION FAILED

stephanosio · 2020-06-21T02:20:43Z

The current failure in qemu_arc_em is almost certainly unrelated to this PR.

@carlescufi The QEMU ARC platform does not have icount enabled and can be quite unstable when running sanitychecks (basically the same issue we had with other platforms earlier). Re-triggering should fix the problem.

EDIT: This is not an intermittent failure. I just locally ran this test and it consistently fails (both on the latest master and in this PR). It looks like this test is actually broken (see #26163, disabled in #26328).

npitre requested review from MaureenHelm, andrewboie, andyross, carlescufi, galak and nashif as code owners April 9, 2020 19:51

zephyrbot added area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test area: Kernel labels Apr 9, 2020

This was referenced Apr 9, 2020

my work on "sys_heap: a new/simpler/faster memory allocator" #19602

Closed

Add sys_heap heap allocator #23941

Merged

npitre force-pushed the sys-heap-2_np branch from c82235f to b1fb41e Compare April 14, 2020 19:06

andyross mentioned this pull request Apr 16, 2020

Add minimal unit tests for TCP2 #24346

Merged

andrewboie closed this Apr 28, 2020

andrewboie reopened this Apr 28, 2020

andrewboie closed this May 11, 2020

andrewboie reopened this May 11, 2020

npitre force-pushed the sys-heap-2_np branch from b1fb41e to ff22352 Compare May 12, 2020 02:34

andrewboie added this to the v2.4.0 milestone May 12, 2020

andyross approved these changes May 12, 2020

View reviewed changes

npitre force-pushed the sys-heap-2_np branch from ff22352 to dfdcd15 Compare May 12, 2020 21:44

Nicolas Pitre added 8 commits June 8, 2020 14:43

sys_heap: introduce min_chunk_size()

adfff60

With this we can remove magic constants, especially those used with big_heap(). Signed-off-by: Nicolas Pitre <[email protected]>

sys_heap: simplify some complex checks

a6fbb6d

Avoid redundancy and bucket_idx() usage when possible. Signed-off-by: Nicolas Pitre <[email protected]>

npitre force-pushed the sys-heap-2_np branch from dfdcd15 to 9b3d50c Compare June 8, 2020 19:06

carlescufi mentioned this pull request Jun 19, 2020

sys_heap_aligned_alloc() implementation #26310

Merged

carlescufi approved these changes Jun 21, 2020

View reviewed changes

carlescufi merged commit ad59e92 into zephyrproject-rtos:master Jun 21, 2020

npitre deleted the sys-heap-2_np branch February 11, 2021 00:46

sys_heap heap allocator optimizations #24249

sys_heap heap allocator optimizations #24249

Uh oh!

Conversation

npitre commented Apr 9, 2020

Uh oh!

zephyrbot commented Apr 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

checkpatch (informational only, not a failure)

Uh oh!

andrewboie commented Apr 14, 2020

Uh oh!

npitre commented Apr 14, 2020 via email

Uh oh!

andrewboie commented Apr 28, 2020

Uh oh!

andrewboie commented May 11, 2020

Uh oh!

npitre commented May 12, 2020 via email

Uh oh!

andrewboie commented May 12, 2020

Uh oh!

andyross left a comment

Choose a reason for hiding this comment

Uh oh!

npitre commented May 12, 2020 via email

Uh oh!

carlescufi commented Jun 20, 2020

Uh oh!

carlescufi commented Jun 20, 2020

Uh oh!

stephanosio commented Jun 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zephyrbot commented Apr 9, 2020 •

edited

Loading

stephanosio commented Jun 21, 2020 •

edited

Loading