Misc. bug: Performance regression on aarch64 q4_0

### Name and Version

llama-cli --version
version: 5615 (f470bc36)
built with Android (13324770, +pgo, +bolt, +lto, +mlgo, based on r530567d) clang version 19.0.0 (https://android.googlesource.com/toolchain/llvm-project 97a699bf4812a18fb657c2779f5296a4ab2694d2) for x86_64-unknown-linux-gnu

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-bench

### Command line

```shell

```

### Problem description & steps to reproduce

Q4_0 performance significantly dropped after this commit
/build-android-f470bc36/llama-bench -m ../gemma-2-2b-q4_0.gguf -p 512 -n 0                                            
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| gemma2 2B Q4_0                 |   1.51 GiB |     2.61 B | CPU        |       8 |           pp512 |         15.84 ± 0.01 |

build: f470bc36 (5615)
/build-android-8f47e25f/llama-bench -m ../gemma-2-2b-q4_0.gguf -p 512 -n 0                                            
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| gemma2 2B Q4_0                 |   1.51 GiB |     2.61 B | CPU        |       8 |           pp512 |        138.02 ± 8.88 |

build: 8f47e25f (5614)

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Performance regression on aarch64 q4_0 #14134

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Performance regression on aarch64 q4_0 #14134

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions