Add use of Clang attribute((nonnull)) in Emscripten system headers. #16545

juj · 2022-03-21T12:55:10Z

This allows users to get a compile time warning if they try to pass a statically null pointer value to a system header that is not intended to accept such a value.

sbc100 · 2022-03-21T18:55:26Z

Is it worth using a macro here in case these headers ever need to be used on a non-clang compiler? (could be it shorter and less visually noisy too).

sbc100 · 2022-03-21T19:01:16Z

Makes me wonder if we also want to use __restrict on some of these?

juj · 2022-03-22T08:07:52Z

I did actually first do this with a EMSCRIPTEN_NONNULL directive, however backed out on it, because

int emscripten_futex_wait(volatile void *addr EMSCRIPTEN_NONNULL, uint32_t val, double maxWaitMilliseconds);

vs

int emscripten_futex_wait(volatile void *addr __attribute__((nonnull)), uint32_t val, double maxWaitMilliseconds);

I found that in IDE (and looks like also on GitHub), using EMSCRIPTEN_NONNULL loses syntax highlighting, so visually it did parse off worse to my eye than directly using __attribute__((unused)) which at least highlights partially.

Also an indirection can cause developers to need to look up what the macro means, if they don't guess its meaning (or doubt themselves).

Finally, our system headers aren't particularly include clean anymore, so I wish not to add to the include chain dependency. Having almost every include file include one more header adds to that load. (not sure how much that matters for compilation unit perf though nowadays.. in the past ~decade+ ago it used to be noticeable, but I might be a dinosaur on this)

in case these headers ever need to be used on a non-clang compiler?

I wonder if that should be a use case even. In a distant future if that becomes a thing, we can look into it then?

juj · 2022-03-22T08:54:04Z

Makes me wonder if we also want to use __restrict on some of these?

My understanding is that restrict enables more optimizations on the implementation side of the function, not outside the function on the caller side (expect than to warn if one does attempt to pass aliasing pointers to the function). Most of these functions are implemented in JS side, so there wouldn't be any optimizations that LLVM would be able to perform there.

Skimming all of these declarations, I was not able to find any that would receive two or more pointers of the same type (pointers to different types are already always assumed to not alias by the compiler).

juj · 2022-03-22T08:56:31Z

I learn that there are two different types of nonnull annotations in Clang: __attribute__((nonnull)) and _Nonnull.

The first tells the compiler that it can optimize based on the assumption that the pointer will never be null. The second annotation does not affect optimizations, but simply emits diagnostics if a static null is passed. Adjusted the code to use _Nonnull in the places where we actually have error checking of inputs in place that should not get optimized out.

sbc100 · 2022-03-22T15:08:11Z

I did actually first do this with a EMSCRIPTEN_NONNULL directive, however backed out on it, because
int emscripten_futex_wait(volatile void *addr EMSCRIPTEN_NONNULL, uint32_t val, double maxWaitMilliseconds);
vs
int emscripten_futex_wait(volatile void *addr __attribute__((nonnull)), uint32_t val, double maxWaitMilliseconds);
I found that in IDE (and looks like also on GitHub), using EMSCRIPTEN_NONNULL loses syntax highlighting, so visually it did parse off worse to my eye than directly using __attribute__((unused)) which at least highlights partially.

That argument makes some sense to me.

Also an indirection can cause developers to need to look up what the macro means, if they don't guess its meaning (or doubt themselves).

Finally, our system headers aren't particularly include clean anymore, so I wish not to add to the include chain dependency. Having almost every include file include one more header adds to that load. (not sure how much that matters for compilation unit perf though nowadays.. in the past ~decade+ ago it used to be noticeable, but I might be a dinosaur on this)

There are some cases where include-bloat is an issue, but for things like this example, we already have split up our includes nad have a very-tiny header called em_types.h that is designed to be included everywhere. That is why em_types.h was created. Tiny C-only headers like this have near-zero cost compared the C++ header bloat the most compilation unit experience. I think we should be including em_types.h everywhere, and assume is has a non-measurable overhead unless proven otherwise.

in case these headers ever need to be used on a non-clang compiler?

I wonder if that should be a use case even. In a distant future if that becomes a thing, we can look into it then?

It used to be the case the the emscripten headers we designed to also be running able through gcc and/or msvc but I don't think I've actually don't that in years, so maybe its fine to make them clang-only? (I guess both these new attributes are gcc-compatible though?)

sbc100 · 2022-03-22T15:12:36Z

system/include/emscripten/atomic.h

-uint16_t emscripten_atomic_exchange_u16(void/*uint16_t*/ *addr, uint16_t newVal);
-uint32_t emscripten_atomic_exchange_u32(void/*uint32_t*/ *addr, uint32_t newVal);
-uint64_t emscripten_atomic_exchange_u64(void/*uint64_t*/ *addr, uint64_t newVal);
+uint8_t emscripten_atomic_exchange_u8(void/*uint8_t*/ *addr __attribute__((nonnull)), uint8_t newVal);


Wouldn't it also be nice to have the compiler warning if a user passes NULL here?

The compiler will warn if a user passes a null here - that is what __attribute__((nonnull)) will do. But in addition to issuing the warning, the compiler can optimize code inside the function, and at the calls to that function with the expectation that any non-statically-inferrable value passed here will not be a null.

So i.e. if one had silly code like

void foo(const char *x __attribute__((nonnull))) { printf("%s\n", x); } extern const char *bar(void); int main() { const char *s = bar(); foo(s); if (!s) // This can be assumed never to be true, since the contract with foo says it can only be called with nonnull pointers. Clang could issue a warning here. { // we cannot get here! printf("s was null!\n"); } }

However now I see that Clang does not actually do this kind of analysis when trying out in practice. :/

To answer the difference between __attribute__((nonnull)) and _Nonnull: if instead the function foo read

void foo(const char *x __attribute__((nonnull))) { if (x) printf("%s\n", x); }

the compiler will issue a diagnostic warning, because if (x) is tautologically true: the contract with x says it will never be null.

So if we have a function that does want to do something different when x actually is null, then we want to just use the weaker _Nonnull semantics:

void foo(const char * _Nonnull x) // issue a warning if x is null, but don't affect codegen { if (x) printf("%s\n", x); }

I see, from your earlier description had though that one triggered optimization and the triggers warnings, but its seem that the both trigger warning and in addition the the __attribute__ form allows optimization.

This makes sense to me now.

sbc100 · 2022-03-22T15:13:31Z

tests/test_html5_core.c

    if (!plce.isActive) {
      printf("Requesting pointer lock..\n");
-      ret = emscripten_request_pointerlock(0, 1);
+      ret = emscripten_request_pointerlock("#canvas", 1);


What was the effect of passing zero here previously?

Under the old semantics 0 meant to use #canvas, but that support was deprecated.

system/include/emscripten/threading.h

sbc100

lgtm % formatting

system/include/webgl/webgl2_ext.h

juj · 2023-03-06T12:32:51Z

Updated the PR for formatting.

sbc100

Still lgtm.

In practice, did you notice any new warnings showing up in real world code? i.e. did this reveal any latent bugs in real code?

system/include/emscripten/console.h

sbc100 · 2023-03-06T16:22:05Z

test/test_core.py

  @node_pthreads
  def test_emscripten_futexes(self):
    self.set_setting('USE_PTHREADS')
+    self.emcc_args += ['-Wno-nonnull']


Can we add a comment as to why this warning is being disabled? Is it strictly needed for the test?

Yeah, the test is explicitly checking that passing null returns in EINVAL error, which would trigger a diagnostic in the test.

juj · 2023-03-07T12:28:03Z

In practice, did you notice any new warnings showing up in real world code? i.e. did this reveal any latent bugs in real code?

I recall I started this PR after I saw a bug in Unity's code that would have been caught if these annotations had existed. I do not know this PR revealing any new bugs in Emscripten's codebase itself.

…s. (emscripten-core#16545) * Add use of Clang __attribute__((nonnull)) in Emscripten system headers. * Fix annotation on emscripten_futex_wake * Remove nonnull on emscripten_fetch_wait * Fix annotation and tests * Add use of _Nonnull attribute * Fix test * Break up long lines in webgl2_ext.h * Add notes about null pointers in emmalloc.h

juj force-pushed the attribute_nonnull branch 2 times, most recently from 7c3eeff to a6feeea Compare March 21, 2022 16:06

sbc100 reviewed Mar 22, 2022

View reviewed changes

system/include/emscripten/threading.h Show resolved Hide resolved

sbc100 reviewed Mar 23, 2022

View reviewed changes

system/include/webgl/webgl2_ext.h Outdated Show resolved Hide resolved

juj added 6 commits March 6, 2023 14:09

Add use of Clang __attribute__((nonnull)) in Emscripten system headers.

d331bb6

Fix annotation on emscripten_futex_wake

de63d6b

Remove nonnull on emscripten_fetch_wait

93f4de7

Fix annotation and tests

4aa4d1f

Add use of _Nonnull attribute

b112582

Fix test

98e3d98

juj force-pushed the attribute_nonnull branch from 02f0da5 to 98e3d98 Compare March 6, 2023 12:09

juj added 2 commits March 6, 2023 14:22

Break up long lines in webgl2_ext.h

d17721f

Add notes about null pointers in emmalloc.h

eaaf213

juj enabled auto-merge (squash) March 6, 2023 12:32

sbc100 approved these changes Mar 6, 2023

View reviewed changes

juj merged commit 5441b40 into emscripten-core:main Mar 6, 2023

Add use of Clang __attribute__((nonnull)) in Emscripten system headers. #16545

Add use of Clang __attribute__((nonnull)) in Emscripten system headers. #16545

Uh oh!

Conversation

juj commented Mar 21, 2022

Uh oh!

sbc100 commented Mar 21, 2022

Uh oh!

sbc100 commented Mar 21, 2022

Uh oh!

juj commented Mar 22, 2022

Uh oh!

juj commented Mar 22, 2022

Uh oh!

juj commented Mar 22, 2022

Uh oh!

sbc100 commented Mar 22, 2022

Uh oh!

sbc100 Mar 22, 2022

Choose a reason for hiding this comment

Uh oh!

juj Mar 23, 2022

Choose a reason for hiding this comment

Uh oh!

sbc100 Mar 23, 2022

Choose a reason for hiding this comment

Uh oh!

sbc100 Mar 22, 2022

Choose a reason for hiding this comment

Uh oh!

juj Mar 23, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

juj commented Mar 6, 2023

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sbc100 Mar 6, 2023

Choose a reason for hiding this comment

Uh oh!

juj Mar 7, 2023

Choose a reason for hiding this comment

Uh oh!

juj commented Mar 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add use of Clang attribute((nonnull)) in Emscripten system headers. #16545

Add use of Clang attribute((nonnull)) in Emscripten system headers. #16545