Skip to content

std.os.windows: LPSTR/LPCSTR/LPWSTR/LPCWSTR/etc being sentinel-terminated is not always correct #20858

@squeek502

Description

@squeek502

Zig Version

0.14.0-dev.625+432905223

Steps to Reproduce and Observed Behavior

From https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/50e9ef83-d6fd-4e22-a34a-2c6b4e3c24f3:

The LPWSTR type is a 32-bit pointer to a string of 16-bit Unicode characters, which MAY be null-terminated. The LPWSTR type specifies a pointer to a sequence of Unicode characters, which MAY be terminated by a null character (usually referred to as "null-terminated Unicode").

Zig defines LPWSTR and LPCWSTR and friends as sentinel-terminated:

zig/lib/std/os/windows.zig

Lines 2850 to 2851 in 390c7d8

pub const LPWSTR = [*:0]WCHAR;
pub const LPCWSTR = [*:0]const WCHAR;

The sentinel was added to these types in 2662e50

This is not a huge deal, but it makes using APIs that take a buffer as input a bit awkward. Take GetModuleFileNameW, for example:

[out] lpFilename
A pointer to a buffer that receives the fully qualified path of the module. If the length of the path is less than the size that the nSize parameter specifies, the function succeeds and the path is returned as a null-terminated string.

[in] nSize
The size of the lpFilename buffer, in WCHARs.

That is, the result is null-terminated, but the buffer does not need to be. With Zig's definition of LPWSTR, though, the buffer itself needs to be sentinel-terminated, which also means that using buf.len directly as the length is incorrect:

// Zig's bindings force the sentinel terminator here, but it's not actually necessary
var buf: [255:0]u16 = undefined;
// By using buf.len we actually undercount the size of the buffer by 1, instead `buf.len + 1` should be passed
const len = std.os.windows.kernel32.GetModuleFileNameW(null, &buf, buf.len);
const name = buf[0..len:0];

Expected Behavior

This is a "problem" introduced by the Zig type system actually being able to encode a pointer being sentinel terminated, so I think there are a few plausible solutions here:

  1. Remove the sentinel terminator from these types, and make callsites to deal with the ambiguity
  2. Add separate Zig-specific types for the sentinel-terminated version and the non-sentinel-terminated version and potentially deprecate LPSTR/LPCSTR/LPWSTR/LPCWSTR/etc.
  3. Remove the LPSTR/LPCSTR/LPWSTR/LPCWSTR types entirely and replace all instances of them with the specific type itself (so, for example, all LPWSTR instances would get changed to either [*]WCHAR or [*:0]WCHAR)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugObserved behavior contradicts documented or intended behavioros-windowsstandard libraryThis issue involves writing Zig code for the standard library.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions