Skip to content

Conversation

@eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Jul 28, 2025

Resolves CXX-3233. Followup to #1412.

This is 3 out of an estimated 7 major PRs which in total are expected to resolve CXX-2745.

This PR provides implementations and tests for the v1 API introduced by CXX-3232 in #1412.

Important

This PR deliberately leaves the v_noabi API interfaces, implementations, and tests unchanged to facilitate direct, side-by-side comparisons between v1 and v_noabi. Redundancy and duplication between v1 and v_noabi will be addressed by the bsoncxx v_noabi refactor (CXX-3234).


In summary, notable changes (relative to the v_noabi API's implementation and tests) include:

  • Smaller ABI footprint: 136 (unique) exported v1 symbols vs. 250 (approximately equivalent) exported v_noabi symbols.
    • Note: v1 is still missing equivalents for builder::*, validate::*, and vector::*, which bring the v_noabi symbol count up to 325.
    • The biggest ABI footprint savings come from:
      • Nearly all array::* function defined in terms of document::* except v1::array::view::find(std::uint32_t) const.
      • Defining types::value constructors in terms of b_<type> overloads only.
      • Inlined definitions of comparison operator overloads and many other trivial functions.
  • "Internal" component headers lib/**/*.hh.
    • A pattern that is more common for the mongocxx v_noabi implementations, but unlike mongocxx, v1 keeps them lightweight (definitions in the .cpp to minimize recompilations).
    • Internal-only API within internal classes instead of friend declarations to access "private" interfaces or data.
    • The public static member functions within internal classes declare the exact set of private data being exposed for use by dependent components (including tests).
    • Invocations of specific internal public static member functions explicitly documents-as-code the dependents that require corresponding private data.
  • An error category is consistently defined as a static local immortal object within the error category function.
    • The name of the category is the fully-qualified name of the associated class (e.g. bsoncxx::v1::decimal128) or the error code enumeration otherwise (e.g. bsoncxx::v1::source_errc).
    • Every error code defines an equivalent() mapping to the source_errc and type_errc error condition enumerations.
  • More applications of the new BSONCXX_V1_TYPES_XMACRO X-macro pattern in place of #include <bsoncxx/enums/type.hpp> X-macro headers.
  • Implementation of the "inline PIMPL" idiom referenced in CXX-3232 (placement new of impl object into reserved data member storage rather than dynamic allocation).
  • A new internal and undocumented "invalid" state for v1::document::element used to implement the alternative approach to CXX-2120. Note the deliberate lack of any mention of this state in the public API: it is meant to be entirely unobservable except in service of corresponding exception error messages.
  • "Test" component headers (test/**/*.hh) defining Catch stringification for tested entities.
    • Alternative to the include-everything bsoncxx/test/catch.hh header to minimize recompilations.
    • Open to alternative design proposals which are not susceptible to ODR violation of StringMaker<T> specializations. 🫠
    • bsoncxx::test::stringify() as equivalent to Catch::Detail::Stringify() (deduces T in call to StringMaker<T>::convert()).
    • New "stringify" test cases for Catch expression decomposition behavior (implicit "to string" test coverage).
    • Catch stringification validated by running tests with the --success flag.
  • Catch test suite quality improvements.
    • Nearly 100% test coverage of bsoncxx v1 API (note: don't expect this for mongocxx...), including copy+move semantics (note: do expect this for mongocxx) to avoid problems like this, this, etc.
      • Especially important for inline PIMPL classes and v1::document::value to ensure correct impl and custom deleter behavior.
      • CHECK_NOFAIL used to document behavior which may vary across standard library implementations (surprising!).
    • [namespace][to][component] for better test case filtering.
      • [error] tag for testing error-code-specific behavior (e.g. mapping codes to conditions).
      • [test] tag for test-suite-specific behavior (e.g. Catch stringification).
    • All documented exception-throwing code paths are explicitly tested with CHECK_THROWS_WITH_CODE (excluding v1::oid::oid()...).
    • Static assertions of important compile-time properties (e.g. overload resolution behavior).
      • Defined in test components instead of library components to avoid impacting library-only build performance.

Note

To avoid v1 test components depending on the v_noabi API (no v1 library or test component includes v_noabi headers), BSON data are hand-written as binary data (e.g. std::uint8_t data[] = {...}) instead of using the bsoncxx::v_noabi::builder::* API. These may eventually be substituted with a v1 builder API, but I think there is also merit to keeping them independent of any builder API, such that the test cases are not dependent on unrelated behavior and minimizes cross-component test behavior dependencies. 🤔

@eramongodb eramongodb self-assigned this Jul 28, 2025
Comment on lines +46 to +48
if (!_view) {
return this->cend();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not present in v_noabi). This is a consequence of the new, well-defined "invalid" state of an array view.

Comment on lines 54 to 62
if (!bson_init_static(&bson, _view.data(), _view.length())) {
throw v1::exception{code::invalid_data};
}

bson_iter_t iter;

if (!bson_iter_init_find_w_len(&iter, &bson, key.c_str(), static_cast<int>(key.length()))) {
return this->end();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly different to v_noabi's bson_init_static -> bson_iter_init -> bson_iter_init_find implementation, but should still be equivalent in observable behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bson_iter_find functions may return false if they fail to find the element, but also may return false if they encounter a parse error (and set iter.err_off). Should this also throw invalid_data in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes, it should throw invalid_data.

Comment on lines +39 to +45
if (sv.empty()) {
throw v1::exception{code::empty_string};
}

if (sv.size() > std::size_t{INT_MAX}) {
throw v1::exception{code::invalid_string_length};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are not present in v_noabi.


namespace {

constexpr std::uint8_t k_default_view[5] = {5u, 0u, 0u, 0u, 0u};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the special empty document representation that is used to avoid v1::document::value allocation (with a no-op deleter).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be a std::array<u8, 5>? (clang-tidy nags about arrays)

Comment on lines +68 to +70
if (!bytes) {
throw v1::exception{code::null_bytes_ptr};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not present in v_noabi, which may lead to library-level undefined behavior when invoking memcpy().

Comment on lines +144 to +145
// For backward compatibility, do not prematurely truncate strings in bsoncxx API. Instead, defer handling of potential
// embedded null bytes to the bson library.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment documents an observation that v_noabi's make_copy_for_libbson (which in hindsight should really be moved out of private and into v_noabi along with a few more current private headers) uses memcpy rather than str(n)cpy to create owning BSON strings. The v1 implementation preserves this behavior, avoiding truncation and deferring null byte validation to the bson library.

#pragma pop_macro("X")
}

// BSONCXX_V1_TYPES_XMACRO: update below.
Copy link
Contributor Author

@eramongodb eramongodb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overloads below implement the behavior currently handled by v_noabi's convert_to_libbson() in convert.hh, but without exposing bson_value_t to dependent components.

Comment on lines +194 to +196
if (v.value.size() > UINT32_MAX) {
throw v1::exception{code::invalid_length_u32};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These UINT32_MAX checks are not in v_noabi and guard the subsequent casts to std::uint32_t.


} // namespace

view view::internal::make(bson_value_t const& v) {
Copy link
Contributor Author

@eramongodb eramongodb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implements the behavior currently handled by v_noabi's convert_from_libbson() in convert.hh, but without exposing bson_value_t to dependent components.

@eramongodb eramongodb marked this pull request as ready for review July 28, 2025 20:00
@eramongodb eramongodb requested a review from a team as a code owner July 28, 2025 20:00
@eramongodb eramongodb requested review from kevinAlbs, mdb-ad and vector-of-bool and removed request for mdb-ad July 28, 2025 20:00
static v1::stdx::optional<bson_iter_t> to_bson_iter(view const& v);

private:
friend view;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the private static methods of element::view::internal to element::view::impl:

// In bsoncxx::v1::element::view::impl:

static view::impl const& to_impl(view const& self) {
    return *reinterpret_cast<view::impl const*>(self._storage.data());
}

static view::impl const* to_impl(view const* self) {
    return reinterpret_cast<view::impl const*>(self->_storage.data());
}

static view::impl* to_impl(view* self) {
    return reinterpret_cast<view::impl*>(self->_storage.data());
}

That appears to allow removing the friend declaration. IIUC the internal classes are intended as a controlled internal view into private API from other components. In this case, no other components need to (or can) call these private static methods.

Comment similarly applies to types::value::internal.

@eramongodb eramongodb requested a review from kevinAlbs August 6, 2025 16:01
Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, but leaving open for a behavioral question regarding iteration exception-throwing

Comment on lines 54 to 62
if (!bson_init_static(&bson, _view.data(), _view.length())) {
throw v1::exception{code::invalid_data};
}

bson_iter_t iter;

if (!bson_iter_init_find_w_len(&iter, &bson, key.c_str(), static_cast<int>(key.length()))) {
return this->end();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bson_iter_find functions may return false if they fail to find the element, but also may return false if they encounter a parse error (and set iter.err_off). Should this also throw invalid_data in that case?


namespace {

constexpr std::uint8_t k_default_view[5] = {5u, 0u, 0u, 0u, 0u};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can this be a std::array<u8, 5>? (clang-tidy nags about arrays)

Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants