Skip to content

Support RawBson objects. #133

@jcdyer

Description

@jcdyer

I'm working on an implementation of a zero-copy RawBson type / family of types, that store the input data as an &[u8], and operate directly on the bytes. This seems like a potentially better direct-from-the-wire format for the mongo crate, as it's more performant, and can still support many of the methods that the current Bson / OrderedDocument types support. Is this something you'd be interested in incorporating into this crate?

It's a WIP, but as it stands now, you can directly instantiate a raw document from a &[u8], and from that query into the document using doc.get("key")?.as_<type>()?, similar to the current implementation. Currently, keying into the document is a linear operation, but could be sped up by validating the json & storing offsets in an index during construction.

It would not support .get_mut() operations, and if we wanted to add that in the future, they would need to return (parsed) Bson objects, since in the general case you can't manipulate the data without allocating new buffers.

I do envision having conversion functions from the raw types to the parsed ones, though the other direction would complicate things, because the current implementation uses &[u8], and there would need to be an owner of the raw data (Cow<'_, str>?).

Code overview:

pub struct RawBsonDoc<'a> {
    data: &'a [u8],
}

impl<'a> RawBsonDoc<'a> {
    pub fn new(data: &'a [u8]) -> Option<RawBsonDoc<'a>>;
    pub fn get(&'a self, key: &str) -> Option<RawBson<'a>>;
    pub fn to_parsed_doc(&self) -> OrderedDocument;
}

pub struct RawBsonArray<'a> {
    data: &'a [u8],
}

impl<'a> RawBsonArray<'a> {
    pub fn new(data: &'a [u8]) -> Option<RawBsonArray<'a>>;
    pub fn get(&'a self, key: &str) -> Option<RawBson<'a>>;
    pub fn to_parsed_array(&self) -> ??;
}

pub struct RawBson<'a> {
    bsontype: BsonType,
    data: &'a [u8],
}

impl<'a> RawBson<'a> {
    // only constructed from RawBsonDoc::get(&str) or RawBsonArray::get(usize)
    pub fn as_str(&self) -> Option<&'a str>;
    pub fn as_doc(&self) -> Option<RawBsonDoc<'a>>;
    pub fn as_array(&self) -> Option<RawBsonArray<'a>>;
    pub fn to_parsed_bson(&self) -> Bson;
}

// Not publicly exposed
enum BsonType {
    String,
    Document,
    Array,
    // ...
}

Still under design: How this interfaces with encode and decode functions, impl serde::Deserialize types, and mongo query functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    tracked-in-jiraTicket filed in Mongo's Jira system

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions