Skip to content

Conversation

@byrnedj
Copy link

@byrnedj byrnedj commented Feb 6, 2024

  • binary request generation and replay
  • fast forwarding of a trace
  • preloading requests into memory

This change is Reviewable

: ReplayGeneratorBase(config), binaryStream_(config) {
for (uint32_t i = 0; i < numShards_; ++i) {
stressorCtxs_.emplace_back(std::make_unique<StressorCtx>(i));
std::string_view s{"abc"};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

outputStreamKeys_ += nreqs * binReqSize;
folly::setThreadName("cb_binary_gen");
traceStream_.fastForwardTrace(fastForwardCount_);
genRequests(latch);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to pass the latch here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In generating the binary file I don't need the latch, but in the other path (reading the CSV file normally) we use the latch to wait for preloading some requests. So it's not used in binary trace generation but I still pass it because it's a parameter to genRequests.

Is there a better way to handle this passing the latch in this case? genRequests does require it (counts down once we have hit a number of requests).

req->req_.ttlSecs, keyOffset);
std::memcpy(outputStreamReqs_ + nreqs, &binReq, sizeof(binReq));
std::memcpy(outputStreamKeys_ + keyOffset, req->key_.c_str(), keySize);
if ((nreqs % BIN_REQ_INT) == 0 && nreqs > 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment for this part of the code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

XDCHECK_LT(req->op_, 12);
auto key = req->getKey(binaryStream_.getKeyOffset());
OpType op;
switch (req->op_) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just cast OptType to uint8_t here and avoid the switch

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

setEOF();
}

const Request& BinaryKVReplayGenerator::getReq(uint8_t,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function (and the entire file) looks pretty similar to KVReplayGenerator.h Can't we unify them? Perhaps make this (and other) function a template (where either BinaryRequest* or std::unique_ptr is the arg)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually pretty hard because we don't have the same underlying request queue logic in the binary trace version.

ttl_(ttl),
keyOffset_(keyOffset) {}

std::string_view getKey(char* baseAddr) const {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you will store offset relative to this pointer than you do not need to pass baseAddr as a parameter. The code of the method will look something like this:

std::string_view getKey() const {
    return std::string_view(
        reinterpret_cast<const char*>(this+ keyOffset_), keySize_);
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just figured out that you store keys in separate file, correct? if so, my idea is not feasible. It is only possible if BinaryRequest objects and corresponding keys are stored in the same file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - thanks for the note!

@byrnedj byrnedj changed the base branch from develop to main April 22, 2024 16:00
------------------------
This offers much lower overhead of trace replaying. It
assumes the kvcache trace format and kvcache behavoir.
This patch supports the following:
- binary request generation and replay
- fast forwarding of a trace
- preloading requests into memory
- object size amplification
- queue free for even lower request overhead
- can parse many more requests per second than cachelib
  can process, so we can get 100% CPU usage

The limitations are:
- no trace amplification (however you can amplify the
original .csv trace and save it in binary format)
- ~4GB overhead per 100 million requests
- you need some disk space to store large traces
@byrnedj byrnedj force-pushed the binary_request branch 2 times, most recently from cf0a7c1 to 9d49e46 Compare April 22, 2024 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants