Skip to content

Conversation

@drozdziak1
Copy link
Contributor

@drozdziak1 drozdziak1 commented Oct 6, 2022

This changeset implements continuous reloads of the mapping account, without interruptions. The complexity bump is there but still appears worth the hassle.

Stan Drozd added 4 commits October 5, 2022 14:26
This change takes care of recoverable mapping crawling
errors (e.g. malformed single price on single product is no longer
dropping otherwise good different prices and products in the mapping
in favor of a warn message)
);

let results = futures::future::join_all(attestation_sched_futs).await; // May never finish for daemon mode
let mut batches: Vec<_> = attestation_cfg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can compute these only when needed (when config has changed below). Performance-wise it's not much different but it will result in more clear logs.

@ali-behjati ali-behjati requested a review from jayantk October 19, 2022 10:58
@ali-behjati
Copy link
Collaborator

I think the code is technically correct and I don't see any issue (just why tilt is complaining?).

My concern is mostly the complexity of this code, it is very complex now.

Now that we are not using constraints to update prices (and we are doing it as fast as possible) we can let go of the constraint check totally, and then have it as two loop. One loop is always attesting based on the groups. One loop is crawling mapping accounts and updating the groups if needed. This will be reusing everything here and removing many existing code to make it simpler.

I think we can merge this one and go for simplifying it like ^ in another PR.

Let's hear what other folks think and if they agree I'll approve this PR.

Copy link
Collaborator

@ali-behjati ali-behjati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked with Stan about this PR and here is a summary:

  • The current code can be further simplified if we can move out the non-deamon logic and it will be much easier to reason about.
  • Also moving some logic in the long functions to separate functions will improve the readability of the code
  • There is no way forward to separate the loops with current design. To do it we should step back and change the whole structure of code. Given that it will be a huge refactor it's not clear for me that the outcome is worth the effort.

Copy link
Contributor

@jayantk jayantk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the cryptographic hash thing should be fixed now.

Aside from that, I think @ali-bahjati 's comment above is right. Let's merge this, but would be good to follow up with a refactoring PR to make the logic easier to understand.


/// Trigger attestation if price changes by the specified percentage.
/// Trigger attestation if price changes by the specified
/// percentage, expressed in integer parts per thousand.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest using basis points as the units here (1 basis point = 1/100th of a %), which is a very standard financial measurement

.collect();
info!("Spinning up attestation sched jobs");
if daemon {
let mut hasher = DefaultHasher::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want a cryptographic hash here -- if you get a hash collision, the code below isn't going to do the right thing.

/// percentage, expressed in integer parts per thousand.
#[serde(default)]
pub price_changed_pct: Option<f64>,
pub price_changed_ppt: Option<u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really really think you should use basis points here. 1 / 1000th is a weird unit for prices, but 1 / 10000 (1 bp) is very standard. This configuration option is going to confuse people.

message_q_mtx.clone(),
)
});
// This loop governs mapping account reloads in daemon mode. It is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is long and the logic is hard to understand because it supports all these different options around daemon / no daemon.

I think it would be good to refactor this by extracting out helper functions and then possibly having two separate top-level functions for daemon / non-daemon mode. Each of those functions could use the helpers to avoid code duplication.

Copy link
Contributor

@jayantk jayantk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I left you a couple minor comments. I suggest addressing them in a fast follow up PR so that it's easier to review.

/// percentage, expressed in integer base points (1bp = 0.01%)
#[serde(default)]
pub price_changed_pct: Option<f64>,
pub price_changed_bp: Option<u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: usually you abbreviate basis points to "bps"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

);

// Hash currently known config
let mut hasher = DefaultHasher::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i left a comment last time about using a cryptographic hash here. The code below depends on no collisions for correctness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I think the std HashMap has internal collision detection that isn't part of the hasher. I assumed it would be enough, but a nicer 32-byte scheme will fit better here.

debug!(
"Attestation config (includes mapping accounts):\n{:#?}",
attestation_cfg
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this debug statement needs to be in the if i think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the code in the if statement above is duplicated across this method and the one below. consider extracting out a helper function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it here to print the config regardless, it's the "includes mapping accounts" part that's new.

let message_q_mtx = Arc::new(Mutex::new(P2WMessageQueue::new(
Duration::from_millis(attestation_cfg.min_msg_reuse_interval_ms),
attestation_cfg.max_msg_accounts as usize,
)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the state of this queue gets reset each time the attestation batches change. Does that mean we can end up reusing accounts more frequently than intended each time the mapping account reloads (and the batches get recreated)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a big deal given that the mapping accounts reload very infrequently. maybe leave a // todo comment if this is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an easy fix, just moved the mutex constructor to before the loop so that it and its state is reused. Async runtimes must properly deallocate variables in scope of an aborted task (to avoid counter-intuitive mem leaks), so it should work well now and in the future

@drozdziak1 drozdziak1 merged commit 3eb2bea into main Oct 24, 2022
@drozdziak1 drozdziak1 deleted the drozdziak1/p2w-attest-cont-mapping-reload branch October 24, 2022 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants