Skip to content

Conversation

@lukasz-golebiewski
Copy link
Contributor

Description

Adds peer fork id validation as described in the section "validation rules" of https://eips.ethereum.org/EIPS/eip-2124

@lukasz-golebiewski lukasz-golebiewski force-pushed the feature/etcm-354/forkid-check branch 2 times, most recently from 423bc93 to 10cdc43 Compare June 23, 2021 14:47
@lukasz-golebiewski lukasz-golebiewski force-pushed the feature/etcm-354/forkid-check branch 2 times, most recently from 476bc61 to 7202b88 Compare June 24, 2021 09:14
Copy link
Contributor

@jvdp jvdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to my eyeballs but the code and tests could be simpler. And it will be more interesting once this is hooked into handshaking and running on a test net.

val (unpassedFork, i) = (forks :+ maxUInt64).zipWithIndex.find { case (fork, _) => currentHeight < fork }.get

// The checks are left biased -> whenever a result is found we need to short circuit
val validate: F[Option[ForkIdValidationResult]] = (for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method looks a lot more complicated than it is. How about moving the tracing into the check methods, for a start? (Do we still need the tracing btw?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method looks a lot more complicated than it is.

This is vague, can you elaborate on that?

How about moving the tracing into the check methods, for a start?

It could be done, but I can't see how moving logging instructions around is going to make the method less complicated.

(Do we still need the tracing btw?)

Logging is part of our DoD and it might get useful when testing on testnet, so I wouldn't remove it.

res <- fromEither[F](Either.left[ForkIdValidationResult, Unit](ErrLocalIncompatibleOrStale))
} yield (res)).value
.map(_.swap)
.flatMap(res => Logger[F].debug(s"Validation result is: $res") >> Monad[F].pure(res.toOption))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use a for-comprehension for this part? (And bind the existing one to a name.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


val maxUInt64 = (BigInt(0x7fffffffffffffffL) << 1) + 1 // scalastyle:ignore magic.number

def validatePeer[F[_]: Monad: Logger](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do with a scaladoc since it's 'external' API, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the monad ?
This is a synchronous function, so there is no need for an async abstraction. I get that logging is a kind of side effect, but from the caller point of view it does not really matter, and it adds some noise because everything needs to be inside a liftF now.

Copy link
Contributor Author

@lukasz-golebiewski lukasz-golebiewski Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monad has nothing to do with asynchronicity, it allows chaining effectful computations.
Logging is a side effect. Are you saying we should ignore that fact?

genesisChecksum +: (forks.map { fork =>
crc.update(bigIntToBytes(fork, 8))
BigInt(crc.getValue())
}).toVector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code perhaps looks functional but it's really relying on the mutable CRC instance. It's probably more performant this way, but it could be more descriptive using List.unfold or similar. (Eg. first produce a collection of collections to produce the CRC with, then flatten it.)

(But this is just a suggestion ofc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code isn't functional because of a side effect in map. It is because of the nature of Java's CRC as you correctly noted. This however doesn't break the referential transparency of calculateChecksum, so the function itself remains pure.

Some(Connect)
}
} else { None }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nicer as a match on remoteId:

remoteId match {
  case ForkId(hash, _) if checksums(i) != hash => None
  case ForkId(_, Some(next)) if currentHeight >= next => Some(ErrLocalIncompatibleOrStale)
  case _ => Some(Connect)
}

(Gets rid of the nested if and the Option.get.)

"ForkIdValidator" must {
"correctly validate ETH peers" in {
// latest fork at the time of writing those assertions (in the spec) was Petersburg
val ethForksList: List[BigInt] = List(1150000, 1920000, 2463000, 2675000, 4370000, 7280000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these different forks interesting for the purposes of testing? I mean, you are testing it with 6, would it make a material difference to the test quality if it was 5 or 7 or 9000?

Also, would this be a good opportunity for property based testing perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I'm curious - how does one property test an equation? by writing the same equation in a different way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but part of this 'equation' would be in how the arbitrary data is constructed. I agree that just re-implementing the same logic wouldn't make much sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh this isn't arbitrary data - it's hard fork heights.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was provided as part of the specification in https://eips.ethereum.org/EIPS/eip-2124 so we would have to have a really good reason to modify or not include it here.

Whether all the forks are interesting for the purposes of testing is probably a question to the original EIP's authors.

@jvdp what are the properties you would like to test here?


def validatePeer[F[_]: Monad: Logger](
genesisHash: ByteString,
forks: List[BigInt]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should forks be validated to not be empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary

@lukasz-golebiewski lukasz-golebiewski force-pushed the feature/etcm-354/forkid-check branch from a32e329 to 1050ee8 Compare June 24, 2021 13:17
@lukasz-golebiewski lukasz-golebiewski force-pushed the feature/etcm-354/forkid-check branch from 1050ee8 to 3a10074 Compare June 24, 2021 13:45
"org.codehaus.janino" % "janino" % "3.1.2"
"org.codehaus.janino" % "janino" % "3.1.2",
"org.typelevel" %% "log4cats-core" % "2.1.1",
"org.typelevel" %% "log4cats-slf4j" % "1.3.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to add 3 deps for one class ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, it's only 2 my bad

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a limit for that? :-)

_ <- liftF(Logger[F].trace(s"Before checkSuperset"))
sup <- fromEither[F](checkSuperset(checksums, remoteId, i).toLeft("not in superset"))
_ <- liftF(Logger[F].trace(s"checkSuperset result: $sup"))
_ <- liftF(Logger[F].trace(s"No check succeeded"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do with just one log between each step right ?

checksums: Vector[BigInt],
remoteId: ForkId,
currentHeight: BigInt,
i: Int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could directly pass the current fork block hash instead of i and a list. It's not really obvious that the index of the next fork is actually also the index of the current checksums.

forks.zipWithIndex.find { case (fork, _) => currentHeight < fork }.getOrElse((maxUInt64, forks.length))

// The checks are left biased -> whenever a result is found we need to short circuit
val validate = (for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we find the hash in checksums which match remote hash first ? So that we directly know if we should do the matching/subset/superset check directly ? I feel that it would simplify the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can see what you mean. Could you paste a code snippet with the suggested simplification?

val checksums: Vector[BigInt] = calculateCheckchecksums(genesisHash, forks)

// find the first unpassed fork and it's index
val (unpassedFork, i) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i -> nextForkIndex or unpassedForkIndex ?

} yield (res.getOrElse(Connect))
}

private def calculateCheckchecksums(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo calculateChecksums


val maxUInt64 = (BigInt(0x7fffffffffffffffL) << 1) + 1 // scalastyle:ignore magic.number

def validatePeer[F[_]: Monad: Logger](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the monad ?
This is a synchronous function, so there is no need for an async abstraction. I get that logging is a kind of side effect, but from the caller point of view it does not really matter, and it adds some noise because everything needs to be inside a liftF now.

Copy link
Contributor

@dzajkowski dzajkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a set of topics in slack to continue the discussion but from my pov it LGTM

@lukasz-golebiewski lukasz-golebiewski merged commit 127524c into develop Jun 25, 2021
@lukasz-golebiewski lukasz-golebiewski deleted the feature/etcm-354/forkid-check branch June 25, 2021 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants