Turing.Experimental.Gibbs often causes test-suites to fail due to high-variance estimates for models in DynamicPPL.TestUtils.DEMO_MODELS.
At the moment, it's somewhat unclear to me whether these are due to more extensive testing of Turing.Experimental.Gibbs (unceratin if Gibbs is actually being tested as rigourously on DEMO_MODELS), or if there's an actual statistical difference between the two.
I'm making this issue to keep track of it; will have a look at it soon.