- 
                Notifications
    
You must be signed in to change notification settings  - Fork 3.4k
 
HBASE-29071 StochasticLoadBalancer candidate generators should use a Map, rather than ordinal based indexing #6598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
ab86f08    to
    647369a      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
647369a    to
    4712a09      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
d3cec9c    to
    21d0368      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| @Test | ||
| public void testBalanceCluster() throws Exception { | ||
| setMaxRunTime(Duration.ofMillis(1500)); | ||
| setMaxRunTime(Duration.ofMillis(2500)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was a little ambitious with the duration required here (in #6597 I drastically reduced balancer test suite runtime), and this test is much less flexible because it doesn't follow the standard balancer test framework. I could bite off a larger refactor, but I'm not sure it's worth it. So, anyway, I saw some flappiness in this test locally that was resolved by just giving a little bit more time for plan generation since it expects the plan to be perfect in one shot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll find that the upstream execution needs quite a lot more time than local runs. Congratulations on your first flakey test!
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
        
          
                ...e-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | void initCosts(BalancerClusterState cluster) { | ||
| // Initialize the weights of generator every time | ||
| weightsOfGenerators = new double[this.candidateGenerators.size()]; | ||
| weightsOfGenerators = new HashMap<>(this.candidateGenerators.size()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using getOrDefault everywhere, so can't you just clear() the map instead of re-seeding it with default values? Or you're protecting against non-default accesses from derived classes? In that case, make the instance private and force derived classes to interact with its contents though accessor methods.
Wait, it is private. Yeah, just clear it in the init function and access it via getOrDefault.
| @Test | ||
| public void testBalanceCluster() throws Exception { | ||
| setMaxRunTime(Duration.ofMillis(1500)); | ||
| setMaxRunTime(Duration.ofMillis(2500)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll find that the upstream execution needs quite a lot more time than local runs. Congratulations on your first flakey test!
| partialSums.set(i, partialSums.get(i) / sum); | ||
| } | ||
| 
               | 
          ||
| // Generate a random number and pick the first generator whose partial sum is >= rand | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This weighted selection algorithm is pretty awkward to follow if you don't expect it. Can you isolate it within its own method so that the calling method is easier to read and the implementation could be picked up for reuse later. Also, it becomes testable independently, if such a thing is testable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I had some trouble understanding exactly what the goals were when working this the original code. I want to flag that this weighted selection algorithm is a pretty faithful attempt at maintaining the original approach — but with maps rather than arrays, some extra comments to try to explain the steps more explicitly, explicitly throwing rather than fetching a nonexistent item if generators are empty, falling back to random generators rather than some magic importance on the last in the list, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can clean this up a bit with the precondition and maybe pulling random generator picking out into its own method — but the logic necessary for picking a random generator is basically the entirety of this. Unless we want to make a static method with a ton of arguments just for testing's sake, and honestly I think this is really well tested by the current suite; if you don't pick the right generators at the right time, then the balancer doesn't work well at all
| logMessage.append(String.format(" - %s: %d steps, %d approvals%n", generator.getSimpleName(), | ||
| count, approvals)); | ||
| }); | ||
| LOG.info(logMessage.toString()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be logged at INFO level? Seems more like DEBUG to me.
| StringBuilder logMessage = new StringBuilder("CandidateGenerator activity summary:\n"); | ||
| generatorToStepCount.forEach((generator, count) -> { | ||
| long approvals = generatorToApprovedActionCount.getOrDefault(generator, 0L); | ||
| logMessage.append(String.format(" - %s: %d steps, %d approvals%n", generator.getSimpleName(), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a String joiner here instead of naive append so that you don't have the extra newline at the end.
21d0368    to
    61bd05a      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
…Map, rather than ordinal based indexing
61bd05a    to
    b8ab794      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @rmdmattingly
| 
           I don't understand why this build is marked as failed -- the console output is all clear. https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6598/7/console  | 
    
| 
           🎊 +1 overall 
 
 This message was automatically generated.  | 
    
| 
           💔 -1 overall 
 
 This message was automatically generated.  | 
    
| 
           Test failures are disappointing, but irrelevant  | 
    
…Map, rather than ordinal based indexing (#6598) Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (#6598) Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (#6598) Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (#6598) (#6621) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…Map, rather than ordinal based indexing (#6598) (#6620) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…Map, rather than ordinal based indexing (#6598) (#6621) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…Map, rather than ordinal based indexing (apache#6598) Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (apache#6598) Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (#6598) (#6621) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…Map, rather than ordinal based indexing (#6598) (#6621) (#6631) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Ray Mattingly <[email protected]>
…Map, rather than ordinal based indexing (apache#6598) (apache#6621) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…tors should use a Map, rather than ordinal based indexing (apache#6598) (apache#6621) (will be in 2.6.3) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…tors should use a Map, rather than ordinal based indexing (apache#6598) (apache#6621) (will be in 2.6.3) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…tors should use a Map, rather than ordinal based indexing (apache#6598) (apache#6621) (will be in 2.6.3) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…Map, rather than ordinal based indexing (apache#6598) (apache#6621) Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
This just makes the candidate generators nicer to work with. This is part of #6593 that I'm breaking off into an atomic piece
Existing test coverage provides great guardrails to ensure that we haven't broken anything